Showing posts with label certain. Show all posts
Showing posts with label certain. Show all posts

Friday, February 24, 2012

Best approach to encrypt data?

Hi,

I want to encrypt certain data like password, ssn, credit card info etc before saving in database. Also, this encrypted data can be queried using standard SQL statements like:

select * from users where userid=454 and password = 'encrypted data'

The mechanism to encrypt data could be in a .net application. The code that does encryption/decryption should also be protected so that it doesnt work if it falls in wrong hands.

Can anyone suggest what would be the best way to accomplish above?

thanks,
dapi

You can start with the following links

http://www.microsoft.com/technet/prodtechnol/sql/2005/sqlencryption.mspx

http://articles.techrepublic.com.com/5100-22-5083541.html

Hope this helps

|||

I would suggest reading "Writing Secure Code", Chapter 6. It has a number of examples of how to encrypt data on a Windows platform, and where people make common mistakes. The book is fairly cheap ($0.10 per page) and probably available at the library. I understand C# has a rather extensive crypto library, but knowing how and why is probably better than just "plugging and chugging".

Hope that helps,

John

Thursday, February 16, 2012

Behavior question about updateable resultsets...

I have a question regarding a certain behavior of updateable
resultsets. If I update a column using any of the updateXXX methods and
then try to use the getXXX methods from the same column to see if it
updated the results locally and not on the server, I get the same old
value. I have to call updateRow() but that updates the underlying
database and still gives me the old value until I execute the same
query again and get a new resultset. Maybe the code below will clarify
my question more..
Connection con = null;
Statement stmt;
ResultSet rst;
Class.forName("com.microsoft.jdbc.sqlserver.SQLSer verDriver");
System.out.println("Getting connection.");
con = DriverManager.getConnection(url);
System.out.println("Connection successful.");
String st = "select age,sname,snum FROM student;";
stmt =
con.createStatement(ResultSet.TYPE_SCROLL_SENSITIV E,ResultSet.CONCUR_UPDATABLE);
rst = stmt.executeQuery(st);
rst.last();
System.out.print(rst.getInt(1)+" ");
System.out.print(rst.getString(2)+" ");
System.out.print(rst.getLong(3)+"\n");
rst.updateInt(1,23);
rst.updateRow();
System.out.print(rst.getInt(1)+" ");
System.out.print(rst.getString(2)+" ");
System.out.print(rst.getLong(3)+"\n");
The output is:
Getting connection.
Connection successful.
25 Edward Baker 578875478
25 Edward Baker 578875478
If I were the run the same code again, I get:
Getting connection.
Connection successful.
23 Edward Baker 578875478
23 Edward Baker 578875478
Any/all help is appreciated
Thanks
Devansh Dhutia
University of Iowa
This is a bug and it does not have a trivial fix. I would like to encourage
you to file this using the product feedback website (below).
The problem here is that there are two mutually exclusive places where
column values transit through the driver. The first, used only by getters,
is through the columns array (lives on the statement). The second, used
only by setters, is through the colParam array (also lives on the
statement). The columns array is read only the colParam array is write
only...
Note that the JDBC spec provides (in section 27.1.22, p. 718 - 719 JDBC API
Tutorial and Reference, Third edition, (Fisher, Ellis, Bruce)) that a result
set's own updates need not be visible to it. Obviously we would not like
this to be the default behavior but it is going to take a lot of work and it
would help to have customer feedback that clarified why this behavior should
be changed.
Entering a bug:
Go to http://lab.msdn.microsoft.com/produc...k/default.aspx
Product/Technology:
SQL Server 2005
Category:
JDBC Driver
Make sure to add JDBC SqlServer 2005 to the but title.
Angel Saenz-Badillos [MS] DataWorks
This posting is provided "AS IS", with no warranties, and confers no
rights.Please do not send email directly to this alias.
This alias is for newsgroup purposes only.
I am now blogging: http://weblogs.asp.net/angelsb/
<devansh.dhutia@.gmail.com> wrote in message
news:1128188361.813553.300740@.g47g2000cwa.googlegr oups.com...
>I have a question regarding a certain behavior of updateable
> resultsets. If I update a column using any of the updateXXX methods and
> then try to use the getXXX methods from the same column to see if it
> updated the results locally and not on the server, I get the same old
> value. I have to call updateRow() but that updates the underlying
> database and still gives me the old value until I execute the same
> query again and get a new resultset. Maybe the code below will clarify
> my question more..
> Connection con = null;
> Statement stmt;
> ResultSet rst;
> Class.forName("com.microsoft.jdbc.sqlserver.SQLSer verDriver");
> System.out.println("Getting connection.");
> con = DriverManager.getConnection(url);
> System.out.println("Connection successful.");
> String st = "select age,sname,snum FROM student;";
> stmt =
> con.createStatement(ResultSet.TYPE_SCROLL_SENSITIV E,ResultSet.CONCUR_UPDATABLE);
> rst = stmt.executeQuery(st);
> rst.last();
> System.out.print(rst.getInt(1)+" ");
> System.out.print(rst.getString(2)+" ");
> System.out.print(rst.getLong(3)+"\n");
> rst.updateInt(1,23);
> rst.updateRow();
> System.out.print(rst.getInt(1)+" ");
> System.out.print(rst.getString(2)+" ");
> System.out.print(rst.getLong(3)+"\n");
> The output is:
> Getting connection.
> Connection successful.
> 25 Edward Baker 578875478
> 25 Edward Baker 578875478
> If I were the run the same code again, I get:
> Getting connection.
> Connection successful.
> 23 Edward Baker 578875478
> 23 Edward Baker 578875478
> Any/all help is appreciated
> Thanks
> Devansh Dhutia
> University of Iowa
>

behavior of SQL on joined queries

Hi all,

Currently our product has a setup that stores information about
transactions in a transaction table. Additionally, certain transactions
pertain to specific people, and extra information is stored in another
table. So for good or ill, things look like this right now:

create table TransactionHistory (
TrnID int identity (1,1),
TrnDT datetime,
--other information about a basic transaction goes here.
--All transactions have this info
Primary Key Clustered (TrnID)
)

Create Index TrnDTIndex on TransactionHistory(TrnDT)

create table PersonTransactionHistory (
TrnID int,
PersonID int,
--extended data pertaining only to "person" transactions goes
--here. only Person transactions have this
Primary Key Clustered(TrnID),
Foreign Key (TrnID) references TransactionHistory (TrnID)
)

Create Index TrnPersonIDIndex on PersonTransactionHistory(Person)

A query about a group of people over a certain date range might fetch
information like so:

select * from TransactionHistory TH
inner join PersonTransactionHistory PTH
on TH.TrnID = PTH.TrnID
where PTH.PersonID in some criteria
and TH.TrnDT between some date and some date

In my experience, this poses a real problem when trying to run queries
that uses both date and personID criteria. If my guesses are correct this
is because SQL is forced to do one of two things:

1 - Use TrnPersonIDIndex to find all transactions which match the person
criteria, then for each do a lookup in the PersonTransactionHistory to
fetch the TrnID and subsequently do a lookup of the TrnID in the clustered
index of the TransactionHistory Table, and finally determine if a given
transaction also matches the date time criteria.

2 - Use TrnDTIndex to final all transaction matching the date criteria,
and then perform lookups similar to the above, except for personID instead
of datetime.

Compounding this is my suspicion (based on performance comparison of when
I specify which indexes to use in the query vs when I let SQL Server
decide itself) that SQL sometimes chooses a very non optimal course. (Of
course, sometimes it chooses a better course than me - the point is I want
it to always be able to pick a good enough course such that I don't have
to bother specifying). Perhaps the table layout is making it difficult for
SQL Server to find a good query plan in all cases.

Basically I'm trying to determine ways to improve our table design here to
make reporting easier, as this gets painful when running report for
large groups of people during large date ranges. I see a few options based
on my above hypothesis, and am looking for comments and/or corrections.

1 - Add the TrnDT column to the PersonTransactionHistory Table as
well. Then create a foreign key relationship of PersonTransactionHistory
(TrnID, TrnDT) references TransactionHistory (TrnID, TrnDT) and create
indexes on PersonTransactionHistory with (TrnDT, PersonID) and
(PersonID, TrnDT). This seems like it would let SQL Server make
much more efficient execution plans. However, I am unsure if SQL server
can leverage the FK on TrnDT to use those new indexes if I give it a query
like:

select * from TransactionHistory TH
inner join PersonTransactionHistory PTH
on TH.TrnID = PTH.TrnID
where PTH.PersonID in some criteria
and TH.TrnDT between some date and some date

The trick being that SQL server would know that it can use PTH.TrnDT and
TH.TrnDT interchangably because of the foreign key (this would support all
the preexisting existing queries that explicitly named TH.TrnDT - any that
didn't explicitly specify the table would now have ambigious column
names...)

2 - Just coalesce the two tables into one. The original intent was to save
space by not requiring extra columns about Persons for all rows, many of
which did not have anything to do with a particular person (for instance a
contact point going active). In my experience with our product, the end
user's decisions about archiving and purging have a much bigger impact
than this, so in my opinion efficient querying is more important than
space. However I'm not sure if this is an elegant solution either. It also
might require more changes to existing code, although the use of views
might help.

We also run reports based on other criteria (columns I replaced with
comments above) but none of them are as problematic as the situation
above. However, it seems that if I can understand the best way to solve
this, I will be able to leverage that approach if other types of reports
become problematic.

Any opinions would be greatly appreciated. Also any references to good
sources regarding table and index design would be helpful as well (online
or offline references...)

thanks,
DaveMetal Dave (metal@.spam.spam) writes:
> create table TransactionHistory (
> TrnID int identity (1,1),
> TrnDT datetime,
> --other information about a basic transaction goes here.
> --All transactions have this info
> Primary Key Clustered (TrnID)
> )
> Create Index TrnDTIndex on TransactionHistory(TrnDT)
> create table PersonTransactionHistory (
> TrnID int,
> PersonID int,
> --extended data pertaining only to "person" transactions goes
> --here. only Person transactions have this
> Primary Key Clustered(TrnID),
> Foreign Key (TrnID) references TransactionHistory (TrnID)
> )
> Create Index TrnPersonIDIndex on PersonTransactionHistory(Person)

Given your query, it could be a good idea to have the clustered index
on TrnDT and PersonID instead. The main problem now with the queries
is that SQL Server will have to make a choice between Index Seek +
Bookmark Lookup on the one hand, and Clustered Index Scan on the other.
This is a guessing game that does not always end up the best way.

Of course, you may have other queries that are best off with clustering
on the Pkey, but this does not seem likely. (Insertion may however
benefit from a montonically increasing index. A clustered index on
PersonID may cause fragmentation.)

> 1 - Add the TrnDT column to the PersonTransactionHistory Table as
> well. Then create a foreign key relationship of PersonTransactionHistory
> (TrnID, TrnDT) references TransactionHistory (TrnID, TrnDT) and create
> indexes on PersonTransactionHistory with (TrnDT, PersonID) and
> (PersonID, TrnDT). This seems like it would let SQL Server make
> much more efficient execution plans. However, I am unsure if SQL server
> can leverage the FK on TrnDT to use those new indexes if I give it a query
> like:
> select * from TransactionHistory TH
> inner join PersonTransactionHistory PTH
> on TH.TrnID = PTH.TrnID
> where PTH.PersonID in some criteria
> and TH.TrnDT between some date and some date

Well, take a copy of the database and try it!

(But first try changing the clustered index.)

> 2 - Just coalesce the two tables into one. The original intent was to save
> space by not requiring extra columns about Persons for all rows, many of
> which did not have anything to do with a particular person (for instance a
> contact point going active).

Depends a little on the ration. If the PersonTransactionHistory is 50%
of all rows in the main table, collapsing into one is probably the best.
If it's 5%, I don't think it is.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||On Tue, 26 Oct 2004, Erland Sommarskog wrote:

> Given your query, it could be a good idea to have the clustered index
> on TrnDT and PersonID instead. The main problem now with the queries
> is that SQL Server will have to make a choice between Index Seek +
> Bookmark Lookup on the one hand, and Clustered Index Scan on the other.
> This is a guessing game that does not always end up the best way.
> Of course, you may have other queries that are best off with clustering
> on the Pkey, but this does not seem likely. (Insertion may however
> benefit from a montonically increasing index. A clustered index on
> PersonID may cause fragmentation.)

My intuition agrees with you regarding the index in this case. I'm
pretty sure the clustered bookmark scan kills us on many reports. However
I haven't looked with enough depth at the wide variety of queries we use
to know for sure where I should put the clustered index so I'm reserving
judgement for now. I'd also like to study a bit more first so that I
don't replace one hasty decision with another - it might solve ad
individual problem but exacerbate others.

For instance, I think

select * from PersonTransactionHistory PTH
inner join TransactionHistory TH on PTH.TrnID = TH.TrnID
where PTH.PersonID = 12345

would be harmed by moving the TH clustered index from TH.TrnID to
TH.TrnDT, as it would now have to make the same lookup vs scan choice in
order to perform the join. Does that make sound reasonable? And since it's
rare for us to access PTH without the inner join to TH, there are probably
many queries like this.

> > 1 - Add the TrnDT column to the PersonTransactionHistory Table as
> > well. Then create a foreign key relationship of PersonTransactionHistory
> > (TrnID, TrnDT) references TransactionHistory (TrnID, TrnDT) and create
> > indexes on PersonTransactionHistory with (TrnDT, PersonID) and
> > (PersonID, TrnDT). This seems like it would let SQL Server make
> > much more efficient execution plans. However, I am unsure if SQL server
> > can leverage the FK on TrnDT to use those new indexes if I give it a query
> > like:
> > select * from TransactionHistory TH
> > inner join PersonTransactionHistory PTH
> > on TH.TrnID = PTH.TrnID
> > where PTH.PersonID in some criteria
> > and TH.TrnDT between some date and some date
> Well, take a copy of the database and try it!

I appreciate the value of experimentation and normally would do that but
if it didn't work that wouldn't necesarily prove to me that I wasn't
simply doing something wrong like not making the foreign key specific
enough or putting something in my query which made SQL server ignore this
potential valuable relationship. So I was basically wondering if there
were any good docs regarding what types of information SQL Server will and
will no leverage in its choices or whether someone familiar with those
rules had some feedback off the top of their head.

> > 2 - Just coalesce the two tables into one. The original intent was to save
> > space by not requiring extra columns about Persons for all rows, many of
> > which did not have anything to do with a particular person (for instance a
> > contact point going active).
> Depends a little on the ration. If the PersonTransactionHistory is 50%
> of all rows in the main table, collapsing into one is probably the best.
> If it's 5%, I don't think it is.

It's probably between 20% and 40% depending on the particular
installation. It's your rationale that for 50% the space saved is
negligible whereas for 5% is is not? For me it's as more about limiting
the changes to the client software (definitely keeping the tables
separate) vs speeding up queries (possible coalescing) rather than a space
consideration. I did a test once and recall discovering we took up nearly
as much or more space with our indexes than our tables anyway, so
coalescing might make a big space difference anyway. (This amount of index
space suprised me but I'm not sure if there is a good rule of thumb for
how much space indexes should take.)

Rereading the post I probably should have just asked for good table design
references right up front. Any takers?

Thanks for the feedback.

Dave|||Metal Dave (metal@.spam.spam) writes:
> For instance, I think
> select * from PersonTransactionHistory PTH
> inner join TransactionHistory TH on PTH.TrnID = TH.TrnID
> where PTH.PersonID = 12345
> would be harmed by moving the TH clustered index from TH.TrnID to
> TH.TrnDT, as it would now have to make the same lookup vs scan choice in
> order to perform the join. Does that make sound reasonable? And since it's
> rare for us to access PTH without the inner join to TH, there are probably
> many queries like this.

Let's assume for the example that the clustered index in FTH is on PersonID.
Then the join against TH on TrnID will be akin to Index Seek + Bookmark
Lookup, no matter if the index on TrnID is clustered or not. In both
cases you would expect a plan with a Nested Loop join which means that
for each in FTH you look up a row in TH. The only difference if the index
on TrnID is non-clustered, is that you will get a few more reads for
each access. Which indeed is not neglible, since it multiplies with the
number of rows for PersonID.

And just like "SELECT * FROM tbl WHERE nonclusteredcol = @.val" has a
choice between index seek and scan, so have this query. Rather than
nested loop, the optimizer could go for hash or merge join which would
mean a single scan of TH. I would guess that the probability for this is
somewhat higher with a NC index on TrnID.

Of course, you opt to change only FTH, if you like.

> I appreciate the value of experimentation and normally would do that but
> if it didn't work that wouldn't necesarily prove to me that I wasn't
> simply doing something wrong like not making the foreign key specific
> enough or putting something in my query which made SQL server ignore this
> potential valuable relationship. So I was basically wondering if there
> were any good docs regarding what types of information SQL Server will and
> will no leverage in its choices or whether someone familiar with those
> rules had some feedback off the top of their head.

SQL Server does look at constraints, but really how intelligent it is,
I have not dug into. Thus, my encouragement of experimentation.

> It's probably between 20% and 40% depending on the particular
> installation. It's your rationale that for 50% the space saved is
> negligible whereas for 5% is is not?

Actually, I was more thinking in terms of performance, but space and
performance are related. My idea was that with 50%, the space saved is not
worth the extra complexity, and performance may suffer. With 5%, you save a
lot of space, since FTH would be a small table.

Your concern of having to change the client is certainly not one to be
neglected, and if this is costly in development time, I don't think it's
worth it.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Beginners SqlDataSource SelectCommand Question

Hi folks,

I'm having problems using the SqlDataSource to return certain values from a SQL database. I have two DropDownLists. When the first DDL's selected index is changed, I need to take the value (an int id) from the field, use the value to look up a string in a different colum in the table for that record id, which should then be used to automatically select a value from the second DDL. I think my problem is I'm not sure how to get the SelectCommand to actually return the value, let alone in string form!

Here is my code:

{
string selectedProject = ProjectDDL.SelectedValue;
SqlDataSource temp = new SqlDataSource();
temp.ConnectionString = rootWebConfig.ConnectionStrings.ConnectionStrings["DBConnectionString"].ToString();

temp.SelectParameters.Add("id", selectedProject);
temp.SelectCommand = "SELECT [username] FROM dbo.projects WHERE id = @.id";

//temp.Select(); ?

//AssigneeField.SelectedValue = string returned from Select Statement!
}

I'd appreciate it if somone could point me in the right direction?

Thanks,

Ally

Hi,

you can either use a command object with a dataadaptor or a datareader to retrieve data from your command
assuming you are only returning one String value, using a datareader would be better on your system:

String result = "";

SqlCommand command = New SqlCommand( _
"SELECT CategoryID, CategoryName FROM dbo.Categories;" & _
"SELECT EmployeeID, LastName FROM dbo.Employees", connection);
command.parameters.add(...);
connection.Open();

sqldatareader reader = command.executereader();
Do While reader.Read()
<<<<<Do your processing with the data here>>>>>
result = reader(<columnNum>);
Loop

reader.Close();
return result

i'm sorry about the syntax errors as im not c# trained...
Hope this helps...

|||

Excellent, exactly what I needed, thanks! I was having problems with SqlDataReader.ExecuteReader() constantly returning anException:

System.InvalidOperationException: Invalid attempt to read when no data is present."

I'd accidently omitted the inital SqlDataReader.Read() call, fought with it for ages until I realised the reader doesn't move on to the first record until SqlDataRead.Read() is called for the first time, makes sense I guess! All working now :)

Thanks again,

Ally