Hi,
I know it's quite common to use BIT fields for boolean values.
CREATE TABLE tblTest (
door BIT DEFAULT 0,
accept BIT DEFAULT 0
)
instead of:
CREATE TABLE tblTest (
door SET('close','open') DEFAULT 'close',
accept SET('no','yes') DEFAULT 'no'
)
(By the way, I use MSSQL and MySQL, not sure if I'm using the right
datatypes for MSSQL)
I reckon the first uses less storage space, but the meaning of the values in
the latter is more unanimous.
So what's best practice? I'm tempted to use the latter, but I almost always
see the first used everywhere.
I'm definitely interested in what CELKO has to say about this.
LisaMy opinion, FWIW
-Comarison operators with bit datatypes will be faster
-You don't have to worry about differences in collations when comparing
-More efficient storage on disk
-You could always add the unanimous-ness (which I'm positive is not a word!)
in a select that extracts the data out of these fields like
select case when door = 0 then 'closed' else 'open' end from tblTest
Proper T-SQL syntax for the character-based table would be
CREATE TABLE tblTest (
door varchar(5) CHECK (door IN ('close','open')) DEFAULT 'close',
accept varchar(5) CHECK (accept IN ('no','yes')) DEFAULT 'no'
)
--
"Lisa Pearlson" wrote:
> Hi,
> I know it's quite common to use BIT fields for boolean values.
> CREATE TABLE tblTest (
> door BIT DEFAULT 0,
> accept BIT DEFAULT 0
> )
> instead of:
> CREATE TABLE tblTest (
> door SET('close','open') DEFAULT 'close',
> accept SET('no','yes') DEFAULT 'no'
> )
> (By the way, I use MSSQL and MySQL, not sure if I'm using the right
> datatypes for MSSQL)
> I reckon the first uses less storage space, but the meaning of the values
in
> the latter is more unanimous.
> So what's best practice? I'm tempted to use the latter, but I almost alway
s
> see the first used everywhere.
> I'm definitely interested in what CELKO has to say about this.
> Lisa
>
>|||Here we use the latter one cuz too many different programmers working off
the same database and each programmer has their own programs that they are
responsible for.
I prefer to use the first one to save size.
"Lisa Pearlson" <no@.spam.plz> wrote in message
news:O1u6wZDKGHA.3984@.TK2MSFTNGP14.phx.gbl...
> Hi,
> I know it's quite common to use BIT fields for boolean values.
> CREATE TABLE tblTest (
> door BIT DEFAULT 0,
> accept BIT DEFAULT 0
> )
> instead of:
> CREATE TABLE tblTest (
> door SET('close','open') DEFAULT 'close',
> accept SET('no','yes') DEFAULT 'no'
> )
> (By the way, I use MSSQL and MySQL, not sure if I'm using the right
> datatypes for MSSQL)
> I reckon the first uses less storage space, but the meaning of the values
> in the latter is more unanimous.
> So what's best practice? I'm tempted to use the latter, but I almost
> always see the first used everywhere.
> I'm definitely interested in what CELKO has to say about this.
> Lisa
>|||Lisa Pearlson wrote:
> Hi,
> I know it's quite common to use BIT fields for boolean values.
> CREATE TABLE tblTest (
> door BIT DEFAULT 0,
> accept BIT DEFAULT 0
> )
> instead of:
> CREATE TABLE tblTest (
> door SET('close','open') DEFAULT 'close',
> accept SET('no','yes') DEFAULT 'no'
> )
> (By the way, I use MSSQL and MySQL, not sure if I'm using the right
> datatypes for MSSQL)
> I reckon the first uses less storage space, but the meaning of the values
in
> the latter is more unanimous.
> So what's best practice? I'm tempted to use the latter, but I almost alway
s
> see the first used everywhere.
> I'm definitely interested in what CELKO has to say about this.
> Lisa
My main objection to BIT is that its behaviour is too strange and
counter-intuitive. For example, can you predict or explain the results
of the following INSERT statement and the SELECT statement?
CREATE TABLE T (x BIT NOT NULL);
INSERT INTO T VALUES (2);
SELECT MAX(x) FROM T;
In most cases BIT's saving on storage is probably modest. I prefer
concise, readable status codes (CHAR(1) for example) and when I need to
add a third status I don't need to change the datatype I just change a
CHECK constraint.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Like David, I usually use a 1 character flag. In my experience, when I
initially think I have a binary status, it is really a multi-varied
flag. In other words, as soon as I code to tell if the door is open or
closed, someone else wants to know if it's locked.
Payson
Lisa Pearlson wrote:
> Hi,
> I know it's quite common to use BIT fields for boolean values.
> CREATE TABLE tblTest (
> door BIT DEFAULT 0,
> accept BIT DEFAULT 0
> )
> instead of:
> CREATE TABLE tblTest (
> door SET('close','open') DEFAULT 'close',
> accept SET('no','yes') DEFAULT 'no'
> )
> (By the way, I use MSSQL and MySQL, not sure if I'm using the right
> datatypes for MSSQL)
> I reckon the first uses less storage space, but the meaning of the values
in
> the latter is more unanimous.
> So what's best practice? I'm tempted to use the latter, but I almost alway
s
> see the first used everywhere.
> I'm definitely interested in what CELKO has to say about this.
> Lisa|||> In other words, as soon as I code to tell if the door is open or
> closed, someone else wants to know if it's locked.
I love the analogy. :-)
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Payson" <payson_b@.hotmail.com> wrote in message
news:1138914380.158598.148590@.f14g2000cwb.googlegroups.com...
> Like David, I usually use a 1 character flag. In my experience, when I
> initially think I have a binary status, it is really a multi-varied
> flag. In other words, as soon as I code to tell if the door is open or
> closed, someone else wants to know if it's locked.
> Payson
> Lisa Pearlson wrote:
>|||Lisa--
Great question. One way of clearing this up is by naming your columns in a
way where it won't be ambiguous. If you know that a Door only has two
states, Open and Closed, the meaning will be clear with a BIT datatype if yo
u
use a convention like "IsDoorOpen". That way, it's obvious what "1" and "0"
mean. If it's possible that a door will have more than 2 states in the
future, use a single character {[O]pen, [C]losed, [A]jar} to represent the
state of the door. In that case, you might call your column DoorStatus, or
DoorState... Likewise, you can name your "Accept" column "IsAccepted" and
clarify the meaning there. At my company, we use "Is" to prefix just about
all of the boolean values that we put in our code for just this reason.
HTH
-Dave Markle
"Lisa Pearlson" wrote:
> Hi,
> I know it's quite common to use BIT fields for boolean values.
> CREATE TABLE tblTest (
> door BIT DEFAULT 0,
> accept BIT DEFAULT 0
> )
> instead of:
> CREATE TABLE tblTest (
> door SET('close','open') DEFAULT 'close',
> accept SET('no','yes') DEFAULT 'no'
> )
> (By the way, I use MSSQL and MySQL, not sure if I'm using the right
> datatypes for MSSQL)
> I reckon the first uses less storage space, but the meaning of the values
in
> the latter is more unanimous.
> So what's best practice? I'm tempted to use the latter, but I almost alway
s
> see the first used everywhere.
> I'm definitely interested in what CELKO has to say about this.
> Lisa
>
>|||>> In other words, as soon as I code to tell if the door is open or closed,
someone else wants to know if it's locked.
I love the analogy. :-) <
Me, too!! I think I will steal it in the next edition of one of my
books! Hey, Imitation is the sincerest form of flattery; Plagiarism is
the sincerest form of imitation!|||Stop writing code as if you were an assembly language programmer in
1957.
Machine level things like a BIT or BYTE datatype have no place in a
high level language like SQL. SQL is a high level language; it is
abstract and defined without regard to PHYSICAL implementation. This
basic principle of data modeling is called data abstraction.
Bits and Bytes are the <i>lowest<i> units of hardware-specific,
physical implementation you can get. Are you on a high-end or low-end
machine? Does the machine have 8, 16, 32, 64, or 128 bit words? Twos
complement or ones complement math? Hey, the standards allow decimal
machines, so bits do not exist at all!! What about NULLs? To be an
SQL datatype, you have to have NULLs, so what is a NULL bit? By
definition a bit, is on or off and has no NULL.
What does the implementation of the host languages do with bits? Did
you know that +1, +0, -0 and -1 are all used for BOOLEANs, but not
consistently (look at C# and VB from the same vendor)? That means
<i>all<i> the host languages -- present, future and not-yet-defined.
Surely, no good programmer would ever write non-portable code by
getting to such a low level as bit fiddling!!
There are two situations in practice. Either the bits are individual
attributes or they are used as a vector to represent a single
attribute. In the case of a single attribute, the encoding is limited
to two values, which do not port to host languages or other SQLs,
cannot be easily understood by an end user, and which cannot be
expanded. Use CHAR(1) which will move.
In the second case what some Newbies, who are still thinking in terms
of second and third generation programming languages or even punch
cards, do is build a vector for a series of "yes/no" status codes,
failing to see the status vector as a single attribute. Did you ever
play the children's game "20 Questions" when you were young? Bingo!!
Imagine you have six components for a loan approval, so you allocate
bits in your second generation model of the world. You have 64 possible
vectors, but only 5 of them are valid (i.e. you cannot be rejected for
bankruptcy and still have good credit). For your data integrity, you
can:
1) Ignore the problem. This is actually what <i>most<i> newbies do.
2) Write elaborate CHECK() constraints with user defined functions or
proprietary bit level library functions that cannot port and that run
like cold glue.
Now we add a 7-th condition to the vector -- which end does it go on?
Why? How did you get it in the right place on all the possible
hardware that it will ever use? Did all the code that references a bit
in a word by its position do it right after the change?
You need to sit down and think about how to design an encoding of the
data that is high level, general enough to expand, abstract and
portable. For example, is that loan approval a hierarchical code?
concatenation code? vector code? etc? Did you provide codes for
unknown, missing and N/A values? It is not easy to design such things!
Get a copy of SQL PROGRAMMING STYLE and look at the chapters on design
encoding schemes.|||An excellent post Celko. I wish more of your posts were like this.
i.e. You focused more on the solution and the reasons behind it than
insulting folks who are doing it wrong.
When you make posts like this one we can all learn a little bit, without
being offended in the process.
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1139090746.005907.99750@.g14g2000cwa.googlegroups.com...
> Stop writing code as if you were an assembly language programmer in
> 1957.
> Machine level things like a BIT or BYTE datatype have no place in a
> high level language like SQL. SQL is a high level language; it is
> abstract and defined without regard to PHYSICAL implementation. This
> basic principle of data modeling is called data abstraction.
> Bits and Bytes are the <i>lowest<i> units of hardware-specific,
> physical implementation you can get. Are you on a high-end or low-end
> machine? Does the machine have 8, 16, 32, 64, or 128 bit words? Twos
> complement or ones complement math? Hey, the standards allow decimal
> machines, so bits do not exist at all!! What about NULLs? To be an
> SQL datatype, you have to have NULLs, so what is a NULL bit? By
> definition a bit, is on or off and has no NULL.
> What does the implementation of the host languages do with bits? Did
> you know that +1, +0, -0 and -1 are all used for BOOLEANs, but not
> consistently (look at C# and VB from the same vendor)? That means
> <i>all<i> the host languages -- present, future and not-yet-defined.
> Surely, no good programmer would ever write non-portable code by
> getting to such a low level as bit fiddling!!
> There are two situations in practice. Either the bits are individual
> attributes or they are used as a vector to represent a single
> attribute. In the case of a single attribute, the encoding is limited
> to two values, which do not port to host languages or other SQLs,
> cannot be easily understood by an end user, and which cannot be
> expanded. Use CHAR(1) which will move.
> In the second case what some Newbies, who are still thinking in terms
> of second and third generation programming languages or even punch
> cards, do is build a vector for a series of "yes/no" status codes,
> failing to see the status vector as a single attribute. Did you ever
> play the children's game "20 Questions" when you were young? Bingo!!
> Imagine you have six components for a loan approval, so you allocate
> bits in your second generation model of the world. You have 64 possible
> vectors, but only 5 of them are valid (i.e. you cannot be rejected for
> bankruptcy and still have good credit). For your data integrity, you
> can:
> 1) Ignore the problem. This is actually what <i>most<i> newbies do.
> 2) Write elaborate CHECK() constraints with user defined functions or
> proprietary bit level library functions that cannot port and that run
> like cold glue.
> Now we add a 7-th condition to the vector -- which end does it go on?
> Why? How did you get it in the right place on all the possible
> hardware that it will ever use? Did all the code that references a bit
> in a word by its position do it right after the change?
> You need to sit down and think about how to design an encoding of the
> data that is high level, general enough to expand, abstract and
> portable. For example, is that loan approval a hierarchical code?
> concatenation code? vector code? etc? Did you provide codes for
> unknown, missing and N/A values? It is not easy to design such things!
> Get a copy of SQL PROGRAMMING STYLE and look at the chapters on design
> encoding schemes.
>
Thursday, March 8, 2012
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment