Saturday, February 25, 2012

Best design method to allow for "dynamic" records

First, a quick overview of my project. I'm designing a vehicle
tracking system that takes data from multiple types of GPS devices,
stores the data in a common database, and allows the user to view
device locations in real time or create reports on previous activity.
Currently we're only using one type of device, but I'm trying to
futureproof the app so I don't have to redesign it down the road.
Since the devices have different capabilities, I'm trying to come up
with the best method to store their data in a common format. For
example, say I have two devices, DeviceA and DeviceB. Both can report
their latitude, longitude, speed, heading, and a timestamp. DeviceA
can also report an odometer value, whether or not it has a GPS fix,
and telematics data. DeviceB cannot report those values. Along with
this data, every record will be tagged with address information on the
server side. Down the road, we may have DeviceC, DeviceD, etc with
new capabilities.
Now, the question is, how can I store all of this information in a way
that is simple to search/order/etc that is also "dynamic"? I've
thought about using three tables...one would contain "basic" record
info...latitude, longitude, speed, heading, timestamp, and a reason
code for why the record was sent (ignition on/off, start/stop, etc).
A second table would contain address information (street number,
street name, city, state, ZIP) and would be linked back to the basic
table. The third table would contain a single field with an XML
fragment detailing the rest of the information for that record
(odometer value, GPS fix status, telematics, etc).
Problem is, I can't find any method to allow me to search through XML
contained in a column other than a full text search. I also have the
problem of creating a result set containing all of the dynamic columns
(or selected ones only) to be returned to my ASP.NET application for
reporting.
Other methods I have thought about...storing all "extra" info in a
huge table with columns for each value. This would result in a lot of
wasted space (NULL values everywhere for devices that don't support
those features) and new columns would have to be added each time a new
device is supported (if new features are provided). Yet another
method...store values as key/value pairs in a separate table. This
would be rediculously slow though...I would have to generate some
heavy-duty dynamic SQL (crosstab query, basically) to pull all of the
values I need and dump them into a result set.
The data volume will be large (50K-60K records per day) and reporting
needs to be fairly responsive (web based reporting system...generate a
result set from SQL Server, typically using a date/time range, with
the required fields and pass back to data access layer for final
processing). So...out of the three methods I've thought about...any
comments or thoughts about which would be the best way to go? Are
there any other methods I should take into consideration? I know SQL
Server 2005 is supposed to have much improved XML support if I go that
route, but that's not an option at this point...I'm stuck with using
2000 for now.
Thanks for any help you can offer...it will be greatly appreciated!
Why dont you have a table like this for the extra info
(Vehicle ID, Capability ID, Value)
Vehicle ID + Capability ID will be the primary key. You can avoid the NULLs this way as you have rows for only those capabilities the vehicle has. If you think you dont add vehicles often to the system, then you can remove the VehicleID from the table and
create a table for every Vehicle with (CapabilityID and Value) as columns.
Is this a possible option?
Chandra
"Jeff L." wrote:

> First, a quick overview of my project. I'm designing a vehicle
> tracking system that takes data from multiple types of GPS devices,
> stores the data in a common database, and allows the user to view
> device locations in real time or create reports on previous activity.
> Currently we're only using one type of device, but I'm trying to
> futureproof the app so I don't have to redesign it down the road.
> Since the devices have different capabilities, I'm trying to come up
> with the best method to store their data in a common format. For
> example, say I have two devices, DeviceA and DeviceB. Both can report
> their latitude, longitude, speed, heading, and a timestamp. DeviceA
> can also report an odometer value, whether or not it has a GPS fix,
> and telematics data. DeviceB cannot report those values. Along with
> this data, every record will be tagged with address information on the
> server side. Down the road, we may have DeviceC, DeviceD, etc with
> new capabilities.
> Now, the question is, how can I store all of this information in a way
> that is simple to search/order/etc that is also "dynamic"? I've
> thought about using three tables...one would contain "basic" record
> info...latitude, longitude, speed, heading, timestamp, and a reason
> code for why the record was sent (ignition on/off, start/stop, etc).
> A second table would contain address information (street number,
> street name, city, state, ZIP) and would be linked back to the basic
> table. The third table would contain a single field with an XML
> fragment detailing the rest of the information for that record
> (odometer value, GPS fix status, telematics, etc).
> Problem is, I can't find any method to allow me to search through XML
> contained in a column other than a full text search. I also have the
> problem of creating a result set containing all of the dynamic columns
> (or selected ones only) to be returned to my ASP.NET application for
> reporting.
> Other methods I have thought about...storing all "extra" info in a
> huge table with columns for each value. This would result in a lot of
> wasted space (NULL values everywhere for devices that don't support
> those features) and new columns would have to be added each time a new
> device is supported (if new features are provided). Yet another
> method...store values as key/value pairs in a separate table. This
> would be rediculously slow though...I would have to generate some
> heavy-duty dynamic SQL (crosstab query, basically) to pull all of the
> values I need and dump them into a result set.
> The data volume will be large (50K-60K records per day) and reporting
> needs to be fairly responsive (web based reporting system...generate a
> result set from SQL Server, typically using a date/time range, with
> the required fields and pass back to data access layer for final
> processing). So...out of the three methods I've thought about...any
> comments or thoughts about which would be the best way to go? Are
> there any other methods I should take into consideration? I know SQL
> Server 2005 is supposed to have much improved XML support if I go that
> route, but that's not an option at this point...I'm stuck with using
> 2000 for now.
> Thanks for any help you can offer...it will be greatly appreciated!
>
|||Why dont you have a table like this for the extra info
(Vehicle ID, Capability ID, Value)
Vehicle ID + Capability ID will be the primary key. You can avoid the NULLs this way as you have rows for only those capabilities the vehicle has. If you think you dont add vehicles often to the system, then you can remove the VehicleID from the table and
create a table for every Vehicle with (CapabilityID and Value) as columns.
Is this a possible option?
Chandra
"Jeff L." wrote:

> First, a quick overview of my project. I'm designing a vehicle
> tracking system that takes data from multiple types of GPS devices,
> stores the data in a common database, and allows the user to view
> device locations in real time or create reports on previous activity.
> Currently we're only using one type of device, but I'm trying to
> futureproof the app so I don't have to redesign it down the road.
> Since the devices have different capabilities, I'm trying to come up
> with the best method to store their data in a common format. For
> example, say I have two devices, DeviceA and DeviceB. Both can report
> their latitude, longitude, speed, heading, and a timestamp. DeviceA
> can also report an odometer value, whether or not it has a GPS fix,
> and telematics data. DeviceB cannot report those values. Along with
> this data, every record will be tagged with address information on the
> server side. Down the road, we may have DeviceC, DeviceD, etc with
> new capabilities.
> Now, the question is, how can I store all of this information in a way
> that is simple to search/order/etc that is also "dynamic"? I've
> thought about using three tables...one would contain "basic" record
> info...latitude, longitude, speed, heading, timestamp, and a reason
> code for why the record was sent (ignition on/off, start/stop, etc).
> A second table would contain address information (street number,
> street name, city, state, ZIP) and would be linked back to the basic
> table. The third table would contain a single field with an XML
> fragment detailing the rest of the information for that record
> (odometer value, GPS fix status, telematics, etc).
> Problem is, I can't find any method to allow me to search through XML
> contained in a column other than a full text search. I also have the
> problem of creating a result set containing all of the dynamic columns
> (or selected ones only) to be returned to my ASP.NET application for
> reporting.
> Other methods I have thought about...storing all "extra" info in a
> huge table with columns for each value. This would result in a lot of
> wasted space (NULL values everywhere for devices that don't support
> those features) and new columns would have to be added each time a new
> device is supported (if new features are provided). Yet another
> method...store values as key/value pairs in a separate table. This
> would be rediculously slow though...I would have to generate some
> heavy-duty dynamic SQL (crosstab query, basically) to pull all of the
> values I need and dump them into a result set.
> The data volume will be large (50K-60K records per day) and reporting
> needs to be fairly responsive (web based reporting system...generate a
> result set from SQL Server, typically using a date/time range, with
> the required fields and pass back to data access layer for final
> processing). So...out of the three methods I've thought about...any
> comments or thoughts about which would be the best way to go? Are
> there any other methods I should take into consideration? I know SQL
> Server 2005 is supposed to have much improved XML support if I go that
> route, but that's not an option at this point...I'm stuck with using
> 2000 for now.
> Thanks for any help you can offer...it will be greatly appreciated!
>
|||Hi Jeff,
SQL 2005 in effect adds the ability to XQuery the data on a column; there is
the "XML" column type that allows this. Also the performance is quite good
since (from what I understand) data is optimized and indexed based on the
XSD information given when defining this field.
Of course this approach is not an option since Yukon is still many months
away.
See below on the poinst I suggest you to follow... and a mid-way solution
that can help!
Ciao,
Adriano

>...store values as key/value pairs in a separate table. This
> would be rediculously slow though...I would have to generate some
> heavy-duty dynamic SQL (crosstab query, basically) to pull all of the
> values I need and dump them into a result set.
Yes this kind of normalization is very good because you don't rely on actual
fields to store information, thus reducing space wasting.
Performance-speacking: SQL Server 2000 has a great set of features you can
use to improve querying speed.
1) Indexed views: You can perform aggregations on this table using Indexed
Views in order to have real-time view of your data in a "de-normalized" and
summarized way, where necessary.
2) Mantain only last-month data here so you have last-month queries quite
fast; move the oldest ones in a parallel "history "table. Report this table
only when explicitly requested by the user.

> The data volume will be large (50K-60K records per day) and reporting
> needs to be fairly responsive (web based reporting system...generate a
> result set from SQL Server, typically using a date/time range, with
> the required fields and pass back to data access layer for final
> processing). So...out of the three methods I've thought about...any
> comments or thoughts about which would be the best way to go? Are
> there any other methods I should take into consideration? I know SQL
> Server 2005 is supposed to have much improved XML support if I go that
> route, but that's not an option at this point...I'm stuck with using
> 2000 for now.
Another solution?
Build many tables, one for each "device type".. .where you can store "extra"
information without wasting space.

> Thanks for any help you can offer...it will be greatly appreciated!

No comments:

Post a Comment