Hi,
I wondered if anyone had any thoughts on the best way to go about
splitting up a san array for a 500gb fact table. Its going to need to
be partitioned to allow for the overnight processing to complete on
time but what is the best way to split the san array for it.
I have about 10-13 san disks available for the table which leaves me
enough space on the other disks for the other database objects and
tempdb, logs etc.
The table will be partitioned into 13 logical weeks but would it be
best to allocate one disk per partition or have a 13 disk raid group
and put all 13 partitions on that and have it striped?
Any thoughts?
Thanks
Ian.
I am *far* from an expert, but here's my thoughts...
For having each partition on a separate disk or spindle, the question is are
you going to be using that partitioned table in parallel? Meaning, are you
going to be accessing or modifying several, if not all 13 weeks, at the same
time? If so, then it would make sense (depending on the processing power of
the SAN - EMC's DMX would be able to handle this all in parallel) to
separate the partitions into distinct spindles.
If you're not sure about parallel operations, then I would probably through
them into a JBOD (Just a Bunch Of Disks) set up like your second
alternative.
Here's my school of thought on this: You separate your TEMPDB (Very
important in SQL Server 2005), your LOGS, your INDEXES, and your TABLES onto
separate sets of spindles. That's usually a good starting point for an EMC
type of SAN. From there, you would then get into further tuning to see if
you would receive any additional benefit of separating objects (Tables,
Partitions, Indexes, etc) onto distinct spindles... remember you're
increasing management of the disk system when you do that, so it's helpful
to determine the benefit, if any.
FYI, I originated from the Oracle School of Thought, so it might clash with
some SQL Server admins ;)
JASON
"ianwr" <ianwrigglesworth@.yahoo.co.uk> wrote in message
news:1186730967.614276.138730@.z24g2000prh.googlegr oups.com...
> Hi,
> I wondered if anyone had any thoughts on the best way to go about
> splitting up a san array for a 500gb fact table. Its going to need to
> be partitioned to allow for the overnight processing to complete on
> time but what is the best way to split the san array for it.
> I have about 10-13 san disks available for the table which leaves me
> enough space on the other disks for the other database objects and
> tempdb, logs etc.
> The table will be partitioned into 13 logical weeks but would it be
> best to allocate one disk per partition or have a 13 disk raid group
> and put all 13 partitions on that and have it striped?
> Any thoughts?
> Thanks
> Ian.
>
|||Hi Ian,
thanks for your question around distributing data across spindles on a SAN.
Based on our experiences with many SQL 2005 data warehousing customers, I
would encourage you to distribute your data across all spindles. This should
give you good IO parallelism in case your query touches only one partition,
and it should also give you similarly high parallelism for a query that
touches many partitions. Depending on how large your data in a single
partition is and how many cores you have in your system, you want to avoid
cases where you exercise only one spindle and have many cores idle waiting
for data from the IO subsystem.
As Jason pointed out, it certainly makes sense to keep log and tempdb on
separate sets of spindles. You might want to experiment with index and data
on the same set of spindles. Usually, this already works sufficiently well as
compared to separate index and data. AS always, further tuning may be needed
depending on the characteristics of your workload.
Hope this makes sense and helps you with your SAN configuration.
Best regards,
Torsten Grabs
Program Manager
Microsoft SQL Server Query Processor
"Jason Fay" wrote:
> I am *far* from an expert, but here's my thoughts...
> For having each partition on a separate disk or spindle, the question is are
> you going to be using that partitioned table in parallel? Meaning, are you
> going to be accessing or modifying several, if not all 13 weeks, at the same
> time? If so, then it would make sense (depending on the processing power of
> the SAN - EMC's DMX would be able to handle this all in parallel) to
> separate the partitions into distinct spindles.
> If you're not sure about parallel operations, then I would probably through
> them into a JBOD (Just a Bunch Of Disks) set up like your second
> alternative.
> Here's my school of thought on this: You separate your TEMPDB (Very
> important in SQL Server 2005), your LOGS, your INDEXES, and your TABLES onto
> separate sets of spindles. That's usually a good starting point for an EMC
> type of SAN. From there, you would then get into further tuning to see if
> you would receive any additional benefit of separating objects (Tables,
> Partitions, Indexes, etc) onto distinct spindles... remember you're
> increasing management of the disk system when you do that, so it's helpful
> to determine the benefit, if any.
> FYI, I originated from the Oracle School of Thought, so it might clash with
> some SQL Server admins ;)
> --
> JASON
>
> "ianwr" <ianwrigglesworth@.yahoo.co.uk> wrote in message
> news:1186730967.614276.138730@.z24g2000prh.googlegr oups.com...
>
>
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment