Sunday, March 11, 2012

Best practice for DR/Standby server

We use SQL2000 Standard Edition SP3a. We have a
Production Server and a DR Server that is off site.
Currently I have SPs coded that do log shipping to the DR
Server every hour from the log backups on the production
server. Is this the best way to keep the DR Server up to
date? I've explored Replication, but it seemed much more
complex and structure changes don't get replicated which
means a new shipment of the entire Production DBs anytime
a structure change is made. It seems like Replication
would be tough to support for this. There is also
Hardware Replication with the Production SAN to the SAN at
the DR site. That's not something I'm very familar with
or how it works exactly. Is Hardware Replication a better
way to do it? I've been told that they may need real time
updates to the DR Server in the future and I'm not sure if
that is realistic or not. I still think log shipping is
best. I could take it down to shipping every 15 min if
needed. Could someone post some opinions/suggestions for
me.
Thanks,
Van
Van,
If your DR server is only used for DR purposes this is what I would do - you should have less than 30 secs down time if you try what I propose.
1) Place your SQL Server data and log files into two separate LUNs on the SAN at your primary site. Start SQL Server and make sure it's running normally. You may even want to try putting the OS on the SAN too - up to you.
2) Image the server using an imaging tool and restore the image to your secondary server and attach it to your secondary SAN at your DR site.
3) Stop SQL Server service on the secondary server.
4) Set up replication at the block level using SAN replication software - replicate the LUNs that the log and data files reside on. You will need to do a full mirror to start with, and then stream replication from then on.
5) When the primary site goes down, make sure that replication gets stopped (it will stop anyway), perform a failover on your SAN to the secondary site, then start SQL Server on your secondary server. You will find that SQL Server will start normally and
there will be much, much, much less data loss than log shipping. Make sure that replication cannot go from source --> secondary should the source come back online, otherwise you will get data file corruption.
6) Step 5 can be scripted so that you can just "push a button" to fail over. The entire process to fail over shouldn't take more than 30 seconds - 1 minute.
The failback process once the primary is available again is a reverse of the above but in a more controlled manner.
I have implemented this solution across the atlantic and it works very, very well indeed, as long as the process is scripted with all manual steps removed. Also, make sure a DR test is performed regularly. Say goodbye to log shipping.
Mark Allison, SQL Server MVP
http://www.markallison.co.uk

No comments:

Post a Comment