Texas Christian University (TCU), a leading private university in Fort Worth, Texas, has implemented LifeKeeper® for Exchange, from SteelEye Technology Inc., to achieve high availability and disaster recovery assurance for the university’s email systems. Bryan Lucas, Server Administrator for TCU, will be available to answer questions regarding his organisation’s implementation at SteelEye’s booth (#307) at this week’s Microsoft Exchange Connections Conference in Orlando.
Texas Christian University is a leading private teaching and research institution located on a 260-acre campus five miles from the heart of downtown Fort Worth. With over 8,000 students pursuing 98 undergraduate majors and 20 graduate degrees in 59 areas, including 11 doctoral fields of study, TCU has an operating budget of approximately $230 million.
The business requirement
In 2005, TCU sought a mechanism to achieve redundancy for its email servers within “The Bunker,” the university IT team’s name for its standby data centre. TCU sought not only to protect against downtime by ensuring high availability on its Microsoft Exchange 2003 servers, but also to achieve comprehensive and rapid failover capabilities for disaster recovery scenarios.
TCU had long previously identified the need to put into place a means to assure its email systems would remain running in the event of disaster. The trouble it encountered, however, was that Microsoft Exchange’s native tools had been designed for clustering within a single data centre, but were less than ideal for creating clusters across geographically separated sites. In 2003, with these concerns in mind, TCU set out to achieve high availability via replication over IP. In its first attempt, it implemented a combined software and hardware solution from Network Appliance. Unfortunately by early 2005, TCU was forced to find a new solution due to disappointment in the solution on a number of dimensions.
“It was an expensive solution that just didn’t work very well with Exchange 2000. We had real concerns about support from Microsoft as well as upgrading it to Exchange 2003. Its data replication mechanism didn´t handle multiple storage groups and databases efficiently. And the recovery wasn’t very clean, often taking as much as two hours to bring a recovery site live,” said Lucas. “You just don’t have that type of time when you’re talking about protecting against a disaster.”
By October 2005, TCU remained concerned about its ability to quickly recover from a potential disaster, a potential that now seemed more realistic than ever, given the havoc wreaked upon other southern universities by Hurricane Katrina in August of the same year. During his search for a better solution, Lucas caught wind of SteelEye’s LifeKeeper for Exchange solution while attending a training session for Microsoft Exchange 2003.
“I hadn’t previously seen a specific solution for our particular challenge, which we were referring to among colleagues at the time as ‘geographic dispersed clustering over IP’,” said Lucas. “When another student at my Exchange training class tipped me off about SteelEye, I practically ran to their website to check out LifeKeeper.”
At first, Lucas was guarded about the much lower price tag associated with LifeKeeper® for Exchange in comparison with TCU’s incumbent solution.
“We were looking at costs that were around a tenth of what we had put forward for what he had in place, so I wanted to get a chance to talk to SteelEye – LifeKeeper frankly sounded a little too good to be true,” noted Lucas. “We have a very complex environment that made me think we were pretty unique. But after spending some time on the phone with a very knowledgeable rep, I was really impressed by their expertise.”
Lucas and his team travelled to one of SteelEye’s facilities in South Carolina to witness LifeKeeper for Exchange work in a live environment on SteelEye’s own corporate email system.
“They let us rip it apart and put it together and it worked great,” said Lucas. “These guys were obviously gurus – they knew what they were talking about. You really do just load the software on top of what you’ve got running and there’s no hardware needed. We were ready to move on it.”
Upon selection, SteelEye sent a technician onsite to TCU for a planned three to four day engagement aimed at getting an initial server pair up and running.
“It was supposed to take four days, but in about two and a half, we had downloaded, installed, tested and created all the documentation for the system,” noted Lucas.
As a next step, Lucas and his team moved the Exchange services for its own department onto a LifeKeeper-based server for a 45-day test comprising about 40 user mailboxes. During that period, SteelEye helped review and optimise LifeKeeper for TCU’s environment.
Shortly after the initial successful test period, the team began the next phase of its pilot implementation, moving 200 more email accounts from its School of Business onto LifeKeeper-protected servers. For 30 days, the team alternated between keeping the production and failover servers live. Because the solution performed flawlessly, TCU then rolled out the solution for the entire faculty, and has placed at the top of its priority list a June 2006 rollout to all of the university’s Exchange servers, including those that run the accounts for the more than 8,000 students on campus.
“A few years ago, I might have looked at this project and said the primary benefit was in reducing the cost for assuring continuity, because this was a highly cost-effective project,” said Lucas. “Since Katrina, it doesn’t really matter what it costs. We realise more than ever this is a critical service for us, and we almost can’t put a price tag on what it’s worth to have a solution like LifeKeeper that we can rely on.”
As for the top evidence of the benefits, Lucas points at a critical success factor for disaster recovery – keeping downtime to an absolute minimum by providing swift failovers.
“We literally can failover an entire server with over 1,000 users in 15 minutes, and most of that is the time it takes Exchange itself to start up. LifeKeeper itself probably takes less than four minutes to complete its role in the failover. It’s just outrageous,” exclaimed Lucas. “We have a complicated environment with some unique network components and if a solution can work for us, it can work for nearly anybody.”