Optimising Data Protection In Linux Environments

By | May 10, 2006

The need to balance business continuity with cost has created a new set of storage best practices that blend backup and recovery with leading-edge data replication to accommodate varying information availability requirements. Conventional data protection methodologies are simply insufficient as traditional backup and recovery technologies can be unreliable, slow and require an inordinate amount of administration. By adding replication to data-intensive Linux environments, however, enterprises of all sizes can refine and improve the efficiency, simplicity and effectiveness of their data protection strategies while strengthening business operations substantially.

Bytes, Blocks, Volumes or Files—Best-of-Class Data Protection

Replication lets users create real-time copies of critical data in multiple locations. Depending upon the environment and requirements, replication can be deployed in several different approaches. Historically, enterprises with no tolerance for any data gaps have invested millions of dollars in array-based replication facilities and dedicated high-speed lines to provide block-level replication between two geographically limited sites. While these solutions provided asynchronous replication, most also relied on synchronous replication that leveraged the high-speed network. Although this provided high levels of data availability, the costs associated with deployment and maintenance often exceeded typical budgetary constraints.

The next wave of replication facilities leveraged advances in networking and the proliferation of IP as the network protocol of choice. The first of these solutions came either as an appliance or volume-based replication facility tied directly into proprietary file systems and volume managers. While the increased speed and dependability of standard IP provides enterprises with economical deployment options, certain configuration limitations of block-based appliances and volume-based replication reduces overall effectiveness. In fact, this approach requires proprietary hardware and software configuration at both locations. As a result, block-based solutions are best suited for infrastructure replication rather than protecting and ensuring the availability of applications and files.

Additionally, this approach hardly took into account the relationship between physical storage and the type of data and application consuming the physical storage. Both of these are important considerations when managing business-critical applications and associated data. Fortunately, a new type of software-based solution has emerged that utilises IP lines and standard servers and disks to replicate more relevant file changes at the byte-level. This approach works across any distance, providing both the scalable performance and application relevance required for replicating growing Linux environments. When implemented in real time, these asynchronous replication offerings can provide enterprises with minimal data gaps, cost-effective disaster recovery as well as environmental and geographic flexibility at an acceptable price point. Furthermore, these solutions also let users take advantage of their existing legacy investments in storage, servers and infrastructure.

Hybrid, Real-Time Replication Solves Critical Business Challenges

For the highest levels of availability, disaster recovery and business continuity, this emerging class of integrated, real-time replication solutions provide continuous, cross-platform data replication for heterogeneous IT environments over standard IP networks. In this fashion, customers can centrally manage both real-time and scheduled replication across multiple sites for a variety of server platforms, storage devices and applications. This unique flexibility reduces deployment costs and total cost of ownership (TCO) by enabling customers to create a tiered solution that matches system and network resources to the value of the data they need to protect. Additionally, this versatile approach to replication delivers unprecedented support for heterogeneous environments with ever-increasing Linux deployments.

Clearly, leading-edge data replication offers compelling business benefits for a variety of applications, including the following: Improving business continuity and disaster recovery, Streamlining consolidated backups, Easing server migrations, Improving content distribution, and Optimising geographically dispersed SAN clusters.

Bolstering Business Continuity and Disaster Recovery – Data replication to a remote site is a key component of any disaster preparedness and data availability plan, because it helps mitigate costs of disruption by minimising data loss and enabling rapid recovery. It also ensures compliance with both internal guidelines and external regulations. It’s extremely important to streamline the business continuity process by automatically failing over to a secondary site in the case of a disruption.

A replication solution that is independent of operating systems, devices and file systems delivers the most cost-effective tiered replication capabilities. Additionally, network costs can be reduced by deploying solutions that only replicate data changes while taking advantage of unique data compression, bandwidth limit throttling as well as configurable replication data streams.

Streamlining Consolidated Backups – Taking servers offline to back them up can be costly both in time and effort – but with data protection at stake, organisations have little choice. Alternatively, consolidating the effort by replicating data to one or more dedicated backup servers, then backing up from the replicas, is far more efficient. IT organisations have smaller backup windows, but increased requirements for high availability. In addition, they must deal with heterogeneous servers and geographically dispersed locations where large amounts of data might reside but with far fewer technical resources to support backup operations.

With newer, software-based replication, enterprises can consolidate critical file content from multiple servers to one or more backup servers. This unique many-to-one replication enables organisations to keep online mirror copies of critical data without doubling the size and cost of the entire infrastructure.

Cross-platform replication also allows users to backup distributed Windows application servers to a centralised Linux server (or any other combination) over local-, metropolitan- or wide-area networks for improved configuration flexibility and manageability. Non-disruptive backups keep production servers online and offload the backup burden. Additionally, users can pause replication on the source server for creating a point-in-time snapshot on the target server, which then can be used for offline backup to tape. When done, replication resumes, starting with the changes that were automatically journaled on the production server during the pause.

Easing Server Migrations – Migrating servers from one platform to another can provide performance improvements, security, scalability, lower cost and other valuable benefits, as well as a competitive advantage such as rapid platform adoption with minimal application disruption. The difficulties of making the change while providing continuous service and safeguarding mission-critical applications and data often forces companies to delay or forego migration altogether.

The ability to install replication on existing production and new target servers without disruption or special hardware becomes especially important during server migrations. By enabling real-time replication of data to the target platforms, companies can realise potential cost savings and other benefits of server migration but without the downtime to mission-critical applications.

Improving Content Distribution – Many organisations maintain multiple sites across different types of networks. These networks often are low bandwidth and high latency links that provide basic connectivity, but are not sufficient to carry large amounts of data. In addition, individual locations may be secured behind corporate firewalls, further complicating the need to exchange data between working groups.

By propagating data more efficiently and securely across existing IP networks, enterprises can obtain a more efficient intra-organisation knowledge transfer.

The ability to maintain data consistency between remote sites, even over high latency, long haul networks is highly advantageous. By consolidating IT administration, functionality and productivity can be enhanced while at the same time lowering TCO. Real-time data replication greatly increases efficiency, because there is no need to traverse large data stores in order to determine the changes. Equally important in Linux environments is scaling along with an organisation’s storage needs, regardless of the size of the underlying data.

Because high application performance often requires large server clusters accessing a Storage Area Network (SAN), data access is limited to just the local data centre. Connecting these SAN clusters improves information access, collaboration and data availability. Server clusters cannot span long distances effectively while SANs are limited by distance and costs of Fibre Channel. Scaling SAN clusters to accommodate data growth can be an increasingly complex task.

Real-time replication can overcome distance limitations of clusters such as Oracle9i, Red Hat GFS and Polyserve Matrix Server by complementing the software with replication capabilities. It ensures that clustered or grid computing environments with large amounts of rapidly changing critical data remain universally available, even in the event of multiple node failures that occur anywhere in the system, over any distance. Additionally, it scales along with data growth regardless of the storage used in each datacentre and the distance between nodes, making it possible to extend the SAN over IP networks well beyond the distance limitations of Fibre Channel.

Exercising Enterprise-Wide Replication Options

Businesses have numerous options when developing or enhancing their data protection strategies. Applications and their associated data are central to a company’s ability to leverage its technology infrastructure. Making them available continuously provides an immediate impact on the bottom line.

Traditional tape-based backup and recovery solutions have been the foundation upon which many data protection strategies have been created. However, business drivers such as business continuity, disaster recovery, and consolidation of backup and deployment of new Linux servers and technologies have continued to drive the evolution of progressive, leading-edge data protection strategies.

When selecting the optimal replication approach, it’s important to evaluate closely core business objectives. Often, heterogeneous environments with increasing, enterprise-class Linux deployments will seek maximum file and application availability while also striving for ultimate flexibility and scalability. In the long run, cross-platform replication delivers the most effective backup consolidation. Real-time, byte-level implementation minimises data exposure while providing scalable performance and asynchronous replication, which can be customised to maximise available resources and provide maximum protection for business-critical applications and data.

Leave a Reply