Automation is the new mantra in business continuity

Automation is the new mantra in business continuity

Sendil Kumar Venkatesan, VP – IT, Sriram Value Services Ltd, discusses what it took to put in place a sound disaster recovery system.

Shriram Value Services is an IT/ITES organisation that is part of the USD 10 billion Shriram Group. The Chennai-based company embarked on a business continuity exercise after the floods that crippled the city a few years ago. Sendil Kumar Venkatesan, who heads IT at Shriram Value Services, spoke to us on what it took to put in place a sound disaster recovery system. Here are some excerpts from the interview.

Disaster recovery used to be a massive concern with respect to business continuity at one point in time. How has the whole concept evolved over the years?
We manage IT for our group companies in the finance and insurance sectors, which in turn manage assets in excess of USD 20 billion. In such a scenario business continuity plays a major role. As Chennai was ravaged by floods a few years ago, we wanted to strengthen our business continuity process. As part of this initiative, we have, over the last two years, enhanced both our data centres and disaster recovery sites with the latest technologies, like automation and redundancy, at all levels. We have designed things in such a way that there is no DC-DR. Instead applications can be live in any of the two data centre locations. We reduced the recovery time objective (RTO) from 3 hours to 45 minutes and reduced the recovery point objective (RPO) from 30 minutes to less than 10 minutes in an asynchronous mode.

How is automation helping business continuity and what business benefits can one realise from it?
The biggest advantage is my RTO and RPO have come down. Secondly, having an automated solution we need not worry about senior resource availability during the disaster period. A junior resource can manage the application movement between data centres. This would be easy for us in a real disaster management situation. For disaster management our focus was more on technology, not so much on human resources. We have in the process created an executive dashboard where we can view the status, utilisation and other required details on one screen. Further, automation helps us simplify our processes and compliance requirements.

What were the challenges that prompted you to shift to automation of business continuity planning?
We had different challenges as we handle multiple companies that are highly regulated and need to comply with RBI and Insurance Regulatory and Development Authority regulatory requirements. We classified our functions into four critical areas.
Business applications: This involved getting the settings and configuration of each application right, linking servers and removing the IP address with domain names among other functions.
Application servers: As our app servers are in a virtualised environment, we enabled the replication of all virtual machines from one data centre to the other and during the disaster drill we could power on as required.
Database servers: We classified the database in such a way the replication happens wherever needed.
DNS: We got the IPs from the Indian Registry for Internet Names and Numbers (IRINN) in such a way that we did not need the IP address for our domain names during the disaster period and also when we change the Internet service provider.
This classification helped us streamline the activity.

What were the hurdles you faced when transitioning to the business continuity automation solution? How did you overcome them?
As this was very much a business need, we got support from management for the investment needed. We have designed our DR site, which is identical to our DC. As we manage multiple companies and multiple partners, integration took time. We went about it in a phased manner and now we have completed all our applications.

What are the innovations you’d like to see in business continuity automation solutions in the near future?
Currently we have an active–passive DC as it is an asynchronous and far site. We are exploring options that will help us bring the recovery time objective down to nil and want to have an active–active data centre. We are also exploring a hybrid cloud where our private and public cloud need to get integrated.