Colocation for Disaster Recovery

I have spent the last 15 years with web hosting and data centers consuming my every day.   Over the years I have had the privilege of seeing a lot of data centers and hosting operations in person.  Early on, when we didn’t have data centers of our own, we would frequently tour facilities as we shopped data centers to grow Hivelocity within.  More recently, we tour data centers exploring new markets and acquisition opportunities.  What holds true with every data center is you never know what you are going to get.  Running a proper data center is not an easy or inexpensive proposition.  If the team behind the infrastructure is not giving 100% every day, I can assure you they are missing an important detail that could adversely affect your business one day.   I have some anecdotal stories to share with you that will shed light on exactly why taking a close look is so critical.  

Let me start by saying, no matter the type of business you run, the data that you store on your servers is vital to your ongoing success and profitability. As one of the most important assets, you would be surprised by how many businesses continue to their IT infrastructure in a closet or fail to implement any disaster recovery strategy.  You might wish to believe that a natural disaster, plumbing disaster or network failure would never happen to your company, but nobody is immune from these threats.  If you experience a fire or flood or even a simple power surge at your office, not having a disaster recovery plan in place could cause your business to fail.



To protect data, companies should consider moving some or potentially all of their server environment to a colocation data center.  Colocation facilities are specifically built to house server equipment for businesses. The beauty of putting the company server gear in a data center is relocates your critical data away from your business’ primary location but provides you with a much more robust network making requests to the server perform as if they are still right down the hallway.

So here is the bottom line, when considering colocation for your primary or disaster recovery site we recommend taking a tour of the facility and asking the following questions:

Is the colocation facility secured?

What was security like when you entered the facility? Is there 24-hour monitoring onsite? Is there a system tracking everyone who enters and leaves the facility? If so, you are in the right place. Pay attention for cameras throughout, personnel monitoring those cameras and posted policies.  Colocation facilities with SSAE-16 compliance should have sound security policies in place that are ideal for keeping your systems physically secure.

Insider’s perspective- Having cameras all over the facility is great but the cameras actually have to work.  I once toured a facility that was equipped with dummy cameras throughout.  Part of being PCI compliant is not only having cameras covering every entry point of the facility but also archiving the footage for a minimum of 90 days.  When I posed the question of length of footage archiving to our tour guide I was told “zero days, none of those cameras are working”.  As a data center owner/operator I can certainly see how this happens.  We just spent $8000 on just the storage systems we use to archive the footage at our new data center.  Then you have the cameras, the software, the man hours to install and manage the system etc…  Like I said, running a data center is not inexpensive and not a business you can really skimp on things.

What type of power plan does the facility have in place?

Businesses that are considering a colocation data center should inquire about power grids for the facility. Take note how many sources of power are feeding the facility, does the facility have generators and how long can the facility run on generator power in the event of commercial power loss.  In case of a power failure, the colocation facility should be well equipped to continue running at full speed for several days.

Insider’s perspective-  Sadly I am the person who looked like a jerk on this one.  Early on we colocated at a third-party data center, a few of them actually.  One day I was giving a tour of one of these facilities to a potential customer.  I was assured a few days earlier by said data center that the UPS had just been outfitted with brand new batteries as everyone there was well aware the backup systems were getting a bit long in the tooth.  During my tour, as we passed through the power room, I took this opportunity to boast about these new batteries.  Our potential customer leaned over to take a closer look and to my shock asks me, “why are all of the batteries data stamped 2003?”  Oh if only the year was 2003 and not 2008 when I was giving this tour.  To my shock, this third-party data center went on the cheap and outfitted their UPS with used batteries that were already EOL.  No, we did not land that customer.

Are there systems in place to handle water or fire?

Behind electrical failure, water damage is the most common reason for data loss. When moving to a colocation facility for disaster recovery, the facility should have a leak detection system in place. The colocation’s leak detection system will detect water and immediately set-off a chain reaction to keep servers protected.  If the data center is in a high-rise ask them what kind of company is located on the floor above the data center.  If they tell you there is nothing but restrooms and showers on the floor above you might want to think on that one.

Insider perspective- We started our company in 2002 in a basement with 1 rack in a closet.  We opened our first data center sometime in 2004.  It was about the size of our conference room today but way bigger than our former closet.  We had a raised floor but no leak detection.  One day we noticed one of our CRACs (computer room air conditioner) was not blowing cold air.  Part of our inspection involved pulling the raised floor that surrounded the unit. The CRAC’s condenser pan drain had become clogged causing the unit to have issues AND for condensation to collect eventually dripping water on the ground.  We were aware of the situation quickly and able to mitigate the water with a towel.  Had the CRAC not had issues causing us to pull up those floor tiles it certainly could have become a much bigger problem requiring a much, much bigger towel.   Today we have leak detection outfitted in each of our raised floor data centers.



Do they perform regular preventive maintenance on critical assets?

A data center is filled with lots of expensive and complex equipment that are all critical to achieving 100% uptime.  If this equipment is not tested and maintained regularly problems are discovered not in advance but at the time of failure…and then outage.  CRAC units should be serviced every 3 months.  This involves changing out the filters, inspecting the belts, the motors etc… Generators should be run every week or two just to hear it run and keep the fluids moving.  Just like your car, you can often recognize an issue with an engine just by turning it on and listening to it.  Oil changes, fuel tests and full inspections should also happen regularly.  The UPS and batteries should be inspected often where infared cameras are used to look for hot spots and bad capacitors.  There is much more to a proper PM schedule than this, the important thing to remember is preventative maintenance should be done regularly.  Ask to see the data center’s maintenance logs.  You should see most items happening every 3 months.

Insider Perspective- Unfortunately I have a lot of stories I could share here.  Too many facilities adopt a reactive strategy with their equipment rather than a proactive one.  I have been in data centers that were 95° in temperature…often.  I have been in data centers that had fans scattered throughout so there was some air movement…constantly.  Both scenarios were due to their CRAC units frequently failing.  I have known data centers to do load transfers in front of big potential clients only to have the generator fail within seconds of the power transfer.  All of these issues could have and should have been avoided with regular preventive maintenance…and money.  At the end of the day all of the problem cost the data centers way more money than the cost of regular preventive maintenance.


Does the facility offer smart hands and are they really smart?

If you have ever toured a data center yourself, you likely saw a big cold building (hopefully cold) with lots of power but often with very few personnel.  Usually the few people you do see are a sales guy to give the tours, maybe a front desk person to sign you in and one or two people in the NOC staring at monitors. At night it is often just one techie type or maybe just a security person who doubles as a reboot specialist if someone needs it.   If the lights are on, the CRACs are blowing and the network is up there is typically not much for these people to do.  “Power and Pipe” is what they call it.  If the “power is up and the pipe (network) is up we are doing our job”.  Managing the servers, fixing the problems and replacing hardware is left up to you at a “power and pipe” facility.  Usually colocated customers are on top of things and happy to drive down to the facility now and again when something involves hands on work.  However, driving down to the facility is not always an option or convenient for the customer.  Techies take vacations, people get sick and if something is down at 3am that is a terrible time for anyone.   This is when “remote hands” or “smart hands” comes in handy…damn, good pun.  This is also when you may discover “smart hands” is really a glorified mall cop doubling as your tech for the night.  Good luck.

Insider’s perspective- I will start by saying we have legitimate Linux/Windows/Virtualization/Network/Etc. experts on site 24/7/365.  With that said, we have a POP in LA where we do some network peering and have some SparkNode hypervisors.   About a year ago we resorted to flying one of our guys to Los Angeles to replace some memory in a hypervisor because the $200/hour “smart hands” provided at the data center had spent a full day trying to accomplish this very simple task with no success.  Mind you, this facility is one of the top 10 best connected buildings in the entire world… they are kind of a big deal.  Once in Los Angeles our guy had the memory added and the server up in about 15 minutes.  The point is, some of the biggest and baddest data centers in the world have no idea how to fix a server.  To there credit, nowhere do the words “server”, “service”or “smart” appear within “Power and pipe”.   So when on your tour you should ask to speak to one of their technicians or their support manager and get a gauge of their knowledge.  Ask them if one of them works overnights.  Techs will tell you the truth, the sales guy will tell you what you want to hear.


In Conclusion:

Businesses should plan for everything, including the possibility that disaster will strike. I can almost guarantee you the data centers described above had more in place than your office server closet currently does.   A business with a good disaster recovery plan including a proper colocation data center is much more likely to maintain regular business operations 100% of the time than one without.  Colocation will help to prevent substantial losses while likely reducing operational costs.

At Hivelocity, we know what it takes to create a secure and stable environment for your valuable servers. Choose from any of our three facilities and experience Hivelocity’s enterprise level colocation.