1. Approach your solution from the point of view of, “What if it never failed in the first place?”
Peak Hosting’s focus is on an architecture where component failure does not lead to service interruption. Cloud architectures have the tail wagging the dog; companies are now rewriting their software to account for failures in the cloud. Not only do you have cost for downtime and additional software development, but service failures can also cost you your reputation. Why pay to run your software and multiple cloud instances at the same time, when you can run on infrastructure that was designed so you never fail in the first place?
2. Design your infrastructure to be 2N.
A 2N architecture means dual power supplies, redundant hard drives, dual PDUs, dual UPSs, dual generators, dual networks, dual NICs…if it’s possible to put two of something into a system, we have it there. This isn’t theoretical. It’s not an add-on or an option. It’s a basic tenet of our design philosophy. Over a third of the cost of our systems is in redundancy. Why? Because these parts will eventually break, but component failure doesn’t need to lead to system failure. The industry standardizes on a 1N architecture and promises to fix it quickly if it fails (which it will). This really means, “I’ll replace your hard drive with a brand new one,” but how many hours or days are you going to be spending getting your code, your configuration, and your data back on to that system?
3. Extensively test the network
One of the advantages of a 2N network is redundancy. But that redundancy does no good if the redundant items don’t work properly. Peak Hosting tests our implemented 2N network environment to ensure it has sub-second failover. This means that, in the event something does fail, the redundant hardware will take over in less than one second, meaning downtime is essentially zero.
4. Use only top name equipment, such as Dell, HP, Cisco, F5 and EMC.
If your company only cares about buying the cheapest hardware, you’re going to have high failure rates, which mean costly downtime. That’s why Peak Hosting partners with companies such as Dell, HP, Cisco, F5 and EMC. By purchasing our hardware from the best vendors with the best reputations, we ensure we have quality products delivered every single time. We can’t stop a motherboard from failing, but we can absolutely stop motherboard failure from interrupting your customers’ users’ experience.
5. Burn in your infrastructure for 72 hours before putting it into production.
We burn in our servers for at least three full days before turning them over to customers, because even if you buy from the highest quality vendors like HP or Dell (like we do), there’s always the possibility of something going wrong. Testing our servers before being released to you ensures you get the highest quality hardware that’s been tested to the fullest extent possible. To accomplish this we run a CPU, memory, and disk-intensive synthetic workload that stresses these components by simulating 100% utilization of the system. This not only detects faulty components, but has the added benefit of “heat cycling” the system to ensure that manufacturing defects (such as weak solder joints) are detected prior to being placed into a production environment. This comprehensive interrogation of the system allows us to discover any issues with the equipment…before your customers do!