Over the past month I’ve been researching (mostly searching and the results have been #fail) cloud providers to understand what they use to “assure” trust. In other words – if I’m a company is of sufficient size that risk outweighs convenience and I want to make sure that if I use the cloud – my site will be secure, my information will protected with the privacy controls I require for my business (be they HIPAA/HITECH, SOX, PCI, etc.) and the site will maintain good availability via service levels.
First lets tackle what constitutes transparency for the customer (customer being someone who is going to place a business ON the cloud providers systems). As a potential customer I’m likely to want to see the following:
- SLA’s – What are the SLA’s? Are they Five 9’s? Four 9’s? What are the aggregated service levels if I’m a typical customer? So if I use your messaging services, your storage, and compute services do I get an aggregate of Five 9’s still or was one of the services at a lower level? Is there a service level credit? For example GoGrid provides a 10,000x credit for downtime while others just credit your minutes of lost time. Several providers will also actually provide the customer payments for business lost though this is provided via an insurance policy that the business could otherwise acquire.
- External Audits – The rage seems to be SAS70 Type II audits which are designed to have provide an element of financial systems trust by having a third-party certified auditor look at security policies and procedures and determine what controls are in place and measure their effectives as they relate to an audit of their financial statements (see SAS70 details). This is great if you are trying to make sure the provider has controls in place that support SOX requirements but what about general PII/PCI controls or HIPAA? What about their security posture? I’d argue that this type of audit is a good “business” control type of audit. What is still needed is a Security and a Privacy audit to complete this picture. Minimally as a customer I would want to see the SAS70 audit results and the mitigation plans. While many of the providers have this audit done – none of them currently provider the results on their websites and would not provider the details to me via email though one provider did provide a list of the SAS70 control “buckets” that they include in their audit but they would not let me publish them via a blog or post to the web – to quote them it would “lessen it’s value” (not sure how that is – but they provided it for my research). One last note is that I found that ADP would provide the results to of their SAS70 audits to their paying customers for the areas that aligned to the services that they used.
- Internal Audits/Assessments – It is assumed that the security and privacy policies need to be in place as SOP for data center operations. Along with the policies are internal audits/assessments that are performed with due rigor and regularity. The International standard ISO/IEC 27001:2005 is for security assessments and is considered by many to be one of the best and most thorough. There are several other well know security assessments such as OCTAVE from Carnegie Mellon which uses a Bayesian Model based on quantitative analysis of qualitative data and is designed to be used by internal resources. The US Government also has developed a series of standards in a he NIST SP800-53 standard The results of these audits are not generally available to “prospects”, actual customers, are not published, nor are they necessarily “honest”. For example – I ran across this statement on risk assessments from Pivot Points Security:
At this point Risk Assessments are a lot like a bikini; “What they reveal is suggestive, but what they conceal is vital”. Worse, it’s easy (and common) to make what they reveal what you want them to reveal.
Having performed and participated in OCTAVE, ISO/IEC 27001, NIST SP800-53, and COBIT audits myself I found out a few things in the process. Purely internal or purely external assessments introduce too much bias. OCTAVE was designed to be run internally because the subject matter expertise lie within the organization and employees/security staff have a better understanding of “asset value” (and I’m not going to get into the whole debate on usefulness of ALE/ROSI valuation methods). The internal bias could potentially be mitigated by having an external firm provide oversight and guidance. Perhaps the best (and most expensive method) would be to have an internal and external audit performed and compare them for patterns and gaps. Having run the operations for a managed service provider in the past it was my experience that we would have internal assessments run on off cycles from the external ones and the external ones would go through a “rough pass” phase, allow us to fix the most egregious problems, have a final pass run, and then the results would be provided to requesting and paying customers. If they weren’t both they didn’t see our security audit results.
One last comment on Risk Assessments – there is a new method name FAIR that Hoff pointed out developed by Jack Jones from Risk Management Insights that takes a different (and refreshing) approach to assessment methods. While most assessment methods rely heavily on interviewing and subjective qualitative data FAIR uses quantitative analysis for the asset valuation, threat impact, and also uses Monte Carlo simulations to pinpoint where the threats are most probably. This seems to be very unique because it is far more quantitative and makes it potentially far more machine readable/executable (I’ll expand on why this is important in future blogs).
- Employee Certifications/Expertise – If you go back a bit in time to ASP/MSP’s vs. Hosting there was a line of demarkation that happened when you wanted help. The MSP had excellent subject matter expertise on the services they provided all the way up through the stack to whatever level they provided. If they were a database MSP they had experts in database, security, backup, etc. at your disposal. If they were a hosting provider – they stopped at the lowest level – they knew a lot about power/pipe/ping (physical security, power, cooling, core network) but they usually left the but they usually stopped at the lowest level and if you needed OS support – they may/may not provide if and if they do it is not included in the service.
When looking at cloud providers you should look at the experience and certification levels of the staff as part of your investigation. I would also suggest looking at “where” the talent is and what hours they work. For example if they use a “follow-the-sun” method – that may mean the staff you are using during your normal workday does a hand-off when the clock strikes 5:00PM and you have to re-educate someone new who may want to have too much creative license on what the focus of the troubleshooting effort. No matter what – find out if they people working in support have names that are followed by the alphabet soup we are all accustomed to int he IT industry – CISSP, CIPP, CNE, CNA, MSCE, RSA/CA, etc. and make sure they have these certs from reputable organizations such as Cisco, ISC2, etc.
Also consider using a provider that is ITIL certified or at least has ITIL certified staff members. Why? Well for one ITIL was designed to improve the quality of service management by creating a framework of best practices for organizations to establish a service desk, a services catalog, and to measure service levels against. The latest version of ITIL v3 included the use of third-party providers extending the standard into the cloud/MSP/ASP world.
- Miscellaneous – The final set of things to look at is – have these guys been in business a while? Are they solvent? What outage/security events have they had? Are they willing to provide you with the things listed above (and anything else you need) to make a good decision? Also make sure you really understand their billing model – some providers charge you for the “max used” or “burst” rate for the month while others do some averaging. Some include or group services together (such as DNS is part of network usage) while others are 100% a-la-carte and you need to pay for them separately. Perhaps someday we’ll see finer grain metering systems (due to competition) like with networks that tend to use 95th percentile billing that allow for short bursts.
One final thought on this tome I’ve written (assuming you read this far!) – consider what happens with your cloud provider when they are part of a set of service providers. For example if you are using one provider who gives you an easy portal to set up and manage your cloud infrastructure, then another behind that provides the core services (storage, compute), then another for backup/DR, then another for Security – start to think about the complexity (=risk), and does this aggregated service (what I like to call the service-stream) still have the security level, SLA, etc. that you had when you started? Do they have the same privacy standards and requirements? Are the protections transitive? Are they willing to test an outage and share the results or actually include you in the process (like you do internally when you test your DR test plan)?
In the end – you need to decide is the provider making it easy for me to understand how they do business with you? Are they open to sharing the controls/methods/etc? Or do you have to work really hard to find out what they really are doing on your behalf – don’t take the thinly veiled answer that it is for your protection that they won’t provide the information – you are the customer but if you are just using a free service – you get what you pay for. If you are a real paying customer – then you don’t deserve to be treated with obscurity or directed to talk to someone else. The cloud is supposed to be self-service and automated – it is up to the providers to include in that service making it easy for potential/paying customers to get the answers they need to make their stockholders and customers happy.
Attached are the results of my looking at various providers via search and the web. It is incomplete – some sites had everything in one place making it easy to find. Others that have empty spaces are because after trying for hours I gave up. Could be my skills are not what they should be with search – but I think if a 25+ year IT vet can’t find stuff easily on the web or with search then you are losing customers already.
RMI & FAIR – http://www.riskmanagementinsight.com/