Applying Risk Analysis Methods to University Systems
W R Chisnall
The words "Risk Analysis" are used today in several different contexts. In safety critical situations such as the design and operation of nuclear power plants or oil and gas rigs, risk analysis is part of the process of making the chance of a disaster as small as possible. The same thinking applies to the design of aircraft blind landing systems and modern "fly-by-wire" avionics. In these cases the consequences of an accident are so horrendous, and therefore costly, that the chance of one happening must be made almost vanishingly small. The problem is to build a system where components are replicated and human actions are checked so that overall the system will meet its reliability targets.
If, on the other hand, you are a manager on a civil engineering or software development project you will want to know how the actual cost and time to completion of your project might differ from the nominal values. Large projects are composed of hundreds or even thousands of individual jobs, each of which will have had a separate estimate made of its likely duration and cost. But as the work proceeds these individual jobs will each take more or less time and cost more or less than was originally estimated. And the separate jobs are not all independent; some cannot be started until certain others have been completed. And if other constraints such as limited access to scarce resources are taken into account it is easy to understand how project slippages can easily get out of hand. In these circumstances every project manager wants to understand how sensitive his project is to the accuracy of the original estimates and how much freedom of action he will have if things start to go wrong.
Risk analysis in the computer security context is different again. It is accepted that, in a computer system, both equipment and staff may fail, often in ways that are difficult to predict. There may be natural disasters and there may also be deliberate attacks against the system. Countermeasures are, of course, available and the most common ones are found in most installations. But very few installations are set up to safety critical standards to ensure uninterruptable working or to be totally impregnable against hacking or denial of service attacks. Instead they tend to focus on being reasonably resistant to attacks and able to restore normal working as soon as possible after an incident. The issue, of course, is one of cost. What is spent on countermeasures should be appropriate to the risks and to the costs that might arise following any disruption to normal service.
In this paper I shall discuss a particular risk analysis method that I have been using and which originated in UK government circles. I shall highlight some of the areas in which its use in academia differs from how it is used in the civil service and in commerce and discuss some of the benefits that would arise if it were applied more widely across the higher education sector.
Risk Analysis in Government
The UK Government operates very many commercial computer systems either directly or through its various agencies. In the mid 1980s, computer security became recognised as a subject that needed to be taken seriously, even in non-military circumstances, and there was the inevitable competition for limited funds to spend on improved countermeasures. In 1985 the Central Computer and Telecommunications Agency (CCTA), part of the Treasury, studied existing methods of carrying out security reviews so that it could recommend one for use in government departments. None of the methods investigated met all the requirements so a new one was developed to meet the specification written for the study. This became known as CRAMM, the CCTA Risk Analysis and Management Method. Originally it was just that - a method, but soon the method was implemented as a computer program that would run on standard PCs and the package was made available to public and private organisations.
CRAMM's aims are to:
- Ensure that security requirements are fully analysed and documented for any type or size of IT system.
- Avoid unnecessary expenditure on unjustified security measures which can arise through the use of subjective and pragmatic risk assessments.
- Avoid inconsistencies associated with improvised risk assessments.
- Involve and aid management in planning and implementing security throughout the various stages spanning the life cycle of IT systems.
- Aid security reviewers to plan and carry out assessments in a reasonable time.
- Reduce the need for clerical effort by implementing the method as a software tool for standard PCs.
These aims have, in general, been met. But CRAMM's critics said that it nevertheless betrayed its origins by being unnecessarily long winded and prone to generating lots of paper output. It was also seen as being good for large systems with lots of data and many users but unwieldy for the typical systems found in smaller companies. A further criticism was that the system was designed for government-style administrative operations and this flavoured all the interactions with the reviewer and the customer.
More recently, following several internal government reorganisations, the range of available risk analysis packages was reviewed. And CRAMM, in an updated form, again emerged victorious. There are now two major versions. One is for UK Government use only, including the military sector, and includes classified countermeasures in its data base. But alongside this is the commercial product, freely available to anyone wishing to buy a licence. Both products are now the responsibility of the UK Security Services - with the names, addresses and telephone numbers of the relevant management staff freely available.
The General Method.
Computer security is about three things:
That information is only disclosed to those who are authorised to receive it.
That information can only be modified by those authorised to do so.
That information and other IT resources are available to authorised users when needed.
Security risk analysis and management consists of two related but separate activities.
Risk analysis involves the identification and assessment of the levels of risks calculated from the known values of assets and the levels of threats to, and vulnerabilities of, those assets.
Risk management involves the identification, selection and adoption of countermeasures justified by the identified risks to assets and the reduction of those risks to acceptable levels.
There are three principal types of asset involved in an operational IT system:
- Physical i.e. equipment, buildings and staff
- Software i.e. the system and application software
- Data i.e. the information stored and processed
Valuing the physical assets is relatively easy; one simply records the replacement cost. In many cases it may not be possible to buy exact replacements for lost or destroyed items but it is usually possible to find functionally equivalent pieces of equipment - often at less than the original price. And it isn't necessary to be very precise. CRAMM reduces all items to a non-linear "value scale" of between 1 and 10. For example, anything valued at less than 1K UKP is valued as 1; for values between 1K UKP and 10K UKP the scale value is 2. Losses of over 30M UKP are scored as a 10.
This use of a scale of values is important since it allows intangible losses to be equated with those which have a simple cash cost associated with them. We shall see how this is achieved when valuing the data assets is discussed.
Buildings and staff are listed as physical assets and one can readily see how losses in these categories can be just as serious as equipment losses. But risk assessments can easily get too big to manage and one golden rule is to define, at the beginning, the scope of an assessment; and for the purposes of this paper I shall exclude buildings and staff from the discussion.
Similarly, it is easy to understand the value of software to an IT system. An installation that uses standard packaged software which is properly licensed and supported is at little risk since in the worst case new copies can be obtained from the vendor. But sites using bespoke software which may be old, written in an obscure programming language and inadequately documented are clearly much more vulnerable. An example of this is the "millennium" problem - even COBOL has become obscure to many of today's programmers.
To value data assets, the method looks at the impacts of accidental or deliberate :
There are many possible impacts which may be relevant:
- Political or corporate embarrassment
- Loss of commercial confidentiality
- Infringement of personal privacy
- Personal safety hazard
- Failure to meet legal obligations
- Financial loss
- Disruption to activities.
The CRAMM method leads the reviewer through all combinations of the elements from the two tables above for each data asset that has been identified.
For example, the total loss of a company's payroll file would cause considerable embarrassment and disruption to activities but would not cause a personal safety hazard. It is unlikely to cause a financial loss directly, although there would be considerable cost associated with the disruption to normal activities while the file was rebuilt.
A different example, and one which actually happened, concerns the deliberate modification by a hacker of patient treatment data in a hospital system. In this case at least one patient died. There would also have been direct financial loss to meet compensation claims and extreme corporate embarrassment.
CRAMM deals with all these circumstances by using a series of guidelines which map the scale of the impact onto the scale of 1 to 10 as used for simple asset values. One example is the "Embarrassment Guideline" as shown below:
Effect Value Contained in department 1 Other departments aware 2 Public made aware 3 Complaints to Members of Parliament 5 Widespread adverse publicity 7 Calls for Minister to resign 9 Minister obliged to resign 10
This is one table where the civil service wording is most obvious. But substituting "director" for "Member of Parliament" and "Managing Director" for "Minister" makes it quite usable in industry and commerce. It is also clear how it could easily be made compliant with the management structures in universities and other higher education establishments.
The equivalent "Personal Safety Guideline" is shown below:
Effect Value Minor injury to an individual 2 More serious injury to an individual 4 Injury to several people 6 Death to an individual 8 Death to several people 10
(Cynics have pointed out that it is apparently less serious to kill someone than to call for a Minister of the Crown to resign)
In making an assessment of a particular data asset, it is important that the reviewer does not make his own judgements about the possible impacts. He should interview the "data owner" and extract the information in this way, preferably without exposing the scoring tables. In this way the assessment becomes a collaborative effort; the reviewer simply the master of the process.
Threats and Vulnerabilities
When all the data assets have been examined it is necessary to consider the Threats and Vulnerabilities. The threats considered are:
- Natural disasters e.g. fire, flood etc
- Deliberate threats from outsiders
- Deliberate threats from staff
- IT equipment failures
- Errors by staff
The likelihood of a threat manifesting is assessed by reference to known conditions and recent experience. For example, computer installations in earthquake zones or in the basement of a building below the flood level of a nearby river would be considered to have a significant threat level. Computer installations in buildings which are open to the public are at risk as are installations using old equipment and with a poor staff training record.
Vulnerabilities also need to be considered, and it is frequently difficult to separate a lack of vulnerability from the application of a countermeasure. For example, a computer in a wooden building and where the management of waste paper is poor is very vulnerable to fire. Appropriate countermeasures would be the installation of fire detection and extinguishing equipment - but these would not reduce the intrinsic vulnerability. Another example, particularly relevant in universities, would be that computer installations themselves should be secured, particularly in buildings which have public access.
At this stage the CRAMM process has information about the physical installation and the totality of the systems that run on it and their overall sensitivities. The package goes into its "expert system" mode and makes reference to its data base of countermeasures to find those which are known to be effective in the circumstances that have been identified. These are listed, cross referenced against the particular threats, and presented as recommendations to management.
For example, base line countermeasures which are generated in almost all assessments include doing back-ups of data and using passwords. In slightly riskier situations the use machine generated passwords and the formal examination of the audit logs might be recommended. At a higher level still the installation of trusted firewalls, encrypted message transfers and the positive vetting of operations staff might be suggested.
As with all consultancy reports, management reserves the right to accept or reject all or part of the report. Countermeasures cost money, some a great deal of money, and management may have important knowledge that was outside CRAMM's data gathering exercise and which, in its judgement, affects CRAMM's conclusions. Or it may just decide to accept the risks.
So, what are the advantages of using a method such as CRAMM? Well, it injects a strong measure of objectivity into the risk analysis process. Universities are multi-faceted institutions. Gone are the days when a university had a single computer installation. There are the machines which support the business functions of a university, those which are used by researchers, often on a faculty by faculty basis, and those which have moved into the basic teaching and learning processes. Institutions are being pressed to operate more and more effectively as businesses while the sources of revenue depend increasingly on quality assessments made of the teaching and research processes - certainly in the UK. And usually the entire campus is wired into the global internet with all the additional risks that brings.
CRAMM enables the relative risks and threats to be assessed so that countermeasures appropriate to the particular system can be selected. It can also be used to show how the risks change with time as the systems evolve. But perhaps most importantly it provides new insights for IS Directors and other university managers about the ever increasing importance of computer based systems in academic life.
The University of Manchester
Manchester M13 9PL
Copyright EUNIS 1997 Y.E.