Cyber security – is it a science?

Tuesday, 25 April, 2017 In


Chris Few, foreseeti UK business manager, discusses the ability of the cyber security profession to make testable predictions of the mean time between system security compromises.

Given the growing importance to society of good cyber security, whether the subject is a science or not is more than just a pedantic question.  Good science develops hypotheses which are tested by experiment; those which survive become theories, able to make testable predictions.  As cyber-attacks become ever more prevalent, society would surely value scientists who can answer the question, ‘if I connect this system to the Internet, how long is it likely to be before it is compromised?’ It is a simple, reasonable question, but not one that the cyber security profession has good answers to.

A good scientific answer would first address the probability of it being attacked.  It would draw from a mass of data on the frequency of cyber-attacks on a wide array of systems.  It would know from this data, what factors influence the rate of attacks and from this infer what rate and attacker profile is most likely for the system in question.  Secondly, from a vulnerability analysis of the system and data on the time to compromise comparable systems, it would deduce the likely rate of successful attacks.  A very good answer would help the questioner understand what sort of compromise was most likely; denial of service, ransomware, discrete exfiltration of intellectual property so they could judge the probable business impact.  Cyber security is a long way from being able to do this.

In contrast, the insurance industry has a good understanding of how likely you are to suffer from a wide variety of accidents tomorrow, the Met Office makes detailed weather forecasts and the medical profession can tell you your probability of contracting various diseases over the next year.  Most engineering disciplines use computer aided design tools to predict the properties of new product or system designs before they are built.

To be fair to the cyber security profession, it can predict some things.  Cryptographers can predict how long it is likely to take to recover a key from a stream of cypher text.  It is feasible to predict system resilience to denial of service attacks.  But in a world of big data and data analytics, the cyber security profession has yet to develop the models and collect the data to answer some basic but important questions.

So, is this situation likely to change and what will be the drivers to change?  One change is the increasing ability to model an IT system from a security perspective, identify the most vulnerable attack path through it and give some measure of the difficulty for an attacker to complete the path.  This capability has developed through much academic work on the use of attack graphs.  These graphs don’t have x and y axes, but nodes and connections between them.  In one form, a node represents a privilege level on a device and a connection represents a possible attack step from one device and privilege level to another.  Attack graphs can capture considerable security data in a form that enables automated analysis of attack paths.

A pioneering group in this field has been at the Swedish Royal Institute of Technology, commonly known as KTH.  In 2014, their landmark academic paper reported a measure of system security derived from an attack graph based model which correlated with the mean time for penetration testers to compromise systems.  This is a significant step towards answering our earlier question, ‘how long is it likely to be before my Internet connected system is compromised?’  If we can model the system and derive a measure of security with a known correlation to time to compromise, we can predict how long a particular type of attacker is likely to take to compromise it.  And in good scientific practice, we can test our predictions by experiment and refine our models accordingly.

However, there are still some big practical problems.  One problem is the practicality of modelling typical IT systems in sufficient detail to make predictions that have a significant correlation with experimental results.  There is substantial progress in this area too.  The KTH techniques have been packaged into a commercial Computer Aided Design tool which makes the development and analysis of cyber security models much easier.  IT systems of a scale commonly found in a small to medium enterprise often translate to models with over a million attack paths but the CAD based approach enables an exhaustive search for the most vulnerable path in less than a minute.

Building a model by ‘hand’ is still time consuming and can be prone to error, but there is much scope for automation of this as well.  System security data derived from vulnerability scans, network traffic or configuration files can be parsed and input to the model generator.  Models with thousands of objects have already been automatically generated using this approach (see  The future development of bespoke software agents to collect model specific information would enable automated generation of increasingly accurate models of existing systems.

This still leaves the issue of knowing what type of attacker profile a particular IT system is likely to face.  Knowing that a highly skilled attacker can probably compromise your system with 10 days of effort, is not enough to know how likely that is to happen.  But, in principle, we could collect data or experiment to find out.  There are many potential sources of data on attacks; e.g. intrusion detection systems, honeynets, firewalls, security operation centres and anti-virus systems.  To enable predictions of the rate of successful attacks on particular IT systems, we would need to collect data on the attack rates on many systems, and for each system measure the factors that affect attack rates.  This might include how attractive a target was, how vulnerable it was and how well publicised it was.  From this we should be able to draw correlations between attack rates and the factors that influence them.  This would bring cyber security more clearly into the realm of science, with the ability to make evidenced predictions of the probable time to compromise (and time between compromises) of an Internet connected device or system.  Cyber security already draws heavily from computer science; to make this transition it will also need to draw from data science and actuarial science.

Is such data collection and analysis likely to happen?  It seems likely to me that it will.   Knowledge of factors that correlate with attack rates would be of value to the cyber insurance industry and to information owners who need to judge how well to protect their information.  For organisations that acquire data on cyber-attacks as part of their normal business it would make sense to sell it (suitably anonymised) when they can to intermediaries.

For cyber security to be fully embraced by the scientific community it probably still needs a few more attributes.   It needs a means of recognising who is a qualified scientist whose predictions should be trusted.  Cyber security has various professional bodies but understandably the membership criteria do not currently test the ability to model IT systems and do the statistical analysis to predict time between compromises.  It also needs some new standards; what is the minimal information that an attack graph needs to hold for it to be considered fit for its purpose?  How should the competence of cyber security scientists be tested?  Under its Horizon 2020 initiative, the European Commission is already funding work by foreseeti and others in this area.

Perhaps the biggest test for admission into the scientific community is widespread trust by society in the science produced.  In the case of cyber security this would imply adoption of measures of security by legislators and regulators.  The UK Data Protection Act and the EU General Data Protection Regulation require that personal data is protected but there is no measure of an adequate level of protection.  In contrast, for example, regulations for pollution commonly do have quantitative limits on acceptable levels, based on scientific studies of the effects of pollution.  The cyber security profession has some way to go to reach this level of acceptance as a scientific body but there are encouraging signs that it is on the path to doing so.

For people working in the field of cyber security who wish to see it flourish as a scientific field there are increasing opportunities to contribute.  There is an emerging business opportunity for gathering, processing and selling data on cyber-attacks.  The availability of powerful CAD tools for cyber security analysis is an opportunity for security architects and designers to show the quality of their work.  For those in education and training, the same tools provide excellent training aids; challenge the student to model a system, study the predicted attack paths and propose design improvements.  For information risk owners and risk managers; demand attack graph based models from your IT system providers.  If they cannot produce them, how well do they really understand their own systems?  For academics and students at MSc or PhD level; there are many potential improvements to modelling techniques to be devised and tested.  For directors and regulators with cyber security responsibilities, ask for attack graph based measures of cyber security.

Overall it is an exciting time to work in the field of cyber security.  Collectively, there is much we can do for the long-term benefit of society by fully embracing best scientific practice.  foreseeti aims to remain at the heart of this transition.