This paper came about from other areas that I´ve been working on lately, namely studying aviation medicine and human factors and the causes of aviation accidents which were attributed to human factors. Many of the concepts of aviation safety can be also applied to IT security matters, and this paper outlines some initial thoughts on the matter.
While aviation is extremely safe ( well, safer than travelling in a car ) it sould be noted that in aviation there is little rooom for mistakes. Gravity is very unforgiving. It is a similar situation with IT security – if an administrator makes a simple mistake, there is a hoard of hackers, crackers, spammers, and automated malware which will dish out punishment for the mistake with very little regard for the poor administrator.
All computer systems are managed, operated, and used by people. Contrary to what popular Science Fiction may have you believe, computers do not have a mind of their own, they are carefully designed, managed, and used by real people. This paper explores how human error has led to major problems in manging IT security systems, and how most problems are increasingly caused by human factors rather than any technical or environmental failure.
Here´s a theory: In any reasonably well organised IT operation, most security failures will be caused by the operations personnel.
To anyone in management, or the casual user-in-the street, this may defy belief. But it comes about because its quite easy to implement a security system which is sufficient to “do the job”, but very hard to maintain it.
Systems with a single administrative domain are particularly vulnerable. These are the systems where management has been fully centralised into a single tool or administration interface.
Lets look at a typical example. Company xyz has an Internet firewall consisting of two redundant nodes in a High Availability (H/A) configuration, both on uninterrutible power supplies, separate racks, etc. However, they are managed as a single unit, ie. a single security policy configuration is loaded onto both systems. Here, the hardware and infrastructure is extremely reliable – its very unlikely to fail. But there is a single point of administration : if an administrator makes a bad configuration and loads it onto the firewall cluster, then the whole environment fails. The environment has a weak link : the people who operate it can ( and most likely, will ) make mistakes which will have a severe impact.
I have had first-hand experience of such a situation. A large company in New Zealand has a pair of load balancers to manage traffic volumes. They are in a dual-redundant configuration to prevents failures. However, they had a single administrative domain and automatically share and replicate a common configuration. During one period, a misconfiguration disabled all traffic passing through these devices, and it wasn´t caused by any attacker.
In another example, a networking company decided against using DNS for host-name to IP address mapping and decided to use local host files on every system. To keep everything up to date, they implemented a distribution system where the hosts file would be copied from a single internal system and onto all other systems in the network.