Gary Lees arrived fresh and early to work on Monday 17 July after a week’s leave, only to be greeted in the corridor by the chief operating officer sharing that “there had been a bit of a disaster” and there was a briefing in his office at eight o’clock.
The “bit of a disaster” was a fire in the wee hours of that morning in Lakes DHB’s main computer server room in the Rotorua Hospital site. The fire had started in the, somewhat ironically named, uninterruptable power supply (UPS) that had more than interrupted the DHB’s IT system – all electronic services served by that main computer server were out of action, leaving the Rotorua and Taupo hospital sites and community service teams without access to electronic patient management systems, the internet, email or even voicemail.
As a result, everybody was about to start the new week with blank screens.
Lees says everyone in the DHB’s executive team is trained to operate within the CIMS (coordinated incident management system) structure but he has a special interest and had just recently completed a postgraduate diploma in emergency management at AUT. So when it was agreed that a formal emergency operation centre was needed, the DHB’s director of nursing and midwifery volunteered to become incident controller, set up an emergency control centre and called the first of the three-hourly incident meetings for 9am.
The situation was fairly grim. How long it would take to get ‘all systems a go’ again was a big unknown. Meanwhile two hospitals had to continue safely.
“We couldn’t get anything to come up on any computer screen anywhere in the system,” says Lee. People could phone each other, but couldn’t leave voicemail messages, and there was no intranet or access to the electronic management systems that staff now took for granted. In all, 54 software products used by DHB clinical and administration staff were unavailable, including electronic meal ordering.
Lees says that on the good news front the laboratory was on a different computer system so blood reports were still available and also radiology could still do scans and imaging so those diagnostic tools weren’t affected.
The problem was how to share results and information that people were now accustomed to receiving and storing electronically.
The DHB’s business continuity plan meant it already had on-hand templates for paper versions of all the electronic forms commonly used by the DHB.
The team quickly got a laptop and printer to work printing off templates; IT staff were sent running about the hospital to set up ward printers to be used as photocopiers, and the communications officer was set up to be able to write and print staff bulletin updates. Lees says by about 10am the hospital had switched into paper mode and was tracking admissions and discharges on paper, as well as taking meal orders.
To get paper from A to B and back again – and keep staff updated – the incident team ramped up the existing internal post system by bringing in more runners, including nurses in roles such as nurse educators, who could be pulled out of their normal duties to help out during the IT crisis.
Fortunately the DHB’s patient records were not fully digitised, so it had paper files containing paper copies of most, if not all, the patient information and reports held electronically up to the time of the outage.
Lees says it was also lucky that the medical records office was in the habit on Friday afternoons of printing off the electronic theatre and outpatients list for the first days of the following week, so the DHB had paper lists of who was due to present and made the clinical decision to postpone some surgery and appointments until the impact of the outage was clearer.
The emergency department was a clinical priority so it was loaned the incident control team’s stand-alone wifi hotspot, which can connect up to six laptops or tablets, so that it could access national systems to check or create NHI (National Health Index) numbers.
Also supporting ED was the “absolutely brilliant” primary health organisation, Rotorua Area Primary Health Services, which turned up at ED with some of its own computer equipment, allowing ED staff to regain their usual access to GP patient records, but directly via the PHO’s network.
The DHB’s pharmacy found its own quick fix by logging in to Taranaki DHB’s ePharmacy system using a mobile phone, the 4G network and a laptop, so it was quickly back dispensing medicines as normal. Lees says that staff in other areas popped home and got their laptops and printers. He expects a number also used personal mobile data and devices to access clinical apps or support systems – improvisation was the order of the day.
Getting back on track
Day one had started with some clinical concern and uncertainty about attempting business as usual in the midst of an IT outage, says Lees. But by midday people were “much happier”.
By the end of that first day the two hospitals were functioning, slowly and with some inconvenience – and with some inevitable concern about missing something stored electronically – but overall Lees said there was an “amazing response” by everyone. “We were so, so pleased with how the nurses, allied health, doctors, support staff and admin people all pulled together.”
By Tuesday, surgery and outpatients were basically back to normal and the hospital’s free public wifi system, provided by an external company, was up and running. It wasn’t secure so it couldn’t be used for transferring patient data, but it did give staff with mobile devices and laptops free access to the internet, and the DHB’s external website could be used to share outage updates.
While the emergency control centre worked on keeping the hospital up and running under a paper-only regime, the specialist IT team was swiftly seeking expert advice on how best to get the IT system’s backup running.
The DHB had to track down and fly in forensic cleaning experts to clean up the smoke and soot in the server room and it had to find, hire and install a replacement UPS. This meant IT staff working round the clock, and neighbouring DHBs lent IT and emergency planning staff to share
The first time the switch was flipped on the replacement UPS, the air-conditioning failed and the DHB had to wait another 24 hours to get replacement parts to ensure that the room had the consistent temperature and humidity control required by the sensitive server equipment. The IT team also started working on setting up a backup secondary server to run a skeleton IT system and was on standby to redeploy 50 desktop computers or laptops – to replace computer monitors that couldn’t run without the main server – to every ward or outpatient clinic room to run the patient management system.
Fortunately, when the switch was flipped again on the Wednesday night, the full system came back to life.
With the health system nationwide moving steadily towards full electronic health records, the Lakes outage has highlighted the vulnerability of a health service being reliant on a single server site.
Having a backup server offsite in theory sounds a good idea, but Lee says he was told that there was about $2 million of IT ‘kit’ in the server room so duplicating that wasn’t an option for small DHBs. And also it wasn’t the server itself that failed – all data was safely backed up and not at risk of being lost – it was the loss of a guaranteed uninterrupted power supply to the server that was the major issue. (Not taking the risk of plugging the server straight into mains power was brought home that week by a storm cutting power to Rotorua homes and creating a power spike big enough to have blown everything in the server room.)
The July IT outage has prompted Lakes to speed up joining the regional data centre being developed in Hamilton for the Midland region DHBs. This is part of the national infrastructure platform that aims to increase the security and reliability of DHBs’ IT infrastructure and reduce the risk of critical outages.
“Business as usual”
Everyone involved was relieved that the outage lasted just three inconvenient and challenging paper-shuffling days.
Returning to “business as normal” turned out to be, however, not just a matter of successfully flipping the switch.
Electronic patient management systems were up and running on the Thursday, but patients who had been admitted on paper couldn’t be discharged electronically until somebody manually uploaded the information from the paper admission forms. This meant that paper discharges had to continue until the hospital caught up with inputting the backlog of paper forms.
Hiccups emerged when staff inputting the forms discovered gaps in the information – that electronic templates would normally prompt nursing or clerical staff to complete – and these gaps were time-consuming to fill once the patient was long gone home.
This has prompted another lesson learnt for Lees, who says if ever Lakes had to revert to paper forms again it would set up a small team to check over filled-in forms immediately and send them straight back if gaps were spotted.
Lees and the incident control team finally handed over to a recovery manager at 2.30pm on the Thursday after an intense and challenging three and a half days.
Time to take a deep breath… and catch up on all those emails that spilled into his mailbox after Lakes rejoined the digital world.
Advice and lessons learned
- Have a business continuity plan – even if the reality is different, having worked through different ways of responding to emergencies is invaluable.
- Use a CIMS (Coordinated Incident Management System) approach right from the beginning of a major incident – don’t think you can manage it with your normal processes.
- Set up a system so you can save and still access electronic surgery and outpatient client lists if an outage occurs.
- Keep paper templates of electronic forms up to date and consider having a stockpile of pre-printed paper forms.
- Consider knowing how many standalone desktop computers and laptops you can deploy i.e. that aren’t reliant on the main server to operate.
- Have more than one stand-alone wifi hotspot device on-hand for emergencies.
- Move the UPS (uninterrupted power supply) outside of the server room to remove the risk of a fire in the UPS damaging the server.
- Big picture – pursue shifting to regional IT data and infrastructure models sooner rather than later to increase security and reduce the risk of critical local outages disrupting electronic services.