This automatically-generated transcript is taken from the IT Pro Podcast episode ‘Behind the scenes of the Solarwinds hack’. To listen to the full episode, click here. We apologise for any errors.
Adam Shepherd
Hello, I'm Adam Shepherd.
Jane McCallion
And I'm Jane McCallion.
Adam
And you're listening to the IT Pro Podcast.
Jane
This week we've got something a little bit special for you. We're taking a deep dive into one of the most significant cyber attacks of the past year, speaking to those who were on the frontlines of the incident response efforts and post-hack cleanup. Solarwinds CEO Sudhakar Ramakrishna and CISO Tim Brown.
Adam
Supply chain attacks have always been a danger for businesses. But as digital tools become more pervasive, they pose more and more of a threat. The idea behind this kind of attack is simple. When a hacker targets a particular organisation, rather than attempting to break through their security directly, they can attack one of their target's partners or suppliers, using them as a Trojan horse to smuggle malicious code into the network of their real target.
Jane
In late 2020, this tactic was used to attack thousands of entities once from companies like Microsoft and Deloitte to the US government. It all leads back to infrastructure monitoring giant Solarwinds; the company's Orion platform is used by a vast number of customers in both the private and public sectors to optimise IT performance. And its deeply embedded position within the data centres of potential high value targets made it an ideal candidate for the attack.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
Adam
The incident has been blamed on operatives working for Russian State Security, who compromised the Orion platform and used it as a backdoor to access the IT systems of Solarwinds customers and deploy further malware.
Jane
Exploring the technical details of an attack like this can be useful for preventing similar attacks in future. But the impact of these incidents goes beyond the technical. What you often don't hear about is the effect they have on people who have to clean them up. The Incident Response teams responsible for putting out the fires and keeping the wheels on.
Adam
We spoke to Solarwinds CISO Tim Brown to find out what it was like on the ground as the hack was being discovered.
So thanks for joining us, Tim.
Tim Brown
Oh, you're welcome. Glad to be here.
Jane
So Tim, can I ask when did you first suspect that there had been an intrusion? So yeah,
Tim
So yeah, December 12 was kind of the date that lives in infamy for us. Yeah, that was a Saturday morning. I got a call from FireEye's CEO, called our CEO, and said, hey, you know, we believe you're shipping tainted code, we know that you've, you know, we've seen the code, we've decompiled the code, so we knew it was there. And, you know, I got a, quickly got a call and said, hey, the CTO for FireEye is going to contact you. And, you know, he contacted me, like an hour later. And we went into details. So that was our first indication of, you know, anything. And that's really when, you know, the whole, our whole incident response started and everything started from there.
Jane
Yeah, it's a classic, isn't it? Of course, it's gonna happen to you on a, on a Saturday. What a nice way to wake up to your weekend.
Tim
Yep. Yeah. We didn't have a Christmas or New Years, that was for sure. Yeah. So yeah, say so Saturday morning. We were, we worked essentially, Saturday, outside the office; Sunday, we were all in the office. Basically, in the office for a couple weeks straight. I think the first, you know, literally the first time we had a little bit of time off was that, you know, Christmas Day. So it's just one of those types of times where, you know, that there's just so much to do, so many little things to do so, many things you have to have right. You know, we were, you know, writing financial, you know, 10k information, you know, at two in the morning to get it right. It was a lot of response needed to happen in the first few weeks.
Adam
So how many people were on those kind of initial incident response squads?
Tim
Yeah. So it's a good way to put it: the squads, right? I'm glad you use that term, because it wasn't just one incident response team working on everything. Right? We split up the incident response to, hey you're dealing with customers and had all the customer questions that we've got to have, and all of those things. Oh, you're dealing with financial data, and all the financial implications that we had in a public company. Oh, you're dealing with the incident itself and the investigation into our internals, our investigation. So each of those squads ended up working kind of separate, but you know, together coming back with information nightly. So on each one of the squads is probably - from a leadership perspective, the team was probably 10 to 15 type of folks. Then, of course, we had our board then we had the technical teams underneath. And the technical teams underneath; we mobilised everybody possible, right? So we probably had, you know, from the engineering team looking at things, you know, 50, 60, folks, at least.
Adam
Oh, my God. Wow
Tim
Yeah. Then the IT team, probably a good 20, 30 of the IT folks, we're looking at things. So essentially, everything stops and everything gets... if you're not working on this you're working on something can be associated with it. So a big response just to get our, you know, the right answers at the right time. Yeah, both figure out what happened, and then start working on our response to what happened.
Jane
How did you decide how, what squads to make, and who would go in them? And were they all created at the same time? Or was there an element of shift over the course of the incident response?
Tim
There was a number of people that shared different areas of the response, right? So I think there was some natural creation of them fairly quickly. That, okay, you know, you're responsible for the marketing communications, you're responsible for the customers, right? Then you have to remember, we have a very large set of customers, right? Solarwinds has 200,000-plus customers around the world. So everybody is calling, everybody wants information. Everybody wants data. So our support teams needed to be ready to handle a huge influx of calls, so we had to make sure we staffed those. So I guess that the squads kind of developed themselves in some ways, but we knew that we needed these levels of, hey, you're going to need customer response, right? You're going to need customer outreach, you're going to need marketing communications, you're going to need, you know, technical investigation, you're going to need engineering investigation on the products. So they somewhat built themselves. But I think some of the partners that we had in place really helped us with the definition of some of those things.
Adam
So how did the partners help you with with those kinds of definitions? And are there any kind of specific partners that you'd like to call out?
Tim
Yeah, sure. So we've been very open about who we've been using. Right. So DLA Piper has been our you know, both our legal counsel as well as they have a very good forensics and cyber team inside of DLA. So they helped being a good coordinator for us. And yeah, so under their paper, we ended up bringing in CrowdStrike as a first level partner focused on the macro investigation, great focus on the macro investigation, so very good threat hunting, great skills on that side, and good technical general skills. What happened though, is that when we looked at the development organisations, and since this happened inside of a supply chain in a build environment, we needed somebody with specialised specialised skills to really be understanding those those build environments. So we brought in KPMG's forensics team. KPMG, not necessarily known as a forensic company, but a very strong team of about 150 or so people that they have. And they really focused on the micro inspections. So where CrowdStrike's macro inspection, instrumenting everything, pulling in data, is a threat actor still here, KPMG focused on all right, this happened to build environments, so let's check every build environment in the company. Let's check to make sure it's not anywhere else. Oh, let's make sure, let's see if we can find it. Let's see if we can find a build system from that period of time. So that separation really helped a great deal as well. All right, where the, you know, the threat hunt team is threat hunting. The forensic team is really looking at the deep details of your development environment where this occurred.
Adam
So you've talked about some of the broader elements of the response strategy. But what were the initial stages like when you first realised that something was up? What were the first things that you had to do?
Tim
Yeah, so first things were, you know, normally when we have an incident of any type, we have to prove prove it. Right, we have to prove that it occurred, we have to prove that it was was there. Yeah. Luckily, in this instance, the proof came in with the report, right? There was proof that said, hey, and it was clear, right? Yeah, I decompiled this code and this code was here. That shouldn't be there. You shipped that. So in that case, proof was not the problem. So the initial initial thing was, okay, well, how do we help the customers? How do we get them information, we were able to, you know, quickly figure out what builds were affected using the same method they used, the FireEye guys used, to prove that something was there. So although we didn't know all the details, we knew that three builds were affected, we we found that out quickly. But then how do we get in touch with any customers that had downloaded those three builds? How do we inform them? How do we get them out of harm's way, because we didn't know the command and control servers had already been taken down, we didn't know how much risk was being faced by that, those customers at that point in time, that was stage one get committed.
Adam
And you have to assume the worst right at that point, you have to assume that it has been a full compromise, until you can say otherwise.
Tim
Yeah, it's still in the wild is still being active, all those things; assume the worst and say, hey, let's get our customers either to shut this down, to upgrade to a different version, to understand what's going on. So that's where a lot of work went in the first, you know, days, with some of the national defender organisations like CISA, and NCSC, right, was really to have them amplify the message, if you're running these versions stop, right, if you're running these versions, upgrade to a version that wasn't affected or use a version that wasn't affected. If you're not running these versions, and you haven't been doing these, running these versions, you're not affected. So that was one of the first things to focus on is the customer, then Sudhakar will probably bring up some information around the financial side of the world, right? We are a public company. Therefore, putting out information to the the financial services in the street was also a critical thing, we had to issue, you know, what's in the US called 10Ks, right to be able to explain, hey, this happened. So very important to be able to do that. And do it accurately. So that's where when I say that, you know, we were doing messaging until, you know, two or three in the morning, a lot of that was, you know, making sure that every you know, every comma was in the right place, every period was in the right place, the information was as clear as possible. The same time, focusing on getting the customer ready, and in a good place.
Adam
What was the atmosphere like inside the business, particularly in the technical teams, once you realised that Orion had been compromised?
Tim
Yeah. So the technical teams were really mad. Right? You know, they were just pissed off, right? They were upset; this happened on their watch, how did this happen? How did this occur? You know, how could they disrupt my product? Because there's a lot of ownership right? If you, if you build code, you know, you own it, right? It's your it's your baby.
Adam
It's your baby.
Tim
Yeah, exactly. It is your baby. So to have somebody break into your house, and corrupt your baby, and change it was, you know, a very difficult situation for folks. So they wanted to do whatever was necessary to both resolve the problem, to understand the problem and understand the problem deeply, understand the incident deeply. So everything from you know, how did this happen? But how do we make it right? And how do we make it so it can happen again? So it, that was really the tone across the environment. And that tone continues, right? That can cause continuous trauma. Hey, let me do what I can do. Right? Let me do what I can do to help. Oh, you need me to do something different, or become you know, something different to give you more visualisation, something different to, you know, increase security in their environment. There's, you know, an attitude of Yes, right, inside the organisation.
Adam
Because that's every engineer's worst nightmare, right? Getting that call. It's, it's the one call that you never, ever want to get.
Tim
Absolutely, absolutely. And you know, you do equate it to somebody having yeah, a break-in to their house, right? It's like, wow, this is, you know, somebody broke in and disrupted my environment. You know, something with my name on it got tainted. So, but you move forward, you you go forward with it, right? But you go forward in a better way.
Jane
Yeah, Tim. And I think that's really important. It's something that you have raised a couple of times is, you know, how your marketing and communications teams are involved, and the amount of external messaging that goes on cause I think it's very easy to get into the mindset that this is an IT problem or an application problem. And therefore, the solution is entirely with techies rather than actually, yes, it is. But it's also a major problem for your customers. And so you also need crisis comms going on, internal comms and all that kind of thing. Is that sort of a fair sort of assessment of it all?
Tim
The comms is so important, right, comms, internally we had comms going on to the company, right? All the company was concerned, they were concerned about hey, what is this? What does this mean? What does this mean to the company? How do we comms to the company, and so that was kind of start right of the model but then our customers? Absolutely our customers needed to know what to do and it wasn't a technology per se conversation, it was what should you do you know as a customer, what should you investigate as a customer you know, what should you, what is what is correct and what is false? There was a lot of information that was floating around in media, some good, some bad, some wrong, some correct. So it was very hard for people to get what was you know, truly happening. One of the the issues that we had was that the the the name of the company was cool enough that you remembered it. Right? So it's actually bad, because you know, Microsoft first gave the this attack, called it Nobelium, right or something. Right. And but nobody talked about the Nobelium SC attacker. They talked about the Solarwinds attack, the solar winds attacker. So the Solarwinds attacker often got confused with Solarwinds. So the Solarwinds attacker, they would, people would say Solarwinds attacker, they would say Solarwinds attacks the, you know, Department of Justice or something, right? They would say those types of things, as opposed to the Solarwinds attacker did this; Solarwinds attacks email, right. So that created a lot of misinformation out there simply because the name was easier to remember. And the details kind of got hid in things. So communications to everyone is really critical. We also took the approach to be as transparent as possible with our customers, with what we knew, with the amount of information we're flowing. By no means were we perfect, right, of getting information out. But it was absolutely our plan was to be as transparent and open and sharing as much as we could.
Jane
Do you feel, Tim, that the existing incident response plans that you had in place were adequate for dealing with the scale of attack that you ultimately faced?
Tim
Sure, great question. They, yeah, the incident response processes that we had, you know, essentially we test all the time. So we test for low level incidents, as well as larger level incidents. So the company itself was very prepared to handle an incident. Now this incident was at a scale we had never seen, right? So we were efficient enough, right? We were, we knew who to call, we knew how to come together. We knew how to start the investigation. We had, you know, folks on the phone very quickly, we had the right people in the room very quickly. So I think in most cases, the the the Incident Response Plan was, you know, pretty good and pretty prepared. One of the things that really came out was that we didn't necessarily have the right way to communicate to all of our customers. We communicated via website, we communicated via, you know, our own blogs, we communicated information for people, but they had to go pull it. We sent mailings out to everybody that we had for mailing lists, but we didn't necessarily have the right contacts within the environment, we had the sales contacts for the environment. So that's some of the places we've improved, right, we've put a security contact field for every one of our Salesforce records. So every customer can have a security contact, but we can use that to communicate security information. So I guess our efficiency was not as good as what it could have been. But you can always find improvements. But overall, you know, we were fairly prepared on the incident response process answer breach pipes chain, which not like we scrambled around for days, right? I mean, you know, we had our teams ready on Saturday, they were in process on Saturday. On Sunday, we're issuing you know, all sorts of information, I think Saturday, we may have issued information. So it wasn't like we were waiting days to be able to get things going. So in that way, we were pretty well prepared.
Adam
So how long did it take from the initial kind of discovery and incident response? How long did it take until you are satisfied that the immediate threat was passed?
Tim
The first one was, are they still there? Right? Are they still in the environment? Was this an event in the past, or was it an event that was ongoing? So you know, first few days were a little crazy, right? So after FireEye called us, Microsoft, called us again that same day or day after and said, they said, FireEye said said that we were going to get a call from Microsoft, which we set up, I believe, Sunday, Monday at the latest. And essentially, that our email system, O365, had been compromised. And with that, we were able to rectify that very quickly, you know, within minutes of a call with them. It was, it was rectified, essentially an application, rogue application had been installed. So that was removed. With that, then the question is, is the threat actor still there? So that starts with, you know, the CrowdStrike investigation of hey, instrument everything, right? And probably we were, we were looking deep, right? We were looking, are they still here? Are they still here? Are they still here, across the environment and doing things like changing all credentials, all of those things, just tightening everything down. That probably a couple of weeks, to get to that, hey, everything is tightened down, every user is tightened down. What we were able to isolate really quickly was the development environments, we were able to isolate those enough to be able to put out an important build that needed to get, get put out. And that was relatively fast, we were able to isolate those environments, and isolate the build environments and the development community into a kind of a sandbox. But I would say from a Hey, are we comfortable they're not here, are we comfortable that, you know, nowhere in the company is affected? That was probably, you know, a couple, three, you know, we kept looking, right? So a couple, three weeks maybe.
Adam
Okay, that's, that's pretty good, I would say, certainly for something at this kind of scale. So obviously, Solarwinds does a lot of security stuff from a kind of products perspective; were there any learnings that you took from your own kind of incident response experience in dealing with this attack that fed into, you know, the the development or feature set or anything like that, of any of Solarwinds' existing platforms or products?
Tim
So, yeah, I mean, if you think, you know, not right at that point in time, right. But you know, when we look further on what was missing, right, what do we need to have? You have visibility within the environment; really critical. You know, the Orion suite provides you visibility and provides you information across the entire environment. And it helps you with understanding what is available, what's going on, what's going on with your network, how are things configured, all of those things become very important. So from a product perspective, we kind of merged some of the IT functions into the security function. And we see that those can come together a little bit tighter. You know, we also look at the build environments and how we build. That's been one of the major kind of improvements from a secure by design perspective, is to build an assumed breach model. So build assuming that you have a developer that's rogue, just build with assuming that you have a component that could be, you know, corrupt and that's the model and build with those things as an assumption and then you can put safeguards in against them. So for example, our build environments are three, right, you have to have three different build environments, therefore not any one person has access to all three, therefore you would need to have collusion between three in order to effectively corrupt a build now. Those type of concepts are really critical. They get put, give resilience into the environment.
Adam
Yeah, it's that idea of you know, not if but when, you know, which I think is is very important, particularly in security, for companies to appreciate, you know, it is going to happen at some point. Statistically, you are going to be breached at some point. And you should, as much as possible, design your security with that in mind.
Tim
Yep, absolutely, and design it with the people in mind, design it with the, you know, I was talking to another government entity around a while back, and we were talking about zero trust. And, you know, our zero trust models had moved more towards the cloud, and more towards zero trust in the cloud, because that's kind of the natural model. Right? But when with this event, we start thinking zero trust internally, right, shifted to not just external but really internally zero trust and what does that mean, you know, not having one entity being able to do harm to others, having checks and balances across the entire system in the entire environment, yet still being able to do business, still being able to you know, do the right things, but but at the same time, having appropriate safeguards to be able to, you know, circumvent these types of attacks.
Adam
So, Tim, if you could go back and do it again, is there anything about your response to the Orion attack that you would have done differently?
Tim
So there's, you know, a number of things right, I talked about the communications before, right? That probably was one of the bigger ones right is how do we communicate to customers in a clear concise fashion about what to do? Yeah, I wish we had very solid information on day one, right? Just a little bit clearer on the communications about you are not affected please don't spend incredible amounts of money investigating whether you're affected or non because a lot of customers, all customers, with the news, with everything going on with the big big media blitz...
Adam
They panic, right?
Tim
It was panic; it was Solarwinds. Oh no, Solarwinds, am I using Solarwinds, right? So we had people had not have affected build whatsoever, still, spending a lot of money doing investigation, we had people not even using Orion, which was some of our other products still spending money doing investigation. So if we could have gotten that message out, faster and clearer, I think that would have been something that would have been, you know, beneficial to the world. And we talked a lot to the national defenders about that, right? And how we could try to get better on the communications of, you know, of facts and, okay, you're running this, you don't have to worry, right? And, but it took time, and within that window of time, you know, the amount of work that people did within that window of time till the truth came out was, you know, just incredible. So I think that's one of the things that, you know, we, everybody needs to kind of focus on is your, think about the customer, think about what they're going through to this, not just about what you're going through, try to be clear, concise, and then get the, you know, the appropriate you know, groups to be able to amplify that message so that your customers are not, the customers have enough comfort to understand they were not affected. Yeah, one of the things that's not heavily known right, is, or it's there, is that the Solarwinds event had the potential you know, we went high on our numbers. So I think it was 28,000, we said, customers had downloaded the product. When we look at the actual numbers of customers that went to a secondary attack, the maximum, total maximum, is under 100.
Adam
Wow.
Jane
That's not bad.
Tim
Yeah. So when you have thousands and thousands of customers out there all doing work to figure out that they weren't affected, it's, you know, it ruined a lot of people's Christmas, right? Not just mine, right? Every one of the IT departments around the world that was investigating it, you know, which is terrible, right? And the amount of work that they had to do to prove that they were not affected, essentially. So those are some of the things that I think we could have tried to do better. That's something I think everybody needs to think about, right? You know, how do you communicate clearly? What's the message you're going to communicate? How do you get it out to everybody in the right way? How do you have it so that they trust it? Because you know, media won't necessarily be your ultimate friend during those moments to put out good information.
Adam
Well, Tim, thank you so much for joining us. And thank you for sharing your perspective on what is an absolutely fascinating example of real world incident response.
Tim
Absolutely. Glad to be here.
Jane
Almost a year has passed since news of the hack first broke. In that time, there have been a number of investigations into its causes and the methodologies used, as well as concerted efforts to remediate the effects.
Adam
While the initial security flaws in Orion have now been repaired, the process of responding to an attack of this scale doesn't stop just because the perpetrators have been ejected from your systems. There are many tough questions that have to be asked and for the leader of any company that falls victim to this sort of hack, reassuring both customers and internal stakeholders can be a daunting task.
Jane
Solarwinds CEO Sudhakar Ramakrishna stepped into the role in January this year, replacing outgoing CEO Kevin Thompson, less than a month after news of the Orion attacks was made public.
Sudhakar, thank you very much for joining us on the show.
Sudhakar Ramakrishna
Thank you, Jane. Thank you for having me.
Jane
So I think it is probably fair to say that the circumstances under which you joined the company were less than ideal. But just how challenging were those first few weeks?
Sudhakar
Less than ideal is one of the understatements that I've heard about my joining the company. Most most people were really thinking, what was I thinking, joining the company? So the first few weeks, quite frankly, the first few months, were quite challenging, I would say all the way through to April. Solarwinds, as you know, is a ubiquitous company, more than 300,000 customers leverage our software. Equally Solarwinds was a company that was very comfortable not being in the spotlight, we were mostly focused on serving customer needs, and then moving on about our life, so to speak. And we were thrust into the, into the limelight, in not so flattering circumstances, as you well know. So dealing with both internal aspects of how do you kind of handle that spotlight, good or bad, and then obviously taking care of your customers, addressing government authorities. And then most importantly, solving the problems were the challenging aspects of the first several months. I would say luckily, in some ways, I've had some experience dealing with security breaches in the past. And so there is an approach that I generally take, and that served me well here as well.
Adam
Can you share some details around that approach?
Sudhakar
Yeah, definitely, Adam. First and foremost, I've always been a believer that when you have any incident of this kind, not just a security incident. But let's say a major quality issue, major issue that impacts customers, the first and foremost thing that you do is be transparent about it, meaning come out and say what happened. That's the most important things. People talk about trust. And a significant part of trust is transparency. The second aspect of it is, I would say humility, and when I say humility, it is a matter of constantly learning, especially in security. I don't think there's one company in the world that can say I won't be breached, or many people have been breached and simply don't know. Right. In fact, in the security space, there is a joke that goes the following way, which is there are two kinds of companies, one that have been breached the others that has been breached, but does not know about it, is kind of the joke that runs. So in in the context of coming back to how you deal with it, it's transparency. It is humility. And it's a sense of urgency that you have to drive in terms of making progress, but maintaining a sense of calm, meaning that there is a procedure and a process that you have to go through to drive what I like to say, obligation before opportunity. So you have an obligation of taking ownership, taking responsibility, solving the problem. And then as you do it, you will be able to address greater opportunities in the future. So it was a lot of communication internally, it was a lot of communication externally. But on a foundation of structure, calm and action.
Jane
While the technical response was well underway, by the time that you took up your position as CEO, Sudhakar, there was still presumably a lot to do in the aftermath. And what were the biggest items on your to do list in those early months?
Sudhakar
Yeah. So you are right, in highlighting that the technical response, so to speak, was fairly quick after we found out about the issue. In fact, in about 48 hours, our teams produced a patch. And at that point, I wasn't even at the company. It is one thing to produce a patch, it's another thing to actually get customers to deploy the patch, especially in a premises-based environment. Equally, the important thing to recognise there is a lot of customers, given the supply chain attack nature of this particular incident, first and foremost, did not fully appreciate what that meant. And then the second piece is, if it happened to you, could it happen to me, and what else might be happening? So there's a lot more to call it customer management, customer engagement and customer success more broadly, than simply giving them a patch. That's one aspect of it. The second aspect of it is, what do we do as a business? What did you learn about it? And what are you going to do to make systemic improvements? So on one hand, it was a nation state attack, and no company might be immune to a nation state attack, as was evidenced by, let's say, much larger breaches and much different breaches. So for instance, Microsoft Exchange breach was attributed to China. And so it is not a matter of how many resources you have, how talented you are. When a nation state that has infinite resources, or significant resources, I should say, is, is after you. One can take that as comfort and use that as an excuse and say, I couldn't have done anything differently. Or you can take the approach of Okay, what did we even learn from even this situation? And what can you do about it. And so that's how we came up with this initiative, I would call it, called secure by design. That's an initiative I've used previously in other companies. But in this particular case, given the scope of the challenge, it was much broader and much wider. And so we use that as a rallying cry across the organisation to become better. And that also became the vehicle by which we communicated to the outside world, be it customers, be it partners, be it the government. That has served us really well because, as I mentioned earlier, obligation before opportunity, and now many customers are trusting us with, call it bigger deployments, because of how we are dealing with the secure by design construct.
Adam
So you mentioned that customer management was one of your kind of primary focuses; what were the first initial meetings with customers and partners like, how did they feel about Solarwinds following the attack?
Sudhakar
I'll categorise it in in two ways. Let's say between January to March, and maybe even on to April, most of my customer discussions were, you obviously apologise for the inconvenience that you caused them, regardless of whether it was a nation state or not. So that's about taking ownership and being transparent. Most of the questions were about what happened, and why did it happen kind of approach, as opposed to what did you learn, but since about the April timeframe, and given a bunch of enterprise vendors came out and said, this happened to me, that happened to me since the time we came out, I think there is an appreciation that this could happen to anybody. This is an industry wide problem. This is not a Solarwinds specific issue. In fact, since I'm talking to the two of you in the UK, I was talking to the UK Cyber Security Centre, director of UK Cyber Security Centre. And he said that at the same time that we were researching our issue, they were researching more than a handful of supply chain attacks in the world. Obviously, I can't go into the details of it. But that gives you a sense for that. That gives you a sense for that. And so, starting in April, I would say customer conversations were more about what did you learn? How have you improved? How can we apply your learnings into our environments, because many of my customers, they're also producers of software, and they could be impacted by supply chain challenges as well. So it's become more educational, more informative, although the first four months were anything but easy to do.
Jane
It's always difficult to say with anything, whether that's IT or just anything; it does seem like supply chain attacks are on the rise, they are increasing in popularity. Is that what you feel as well, Sudhakar, or is that just because it happened to you, now there is a lot more reporting about it, and it becomes a self fulfilling cycle.
Sudhakar
As as you know, supply chain attack as a security attack construct is a fairly old technique. It's not a new technique per se, except that it's becoming more and more prevalent nowadays, and has the incident rate increased? I would say yes, in in many ways the incident rate has increased. But I also think that companies like us that have been much more forthcoming and transparent about this is what happened to us have also emboldened others to come out. And this is one of my key focus areas working with the government and the regulators is do not indulge in victim shaming, when a security - especially as it relates to security issues, because the more I hold that information, the more I'm actually impacting my customers, because they could be the subjects of these challenges. And so it's not to say that take away responsibility from software vendors, because we all bear responsibility to improve constantly and do everything we can to avoid it. But when there are issues like this, we need to come out earlier rather than later. Because as they say, the first few hours in a crime are very important. Similarly, in a security incident, the first few days are very important to understand what happened and what you can do to fix.
Adam
One of the really interesting points that Tim brought up is that there were a bunch of customers who hadn't, hadn't really kind of fully read through the advisories and, and guidance that Solarwinds put out. And were doing investigations into whether or not they were affected, despite the fact that they weren't using Orion at all, you know, and that costs money, it costs time, it introduces a bunch of unnecessary stress and, you know, overheads, and more and greater information sharing can help eliminate that kind of thing in these kinds of incidents.
Sudhakar
Absolutely. And that's one reason I would, that's one reason. And then another aspect of it is, while this incident got a lot of press, as I'm sure you've seen, we reported alongside the government that maybe less than 100 customers were actually able to reach the secondary server. And one of the main reasons I would highlight is many customers would have probably configured Orion the right way, which was don't give it access to the external world through your firewall, when you don't have it, the software or the malware is inert. It can do no damage. So just following some of these practices, but then there's a learning and teaching for us as well. How do we automatically do that for a customer? How do we recognise when a customer's got things misconfigured and be able to flag it for them? So those are all things that we will on an ongoing basis implement in our software.
Adam
So speaking of which, as a CEO coming into this kind of situation, did you find that the hack was useful as a learning opportunity, did you make any changes to the organization's top level incident response or security strategies, for example, to try and mitigate similar attacks in the future?
Sudhakar
Definitely, as you know, Adam, there's, there is always a learning opportunity in a crisis if you choose not to rationalise it. And so we took the same approach. At Solarwinds, there's a few things that we did, almost immediately; one of them I proposed even before I actually came on board, because between the time the incident happened, and before I came on board, there was I think a gap of about three weeks, but I had to get involved because it was important for me to hit the road running essentially on January 4. So the first thing we did was form a technology and cybersecurity committee of our board. So oftentimes, there is the audit committee, there's a compensation committee and the non endowed committee, but there's not as much focus on technology and cybersecurity. So we founded a three member committee, which included two sitting members of our boards who are both CIOs of very large corporations, and myself. And so the idea being technology and cybersecurity at the board level should be almost at the same level as audit and comp and other committees. And so we meet more frequently there to provide them an update, and there's oversight there. So which signifies the importance of it. The second piece of it is, you met Tim Brown, Tim has authority, I would say right now to stop any software release from going out of the business, if he feels the security risk of that is more than acceptable. So he will have, he's an independent voice, so to speak, and an independent authority. That highlights how significant security has become for us, and should be for every enterprise out there. And then the third is the focus on secure by design, which has three broad categories of focus for it. One is how do we improve consistently, our infrastructure security, especially as we become more and more hybrid, which is what more and more customers of ours are becoming. That's number one. Number two, is how do we change our build processes to change the attack surface and reduce the threat surface? So thereby shrinking the opportunity for supply chain attacks. And three, how do you change the build processes themselves, such that you're designing in security, as opposed to let's call it testing security post post fact. So those are the three initiatives and investment areas we've made inside the organisation.
Jane
We spoken a fair amount about your customers and speaking with your customers and how they were doing and just how you sort of dealt with, I guess the philosophy to an extent of Solarwinds post attack. Coming into the business, how did you find the situation was internally with staff in terms of morale and that kind of thing? Because I can personally imagine, you know, in the initial crucible of the incident being very adrenaline fired and galvanised to sort this problem, but did that kind of upbeat mentality, was it there in the first place? Or am I making assumptions? And did it continue or was there any problems with low morale following such a, such a high profile attack?
Sudhakar
Yeah, I would definitely not categorise it as upbeat. I will, I will be transparent with you. Like I mentioned earlier, we are not a company that enjoyed the limelight, we were very happy and we are still very happy to serve the customers and make progress in our business. I would say there were lots of, a lot of parts of the organisation that were shell shocked that this happened to us. That's one description I would provide. There were some parts of the organisation, let's call it the engineering organisation, that were actually angry that somebody was able to break into my piece of code, and I consider that to be a healthy thing because you will learn from that feeling and you improve going forward. But the vast majority the organisation I would say was tired because because the organisation went through this, we had to work incredibly hard just to kind of apply those patches or help those with those patches. And it's not just the engineering issue. The customer support teams are constantly on the phone. The execs are constantly on the phone, you go to a cocktail party and everybody's asking, Hey, I saw this about Solarwinds, what happened? Everybody is doing this. Everybody is maybe well meaning, but it's very tiresome. And so those would be the three things that I would attribute to the team when I when I came on board, I would say, and that's one reason why when you come into a situation like that, of the most important attribute you can demonstrate is empathy, I would say, in terms of understanding what happened, so that you can build from there, as opposed to coming in with very preconceived notions. I'll tell you, speak for myself, which is, when I accepted the job, I was framing in my head based on Solarwinds as I knew it, and started prioritising okay, what could I do first, what would I do second, what's my 100 day plan, so to speak? Guess what, as soon as I learned about this incident, all that went out the window, and I had to deal with a completely new set of issues and challenges. And so I had to make adjustments myself in terms of how I approach the problem, as well. But I'll give the team a lot of credit. Because, yes, while I came in and started contributing, there's no way I could have done it by myself without the team actually doing it. And the attitude was very positive. Once we started highlighting what we needed to do first, what we needed to do second, they need to see progress. And they need to have a sense of direction and purpose. And so really, that's what we tried doing. And we rallied around the theme of customer success, obligation, and just doing the right thing, by being transparent. And if we were criticised, let's say in the initial stages, for being transparent, and taking ownership of it, I was equally confident that if we stayed the course and kept doing the right things, good things will happen, even though at that point in time, it may not seem that way.
Jane
Yeah.
Adam
So it's been almost a year since the initial discovery of the attack in kind of early to mid December 2020. Have there been any ongoing effects or ongoing repercussions, let's say, from the attack? You know, the initial, the initial intrusion has now been sorted, you know, customers have all been communicated with and handled and those relationships nurtured, but has there been any kind of knock on effects from it?
Sudhakar
Let me start with the positive side of it. I do believe that today, we are a better company than we were a year ago. We were a great company a year ago, we are a better company today for the incident. Because as I described, through secure by design, we are now not only delivering powerful and simple solutions, but powerful, simple and more secure solutions. Just as an aside, I was with our partners in EMEA and APJ, just in the last two weeks. And one of the key points that our partners are making to our customers is you should deploy Solarwinds with greater confidence now, because it's probably more secure than it ever was before. So that was a positive. That was a positive out of this whole thing. As you go through the negatives of it, I would say that, as we learn about various security incidents, I do feel that there is a greater opportunity for the community to work much better together. So I've been trying to call it in my spare time work with other industry leaders and regulators to see what we can do to create this notion of a community vigil, so to speak. So coming from the security industry, I had an unwritten rule, or the security industry has an unwritten rule, I should say, which is to say, if I find an issue in your company's software, the first thing I will do is to inform you of it so that you can fix it and protect your customers. If I go out into the world and say there's an issue with your software, then I've exposed a lot of your customers. So if I truly care about customers at large, we need to do more and more of those types of things. And I think I would say the industry is is inconsistent, let's say in that, and commercial commercial considerations take precedent, let's say, sometimes over doing the right things, I totally understand it. But I would say that this is an opportunity for us to learn and continue to do better.
Adam
Yeah, and I think things like bug bounty programmes are a great example of that kind of philosophy in action. You know, companies need to have bug bounty programmes, they need to talk about them, they need to make them public, you know, they need to offer, as, you know, as reasonable compensation for those bug bounties, as they, you know, as they can without bankrupting themselves. Yep. But, but they need to, they need to support and incentivize that kind of knowledge sharing.
Sudhakar
Absolutely. Since you mentioned that, in the context of the three pillars of secure by design, as it relates to infrastructure, security, and improving our posture. Under the guidance of Tim Brown, we actually have a bug bounty programme as well. And it's funny that on a monthly basis, when I meet him for status updates on secure by design, we actually talk about how much did we give out last month. And if he had given something out, I consider that to be a good news story.
Adam
Absolutely.
Sudhakar
That means that these issues that were reported, that they're fixed and paid for.
Adam
Absolutely.
Sudhakar
So, so we are definitely a believer in that. Since we brought that up. Another thing that we do is we have call it teams within Tim's organisation that create synthetic attacks against us. The tools and techniques and procedures that they use are not known to many people in the organisation; known only to a few of us. And we we learn from that, suppose I send you Adam a link, and you click on it. By the end of the day, I know how many people clicked on it. How many people are falling prey to phishing attacks, because spear phishing attacks, as you know, from initial indicator of compromise still represent one of the highest percentages of attacks. So we do synthetic attacks that way. So these are all some of the learnings that we have taken, implemented, and we share it now with the community. And we're also writing a comprehensive white paper on what they call secure by design where some of our build processes and enhancements are quite unique. But we felt, let's put it out there, let the industry use it. And let the customers use it so that we can all benefit from it.
Jane
I do wonder if there's sort of an element of embarrassment, maybe for both for companies who are rather than kind of going Oh, thank you so much security researcher, citizen developer, whatever, for finding this hole in our defences, they instead freak out and tried to call the police on them and crack down on them and everything. If there is if it's partly not understanding what's going on. And yeah, partly an element of embarrassment, having been metaphorically caught in public with your pants down, I guess.
Sudhakar
Yeah, I think, I think there is a large element of that, to be honest with you. One is that piece of it. The other piece of it is rationalisation. And equally I use the phrase victim shaming, I think there is quite a bit of that, that goes on that causes people to not want to come forward and say this. A quick anecdote on a on a previous company, we had a security incident that was found by a researcher; not at Solarwinds, but at a different company. And the same issue happened to be in four different vendors technologies at that point in time.
Adam
Wow.
Sudhakar
And I'm guessing here because that we probably were all using some flavour of open source. Right? You may know this fact, but I'll I'll highlight this. Common security vulnerabilities are known to be in open source software on average for four plus years before they are discovered.
Adam
I think we all remember the Heartbleed Open SSL bug.
Sudhakar
Exactly, exactly right. So when we came to know about it, we engaged with the security researcher, we got it resolved, we took care of our customers. And to Jane's point, they probably, the rest of them may be either unwilling or for whatever reason did not address it. And we were at the Black Hat conference and the security research actually presented. All he wanted was acknowledgement of the community, so to speak. And he mentioned our company as a company that did it the right way. And then some other companies not even acknowledging the issue, much less fixing the issue. So kind of goes back to what are some of my principles and the principle of transparency served us well then, and I carried that forward into into Solarwinds as well.
Adam
So Sudhakar, what advice would you have for other CEOs, whether they're newly coming into a business or are already in place, who are facing a similar situation in terms of coordinating the strategic response to a breach?
Sudhakar
Definitely, Adam. First, I will highlight there are some, if I can use the word, incongruesness that exists between what IT professionals, security professionals live every single day as it relates to budgets and technology, and education, and lack of direction; that against management's need and goal of never having security issues. So oftentimes, I'm part of it, which is you get into a situation and you realise that I have not given Tim Brown and team the tools needed to succeed. And yet, I'm disappointed when a security incident happens. So first things first, is to acknowledge that whether you're coming into a hot issue, like the one that I entered or not, security has to be paramount in everybody's minds; it's unfortunate, but it's true, right. And when it happens to you, the best thing to do would be to be transparent about it, and to own up to it, and share it with both the public as well as your customers, and most importantly, do something about it. And as you go through the storm, it may seem like it's not worth it to do that, you might as well have hidden from it. But I think longer term that causes more damage than not, I would also say use the community. Broadly speaking, ask them what they did, share with them what you learnt. And I can go back to the following principles, ownership, transparency, humility, and action. Those are the four things that you need, and try to avoid stereotypical things. For instance, when you come in into a security situation, like the way I did it, the fashionable thing to do is to look around and say, Who should I fire? Right. And the important thing to do is dig your heels in and really understand. You can do it quickly, but dig your heels in and understand, because there is an opportunity to learn and to serve, and not do random things in the name of acting.
Adam
Yeah, I think that's so important, the temptation to come in and immediately start taking action to fix things, you know, to, to have an impact must be so intense for somebody coming into a situation like that. But fast action is not necessarily the same thing as the correct action.
Sudhakar
Yeah, sometimes I like to repeat the phrase go slow to go fast, so to speak, which is be a little bit more thoughtful at the beginning. And then you can go much faster, because you don't have to rework a bunch of things. And there's always a balance. But the most important thing is communicate constantly; communicate constantly what's happening, what you're doing, what you're thinking, where you're going, and get everybody aligned as best as you can. I just like to emphasise that, first of all, thanks for this opportunity to talk about the Solarwinds experience and what we're doing about this, I truly believe and being a member of the community at large, that this is a much larger issue than one company as has been evidenced by more and more companies talking about this. And it is important for the entire community. And when I say community, it is us as vendors, the partners that we have, the customers, the regulators and the government to work together. Because as I like to say that especially in the context of a foreign state, they have a large number of resources and they don't have scruples. So we from a security standpoint have to be right every single time. They have to be right once to create a lot of damage. So it's very asymmetric. And I feel like we are doing ourselves a disservice by being uncoordinated and not sharing and using this as a competitive advantage, or a one company issue, as opposed to a broader industry wide issue. So that's a piece that we need to continue to work on improving.
Jane
So thank you very much Sudhakar for sharing your insight and your experience with us.
Sudhakar
Thank you very much. I enjoyed doing this.
Adam
It's rare to get such an in depth look at how an organisation fared during a major attack like this. So we'd like to thank Tim Brown and Sudhakar Ramakrishna from Solarwinds for taking the time to speak to us.
Jane
As both of our guests this week highlighted, getting hit by a cyber attack is a matter of if, not when. With that in mind, it's more important than ever that we be open and honest when discussing these incidents and how we respond to them.
Adam
You can find more information about this topic in the show notes and even more on our website, itpro.co.uk.
Jane
You can also follow us on Twitter where we are @IT Pro, as well as Facebook, LinkedIn, and YouTube.
Adam
Don't forget to subscribe to the IT Pro Podcast wherever you find podcasts to never miss an episode. And if you're enjoying the show, leave us a rating and review. We'll be back next week with more analysis from the world of IT. But until then, goodbye.
Jane
Bye.
ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it's cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.
For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.