The "free speech" social media platform Parler has been catapulted into the headlines this month, after it was accused of being used to plot an insurrection and was subsequently shut down as one provider after another terminated services and refused to do business with the platform.
It has also been reported that, before the service was taken offline when Amazon Web Services cut off hosting, up to 70TB of user data from the platform was downloaded and archived by a group of hacktivists.
This is a fast moving and evolving story, and the facts are not all clear at time of writing, so some of the detail below may not be 100% accurate. And Parler is certainly not a "typical" business, in many respects - the wider political and social context of which is the subject of ongoing discussion and debate. But there are important lessons for both risk management and systems architecture in the story, which apply much more widely.
Managing Supplier Risk
The most obvious risk, and the one that ultimately took Parler down, was their dependency on the Amazon Web Services (AWS) infrastructure for hosting.
Parler's CEO claimed that Parler "prepared for events like this by never relying on amazons [sic] proprietary infrastructure and building bare metal products".
However, running a social media platform able to support tens or hundreds of millions of users requires significant hosting resources - and other than Amazon there are only a small number of providers (Microsoft Azure, Google, Oracle) able to provide comparable scalability.
"Preparing" for a risk requires more than simply having a theory that you will be able to "move to a new supplier". To truly prepare, organisations must plan out and test their disaster recovery and business continuity plans. Only testing individual small-scale failure scenarios (such as failure of an individual server, or fail-over of single component from one location to another) does not demonstrate that you will be able to recover service quickly in the event of a large-scale failure.
For a company which was created specifically to challenge the control of "big tech" over internet platforms, not properly preparing for Parler themselves being "de-platformed" was a critical mistake. At the time of writing Parler had not yet been able to reinstate service and was still offline.
A Problem Shared is a Problem … Squared?
Most companies don't have to deal with "internet scale" and won't realise a "multiple suppliers refuse to do business with us due to toxic political associations" risk leaving them unable to continue operations.
But many organisations experienced a similar effect first hand last year, when the Covid lockdowns forced large parts of the workforce into working from home. Organisations' office closure business continuity plans were based on using secondary locations, often provided from a shared location via an arrangement with a provider such as Sungard.
When the coronavirus struck, those plans - however well tested - were useless, because the secondary locations had also been closed down. Organisations raced to implement alternatives, making emergency changes to VPN and IVR configurations to allow almost all staff to work from home, and to put in place new management and oversight processes to ensure that staff were working safely, securely and effectively.
Like Parler's hosting supplier risk, this was an example of a "common-mode" risk -- where the same risk affects both the primary and its planned replacement. Such risks can occur in many situations - for example if two services rely on the same underlying cloud infrastructure.
Effectively managing supplier risk requires a careful and thorough assessment of risks that exist through the supply chain, to develop, plan and test (to the extent possible) responses that deal with the reasonable-worst case scenarios.
Multiple Single Points of Failure
It was the termination of Parler's AWS hosting which brought the social media platform down completely. But by that stage, the platform was already hugely affected by other suppliers who had refused to do business with them. Like many organisations, Parler's operations were dependent on many different third-party services.
Google and Apple both blocked new downloads of the Parler app from their respective app stores. The SMS provider Twilio, the authentication and identity provider Okta and the customer service/support provider Zendesk also all terminated Parler's services.
Parler was actually able to continue operating in at least some form even after it was pulled from the app stores - existing users could continue to use the app, and new and existing users could also access Parler via the website.
Making Decisions Under Pressure
But the termination of SMS services by Twilio proved very damaging to Parler. Parler used Twilio as part of the registration process, to verify that the user had provided a valid phone number - deterring people from creating fake and spam accounts. When Twilio terminated SMS services for Parler, Parler decided to disable that check, so that new users could continue to register.
This appears to have been a conscious choice - they could instead have decided to block all new user registrations until they could find an alternative SMS provider, or implement a different anti-spam mechanism. Parler probably decided it was more important to "protect service" by keeping the doors open.
Unfortunately, once it became known that Parler was no longer validating phone numbers on registration, people were able to create as many fake accounts as they wished. Automated scripts were created allowing millions of fake accounts to be created - at which point Parler did what they should have done in the first place, and blocked new user registrations.
This reinforces how important it is to have properly documented and understood plans for what happens in the event of an incident - rather than leaving those decisions to be made under pressure, in the heat of the moment.
Parler's registration screens did in fact have another level of protection against fake accounts being created -- the user was required to solve a "CAPTCHA" puzzle - a Completely Automated Public Turing-test to tell Computers and Humans Apart. Parler used an older style of CAPTCHA which asks the user to re-type letters and numbers displayed in a slightly distorted image.
The aim of this test was to make it harder for attackers to use a script, or "bot" to create huge numbers of fake accounts. Unfortunately, this older style of CAPTCHA puzzle can be trivially solved by computers nowadays - the only way for Parler to stop the fake accounts from being created was to switch off registration completely.
Buy Vs Build
In several areas, Parler appears to have built its own solutions rather than adopting and re-using standard mechanisms. This aligns with what the Parler CEO claimed, about "bare metal" solutions, which minimise dependencies on Amazon and third-party services.
Organisations are frequently faced with these sorts of buy-vs-build decisions. Deciding to "build your own" can be the right answer -- particularly for key differentiating capabilities. But it comes with a cost -- not just the initial build cost, but the work required to keep that component updated as technology moves forward. This is particularly important for security-related components, where a single mistake can result in catastrophic failure. Buying an off-the shelf / third-party solution leverages the skills, time and expertise of a specialist provider, allowing the client to get on with delivering their products and services.
As with many high-impact security breaches, it was a combination of multiple individual mistakes which led to hackers being able to download all the user-submitted content from Parler. Even after Parler had shut down new user registration, hackers were able to use the fake accounts they had created to "scrape" posts, images and video submitted to the site. They were even able to access the "upload" copies of images and videos, complete with the metadata which records exactly where and when each picture was taken - enabling analysts to track where those users had been.
Through a combination of operational and architecture mistakes, the social media platform which had claimed to be "privacy-focussed" ended up being the source of a major leak of personal data.
Assessing Risks, Taking Action
Organisations should regularly assess their key systems and infrastructure with a clear focus on their real exposure to operational and reputational risks that might arise due to security or infrastructure failures. But it's not enough to simply identify risks - effective plans must be put in place as to how those risks will be dealt with should they be realised.