Rare Insight Into GCHQ's Comms Spies

Since Edward Snowden leaked documents taken from the US National Security Agency, another intelligence agency which specializes in spying on electronic communications has endured a lot of collateral damage. If the NSA is a global Big Brother, listening to the private conversations of the German head of government and thousands of ordinary French citizens, then Britain’s Government Communications Headquarters (GCHQ) is Big Brother’s little brother. Snowden’s revelations suggested the British electronic eavesdropping organization had eagerly shared secrets with its larger American sibling, and often done its bidding. As a consequence, the profile of Britain’s network spies has risen worldwide. But how accurate were Snowden’s revelations about GCHQ? Last week afforded a rare glimpse behind the veil of secrecy, with a British Parliamentary committee issuing a lengthy report into the workings of GCHQ and other British intelligence agencies.

Before we dive into the details of the report, we should lay some groundwork. In many respects, it makes practical sense for the NSA and GCHQ to have a cooperative relationship. In general, the British and American governments are confronted by common enemies. Britain sits at one end of many submarine cables that carry a lot of communications traffic, some of which will be of interest to American spies. US spies have a larger budget and more resources, but they are also subject to laws for which there may be no equivalent in Britain. These are the fundamentals that make Snowden’s revelations so credible. Public trust has been shaken. That is why British parliamentarians have been under unprecedented pressure to review how GCHQ’s work affects privacy and security. As a consequence, the Intelligence and Security Committee of the British Parliament has issued a 149-page report entitled Privacy and Security: A modern and transparent legal framework. Does it make the workings of GCHQ transparent? Err… no. No intelligent individual could expect so much. But discussing the legal boundaries of GCHQ’s work does give hints about how data is gathered, and the technological capabilities of GCHQ. It also indicates that Britain’s parliamentarians either struggle to understand how technology works, or they only want to present dumbed-down conclusions in a vocabulary designed to reassure the general public. The key insights follow.

Politicians believe that the profit motive makes comms businesses too eager to protect customer privacy.

Some spy bosses have complained that encryption is interfering with their ability to protect the public, and that comms providers have gone too far by actively ‘marketing’ the privacy features of products and services. It does not seem to occur to them that such marketing only works if there is already widespread demand for privacy. In this report, the politicians side with the spooks and against business. By so doing, they are also siding against the customers of these businesses, although they are too coy to say so.

As a result of the Snowden allegations, technology companies have improved privacy protections and strengthened the encryption offered to their customers. The extent to which a Communications Service Provider (CSP) can assure their users that their communications cannot be read by the intelligence Agencies has become a part of their marketing strategy.
…However, while CSPs may be primarily concerned about commercial advantage, the growing use of encryption also raises moral and ethical issues. The effect of increased privacy controls has been to place some of the communications of their users beyond the reach of law enforcement and intelligence officers and even, in some cases, beyond the reach of the law courts: should CSPs be providing an opportunity for terrorists and others who wish to do us harm to communicate without inhibition?

Comms providers outside the UK are subject to special criticism:

…overseas CSPs generally do not comply with UK interception warrants (although recent legislation – the Data Retention and Investigatory Powers Act, passed in 2014 – compels CSPs to comply, the Government has not, as yet, sought to enforce compliance). This is having a significant impact on the Agencies’ ability to use this capability. As our introduction to this Report has outlined, protection of their users’ privacy is increasingly a market differentiator for technology companies and therefore (generally) they are not willing to cooperate with UK intelligence Agencies.

Without Snowden’s leaks, this committee would never have written its report.

The leak by Edward Snowden of stolen intelligence material in June 2013 led to allegations regarding the UK Agencies’ use of intrusive capabilities – in particular those relating to GCHQ’s interception of internet communications. This Committee investigated the most serious of those allegations – that GCHQ were circumventing UK law – in July 2013. We concluded that that allegation was unfounded. However, we considered that a more in-depth Inquiry into the full range of the Agencies’ intrusive capabilities was required – not just in terms of how they are used and the scale of that use, but also the degree to which they intrude on privacy and the extent to which existing legislation adequately defines and constrains these capabilities.

The committee usually prefers words that ‘transparently’ describe numbers, instead of just stating the numbers themselves.

… only a very tiny percentage (***%) of the communications that GCHQ collect are ever opened and read by an analyst.

The report is full of unnecessary redactions, putting ***’s in places where it is hard to imagine the risk to national security of simply stating the original number. If numbers like this really are ‘tiny’, or ‘small’, or ‘next to nothing’, then what conceivable danger is posed by letting us know the actual number? Or to put it another way: if we are allowed to know something is tiny, but it is dangerous to tell us the exact number, then why not compromise by stating a range within which the tiny number falls, so we can judge if it is tiny for ourselves?

The committee is so keen to reassure the public that they say facile things at times.

Our Inquiry has shown that the Agencies do not have the legal authority, the resources, the technical capability, or the desire to intercept every communication of British citizens, or of the internet as a whole: GCHQ are not reading the emails of everyone in the UK.

Well, obviously not. You hardly need a formal review by well-paid politicians to conclude that not every email is read by a government spook. I do not have time to read every email that winds up in my spam folder, so I would be surprised if taxpayer-funded spies had time for that. Even the East German Stasi, at the height of their powers, did not have the manpower to review every single communication between every single East German citizen, and they worked at a time before email and other internet messaging services prompted huge growth in the volume of communication. The real issue is not whether human beings are employed to read everything, but whether we are approaching a stage where machines might effectively review everything, picking out any and every message which might interest their human masters, and to do so according to arbitrary criteria.

I was also disappointed that the committee repeatedly chose to set low expectations. Security forces do not need to read every email to be reading far more than they should.

The quote above also illustrated another recurring problem with the report, which repeatedly talks about the ‘internet as a whole’. It would be incredible to suggest one nation’s communications espionage infrastructure might have accessed every cable in every nation. Even the NSA cannot do that – which is why they cooperate with GCHQ. Repeated references to the ‘whole’ internet leaves me suspicious that the politicians are deliberately downplaying the scale of GCHQ’s activities, both in the UK and abroad.

GCHQ works on the principle of collecting a lot of communications because only a small amount will be useful.

GCHQ’s bulk interception systems operate on a very small percentage of the bearers that make up the internet. We are satisfied that they apply levels of filtering and selection such that only a certain amount of the material on those bearers is collected. Further targeted searches ensure that only those items believed to be of the highest intelligence value are ever presented for analysts to examine: therefore only a tiny fraction of those collected are ever seen by human eyes.

This quote is more revealing than the one above. Obviously GCHQ analysts cannot read every email. But the agency collects a lot more than its people could read, on the basis that the unread communications might prove useful, but they will not know until the messages are collected and then filtered using automated criteria.

And if you were in any doubt about whether GCHQ interception is focused on submarine cables, the committee helpfully explained what they mean by a ‘bearer’:

Internet communications are primarily carried over international fibre optic cables. Each cable may carry several ‘bearers’ which can carry up to 10 gigabits of data per second.

So long as communications are intercepted ‘in bulk’, the spy agencies can say they are not ‘targeting’ anyone, and so sidestep restrictions on targeted surveillance.

…we have established that bulk interception cannot be used to target the communications of an individual in the UK without a specific authorisation naming that individual, signed by a Secretary of State.

It seemingly did not occur to the committee that if you gather enough information in ‘bulk’, you increase the chances of gathering data about individuals the spy agencies would like to target, but are not authorized to target. In fact, that is the point of the exercise – to get an insight into who should be watched more closely. The report goes on to make that exact point:

The examples GCHQ have provided, together with the other evidence we have taken, have satisfied the Committee that GCHQ’s bulk interception capability is used primarily to find patterns in, or characteristics of, online communications which indicate involvement in threats to national security. The people involved in these communications are sometimes already known, in which case valuable extra intelligence may be obtained (e.g. a new person in a terrorist network, a new location to be monitored, or a new selector to be targeted). In other cases, it exposes previously unknown individuals or plots that threaten our security which would not otherwise be detected.

And again:

GCHQ’s bulk interception capability is used either to investigate the communications of individuals already known to pose a threat, or to generate new intelligence leads…

GCHQ uses the bulk data it has intercepted like recruitment agencies use LinkedIn. And just like any recruitment agency, they suffer an (undisclosed) number of false positives.

Have you ever had a recruitment consultant pester you about a job which does not suit you, or which you are not qualified for, because they did a string search on LinkedIn and your profile contained words that matched their search? This is hardly rocket science. The parliamentary committee admitted GCHQ does the same thing, though they described it in a roundabout fashion:

The first of the major processing systems we have examined is targeted at a very small percentage of the ‘bearers’ that make up the internet. As communications flow across those particular bearers, the system compares the traffic against a list of ‘simple selectors’. These are specific identifiers relating to a known target. Any communications which match are collected.
…The processing system then runs both automated and bespoke searches on these communications in order to draw out communications of intelligence value. By performing complex searches combining a number of criteria, the odds of a ‘false positive’ are considerably reduced.

The committee downplays the risk that data collected by GCHQ might be used for inappropriate purposes.

The system does not permit GCHQ analysts to search these communications freely (i.e. they cannot conduct fishing expeditions).

I do not doubt this point is true, but it is wrong to emphasize this point whilst failing to discuss all the ways that employees or contractors, whilst working for GCHQ, might ‘fish’ for data. Typical intelligence analysts may only be able to access data through a carefully designed front-end that imposes controls over their access. But it would be daft to suggest that nobody in GCHQ has access to the raw data, and hence that nobody in GCHQ could even attempt a fishing expedition. Somebody has to manage their information technology. The likeliest obstacle to that person conducting a fishing expedition is the volume of data to be searched and the processing resources needed to accomplish such a search. Apart from that, the issue is who has both the necessary skills and access to the raw data, how those people were selected for their role, and how they are monitored when performing it. The report is weak on this subject, commenting in a disjointed way about the existence of audit trails, disciplinary procedures, and the laws which people would be breaking if they did engage in a fishing expedition. By conflating the obstacles to abuse with penalties for abuse, they gloss over evidence that abuse is technically possible…

…to date there has only been one case where GCHQ have dismissed a member of staff for misusing access to GCHQ’s systems.

So there has been at least one case where system controls did not prevent real abuse taking place. And telling us one person was dismissed tells us nothing about the strength of GCHQ’s internal disciplinary procedures.

People invent stupid legal jargon for things that already have their own stupid technical jargon… but whatever name you give to CDRs, GCHQ likes to gather them freely.

While much of the recent controversy has focused on GCHQ’s interception of emails, there has also been concern over the use the Agencies make of Communications Data (CD). This encompasses the details about a communication – the ‘who, when and where’ – but not the content of what was said or written.
CD… is used to develop leads, focus on those who pose a threat and illuminate networks. However, concerns have been raised as to whether the distinction between data and content is still meaningful, and also whether changes in technology mean that CD is now just as intrusive as content.
…while the volume of CD available has made it possible to build a richer picture of an individual,
this remains considerably less intrusive than content. It does not therefore require the
same safeguards as content does.
…there is a ‘grey’ area of material which is not content, but neither does it appear to fit within the narrow ‘who, when and where’ of a communication, for example information such as web domains visited or the locational tracking information in a smartphone. This information, while not content, nevertheless has the potential to reveal a great deal about a person’s private life – his or her habits, tastes and preferences – and there are therefore legitimate concerns as to how that material is protected.
We have therefore recommended that this latter type of information should be treated as a separate category which we call ‘Communications Data Plus’. This should attract greater safeguards than the narrowly drawn category of Communications Data.

The report repeatedly tell us that GCHQ can only intercept a ‘small’ part of the internet, and they read only a ‘tiny’ proportion of emails. It is silent about the volume and proportion of CDRs that are collected and analysed. You can draw your own conclusions as to why they said nothing about that topic.

British legislators invented new jargon for things which already had their own technical jargon, but they fault Americans for doing the same. Lots of jargon causes confusion, but the solution is apparently more British jargon.

A further complicating factor in this debate is the confusion as to what is treated as CD and what is treated as content. The confusion is caused, in part, by many commentators using the term ‘metadata’ for information that does not appear to fall neatly into either category. ‘Metadata’ is a term commonly used in the USA, but it has… no bearing on the UK system of interception. For example, in the UK a record of a website visited… is treated as CD, whereas the full web address, which includes the precise words searched for… is treated as content. Both of these, however, might be referred to as ‘metadata’. This Committee has previously noted this confusion… and has already recommended greater clarity and transparency around the different categories of information.

GCHQ has increasing access to cables which are wholly outside of the UK.

While the number of communications links that GCHQ can access can increase or decrease over time, the overall trend is upwards. 40% of the links that GCHQ can access enter or leave the UK, with 60% entirely overseas.

When intercepting in bulk, GCHQ chooses which cables to spy on by picking samples… and spying on them.

[Bearers] are chosen on the basis of the possible intelligence value of the traffic they carry… To establish this, GCHQ conduct periodic surveys, lasting a few seconds or minutes at a time, on these bearers.

Privacy campaigners complain that bulk interception has never been shown to work. They will not be reassured by this report.

GCHQ have provided case studies to the Committee demonstrating the effectiveness of their bulk interception capabilities…
Example 1: ***
82. ***.
83. ***.
Example 2: ***
84. ***.
85. ***.
86. ***.
Example 3: ***
87. ***.
88. ***.
89. ***.

The committee confuses or ignores the risks of machines reading emails.

[Privacy campaigners] considered that the collection of internet communications was intrusive in and of itself, even if the content of those communications was not looked at by either a computer or a person [my emphasis]. They argued that this threatens people’s fundamental rights and has a pervasive ‘chilling effect’ on society as a whole:
“The objection is to both – the collection and interrogation without an appropriate framework. There is nothing passive about GCHQ collecting millions and millions of communications of people in this country… even if human beings are not processing those communications and it is being done by machines [my emphasis], that is a physical interception – a privacy infringement – and a model of blanket interception that we have not traditionally followed in this country.”

For reasons that baffle me, the report repeatedly fails to deal with the issue of whether it is fine for machines to automatically read huge volumes of emails, even when that issue is explicitly brought to their attention. The topic is raised in the section quoted above, but the report immediately jumps to the committee’s criticism that it would be too dangerous not to do any automated reading of electronic correspondence. They never comment on how much automated reading might or might not be appropriate in a free society.

Obtaining and analysing large volumes of CDRs and similar data is now an established and long-practiced norm for spy agencies.

The Home Secretary has stated that “Communications data has played a significant role in every Security Service counter-terrorism operation over the last decade”. The Director General of
MI5 has explained that “comms data is very often one of the early means we can use to determine whether and where to focus our investigation”.
In our 2013 Report, we said that “it is clear to us from the evidence we have been given that CD is integral to the work of the… Agencies”…

The spies collect every kind of customer data that telcos possess.

Agencies collect the following three categories of CD:
• traffic data – information attached to, or comprised in, the communication which tells you something about how the information is sent (e.g. an address or routing information). It includes caller line identity, dialled number identity, cell location data, and other details of the location or ‘address’ (whether postal address or electronic address) of a sender or recipient of a communication;
• service use information – this includes billing and other types of service use information such as call waiting and barring, redirection services and records of postal items; and
• subscriber information – includes any information (that is not traffic data or service use information) that is held by the CSP ‘in connection with’ the provision of the service. This could include the name and address of the subscriber, bank details and details of credit cards etc. attached to a user’s account.

The spies collect telco data in every way imaginable.

The Agencies collect CD either directly from CSPs, from their own interception of traffic (running over fibre optic cables or via satellites and other more traditional forms of communication), or from overseas partners.

GCHQ can determine if you are a terrorist just by discovering who you phone and what websites you visit.

GCHQ have established that they can analyse CD to find patterns in it that reflect particular online behaviours that are associated with activities such as attack planning, and to establish links.

GCHQ can probably decrypt your encrypted comms.

Terrorists, criminals and hostile states increasingly use encryption to protect their communications. The ability to decrypt these communications is core to GCHQ’s work, and therefore they have designed a programme of work – *** – to enable them to read encrypted communications. There are three main strands to this work:
i) ***;
ii) developing decryption capabilities ***; and
iii) ***.

The committee had concerns about one of the techniques used to hack encrypted systems, but we are not allowed to know what the technique is.

GCHQ need to be able to read the encrypted communications of those who might pose a threat to the UK. We recognise concerns that this work may expose the public to greater risk and could have potentially serious ramifications (both political and economic). We have questioned GCHQ about the risks of their work in this area. They emphasised that much of their work is focused on improving security online. In the limited circumstances where they do *** they would only do so where they are confident that it could not be ***. However, we are concerned that such decisions are only taken internally…

Do you feel reassured yet?

Throughout the report, it is clear these politicians have disdain for telcos. They think they are run by selfish businessmen who prioritize profit over the interests of the public. The role of the telco is to be a conduit for information that spies want to obtain, and if telcos exercise a legal right to refuse, they are chastised for doing so. The politicians blame marketing for the fact that some telco customers demand privacy, and they assert that telco customers must be pacified with more ‘reassurance’ and improved ‘transparency’, although this report provides little of either.

Technologically sophisticated readers would have already guessed most of the facts corroborated by this report: that the electronic spies focus on submarine cables; that they collect in bulk but then filter using searches; that they indiscriminately build up CDR databases to map who speaks to who; that they actively develop the means to decrypt encrypted comms; and that spy agencies are not run by reckless buffoons who failed to implement basic system controls. As such, the purpose of the report seems to be to tell people they should feel reassured, without giving them any actual reassurance… because anything that might be genuinely reassuring was inevitably rendered as ***.

The politicians repeatedly state that modern electronic communications is being used by terrorists and other enemies of the people, and hence the proliferation of communication technology increases risk. But there is no balance to this analysis. Technology is neutral. Better technology can help both the state and its enemies. Whilst Snapchat could be used by terrorists to send secure messages, Big Data technology greatly increases the state’s ability to collect, store and search through data. A balanced analysis of the privacy and security risks would not solely focus on the downsides of cheaper, better, more private and more widespread comms. We should also highlight the benefits of improved technology, and the areas where technology shifts the balance of power towards the state, away from citizens and privately-owned businesses.

In East Germany, human spies had to read letters and listen to phone calls to learn what was being communicated. Today, a machine can read emails. Imagine if the Stasi possessed the technology available today. Should we jump to the conclusion that improvements in communications technology would have hurt their spy network more than improvements in hardware storage and software analytics would have helped them? No. That is why I find this report to be biased. The writers only consider the downsides of technology from the spy’s perspective, and not how spies have been empowered by new technology.

The most important revelation is that the UK’s electronic spy agency has access to an increasing number of communications lines, and that 60% of these have no connection to the UK. Surveillance of international communications is a growth business. Changes in technology make it possible to fuel that growth at diminished cost. We must continue to be wary of all risks, and we need telcos to be strong consumer champions, providing balance when politicians do not.

Rare Insight Into GCHQ’s Comms Spies