The Censorship Arms Race
Internet censorship by authoritarian governments prohibits free and open access to information for millions of people around the world. Attempts to evade such censorship have turned into a continually escalating race to keep up with ever-changing, increasingly sophisticated internet censorship. Censoring regimes have had the advantage in that race because researchers must manually search for ways to circumvent censorship, a process that takes considerable time.
On November 14, 2019, a paper titled “Geneva: Evolving Censorship Evasion Strategies,” by Dave Levin, Kevin Bock, George Hughey and Xiao Qian was introduced at the Association for computing Machinery’s 26th Conference on Computer Communications Security in London. Levin, Bock and Hughey were computer scientists at the University of Maryland and Xiao Qian was a computer scientist at UC Berkeley. In addition, seven UMD undergraduate students worked on this project as part of Levin’s Breakerspace lab in the UMD Department of Computer Science.
The paper addressed a new Artificial Intelligence system that could automatically learn how to detect and evade censorship from repressive regimes. The AI system called Geneva (short for genetic evasion) to detect and evade internet censorship efforts from repressive regimes. Geneva was designed to circumvent government censorship in two ways. The program could make the government oversight regard offending content as safe. Or, it could make it appear that the traffic was successfully censored when it was not. To keep users of the program from being in violation of restrictions on anti-censorship software, Geneva could run in the background of a web browser, without being fully installed.
Tested in China, India and Kazakhstan, Geneva found dozens of ways to circumvent censorship by exploiting gaps in censors’ logic and finding bugs that the researchers say would have been virtually impossible for humans to find manually.
At the time of its introduction, Dave Levin, an assistant professor of Computer science at UMD and senior author of the paper, said, “With Geneva, we are, for the first time, at a major advantage in the censorship arms race. Geneva represents the first step toward a whole new arms race in which artificial intelligence systems of censors and evaders compete with one another. Ultimately, winning this race means bringing free speech and open communication to millions of users around the world who currently don’t have them.”
Under the Hood
All information on the internet is broken into data packets by the sender’s computer and reassembled by the receiving computer. One prevalent form of internet censorship used by authoritarian regimes works by monitoring the data packets sent during an internet search. The censor blocks requests that either contain flagged keywords (such as “Tiananmen Square” in China) or prohibited domain names (such as “Wikipedia” in many countries).
When Geneva is running on a computer that is sending out web requests through a censor, Geneva modifies how data is broken up and sent, so that the censor does not recognize forbidden content or is unable to censor the connection.
Known as a genetic algorithm, Geneva is a biologically inspired type of artificial intelligence that Levin and his team developed to work in the background as a user browses the web from a standard internet browser. Like biological systems, Geneva forms sets of instructions from genetic building blocks. But rather than using DNA as building blocks, Geneva uses small pieces of code. Individually, the bits of code do very little, but when composed into instructions, they can perform sophisticated evasion strategies for breaking up, arranging, or sending data packets.
Geneva evolves its genetic code through successive attempts (or generations). With each generation, Geneva keeps the instructions that work best at evading censorship and kicks out the rest. Geneva mutates and crossbreeds its strategies by randomly removing instructions, adding new instructions, or combining successful instructions and testing the strategy again. Through this evolutionary process, Geneva can identify multiple evasion strategies very quickly.
As noted in an article from the Open Technology Fund, to automate this evasion detection process, the Geneva team developed a “survival of the fittest” theory under which the algorithm can use any combination of its four packet-manipulation building blocks (duplicate, fragment, tamper, and drop) to try to circumvent a censor. “Unsurprisingly, at the beginning of the process every strategy the algorithm employs will fail. But the failures of certain strategies won’t be quite as bad as others and the ‘children’ of those strategies can go on to try again, and again, until eventually one actually succeeds in slipping past the censor undetected. Success in hand, the algorithm can begin again to test and develop a new strategy while the identified success is able to be deployed through Geneva’s separate engine component.”
“This completely inverts how researchers typically approach the problem of censorship,” said Levin, who holds a joint appointment in the University of Maryland Institute for Advanced Computer Studies. “Ordinarily we identify how a censorship strategy works and then devise strategies to evade it. But now we let Geneva figure out how to evade the censor, and then we learn what censorship strategies are being used by seeing how Geneva defeated them.”
The team tested Geneva in the laboratory against mock censors and in the real world against real censors. In the lab, the researchers developed censors that functioned like those known from previous research to be deployed by autocratic regimes. Within days, Geneva identified virtually all the packet-manipulation strategies that had been discovered by previously published work.
The researchers plan to release their data and code in the hopes that it will provide open access to information in countries where the internet is restricted. The team acknowledges that there may be many reasons why individuals living under autocratic regimes might not want or be able to install the tool on their computers. However, they remain undeterred. The researchers are exploring the possibility of deploying Geneva on the computer supplying the blocked content (known as the server) rather than on the computer searching for blocked content (known as the client). That would mean websites such as Wikipedia or the BBC could be available to anyone inside countries that currently block them, such as China and Iran, without requiring the users to configure anything on their computer.
“If Geneva can be deployed on the server side and work as well as it does on the client side, then it could potentially open up communications for millions of people,” Levin said. “That’s an amazing possibility, and it’s a direction we’re pursuing.”
The Geneva Site
Since the presentation of the Geneva paper in November 2019, the researchers have developed a Geneva site. One of the key components on the site is the listing of the open source code for Geneva. The site notes that there are two main components: 1) the Geneva engine, which runs the strategies that Geneva has discovered on network traffic, and 2) the full genetic algorithm that powers Geneva. This allows anyone to try out strategies the team has published, write their own strategies manually, or evolve new strategies against real world censors. The site also lists Geneva documentation.
Currently, only Linux is supported with Windows support in alpha.
An Update on Geneva
I contacted Dave Levin in December of 2021, two years after the introduction of his paper in London. We asked for a brief update on Geneva. Dave replied with the following email on 12/9/21.
“In the two or so years since Geneva first came out, we’ve had quite a few developments: we found that it can run purely at a server, without requiring clients to download any extra software. So, for instance, if your blog were getting censored, you could run Geneva at your web server and clients in censoring regimes would be able to access it without the clients themselves having to download or install anything. We’ve also been expanding different ways that Geneva can circumvent censorship.”
“Also in that time (since November 2019) new forms of censorship have arisen in many countries, notably China, Iran, and Kazakhstan. In each of these cases, it took Geneva about one hour to discover circumvention techniques.”
“Perhaps most importantly, we’ve been teaming up with other anti-censorship efforts like Psiphon and TunnelBear VPN to help enable them to better circumvent censorship by incorporating Geneva.”
“Geneva’s findings are helping people today, but mainly through integration with these other tools. We have not yet embarked on making our own plug-and-play tool for users (the main reason because we would want to make extra certain that we are not increasing any user’s attack surface, and there can be a big gap between research prototype and security-hardened software that is resilient against powerful nation-state adversaries).
“That said, as you probably saw, all of our code is open source and I think the students have done a fantastic job with it.”
Fighting Censorship on a Global Scale
Interestingly enough, Geneva was introduced in November of 2019, just a few months before the pandemic in March of 2020. Certainly, as the researchers note, there was great internet censorship “arms race” at this time to keep up with an ever-changing, increasingly sophisticated state internet censorship. Yet at this time, AI systems were already being used in China to censor online content.
Reported in The Internet of Business from the Cambridge Innovation Institute in an article titled “The Great Firewall: China looks to AI to censor online material” the article notes “One of the country’s newest industries, the mass enforcement of internet censorship, is well on the way to becoming automated. Access to global internet giants from across the Pacific, including Google, Facebook, and Twitter is completely restricted. Yet the number of people in China with internet access through mobile devices has grown dramatically in recent years: from 420 million in 2012 to 753 million in 2017, according to the government’s own statistics. This has created an enormous challenge for state censors. Sifting through millions of images and videos every day is no longer feasible without the help of AI.”
The Chinese continue to impose tight censorship. As reported in The Daily Swig: Cybersecurity News & Views, “Chinese censors have begun blocking TLS connections with the Encrypted Server Name Indication (ESNI) extension, in an attempt to reassert state-led censorship controls. Security researchers reported that, starting from late July 2020, the Great Firewall (GFW) of China has been blocking ESNI, one of the foundational privacy-enhancing features of TLS 1.3.Academics have documented various client- and server-side workarounds to China’s heightened security controls, but any relief these hacks offer may only be temporary.”
While the Geneva team continues to battle growing AI censors of China, they have had success in a number of areas. For example, the Open Technology Fund, an independent non-profit organization committed to advancing global Internet freedom, reports that Geneva detected a new form of censorship in Iran in February of 2020. The article notes that ahead of its February 21st parliamentary elections, the Iranian government quietly deployed a second network censorship system that ran in parallel to their existing censorship infrastructure. Rather than blocking specific forbidden network communications (blacklisting), this new system instead only allowed a small list of approved network communications (whitelisting). This whitelisting posed a threat to almost all existing censorship-evasion and privacy tools used by journalists and activists in the country, and due to its design, it was challenging to detect, measure, or study from outside the country. The censorship system was detected by Geneva, which had gained the support of the OTF by then. They deployed their artificial intelligence (AI) algorithm to learn how to defeat the new whitelister. In just two hours, Geneva had discovered three different ways to defeat the whitelister. As the article notes, “These findings are helping tool developers and activists protect themselves as the Iranian government continues to invest in their censorship and information controls infrastructure.”
The nations of China, India and Kazakhstan were the original test nations for Geneva but since 2019 the system has been tested in a number of other nations. And, since 2019, the Geneva Project under the leadership of computer scientist Dave Levin has grown from seven graduate students working on it in 2019 to now over two-dozen in his Breakerspace Labs at the University of Maryland.
Areas of Censorship
It is important to clarify what type of censorship Geneva involves itself with. In general, this involves censorship by nation states of information coming in and going out of the nation. Geneva does not concern itself with domestic censorship inside nations. In an email to me, Levin observes there are basically three broad areas of censorship:
- Restricting Internet access, for instance by shutting off access to the Internet completely. Such a censor blocks all access to the Internet for some period of time.
- Restricting the ability to access certain sites/send certain keywords over the Internet. Such a censor selectively blocks access to certain parts of the Internet.
- Influencing content through misinformation, disinformation, or restriction of what information is shown to users. Such a censor effectively blocks access to information by, for instance, overloading users with conflicting (or simply too much) information. (Russia is well known to do this with various online efforts.)
Levin says, “The kind of censorship that Geneva targets is #2: if you cannot connect to the Internet (#1) there’s not much we can do for you, and if you can successfully interact with, download from, and upload to a given website (#3), then Geneva considers its job done.” He continues noting “I feel pretty comfortable in saying that the US, England, and Australia do not take part in disconnection (#1) or selective blocking (#2) styles of censorship, certainly not at a national scale like we see in the countries we’ve mainly studied (China, Iran, India, Kazakhstan, and Russia). Censoring nation-states impose nation-wide policies for (nearly) all communication through (nearly) all ISPs. The US government, anyway, does not. In fact, the US government (especially through the State Department’s DRL) is quite committed to ensuring and promoting human rights around the world, both online and offline.”
He notes there certainly are some websites that are blocked by the US government. “For instance, DoD firewalls block access to certain kinds of websites within their own networks—but from what I’ve seen, these are typical kinds of policies at most workplaces, blocking things like pornography and gambling websites.”
Levin makes clear that Geneva targets censorship by nation states on a global level. As he notes, “the U.S. government is quite committed to ensuring and promoting human rights around the world, both online and offline.” In effect, it stays away from #3 above or domstic censorship and that influencing content through misinformation, misinformation, disinformation, or restriction of what information is shown to users.
Certainly, authoritarian state targets for Geneva is a tremendously worthwhile goal. In targeting nations like China and Iran, Geneva performs an invaluable service of allowing information flow in and out of these states. However, censorship inside nations rather than between nations, the censorship of misinformation or restriction of information within national borders for control of populations by governments is the censorship that many Americans are concerned with. Even if one accepts the argument that the U.S. government is committed to ensuring and promoting human rights around the world, it is difficult to believe that they are committed to non-censorship of information within the United States. Especially since the beginning of the pandemic.
It is a fact that big tech corporations like Twitter, Facebook and Google are attacking, silencing, and censoring millions of Americans. This creates somewhat of a paradox for Geneva it seems to me. In effect, Geneva attempts to break down the censorship of authoritarian nations around the world but has little interest in breaking down censorship inside nations.
Geneva is probably wise to stay away from going after internal censorship. Going after it would immediately make them a political player but they have attempted to stay away from politics so far by keeping their goal of promoting human rights around the world. The goal of the Geneva Project is to ensure a new type of global communication takes place where censorship by authoritarian states becomes smaller and smaller.
Open Source Code / Still the Great Uncensored Part of the Internet
Yet what if the AI of Geneva could be turned towards going after the internal aspects of censorship rather than its external aspects related to nation states? Geneva is focused on the great walls that surround information coming into and going out of nation states like China. Yet, what if it would focus on using AI to remove censorship of information inside nations? Perhaps a great opportunity for Geneva but risking the danger of becoming a political player.
Yet, there is one thing places Geneva far ahead of others in this battle against censorship. It has not gotten into internal censorship within America. But, at the same time, it has provided source code for others to perhaps deal with this censorship. So many things have changed on the Internet since everything started. Yet perhaps the most democratic aspect of the Internet is Source Code.
In effect, providing Source Code to something is like a truth one has discovered, uncensored by anything or anyone. There is no better way to transmit knowledge onto others in this digital age. This is the way that Geneva has grown and will continue to grow.
Geneva has a huge challenge in being one step of authoritarian nations in the censorship wars.
Can AI knock down censorship in nations like China, Iran and North Korea? This is the big question for Geneva and for all the world. Geneva moves forward into a future of censored and uncensored information about the world. It takes on the grand censors of information today in nation states.
But computer scientists out there see what Geneva is doing and undoubtedly question its application to internal censorship within nations. Perhaps the most valuable thing that Genva provides is its open-source code to these scientists.
They have the open-source code of Geneva to work with. As the Internet has become more and more about censorship, open-source code is perhaps the only remaining uncensored part of the Internet.
Visit Geneva’s Breakerspace Lab