Jumping The Great Firewall

Project Director: Laura Kurgan
Data Visualization: Dan Taeyoung

As part of the Spatial Information Design Lab, in collaboration with The Brown Institute for Media Innovation, in partnership with Pen American Center and ProPublica.

This project visualizes a relatively new phenomenon: online free expression in China. It examines some innovative strategies employed by users of Weibo, a Twitter–like micro–blogging platform, in order to avoid government censorship bloggers post images as text. Images are much more difficult for automated search programs to analyze, which allows image-based content to spread more widely before it is detected and removed. Taking advantage of this, some users now turning writing into images, taking screenshots of their own and others' controversial posts before they're removed, then posting and re–posting them. The project visualizes Weibo posts that were posted and deleted between September 8th to November 13th, in 2013.

Use of the Internet in China is policed — watched over, censored, and punished — by a human and technological program that has been nicknamed 'The Great Firewall'. The aim is to keep politically unacceptable or "sensitive" content (words and articles about the Tiananmen Square massacre, for example) invisible to Chinese Internet users. Twitter and Facebook are largely blocked, as are many news outlets and human rights web sites; web searches are seriously curtailed; sensitive words are blocked; and online postings and other content is routinely removed, blog posting removed. For many Chinese users who wish to access blocked web sites, the only option is a Virtual Private Network (VPN), a virtual leap over the Great Firewall.

We examined a different strategy that has emerged in Weibo blogging, where users can insert images directly into their postings, without links. Images are much more difficult for automated search programs to analyze, which allows image-based content to spread more widely before it is detected and removed. Taking advantage of this, some users now turning writing into images, taking screenshots of their own and others' controversial posts before they're removed, then posting and re–posting them. Visualized here are many such deleted posts from September 8th to November 13th, in 2013.

Research for this investigation was conducted in collaboration with a team at the Spatial Information Design Lab and the Brown Institute for Media Innovation, in partnership with Pen American Center and ProPublica. The ProPublica article, called "China's Memory Hole: The Images Erased From Sina Weibo" uses a similar methodology to ours.

Data collection method

How do you observe a disappearance?

To detect a deleted post, you need to 1) first know that it exists, 2) then know that it has disappeared. There are no databases for disappearance. Instead, to observe a disappearance requires you to continually check for an absence. Like librarian Jessamyn Ward's 'library warrant canary', observing disappearance requires you to 'watch very closely'.

Creating an archive of 'deleted posts' was itself only possible by creating a comprehensive archive of posts, and to 'watch each post closely' for potential deletion. Around 700 Weibo users were tracked. Each users's posts that were less than a week old were checked for their disappearance, roughly every 5 minutes. A series of python scripts on a cloud server allowed the use of a rotating API keys to bypass Sina Weibo's API rate limit. When a post was reported disappeared, specifically due to censorship (rather than user-initiated deletion), we logged the post into a database, with any metadata (repost count, follower count) previously recorded.

For the purposes of visualization, data was anonymized as much as possible to protect the original users. The post timestamps were shifted by random amounts, and anonymous usernames were created to replace the original usernames.