In 1937, the year that George Orwell was shot in the neck while fighting fascists in Spain, Julian Chen was born in Shanghai. His parents, a music teacher and a chemist, enrolled him in a school run by Christian missionaries, and like Orwell he became fascinated by language. He studied English, Russian, and Mandarin while speaking Shanghainese at home. Later he took on French, German, and Japanese. In 1949, the year Mao Zedong came to power and Orwell published 1984, learning languages became dangerous in China. In the purges of the late 1950s, intellectuals were denounced, sent to labor camps, and even executed. Chen, who by then was a student at prestigious Peking University, was banished to a Beijing glass factory.
Chen’s job was to cart wagons full of coal and ash to and from the factory’s furnace. He kept his mind nimble by listening to his coworkers speak. At night, in the workers’ dormitory, he compiled a sort of linguistic ethnography for the Beijing dialect. He finished the book around 1960. Soon after, Communist Party apparatchiks confiscated it.
His fortunes improved after Mao’s death, when party leaders realized that China’s economy needed intellectuals in order to develop. Chen went back to school, and in 1979, at the age of 42, his test scores earned him a spot in the first group of graduate students to go abroad in decades. He moved to the US and earned a PhD in physics at Columbia University. At the time, America offered more opportunity than China, and like many of his peers, Chen stayed after graduation, getting a job with IBM working on physical science research. IBM had developed some of the world’s first speech recognition software, which allowed professionals to haltingly dictate messages without touching a keyboard, and in 1994 the company started looking for someone to adapt it to Mandarin. It wasn’t Chen’s area, but he eagerly volunteered.
Right away, Chen realized that in China speech recognition software could offer far more than a dictation tool for office workers; he believed it stood to completely transform communication in his native tongue. As a written language in the computer age, Chinese had long posed a unique challenge: There was no obvious way to input its 50,000-plus characters on a QWERTY keyboard. By the 1980s, as the first personal computers arrived in China, programmers had come up with several workarounds. The most common method used pinyin, the system of romanized spelling for Mandarin that Chinese students learn in school. Using this approach, to write cat you would type “m-a-o,” then choose 猫 from a drop-down menu that also included characters meaning “trade” and “hat,” and the surname of Mao Zedong. Because Mandarin has so many homophones, typing became an inefficient exercise in word selection.
To build his dictation engine, Chen broke Mandarin down into its smallest elements, called phonemes. Then he recruited 54 Chinese speakers living in New York and recorded them reading articles from People’s Daily. IBM’s research lab in Beijing added samples from an additional 300 speakers. In October 1996, after he had tested the system, Chen flew to China to display the resulting software, called ViaVoice, at a speech technology conference.
In a packed room festooned with gaudy wallpaper, Chen read aloud from that day’s newspaper. In front of him, with a brief delay, his words appeared on a large screen. After he finished, he looked around to see people staring at him, mouths agape. A researcher raised her hand and said she wanted to give it go. He handed over the microphone, and a murmur ran through the crowd. ViaVoice understood her too.
ViaVoice debuted in China in 1997 with a box that read, “The computer understands Mandarin! With your hands free, your thoughts will come alive.” That same year, President Jiang Zemin sat for a demonstration. Soon PC makers across China—including IBM’s rivals—were preinstalling the software on their devices. The era of freely conversing with a computer was still a long way off, and ViaVoice had its limitations, but the software eased the headache of text entry in Chinese, and it caught on among China’s professional class. “It was the only game in town,” Chen recalls.
But for some scholars who had stayed in China, it stung that a researcher working for an American company had been the one to make a first step toward conquering the Chinese language. China, they felt, needed to match what Chen had done.
Among those motivated by IBM’s triumph was Liu Qingfeng, a 26-year-old PhD student in a speech recognition lab at the prestigious University of Science and Technology of China, in Hefei. In 1999, while still at USTC, Liu started a voice computing company called iFlytek. The goal, it seemed, was not just to compete with IBM and other foreign firms but to create products that would recoup Chinese pride. Early on, Liu and his colleagues worked out of the USTC campus. Later they moved elsewhere in Hefei. It was a second-tier city—USTC had been relocated there during the Cultural Revolution—but staying in Hefei meant iFlytek was close to the university’s intellectual talent.
When Liu explained his business concept to Kai-Fu Lee, then the head of Microsoft Research Asia, Lee warned that it would be impossible to catch up with American speech recognition giants. In the US, the industry was led by several formidable companies in addition to IBM and Microsoft, including BellSouth, Dragon, and Nuance Communications, which had recently spun off from the nonprofit research lab SRI International. These companies were locked in a slog to overcome the limitations of early-2000s computing and build a voice-computer interface that didn’t exasperate users, but they were far ahead of Chinese competitors.
Liu didn’t listen to Lee’s warnings. Even if voice-interface technology was a crowded, unglamorous niche, Liu’s ambition gave it a towering moral urgency. “Voice is the foundation of culture and the symbol of a nation,” he later said, recounting iFlytek’s origin story. “Many people thought that they”—meaning foreign companies—“had us by the throat.” When some members of his team suggested that the company diversify by getting into real estate, Liu was resolute: Anyone who didn’t believe in voice computing could leave. Nuance was building a healthy business helping corporate clients begin to automate their call centers, replacing human switchboard operators with voice-activated phone menus (“To make a payment, say ‘payment’”). iFlytek got off the ground by doing the same sort of work for the telecommunications company Huawei.
iFlytek went public in 2008 and launched a major consumer product, the app iFlytek Input, in 2010. That same year, Apple’s iPhone began to carry Siri, which had been developed by SRI International and acquired by Apple. But while Siri was a “personal assistant”—a talking digital concierge that could answer questions—iFlytek Input was far more focused. It allowed people to dictate text anywhere on their phones: in an email, in a web search, or on WeChat, the super app that dominates both work and play in China.
Like any technology trained on interactions with human speech, Input was imprecise in the beginning. “With the first version of that product, the user experience was not that good,” said Jun Du, a scientist at USTC who oversaw technical development of the app. But as data from actual users’ interactions with the app began to pour in, Input’s accuracy at speech-to-text transcription improved dramatically.
Information about you, what you buy, where you go, even where you look is the oil that fuels the digital economy.
As it happened, Siri and Input were relatively early arrivals in a coming onslaught of mature voice-interface technologies. First came Microsoft’s Cortana, then Amazon’s Alexa, and then Google Assistant. But while iFlytek launched its first generation of virtual assistant, Yudian, in 2012, the company was soon training much of its AI firepower on a different challenge: providing real-time translation to help users understand speakers of other dialects and languages. Later versions of Input allowed people to translate their face-to-face conversations and get closed captioning of phone calls in 23 Chinese dialects and four foreign languages. When combined with China’s large population, the emphasis on translation has allowed the company to collect massive amounts of data.
Americans might tap Alexa or Google Assistant for specific requests, but in China people often use Input to navigate entire conversations. iFlytek Input’s data privacy agreement allows it to collect and use personal information for “national security and national defense security,” without users’ consent. “In the West, there are user privacy problems,” Du says. “But in China, we sign some contract with the users, and we can use their data.” Voice data can be leaky in China. The broker Data Tang, for example, describes specific data sets on its website, including one that includes nearly 100,000 speech samples from 3- to 5-year-old children.
In 2017, MIT Technology Review named iFlytek to its list of the world’s 50 smartest companies, and the Chinese government gave it a coveted spot on its hand-picked national “AI team.” The other companies selected that year were platform giants Baidu, Alibaba, and Tencent. Soon after, iFlytek signed a five-year cooperation agreement with MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), a leading AI lab. The company’s translation technology is used by the Spanish football club RCD Espanyol, and it signed an exclusive deal to provide automated translation for the 2022 Beijing Winter Olympics. As of mid-April, iFlytek was valued on the Shenzhen Stock Exchange at $10.8 billion, and it claims to have 70 percent of the Chinese voice market, with 700 million end users. Nuance was valued at $5.3 billion during the same time. In China, the company’s other major competitors in voice computing are mainly platforms like Alibaba and Baidu.
Two decades after Julian Chen intuited that voice computing would revolutionize how people interact with computers in China, its impact there is indeed dramatic. Every day, WeChat users send around 6 billion voice texts, casual spoken messages that are more intimate and immediate than the typical voicemail, according to 2017 figures. Because WeChat caps the messages at one minute, people often dash them off in one long string. iFlytek makes a tablet that automatically transcribes business meetings, a digital recorder that generates instantaneous transcripts, and a voice assistant that is installed in cars across the country.
Consumer products are important to iFlytek, but about 60 percent of its profits come from what is described in the company’s 2019 semiannual report as “projects involving government subsidies.” These include an “intelligent criminal investigation assistant system,” as well as big data support for the Shanghai city government. Such projects bring access to data. “That might be everything that’s recorded in a court proceeding, call center data, a bunch of security stuff,” says Jeffrey Ding, a scholar at Oxford University’s Future of Humanity Institute who studies AI governance in China. Liu, iFlytek’s founder and CEO, is a delegate to the National People’s Congress, China’s rubber-stamp parliament. “He has a very good relationship with the government,” Du says.
Liu has a vision that voice computing will someday penetrate every sphere of society. He recently told an interviewer for an online state media video channel: “It will be everywhere, as common as water and electricity.” That’s a dream that aligns neatly with the Chinese Communist Party’s vision for a surveillance state.
One day this past fall, I tested out a recent model of the Translator, an instant translation device made by iFlytek, with a man I’ll call Al Cheng. The Translator, a device powered by a Qualcomm Snapdragon chip, works offline for major world languages. Cheng and his wife live in a congested city in southern China, but every other year they travel to the Midwest to visit family. To get exercise, they walk half a mile each morning to the mall. But Cheng, who likes to hold forth on art and culture in Mandarin, Cantonese, and Hakka, does not speak any English. Much of the time while in the US, he is unhappily silent. He is exactly the sort of person who needs a Translator.
I met Cheng one morning in the mall’s central atrium, near an antique Chevrolet pickup truck that held hay and flowers. (“Chrysanthemums,” Cheng noted, approvingly.) When I told him the price of the Translator (around $400), he was skeptical. “Too expensive,” he said, shaking his head. But as we sat down outside Caribou Coffee to play around with it, his skepticism gave way to admiration. We held the device alongside the Baidu Translate app on his phone, taking turns speaking phrases in various languages in an attempt to stump it. In Mandarin, the Translator understood that Cheng’s accented “mingnisuda” was Minnesota. It got my name, despite Cheng pronouncing it “Mala.” When I spoke English, both translation tools could handle the metaphor “I’m feeling blue,” but only the Translator got that “I got up on the wrong side of the bed” was about my mood, not where I placed my feet. The most magical moment came when Cheng recited a couplet from the eighth-century poet Zhang Jiuling. Baidu translated the lines, nonsensically, as “At sea, the moon and the moon are at this time.” The Translator offered up an accurate and genuinely poetic translation:
As the bright moon shines over the sea; / From far away you share this moment with me.
When Cheng switched to Cantonese, the results were more mixed. (The Translator understood an idiom for the “English language” as “chicken farm.”) But the mere fact that the device supported the language impressed him.
iFlytek’s translation mission goes far beyond helping travelers, business people, and urban elites. It has developed products for ethnic minorities and people in rural areas where many people do not speak Mandarin, and it is constantly improving its handling of dialects. In 2017 it launched what it calls the Dialect Protection Plan. When I first came across a news report about it, I laughed out loud at the Orwellian name. The Chinese Communist Party has spent decades attacking language noun by noun, verb by verb—censoring terms it deems dangerous, undermining dialects and minority languages, and bludgeoning Mandarin with ideological drivel. (The Chinese cultural critic Li Tuo dubbed such clunky phrasing Maospeak, in reference to the Newspeak of 1984.) Tech companies have aided in the assault on language.
An iFlytek spokesperson said in an email that the goal of the company’s work on dialects was to “protect our ways of communication.” iFlytek has devoted special attention to Uighur and Tibetan, which are spoken by ethnic minorities that have been singled out for persecution by Beijing. China Daily reported that in one promotion for the Dialect Protection Plan, executives encouraged users of iFlytek Input to record themselves speaking their native language, in exchange for a chance to win an iPhone.
iFlytek’s campus sits far outside Hefei’s city center, on a street lined with drab apartment buildings. Nearly half of the company’s 11,000 employees work in a guarded compound spanning 31 acres. The rest are scattered at offices throughout China, with a few in other parts of the world. Like Silicon Valley tech companies, iFlytek buses in staff, provides food and entertainment, and projects a lofty mission. Everywhere on the campus—on walls, merchandise, and the stall doors in the squat toilets—is the slogan “Empower the world with artificial intelligence.” When I arrived there last spring, I was greeted by a photograph of Xi Jinping.
A spokesperson led me to a café on the campus for a chat over bubble tea. iFlytek does not list any media contacts on its website, and I had managed to arrange this visit only after spending several hours cold-calling the company’s customer service lines. After a few dead ends, an agent took pity and connected me with the spokesperson, who accepted my request to visit. (Another spokesperson later responded to a list of questions sent to Chartwell Strategy Group, a DC-based lobbying firm that iFlytek engaged to manage its communications in the US.)
Surrounded by blond wood, I slurped up tapioca balls as my host explained the company’s consumer products. She wore a flouncy shirt with a sewn-on vest, dangly earrings, and platform shoes—an outfit that reflected iFlytek’s aesthetic, which is cute, whimsical, and even silly. One version of its child companion robot, the Alpha Egg, has polka dots and little antennae and speaks in a cartoonish alien voice. Its virtual assistant for drivers, Flying Fish, is depicted in ads as a cuddly shark in a scuba mask. The robot it markets to hospitals to assist with patient queries looks like the love child of C-3PO and EVE, the machine in the animated film WALL-E. (“Perhaps more than any people in the world,” says Chen Xiaoping, the director of USTC’s Center for Artificial Intelligence Research, the lab that helped develop the medical assistant, Chinese people “really like robots.”) As the spokesperson explained it, iFlytek’s products were all in service of convenience and fun.
Fun is also a means of subversion in China, especially when it comes to language. In the early 2000s, as online censors banned certain characters, computer users got around the state by switching to homophones. To mock the notion of a “harmonious society,” a Maospeak phrase popularized during Hu Jintao’s rule, they joked about crustaceans—hexie, river crab, is pronounced similarly to hexie, harmony. “Serve the people” became “smog the people.” Both contain the sound wu. The characters were neighbors in an input drop-down menu.
Alarmed by the proliferation of online sarcasm, the central government went so far as to ban homophones and other wordplay. So dissidents turned to other means of dissemination. “Activists saw new opportunities in video as it became easier and cheaper on cameras and phones to record, view, and also distribute,” said Dechen Pemba, a Tibetan human rights activist in London who edits the site High Peaks Pure Earth. But by the late aughts, the Communist Party had embarked on a quest to master speech technologies—one that ran in parallel with iFlytek’s growth as a consumer voice company.
In 2009, Meng Jianzhu, the head of China’s Ministry of Public Security, traveled to Hefei and visited iFlytek’s headquarters. According to a report posted on the central government’s website, he spoke there of the need for “public security organs to closely cooperate with technology companies” to create “prevention and control systems.” As the CCP has amped up its surveillance capabilities over the past decade, it has installed millions of cameras, introduced electronic ID cards and real-name registration online, and built tech-driven “smart” cities. iFlytek’s technology has helped the government to integrate audio signals into this network of digital surveillance, according to Human Rights Watch.
The company is emblematic of a broader Chinese government effort called “military-civil fusion,” which aims to harness advances in China’s tech sector for military might. “iFlytek is contributing to military-civil fusion quite actively,” says Elsa Kania, a fellow at the Center for a New American Security in Washington, DC, who studies artificial intelligence in China. “There are elements of the company that pursue consumer applications, but the public security, policing, and defense-oriented applications appear to be significant as well.” The company has promoted its products to the People’s Liberation Army, according to testimony that Kania presented to Congress last year. She adds, “It’s not clear that there are firewalls or divisions” between consumer and other state-oriented applications. (The spokesperson reached through Chartwell Strategy Group said that iFlytek does not develop military technologies and would not comment on the company’s security work or on whether data gathered through iFlytek’s consumer products is firewalled from its government projects. )
For the CCP, monitoring speech appears to be about more than censorship. “The collection of voice and video data assists with identifying people, networks, how people speak, what they care about, and what are the trends,” says Samantha Hoffman, an analyst at the Australian Strategic Policy Institute’s Cyber Centre in Canberra.
iFlytek has patented a system that can sift through large volumes of audio and video in order to identify files that have been copied or reposted—part of an operation that the patent explains as “very important in information security and monitoring public opinion.” iFlytek responded that “analyzing audio and video data can have a number of potential applications, including identifying popular songs, detecting spam callers, etc.”
But iFlytek does enable security work. In 2012 the Ministry of Public Security purchased machines from iFlytek focused on intelligent voice technology. The ministry chose Anhui province, where iFlytek is headquartered, as one of the pilot locations for compiling a voice-pattern database—a catalog of people’s unique speech that would enable authorities to identify speakers by the sound of their voice.
Sign Up Today
Sign up for our Longreads newsletter for the best features, ideas, and investigations from WIRED.
The project relies on an iFlytek product called the Forensic Intelligent Audio Studio, a workstation that includes speakers, a microphone, and a desktop tower. The unit, which according to a 2016 local government procurement announcement sells for around $1,700, can identify people based on the unique characteristics of their voices. An iFlytek white paper uploaded online in 2013 touts voiceprint or speaker recognition as the “only biometric identification method that can be operated remotely,” noting that “in the defense field, voiceprint identification technology can detect whether there are key speakers in a telephone conversation and then track the content of the conversation.” The workstation can take a snippet of audio, compare it against the voices of 200 speakers, and pick out the person talking in under two seconds, according to the white paper.
Other countries also use voiceprint recognition for intelligence purposes. According to classified documents leaked by Edward Snowden, the National Security Agency has long used the tool to monitor terrorists and other targets. NSA analysts used speaker recognition, for example, to confirm the identities of Saddam Hussein, Osama bin Laden, and Ayman al-Zawahiri in audio files, and the FBI has a research arm devoted to the technology. Nuance once sold a system called Nuance Identifier, which it said allowed law enforcement to “perform searches against millions of voiceprints within seconds.” The US Bureau of Prisons reportedly collects and stores prisoners’ voiceprints in order to monitor their phone calls.
In 2017, Human Rights Watch published a report detailing iFlytek’s government work. Maya Wang, a researcher with the rights group, says that the company’s tools are an essential part of the party’s plan to “build a digital totalitarian state”—a charge the company calls “baseless and absurd.” iFlytek’s voice biometric technologies make “tracking and identifying individuals possible,” Wang says. At some point, the noble effort to reclaim the Chinese language and ease communication became indistinguishable from one to control it.
The company, like many US tech firms, maintains that its technology is “end-use agnostic.”
iFlytek’s work has come under particular suspicion in regions that pose a threat to the party’s rule. One focus is greater Tibet, the culturally distinct part of western China where people have long fought for sovereignty. In Lhasa, iFlytek cofounded a lab at Tibet University that focuses on speech and information technology. The company says the goal of the lab is “the preservation and greater understanding of minority dialects and to help protect Tibetan culture.” The company also makes a Tibetan input app called Dungkar, which means “conch shell,” an auspicious symbol in Tibetan Buddhism.
According to Human Rights Watch, iFlytek’s technology appears to enable surveillance in Xinjiang, a region in northwest China populated by the predominantly Muslim Uighur minority group. In recent years, the Chinese government has tightened its grip on Uighurs, interning more than a million people in camps and farming out others to factories as forced labor. Residents have been made to install nanny apps on their phones, give biometric data at regular security checkpoints, and host cultural inspectors in their homes. In official materials, these inspectors are called, with no apparent hint of irony, “big sisters and big brothers.”
The crackdown is perhaps most intense in Kashgar, an ancient city on the Silk Road that was once a major destination for tourists and is now home to at least a dozen internment camps. In 2016 police in Kashgar contracted with an iFlytek subsidiary to purchase 25 voiceprint terminals. According to the procurement agreement, the technology would be used to collect speech samples for biometric dossiers that also include photos, fingerprints, and DNA samples. The same subsidiary helped Kashgar University integrate big data on its campus for the purpose of ensuring its “safe and stable operation” in a “multiethnic” environment, according to the subsidiary’s website.
In May 2016, iFlytek signed a strategic cooperation agreement with the agency that operates prisons in Xinjiang. It is unclear exactly how iFlytek’s technology is used in this context, but a post on Sohu, a Chinese platform, stated that iFlytek’s work would “ensure the security and stability of prisons.”
A group of US scholars met with a member of iFlytek’s leadership in Beijing last summer and pushed him on the company’s work in Xinjiang: “He characterized it as, ‘We’re helping the government to better understand Uighurs through providing those language capabilities,’” one security analyst who was there told me.
Wang, of Human Rights Watch, says the fact that iFlytek builds “both benign, commercial applications as well as surveillance applications is precisely what makes them very problematic.” iFlytek’s data from government projects is likely used to improve its consumer products—and vice versa. “They can train and perfect their AI systems on lots of samples, collected through not only their commercial but also through their military and policing applications,” she says. Every time a traveler speaks into the Translator, their words feed an algorithmic black box. All told, iFlytek’s technologies promise to dramatically reshape life for people in China and elsewhere, by turning an individual’s voice into both a crucial time-saver and an inescapable marker of their identity.
In recent years, iFlytek entered a stage of international expansion, brokering research partnerships with universities in Canada, New Zealand, and the United States. On the occasion of its agreement with MIT’s CSAIL, lab director Daniela Rus said the partnership would focus on addressing “the biggest challenges of the 21st century” by “finding ways to better harness the strengths of both human and artificial intelligence.” Critics of the partnership had a more cynical view: iFlytek gave an undisclosed sum of money in return for the prestige of the MIT brand.
The announcement came eight months after Human Rights Watch exposed iFlytek’s work in Xinjiang, and as awareness of the indoctrination camps spread, some MIT researchers grew alarmed. Alan Lundgard, a graduate student at CSAIL, told me that he learned his work would be funded by iFlytek only after he started his position at the lab. When he emailed a CSAIL administrator to explain that he had moral objections to receiving money from the company, the administrator responded that he could find other funding for his work. If he didn’t, he said, Lundgard would have to return the payments that he had already received.
Supersmart algorithms won’t take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.
Last summer, after press reports revealed that sex offender Jeffrey Epstein and the Saudi state had funded other labs at MIT, students and staff staged a series of protests, and CSAIL’s agreements with Chinese tech companies were thrust into the spotlight. “The concerns about surveillance activities in China are very real,” says Roger Levy, the director of MIT’s Computational Psycholinguistics Laboratory. “MIT needs to take very seriously that, when we enter into an engagement with another entity, we are lending it a kind of credibility.”
In October, the Department of Commerce placed iFlytek on the US government’s Entity List of companies subject to export restrictions. In response, Liu Qingfeng posted a defiant missive on iFlytek’s site, in Chinese, that did little to dispel perceptions of close government ties. “Without the revolutionary martyrs giving their blood, today there would be no modern China,” it read. “Without the prosperity and development of modern China, iFlytek could not have debuted on the industrial stage … There is no force that can stop our confidence and pace in building a beautiful world with artificial intelligence!” The next day, US secretary of state Mike Pompeo touched on the crackdown in Xinjiang in a speech. “The pages of George Orwell’s 1984 are coming to life there,” he said.
Last fall, in response to emailed questions, a CSAIL spokesperson said that the addition of iFlytek to the Entity List had triggered a review by MIT, but that in the interim CSAIL would continue with the partnership. In April, after WIRED sent more inquiries, MIT announced that it had terminated the relationship in February. The university would not disclose the rationale. As for Lundgard’s funding, a spokesperson responded that “the Institute must strike a balance between the funding available to carry out research and the individual preferences of researchers.”
When I mentioned iFlytek’s work to a friend in Shanghai, she said it reminded her of the story “City of Silence,” by the Chinese science fiction writer Ma Boyong. The story is set in a future society where speech is tightly controlled. The people are clever at adapting to each new limit, turning to homonyms and slang to circumvent censors, and in time the authorities realize that the only way to truly control speech is to publish a List of Healthy Words, forbid all terms not on the list, and monitor voice as well as text. Anytime the protagonist leaves the house, he has to wear a device called the Listener, which issues a warning whenever he strays from the list of approved words. The realm of sanctioned speech dwindles day by day.
Eventually the protagonist discovers the existence of a secret Talking Club, where, in an apartment encircled by lead curtains, members say whatever they want, have sex, and study 1984. Feeling alive again, he realizes that he has been suppressing “a strong yearning to talk.” This brief encounter with hope is squelched when the authorities develop radar dishes that can intercept signals through lead curtains. By the end of the story, there are no healthy words left, and the hero walks the city mutely, alone with his thoughts. “Luckily, it was not yet possible to shield the mind with technology,” Ma writes.
The Chinese surveillance state has sometimes involved what the scholar Jathan Sadowski calls “Potemkin AI”—technology that seems all-powerful but is not. But on a practical level, whether the technology is as accurate as advertised makes little difference. When people have the impression that the state can locate them using just a few seconds of intercepted audio, they begin to self-censor. Big Brother is internalized.
I reflected on this last April, while visiting Shanghai. One day I took the metro to the massive National Exhibition and Convention Center, at the city’s western edge, to attend the Shanghai Auto Show. iFlytek was one of the exhibitors, and when I reached its booth a video was playing on a large screen. It showed a clean-cut young man getting behind the wheel of a red sedan. “Hi Peter!” a voice said, as a screen attached to the dashboard flashed his picture. Peter beamed as if he had been waiting his whole life for his car to recognize him.
White characters flashed across a neon background in rapid succession, so fast as to be almost subliminal. Understands your needs. Establishes your feelings. The intelligent interactive auto system of the future.
A sales associate named Xing Xiaoling led me to a small station to try out the auto assistant for myself. We put on headphones. “Flying Fish, hello!” she said, and the screen woke up.
“I want to listen to a song,” Xing said, and a saccharine pop number blasted into my ears. She showed me how to buy airplane tickets to Beijing with a few simple voice cues, a feature available to users who connect their Alipay or WeChat Pay mobile payments accounts. Xing added that Flying Fish was always at the ready.
iFlytek’s virtual assistants are often called “China’s Siri,” but Xing thought that comparison did the company a disservice. “With Siri, you have to say ‘Hey, Siri’ every time,” she told me. “It’s very mechanical.” In the US, companies like Apple have fought hard against the perception that their devices are always listening. In China, though, it was a selling point. “You only have to wake it up one time, and then it’s awake,” Xing said of Flying Fish.
I captured the conversation with my digital recorder and took notes as she spoke. When I glanced back at the screen I saw that I wasn’t the only one who had made a recording. Right there, on the intelligent interactive auto system of the future, was a complete record of all our words.
When you buy something using the retail links in our stories, we may earn a small affiliate commission. Read more about how this works.
This article appears in the June issue. Subscribe now.
Let us know what you think about this article. Submit a letter to the editor at email@example.com.
More Great WIRED Stories
- The confessions of Marcus Hutchins, the hacker who saved the internet
- The first shot: Inside the Covid vaccine fast track
- The relentless startup fast-tracking Ford’s self-driving cars
- The stockbrokers of Magic: The Gathering play for keeps
- Zoom not cutting it for you? Try exploring a virtual world
- ? AI uncovers a potential Covid-19 treatment. Plus: Get the latest AI news
- ✨ Optimize your home life with our Gear team’s best picks, from robot vacuums to affordable mattresses to smart speakers