You, a person who’s currently on the English-speaking internet in The Year of The Pandemic, have definitely seen public service information about Covid-19. You’ve probably been unable to escape seeing quite a lot of it, both online and offline, from handwashing posters to social distancing tape to instructional videos for face covering.
But if we want to avoid a pandemic spreading to all the humans in the world, this information also has to reach all the humans of the world—and that means translating Covid PSAs into as many languages as possible, in ways that are accurate and culturally appropriate.
It’s easy to overlook how important language is for health if you’re on the English-speaking internet, where “is this headache actually something to worry about?” is only a quick Wikipedia article or WebMD search away. For over half of the world’s population, people can’t expect to Google their symptoms, nor even necessarily get a pamphlet from their doctor explaining their diagnosis, because it’s not available in a language they can understand.
Share Your Stories
This health-language gap isn’t unique to Covid. Wuqu’ Kawoq|Maya Health Alliance is a nonprofit in Guatemala that’s been providing health support in indigenous Mayan languages such as Kaqchikel and Kʼicheʼ for the past 13 years. An early client of Wuqu’ Kawoq was a Kaqchikel-speaking woman who knew she had diabetes—she could repeat the name that the Spanish-speaking doctors had told her, but a big part of managing diabetes is carefully balancing one’s blood sugar through what one eats, which an opaque, untranslated name didn’t help her with. That is, until Wuqu’ Kawoq developed a name for diabetes in Kaqchikel—kab’kïk’el, literally “sweet blood,” in consultation with medical professionals. The new terminology made it easy for Wuqu’ Kawoq’s health workers to explain how to manage the disease in her native tongue: Your blood is too sweet, you need to make it less sweet, by eating less sweet things. With this information, the woman was able to go back and explain to her family how they needed to cook to help her.
Like diabetes, Covid is for the moment a lifestyle illness—until we have a vaccine or other treatments, the best way we currently have of managing it is through changing the way we live. All those handwashing and social distancing posters. A doctor can give a pill or a shot to someone who doesn’t understand how it works, but since we don’t have that yet for SARS-CoV-2, we’re facing what the Epidemic Intelligence Service program of the Center for Disease Control and Prevention considers a communications emergency—what the WHO calls an “infodemic.”
In the past few months, Wuqu’ Kawoq has expanded from its usual mandate (primary care issues like diabetes, midwifery, child malnutrition, and accompanying its indigenous clients to Spanish-speaking hospitals for interpretation and advocacy) to looping in translators on telemedicine phone calls with doctors and producing Covid podcasts in Mayan languages to air on local radio—the most effective way of disseminating information at a distance in rural areas where internet service isn’t always available.
That’s just one of many Covid translation projects that have been springing up all over the world. Adivasi Lives Matter has been making info sheets in languages of India including Kodava, Marathi, and Odia. The government of Australia’s Northern Territory has been producing videos in First Nations languages including Yolŋu Matha, Pintupi-Luritja, and Warlpiri. Seattle’s King County has been producing fact sheets in languages spoken by local immigrant and refugee communities, such as Amharic, Khmer, and Marshallese. VirALLanguages has been producing videos in languages of Cameroon, including Oshie, Aghem, and Bafut, starring well-known community members as local “influencers.” Even China, which has historically promoted Mandarin (Putonghua) as the only national language, has been putting out Covid information in Hubei Mandarin, Mongolian, Yi, Korean, and more.
According to a regularly updated list maintained by the Endangered Languages Project, Covid information from reputable sources (such as governments, nonprofits, and volunteer groups that clearly cite the sources of their health advice) has been created in over 500 languages and counting, including over 400 videos in over 150 languages. A few of these projects are shorter, more standardized information in a larger variety of global languages, such as translating the five WHO guidelines into posters in over 220 languages or translating the WHO’s mythbuster fact sheets into over 60 languages. But many of them, especially the ones in languages that aren’t as well represented on the global stage, are created by individual, local groups who feel a responsibility to a particular area, including governments, nonprofits, and volunteer translators with a little more education or internet access.
There are still gaps: South Africa’s government has been criticized on social media for doing briefings mostly in English, rather than in at least two of its 10 other official languages: an Nguni language (such as Zulu or Xhosa) and a Sotho language (such as Setswana or Sesotho). England has faced legal proceedings for not including a British Sign Language interpreter in its regular government briefings the way that Scotland, Wales, and Northern Ireland have. (Many other countries have also been proactive about including sign language interpreters, from the Netherlands to New Zealand.)
But by and large, there is a recognition that language is a vital part of the Covid response, an understanding that’s come from hard-gained experience. When respiratory illness experts talk about precursors to Covid-19, they tend to talk about SARS and MERS; when language experts talk about the pandemic, there are two different precedents that keep coming up: the 2010 earthquake in Haiti and the Ebola epidemics in Western Africa (2013–2016) and the Democratic Republic of the Congo (since 2018).
In both cases, the language spoken by locals wasn’t a language widely spoken by aid workers. In Haiti, this led to an initiative called Mission 4636, where Haitians could text requests for assistance—such as spotting someone trapped inside a building, or needing medical assistance—to the 4636 SMS shortcode, and volunteers from the Haitian diaspora around the world would translate tens of thousands of requests from Haitian Creole into English and forward them to English-speaking aid workers on the ground, within an average of 10 minutes.
For the Ebola epidemics, the language challenge multiplies. In the DRC alone, there are at least seven major languages: French, Kikongo (Kituba), Lingala, Swahili, Tsiluba, Francophone African Sign Language, and American Sign Language, and still more smaller languages that are common in particular areas, according to a map created by Translators Without Borders. A recent study by Translators Without Borders points to what these resources should look like, reflecting what we could term the universal human desire to WebMD your illness: “Study participants voiced frustration with information like ‘You have to go early to the Ebola treatment centre to be cured.’ They want a more detailed and sophisticated explanation of how the treatment drugs work, and why they were selected. […] People want details on complex issues to inform their decisions, and they want them presented in what they referred to as ‘community language’—meaning in a language and style they understand, using words and concepts they are familiar with.”
Not understanding the community language can be negligent—relying on lingua francas like French and Swahili disproportionately harms women in the DRC, who are more likely to only speak Nande and other local languages. It can even be counterproductive. Rob Munro, who’s worked on the language tech response for both the Haiti earthquake and Ebola, told me a story out of Sierra Leone during the Ebola crisis, where naive do-gooders swooped in to create public service announcements about Ebola, only to find that, on the advice of the Mande-speaking party in power at the time, they’d put Mande announcements on loudspeakers in a Temne-speaking area, thereby stoking conspiracy theories that the virus was being used as a tool for suppressing political rivals.
Linguistic competence is just as important for Covid: providing sufficient context about how a disease works allows people to figure out reasonable precautions in unanticipated circumstances, and putting out this information in appropriate community languages also helps convince people that the advice is reputable and should be followed. Not to mention that as countries ramp up contract tracing to help with reopening, this too will need to happen in all the languages spoken in a community. (The current demand for Spanish-speaking contact tracers in the US is just a beginning.)
But in a pandemic, the challenge isn’t just translating one or a handful of primary languages in a single region—it’s on a scale of perhaps thousands of languages, at least 1,000 to 2,000 of the 7,000+ languages that exist in the world today, according to the pooled estimates of the experts I spoke with, all of whom emphasized that this number was very uncertain but definitely the largest number they’d ever faced at once.
Machine translation might be able to help in some circumstances, but it needs to be approached with caution. Here’s an example of what can go wrong with a phrase as simple as “wash your hands.” The Japanese equivalent of “wash your hands” as provided by Google Translate is 手を洗いなさい (te o arainasai), which I’m told is technically grammatical but also a style appropriate for a parent to say to a child. Certainly appropriate in some circumstances, but also liable to leave a bad impression (“reduce compliance” in public-health speak) on posters aimed at adults.
So I challenged my Twitter followers to find any language they were fluent in where the Google Translate version of “wash your hands” was specifically in the style appropriate for a public service announcement or poster. Again, many languages did produce grammatical results, but for the European languages, the website tended to return the informal, singular forms of “you” (the “tu” or “du” forms). Informal versions are often appropriate in speech—but not typical for official posters, where most speakers expected impersonals (“Handwashing required”) or polite forms like “vous” or “usted” or “Sie.” From over a dozen languages, we found just two where the results were the right register for a sign: Korean and Swahili. Appropriateness may seem trivial, but imagine your doctor asking you, an adult, if your tummy has an owie rather than asking if you have abdominal pain. It just … doesn’t really inspire confidence.
That’s not to say that machine translation isn’t helpful for some tasks, those where getting the gist quickly is more important than the nuanced translations humans excel at, such as quickly sorting and triaging requests for help as they come in, or keeping an eye on whether a new misconception is bubbling up. But humans need to be kept in the loop, and both human and machine language expertise needs to be invested in during calmer times so that it can be used effectively in a crisis.
The bigger issue with machine translation is that it’s not even an option for many of the languages involved. Translators Without Borders is currently translating Covid information into 89 languages, responding to specific requests of on-the-ground organizations, and 25 of them (about a third) aren’t in Google Translate at all. Machine translation disproportionately works for languages with lots of resources, with things like news sites and dictionaries that can be used as training data. Sometimes, like with French and Spanish, the well-resourced languages of former colonial powers also work as lingua francas for translation purposes. In other cases, there’s a mismatch between what’s easy to translate by machine versus what’s useful to TWB: TWB has been fielding lots of requests for Covid info in Kanuri, Dari, and Tigrinya, none of which are in Google Translate, but hasn’t seen any yet for Dutch or Hebrew (which are in Google Translate but don’t need TWB’s help—they have national governments already producing their own materials).
Google Translate supports 109 languages, Bing Translate has 71, and even Wikipedia only exists in 309 languages—figures that pale in comparison to the 500+ languages on the list from the Endangered Languages Project, all human-created resources. Anna Belew, who’s been compiling the list since mid-March, tells me that she’s been adding a dozen or so languages every day and that if anything, it’s an undercount—the list deliberately excludes well-resourced national languages like Dutch (unless they’re also lingua francas, like French), based on similar priorities to TWB. Of course, it’s easier to translate a few documents than to create a whole machine translation system, but the the first can also help with the second.
A crisis like the pandemic can expose both the flaws and the potential that are already present in a system. On the one hand, fewer trips by cars and planes means improved air quality and reduced carbon emissions, a potential opportunity to address another big intractable societal problem in the process of reopening. On the other hand, the people who’ve been disproportionately impacted by Covid have been those who were already marginalized, including migrant workers, refugees, and indigenous people—a different sort of big social problem that reopening is only making worse.
The flaw in the linguistic structure of the internet is that tech platforms have been primarily supporting around 30-100 major, wealthier languages—figures that haven’t increased significantly since I started tracking them in 2016 while writing Because Internet. The potential is that distributed networks of translators, both professional and volunteer, have been able to make Covid information available in over 500 languages within a few months. In the early days of the web, it may have been justified to assume that internet users were all comfortable in a few dominant languages, but that situation has demonstrably changed—grassroots efforts have, in a few months, created resources in almost twice as many languages as Wikipedia has in 19 years, in almost five times as many languages as Google Translate has in 14. These numbers demonstrate that sufficient numbers of speakers are reachable via the internet for way more languages than Silicon Valley typically supports—and tech platforms need to figure out how to catch up to this new reality. People deserve full linguistic access to more than just Covid PSAs.
In the long run, Translators Without Borders aims to help with this tech problem too, through a project known as the Translation Initiative for Covid-19 (TICO-19). TWB is working with researchers at Carnegie Mellon and a who’s who of major tech companies including Microsoft, Google, Facebook, and Amazon (though with the notable exception of Apple) to translate Covid-related materials in 36 languages through these companies’ networks of translators (and on their dimes). The next stage will be to repurpose this newly translated material as training data—the massive amounts of text and recordings needed in each language as raw materials for tools like machine translation and automatic speech recognition.
It’s not 500, and it’s not even TWB’s longer list of 89, but every piece helps. “I just wish,” says Antonis Anastasopoulos, a postdoc at CMU who’s working on TICO-19, “that all of these other great initiatives releasing translations in underrepresented languages would also release their data in open-licensed, plain-text form, alongside those PDFs or image files that are easy to share on social media but hard for machines to read.”
Here again, pre-existing relationships are critical—TICO-19 was able to spin up so quickly because Translators Without Borders had been working on a similar, smaller project since 2017 under the name Gamayun, working with tech companies to translate materials in 10 key underrepresented languages and repurpose them as training data, to get tech product support in key languages like Kanuri (for internally displaced people in northeast Nigeria) and Rohingya (for Rohingya refugees in Bangladesh).
Just as our best efforts at fighting the virus are a whole bunch of small, unglamorous decisions by many people—staying home, washing hands, painstakingly testing vaccine candidates—the same thing is true on the communications side. There’s still a role for tech—farming out poster templates and video scripts to translators, keeping track of which languages are up to date so that effort isn’t duplicated, sending posters and videos through family WhatsApp groups. All this would have been impossible in a pre-internet era, especially with social distancing. But they rely on humble, human-mediated tools like shared spreadsheets and email lists and phone cameras, not whiz-bang artificial intelligence swooping in to save the day.
The historian and novelist Ada Palmer has pointed out that this is the first pandemic in human history where we’ve had an understanding of diseases and hygiene, where we’ve actually known what we needed to do to hold it off for long enough to develop a vaccine, making social distancing a realistic strategy, even as it upends all our lives. This is also, therefore, the first pandemic in human history where we’ve have the power and the responsibility to share this understanding, a network of linguistic care which ultimately spans every corner of the globe.
Photographs: John Moore/Getty Images; Alberto Pizzoli/Getty Images
More From WIRED on Covid-19