If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Bias in language translation

The complexity of human language has always posed challenging problems for computer scientists interested in speech recognition, textual understanding, translation, and natural language generation.
Consider the problem of translation: there are nearly 200 countries in the world and thousands of languages spoken by their citizens. Now that we live in a global economy, we'd love if computers could at least translate between the top 5 spoken languages.

Before machine learning

The quest for translation algorithms started in the 1960s with Rule-Based Machine Translation. RBMT algorithms rely on a grammar describing the structure of each language plus a dictionary of words. To translate a sentence, they try to parse it based on that language's grammar, convert that grammatical structure to the target language, and translate the words using the dictionary.
Diagram of two parse trees for the same sentence across English and French. The English tree starts with a node labeled "NP" ("Noun Phrase"), which has three child nodes labeled "DET" ("Determiner"), "ADJ" ("Adjective"), and "N" ("Noun"). The "DET" node ends in the word "the", the "ADJ" node ends in the word "red", and the "N" node ends in the word "house". The French tree also starts with a node labeled "NP" which three child nodes. The first node is "DET", the second node is "N", and the third node is "ADJ". "DET" ends in the word "la", "N" ends in "maison", and "ADJ" ends in "rouge".
Translating a short phrase from English to French with RBMT.
RBMT algorithms require the work of expert linguists to craft the grammar, yet their translations still fail to capture the complexity of human language. Researchers sought better options.
In the 1990s, computers suddenly had access to much more natural language data. There were millions of digitized textual documents, like books and news articles, and many of them had been translated into multiple languages.
The Harry Potter series has been translated into more than 70 languages, so computers can infer the translation of "owl" just by comparing those many translations.
We await your owl by no later than July 31.Queira enviar-nos a sua coruja até dia 31 de Julho, sem falta.
"What does it mean, they want my owl?""O que é que quer dizer espararam a minha coruja?"
All that new data enabled the approach of Statistical Machine Translation. SMT algorithms break a sentence down into smaller segments, look for existing translations of those segments, and propose the most probable translation of the full sentence.
Diagram of a statistically translated sentence. The Spanish phrase "Quiero ver la película" is displayed on top. Underneath "Quiero" are three English phrases "I want", "I love", and "I like". Underneath "ver" are three English infinitives "to see", "to watch", and "to meet". Underneath "la película" are three English phrases "the film", "the movie", and "the motion picture." A line goes from "I want" to "to watch" to "the movie".
Translating a short phrase from Spanish to English with SMT.
With a small training data set, SMT algorithms produce worse results than RBMT algorithms. However, with big data, SMT algorithms can produce fairly fluent sentences, or at least, fluent phrases within sentences.

The machine learning approach

In recent years, the new algorithm on the block is Neural Machine Translation. NMT is a machine learning algorithm that uses neural networks on enormous amounts of data. When trained well and with enough data, those algorithms can learn how to produce sentences that are fluent from start to finish.
Diagram of a neural network, with circles representing each neuron and lines representing connections between neurons. The network starts on the left with a column of 3 neurons labeled with words from an English phrase: "Let's", "go", and "dancing". Those neurons are connected to another column of 4 neurons, which itself connects to another column of 4, and those neurons are labeled "Hidden layers". The second hidden layer of neurons is connected to a column of 3 neurons labeled with Spanish words: "Vamos", "a", "bailar".

Biased translations

Since NMT is trained on examples from a biased world, it can reflect those biases in its translations. When Google Translate started using NMT, people noticed a bias when translating from non-gendered to gendered languages. start superscript, 1, end superscript
For example, here's how it translated four gender-neutral Turkish phrases:
Screenshot of Google Translate UI translating four phrases from Turkish to English. The English phrases are "She is a cook", "He is an engineer", "He is a doctor", and "She is a nurse".
The translation algorithm was simply spitting out the pronoun that was most frequently associated with that profession, not realizing that it had learned a sexist view of the world.
Google engineers changed the interface to always show translations with both female and male pronouns:
Screenshot of Google Translate UI translating "o bir doktor" from Turkish into English. English translations show both "She is a doctor" and "He is a doctor".
Thanks to machine learning, we can now translate many more complex phrases than ever before, but we also need to remember that the training data contains all the biases of our present and past. The developers of translation systems can look for ways to combat the algorithmic bias, while the users of those systems should look at the results with a critical eye.

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

  • male robot hal style avatar for user JI YONG Ahn
    However, we should also not see through the lens of this identity politics. This continual provocation of calling slight nuances into racist, sexist, white supremacist, Nazi etc. while conveniently grabs attention, incites an anger and impairs judgement, which unfortunately humans have been practicing too long.

    Fairness is important, and by being prone to call "she" is a nurse over "she" is a doctor can be offensive to some women; so above measure is appropriate. Petty but appropriate. And these should gradually change.

    However, the statistical disparity does exist, hence why human and machine learning occurs as such. You don't get offended when you assume a man to be taller than a woman, though some women are taller than some men. Intelligence and roles are much different than height, but it is dishonest to say there is no difference between men and women, despite significant overlap. And the current tendency to be offended about the status quo is pendulum that swung too far. Harnessing that diversity and creating something special from it is lacking in this day and age, as a philosophical groundwork for building machine learning to personal worldview, in my opinion.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user madisonsbuck
      I agree with this sentiment. I feel people should simply "get over it." or decide to generalize by stating "They are a nurse." "They are a doctor." "One would wonder." like other languages that lack gender specifications. I don't mention this in a political manner, but more so as a simplified answer which could work for most instances based off of how English speakers format sentences.

      Exp: "I told them I was serious." "They didn't know?" "One can question validity."
      (1 vote)