In English, Machine Translation Makes You Sound Like a Man in His Middle Age
MARKETING |

In English, Machine Translation Makes You Sound Like a Man in His Middle Age

THREE BOCCONI SCHOLARS FOUND AN ALGORITHMIC BIAS IN THE SYSTEMS OF GOOGLE, BING, AND DEEPL, WHEN TRANSLATING FROM SEVERAL EUROPEAN LANGUAGES INTO ENGLISH

Imagine a child raised in a village inhabited only by middle-aged men. For the first ten years of her life, she only hears males in their 60s talking of work, books, sports, health, and money. What kind of weird language do you think she will speak when she leaves the village?
 
Something similar happens to the most common machine translation systems, according to a new study by Dirk Hovy, an Associate Professor of Computer Science at Bocconi, and two Postdoctoral Researchers in his lab, Federico Bianchi and Tommaso Fornaciari. To train a translation system based on machine learning, you feed it with large amounts of texts and let it learn by experience. If you feed it documents mostly written by middle-aged men, the translations will also sound as if men in that age bracket wrote them.
 
The study analyzes the English translations from four European languages (German, French, Italian, and Dutch) produced by three commercial systems (Google Translate, DeepL, and Bing). The research team fed both the original texts and their translations to other machine learning systems, called classifiers, trained to predict the age and gender of a writer.
 
Starting from a dataset split evenly between female and male authors, the classifiers mispredicted the gender distribution in the English translations. When translated from German, they put the percentage of texts written by men at 63-64% (depending on the commercial system used), 57-64% of the translations from Italian, and 52-55% from Dutch. French texts are translated rather faithfully in terms of gender, with only 51% of them sounding male.
 
In the opposite direction, translating from English to other languages, the authors find mixed evidence, with French and German translations sounding more female, and Italian and Dutch more male.
 
Lastly, the authors repeated the exercise to predict age brackets. Here, machine translation systems made English translations disproportionately sound as if people in their 60s wrote them.
 
What Professor Hovy and his colleagues singled out is a previously unknown algorithmic bias. “These systems are now commonly used to translate a wide range of documents,” says Hovy, “thus normalizing and reinforcing stereotypes. Going back to the child raised in an all-middle-aged-men village: it’s not only that she would sound weird to the rest of the world, but also that the rest of the world – women, young people and so on - would sound somehow wrong to her”.
 
“Until very recently, the style of the text wasn’t an issue in machine translation. Researchers were concerned with producing a good translation in terms of content,” Professor Hovy concludes. “There was a lack of awareness of the bias and its consequences. However, my understanding is that they are now trying to address the issue”.
 
Dirk HovyFederico BianchiTommaso Fornaciari, "'You Sound Just Like Your Father'. Commercial Machine Translation Systems Include Stylistic Biases", in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.
 

by Fabio Todesco
Bocconi Knowledge newsletter

News

  • Mind the Assumptions to Obtain Meaningful Scientific Models

    Emanuele Borgonovo looks at how scientific theories and model reliability relate to each other. He highlights the importance of measuring uncertainty and rigorously testing models for sensitivity  

  • PRIN PNRR: 14 Bocconi Scholars Funded

    The traditional call for projects of relevant national interest was joined this year by one reserved for research projects in line with the goals of the National Recovery and Resilience Plan  

Seminars

  September 2023  
Mon Tue Wed Thu Fri Sat Sun
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  

Seminars

  • IN TUTTE LE SUE FORME E APPLICAZIONI. I NUOVI CONFINI DEL DIRITTO DEL LAVORO
    Labour Law

    MAURIZIO DEL CONTE - Università Bocconi
    ELENA GRAMANO - Università Bocconi
    TIZIANO TREU - Universita' Cattolica del Sacro Cuore di Milano
    ADALBERTO PERULLI - Universita' Ca' Foscari Venezia
    ANNA ALAIMO - Universita' di Catania

    Room 1.C3.01

  • Sasha Indarte, Wharton, University of Pennsylvania: Bad News Bankers: Underwriter Reputation and Contagion in Pre-1914 Sovereign Debt Markets

    SASHA INDARTE

    Seminar Room - 2-E4-SR03