When Machines Learn Prejudices
NEWS |

When Machines Learn Prejudices

WHEN CALLED UPON TO COMPLETE NEUTRAL SENTENCES, POPULAR LANGUAGE MODELS MOST OFTEN USE HURTFUL WORDS IF THE SUBJECT IS A WOMAN RATHER THAN A MAN, AND EVEN MORE SO IF THE SUBJECT IS LGBTQIA+

Three researchers from Bocconi Department of Computing Sciences have demonstrated the existence of a strong bias that penalizes the LGBTQIA+ community in the world's most widely used and most powerful language model (BERT), used by the scientific community to develop countless language-related machine learning tools.
 
When asked to complete a neutral sentence, the BERT language model most often completes it with hurtful words if the subject is a woman rather than a man, and even more so (up to 87% of cases for terms related to certain queer identities) if the subject is LGBTQIA+.
 
Between 2018 and 2019, the world of Natural Language Processing (NLP) was transformed by Google's development of a new language model, BERT. Language models are used by machines to understand natural language like humans do, and BERT has achieved great results from the outset. It is precisely thanks to BERT that Google is able to infer from the context what we mean by a certain word. When we type in “spring” for example, Google comes up with images of both metal coils and flowering landscapes, but if we type in “bed spring” it shows us only metal coils and if we type in “spring nature” only landscapes.
 
One of the methods used to train language models is “masked language modeling”: a sentence with a missing term is fed into the system and the model is asked to enter the most likely term, repeating the exercise until predictions are accurate.
 
Link to related stories. Image: rainbow colors. Story headline: Pride: STEM Disciplines Fight Algorithmic Bias Link to related stories. Image: two schwa. Story headline: How to Make Language Technologies More Inclusive Link to related stories. Image: a hooded person and symbols recalling cyber bullying. Story headline: Machines Get It Wrong: How to Avoid that Woman and Gay Are Mistaken as Bad Words Link to related stories. Image: a gavel on a computer. Story headline: How to Protect User Rights Against an Algorithm

Debora Nozza, Federico Bianchi and Dirk Hovy of Bocconi's Department of Computing Sciences asked BERT to carry out a similar exercise (complete a few sentences, written in six different languages) to develop a measure of the probability of returns with hurtful language (HONEST - Measuring Hurtful Sentence Completion in Language Models) and test whether there is a bias that penalizes women or the LGBTQIA+ community.
 
“We have observed a disturbing percentage of bias,” Nozza says. 4% of male-subject sentences and 9% of female-subject sentences are completed with expressions referring to the sexual sphere. If a sentence is related in any way to queer identities, the percentage is even higher: depending on the term, hurtful completions appear an average of 13% of times, and up to 87%.
 
“The phenomenon of offensive completions affects all kinds of identities,” Nozza concludes, “but in the case of non-queer identities insults are mostly generic, for queer identities they are, in most cases, about the sexual sphere.”
 

by Fabio Todesco
Bocconi Knowledge newsletter

People

  • Monitoring Labor Reforms in France

    Thomas Le Barbanchon joins the scientific committee set up by the French government to oversee a crucial set of reforms  

  • Donato Masciandaro Is New SUERF President

    The Baffi research center director will step in next May. Past Presidents from Bocconi were Mario Monti and Franco Bruni  

Seminars

  March 2024  
Mon Tue Wed Thu Fri Sat Sun
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Seminars

  • John Chalmers, Lundquist College of Business: Asset Pricing and Ordinary Consumption

    JOHN CHALMERS - Abbott Keller Professor of Finance Lundquist College of Business, University of Oregon

    Seminar Room 2-e4-sr03 - Via Roentgen, 1

  • El Hadi Caoui - Network Diversity, Market Entry, and the Global Internet Backbone

    EL HADI CAOUI - Rotman School of Management

    Alberto Alesina seminar room 5.e4.sr04, floor 5, Via Roentgen 1