Skip to main navigation Skip to search Skip to main content

Gender Representation in Large Language Models: A Cross-Linguistic and Cross-Model Analysis

  • Aleksandra Urman
  • , Emese Domahidi
  • , Johannes B. Gruber
  • , Ana Jovančević
  • , Michaela Maier
  • , Dren Gërguri
  • , Jaromír Mazak
  • , Mariken van der Velden
  • University of Zurich
  • Ilmenau University of Technology
  • Leibniz Institute for the Social Sciences
  • University of Kaiserslautern
  • University of Prishtina “Hasan Prishtina”
  • STEM institute
  • Vrije Universiteit Amsterdam

Research output: Contribution to journalArticlepeer-review

Abstract

The representation of gender in large language models (LLMs) can reflect and reinforce existing sociocultural inequalities. However, the nature of such gender biases can differ significantly across languages, influenced by linguistic features and a model’s training data. In this study, we investigate gender representation in 24 open-weight LLMs across six linguistically distinct languages (English, German, Russian, Czech, Albanian, and Serbian). Extending beyond binary frameworks, we incorporate nonbinary individuals as response options and examine associations across psychometrically validated stereotype dimensions (agency, communality, dominance, weakness, and giftedness). Our analysis accounts for variations between and within model families and differences in sampling parameters. The results reveal that traditional gender stereotypes persist with varying degrees of strength, while nonbinary associations show substantial cross-linguistic variations. Temperature analysis demonstrates that such associations are deeply embedded in model parameters rather than being artifacts of sampling procedures. These findings suggest that gender bias identification and potential mitigation in LLMs are shaped by both contextual and technical factors. Overall, our findings challenge the notion that gender bias is a simple, measurable construct, highlighting its complex, context-dependent nature across languages, models, and stereotype dimensions. Effective bias mitigation requires interventions at the level of training data, model architecture, or alignment procedures.

Original languageEnglish
Pages (from-to)1-39
Number of pages39
JournalComputational Communication Research
Volume8
Issue number2
DOIs
Publication statusPublished - 1 Feb 2026

Keywords

  • Bias
  • Gender
  • generative AI
  • LLM
  • Multilingual

Fingerprint

Dive into the research topics of 'Gender Representation in Large Language Models: A Cross-Linguistic and Cross-Model Analysis'. Together they form a unique fingerprint.

Cite this