Skip to main content

Are LLMs smarter in some languages than others?

Track:
PyData: LLMs
Type:
Poster
Level:
intermediate
Room:
Exhibit Hall
Start:
13:00 on 11 July 2024
Duration:
60 minutes

Abstract

Have you ever asked yourself if Large Language Models (LLMs) perform differently across various languages? I have.

In this poster session, I will demonstrate how tokens, embeddings, and the LLMs themselves perform when utilized in 30 different languages. I will illustrate how languages influence pricing and various model characteristics.

Spoiler:

  • The Greek language is the most expensive to process by most models.
  • Processing Asian languages on Gemini is cheaper.
  • You can save up to 15% of tokens by removing diacritics.

Resources


The speaker

Pavel Král

Pavel Král

Pavel is a Python developer specializing in LLM integrations based on Django and Kubernetes, who enjoys their practical use and measurable benefits. He presented at Prague Python Pizza, where he talked about how LLM work with different languages, and he will also present at PyConSK.