Since the page didn't load for me several times, and the title is ambiguous, here's the Abstract: Large language models (LLMs) have recently made vast advances in both generating and analyzing textual data. Technical reports often compare LLMs’ outputs with “human” performance on various tests. Here, we ask, “Which humans?” Much of the existing literature largely ignores the fact that humans are a cultural species with substantial psychological diversity around the globe that is not fully captured by the textual data on which current LLMs have been trained. We show that LLMs’ responses to psychological measures are an outlier compared with large-scale cross-cultural data, and that their performance on cognitive psychological tasks most resembles that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines rapidly as we move away from these populations (r = -.70). Ignoring cross-cultural diversity in both human and machine psychology raises numerous scientific and ethical issues. We close by discussing ways to mitigate the WEIRD bias in future generations of generative language models.
As an aside: Last year a student of mine (we're at a U.S. college) told me that his teenage cousins back in Mongolia were all learning English in order to use ChatGPT.
Since the page didn't load for me several times, and the title is ambiguous, here's the Abstract: Large language models (LLMs) have recently made vast advances in both generating and analyzing textual data. Technical reports often compare LLMs’ outputs with “human” performance on various tests. Here, we ask, “Which humans?” Much of the existing literature largely ignores the fact that humans are a cultural species with substantial psychological diversity around the globe that is not fully captured by the textual data on which current LLMs have been trained. We show that LLMs’ responses to psychological measures are an outlier compared with large-scale cross-cultural data, and that their performance on cognitive psychological tasks most resembles that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines rapidly as we move away from these populations (r = -.70). Ignoring cross-cultural diversity in both human and machine psychology raises numerous scientific and ethical issues. We close by discussing ways to mitigate the WEIRD bias in future generations of generative language models.
This was submitted 30 months ago. Still interesting. I would be interested if this got 'worse' or 'better' with newer models.
2023, and using some kind of in-page pdf reader from 2002
It’s 2025 today, Andy :)
/usr/bin/humans, presumably
As an aside: Last year a student of mine (we're at a U.S. college) told me that his teenage cousins back in Mongolia were all learning English in order to use ChatGPT.
is it a steppe up?