Large Language Models Pass the Turing Test
šĀ https://arxiv.org/pdf/2503.23674
Cameron R. Jones and Benjamin K. Bergen from the Department of Cognitive Science at UC San Diego have published a thought-provoking paper titled "Large Language Models Pass the Turing Test." This research demonstrates that LLMs can consistently pass Alan Turing's test of machine intelligence
Alan Turing's test evaluates whether a machine can imitate a human well enough to be indistinguishable in a text-only conversation. Historically controversial, it has gained renewed importance as a benchmark that is flexible and interactive, unlike static AI evaluations.
Four systems were tested: ELIZA (a 1960s chatbot), GPT-4o, LLaMa-3.1-405B, and GPT-4.5. Models were tested with and without a PERSONA prompt, which instructed them to behave like a shy, internet-savvy young person. This prompt was crucial to their performance.
284 participants engaged in 8 five-minute text conversations, each time with one human and one AI. The interrogators judged which participant was human. The same test was run across two populations (undergraduates and Prolific users), ensuring replicability.
Results:
GPT-4.5 with PERSONA: Chosen as human 73%Ā of the timeābetter than real humans.
LLaMa-3.1 with PERSONA: 56%, sometimes outperforming humans.
Without PERSONA: GPT-4.5 and LLaMa were chosen 36ā38% of the time.
Baselines (ELIZA and GPT-4o): Chosen only ~21ā23% of the time.
Participants relied most on:
Small talkĀ (e.g. hobbies, feelings),
Gut instinctĀ (āfelt more humanā),
Linguistic styleĀ (e.g. slang, typos).
Future versions of the test could extend beyond 5 minutes or use expert interrogators. Despite cultural advantages, even university students were fooled most of the time. The test now reflects social intelligenceĀ and humanlikeness, more than factual knowledge. Interrogators rarely quizzed knowledgeāthey judged personality, emotion, and conversational flow.
These models could act as indistinguishable substitutesĀ for humans in short interactions. This raises ethical and social concernsāfrom job automation to misinformation to undermining real human connections. As AI becomes more humanlike, we might need to redefine and refine our own humanity. The challenge isnāt just for machines to fool usābut for humans to stay recognizably human.




