Study: AI-Generated Reviews Deceive Humans and Detectors, Eroding Trust in Online Platforms

Technology April 28, 2024

Moor Studio/Getty

A recent study by Yale School of Management professor Balázs Kovács reveals that AI-generated restaurant reviews can pass the “Turing test” by fooling both human readers and AI detectors, potentially undermining the credibility of online reviews.

SuchScience reports that online reviews have become an integral part of the decision-making process for consumers, with the majority relying on them to make informed choices. However, the emergence of sophisticated AI language models now threatens to undermine the trustworthiness of these reviews. Professor Kovács conducted two experiments with a diverse group of 301 participants, split between the two studies, to investigate the ability of AI-generated reviews to deceive human readers and AI detectors.

In Study 1, participants were shown a mix of real Yelp reviews and AI-generated counterparts created by OpenAI’s GPT-4. Shockingly, they correctly identified the source only about 50 percent of the time – no better than random chance. Study 2, where GPT-4 created entirely fictional reviews, yielded even more striking results: participants classified AI-generated reviews as human-written 64 percent of the time.

Kovács also tested leading AI detectors designed to distinguish human-written from AI-generated text. He fed 102 of the reviews to Copyleaks, a publicly available AI-text recognition tool, which labeled all of the reviews as human-generated, indicating its inability to identify the AI-generated content. Even GPT-4 was unable to reliably distinguish between human-written and its own AI-generated reviews when asked to assess the likelihood of each review being AI-generated on a scale from 0 to 100.

The findings have far-reaching implications for review platforms, businesses, and consumers. Unscrupulous actors could exploit AI to generate fake reviews, eroding trust in online platforms and disproportionately affecting small businesses that rely heavily on genuine reviews. The study serves as a wake-up call for review platforms to rethink their authentication mechanisms and for policymakers to consider regulatory action to enforce transparency.