Featured Article

The Great Pretender

AI doesn’t know the answer, and it hasn’t learned how to care

Comment

Hands holding a mask of anonymity. Polygonal design of interconnected elements.
Image Credits: llya Lukichev / Getty / Getty Images

There is a good reason not to trust what today’s AI constructs tell you, and it has nothing to do with the fundamental nature of intelligence or humanity, with Wittgensteinian concepts of language representation, or even disinfo in the dataset. All that matters is that these systems do not distinguish between something that is correct and something that looks correct. Once you understand that the AI considers these things more or less interchangeable, everything makes a lot more sense.

Now, I don’t mean to short circuit any of the fascinating and wide-ranging discussions about this happening continually across every form of media and conversation. We have everyone from philosophers and linguists to engineers and hackers to bartenders and firefighters questioning and debating what “intelligence” and “language” truly are, and whether something like ChatGPT possesses them.

This is amazing! And I’ve learned a lot already as some of the smartest people in this space enjoy their moment in the sun, while from the mouths of comparative babes come fresh new perspectives.

But at the same time, it’s a lot to sort through over a beer or coffee when someone asks “what about all this GPT stuff, kind of scary how smart AI is getting, right?” Where do you start — with Aristotle, the mechanical Turk, the perceptron or “Attention is all you need”?

During one of these chats I hit on a simple approach that I’ve found helps people get why these systems can be both really cool and also totally untrustable, while subtracting not at all from their usefulness in some domains and the amazing conversations being had around them. I thought I’d share it in case you find the perspective useful when talking about this with other curious, skeptical people who nevertheless don’t want to hear about vectors or matrices.

There are only three things to understand, which lead to a natural conclusion:

  1. These models are created by having them observe the relationships between words and sentences and so on in an enormous dataset of text, then build their own internal statistical map of how all these millions and millions of words and concepts are associated and correlated. No one has said, this is a noun, this is a verb, this is a recipe, this is a rhetorical device; but these are things that show up naturally in patterns of usage.
  2. These models are not specifically taught how to answer questions, in contrast to the familiar software companies like Google and Apple have been calling AI for the last decade. Those are basically Mad Libs with the blanks leading to APIs: Every question is either accounted for or produces a generic response. With large language models the question is just a series of words like any other.
  3. These models have a fundamental expressive quality of “confidence” in their responses. In a simple example of a cat recognition AI, it would go from 0, meaning completely sure that’s not a cat, to 100, meaning absolutely sure that’s a cat. You can tell it to say “yes, it’s a cat” if it’s at a confidence of 85, or 90, whatever produces your preferred response metric.

So given what we know about how the model works, here’s the crucial question: What is it confident about? It doesn’t know what a cat or a question is, only statistical relationships found between data nodes in a training set. A minor tweak would have the cat detector equally confident the picture showed a cow, or the sky, or a still life painting. The model can’t be confident in its own “knowledge” because it has no way of actually evaluating the content of the data it has been trained on.

The AI is expressing how sure it is that its answer appears correct to the user.

This is true of the cat detector, and it is true of GPT-4 — the difference is a matter of the length and complexity of the output. The AI cannot distinguish between a right and wrong answer — it only can make a prediction of how likely a series of words is to be accepted as correct. That is why it must be considered the world’s most comprehensively informed bullshitter rather than an authority on any subject. It doesn’t even know it’s bullshitting you — it has been trained to produce a response that statistically resembles a correct answer, and it will say anything to improve that resemblance.

The AI doesn’t know the answer to any question, because it doesn’t understand the question. It doesn’t know what questions are. It doesn’t “know” anything! The answer follows the question because, extrapolating from its statistical analysis, that series of words is the most likely to follow the previous series of words. Whether those words refer to real places, people, locations, etc. is not material — only that they are like real ones.

It’s the same reason AI can produce a Monet-like painting that isn’t a Monet — all that matters is it has all the characteristics that cause people to identify a piece of artwork as his. Today’s AI approximates factual responses the way it would approximate “Water Lilies.”

Now, I hasten to add that this isn’t an original or groundbreaking concept — it’s basically another way to explain the stochastic parrot, or the undersea octopus. Those problems were identified very early by very smart people and represent a great reason to read commentary on tech matters widely.

Ethicists fire back at ‘AI Pause’ letter they say ‘ignores the actual harms’

But in the context of today’s chatbot systems, I’ve just found that people intuitively get this approach: The models don’t understand facts or concepts, but relationships between words, and its responses are an “artist’s impression” of an answer. Their goal, when you get down to it, is to fill in the blank convincingly, not correctly. This is the reason why its responses fundamentally cannot be trusted.

Of course sometimes, even a lot of the time, its answer is correct! And that isn’t an accident: For many questions, the answer that looks the most correct is the correct answer. That is what makes these models so powerful — and dangerous. There is so, so much you can extract from a systematic study of millions of words and documents. And unlike recreating “Water Lilies” exactly, there’s a flexibility to language that lets an approximation of a factual response also be factual — but also make a totally or partially invented response appear equally or more so. The only thing the AI cares about is that the answer scans right.

This leaves the door open to discussions around whether this is truly knowledge, what if anything the models “understand,” if they have achieved some form of intelligence, what intelligence even is and so on. Bring on the Wittgenstein!

Furthermore, it also leaves open the possibility of using these tools in situations where truth isn’t really a concern. If you want to generate five variants of an opening paragraph to get around writer’s block, an AI might be indispensable. If you want to make up a story about two endangered animals, or write a sonnet about Pokémon, go for it. As long as it is not crucial that the response reflects reality, a large language model is a willing and able partner — and not coincidentally, that’s where people seem to be having the most fun with it.

Where and when AI gets it wrong is very, very difficult to predict because the models are too large and opaque. Imagine a card catalog the size of a continent, organized and updated over a period of a hundred years by robots, from first principles that they came up with on the fly. You think you can just walk in and understand the system? It gives a right answer to a difficult question and a wrong answer to an easy one. Why? Right now that is one question that neither AI nor its creators can answer.

This may well change in the future, perhaps even the near future. Everything is moving so quickly and unpredictably that nothing is certain. But for the present this is a useful mental model to keep in mind: The AI wants you to believe it and will say anything to improve its chances.

More TechCrunch

Tags

Ola Electric, India’s largest electric two-wheeler maker, saw its shares rise as much as 20% on its public debut on Friday, making it the biggest listing among Indian firms in…

Ola Electric surges in India’s biggest listing in two years

Rocket Lab surpassed $100 million in quarterly revenue for the first time, a 71% increase from the same quarter of last year. This is just one of several shiny accomplishments…

Rocket Lab’s sunny outlook bodes well for future constellation plans 

In 1996, two companies, Patersons HR and Payroll Solutions, formed a venture called CloudPay to provide payroll and payments services to enterprise clients. CloudPay grew quietly over the next several…

CloudPay, a payroll services provider, lands $120M in new funding

The vulnerabilities allowed one security researcher to peek inside the leak sites without having to log in.

Security bugs in ransomware leak sites helped save six companies from paying hefty ransoms

Featured Article

A comprehensive list of 2024 tech layoffs

The tech layoff wave is still going strong in 2024. Following significant workforce reductions in 2022 and 2023, this year has already seen 60,000 job cuts across 254 companies, according to independent layoffs tracker Layoffs.fyi. Companies like Tesla, Amazon, Google, TikTok, Snap and Microsoft have conducted sizable layoffs in the…

A comprehensive list of 2024 tech layoffs

A new “beta rabbit” mode adds some conversational AI chops to the Rabbit r1, particularly in more complex or multi-step instructions.

Rabbit’s r1 refines chats and timers, but its app-using ‘action model’ is still MIA

Los Angeles is notorious for its back-to-back traffic. Three events that promise to bring in millions of spectators from around the world — the 2026 World Cup, the Super Bowl…

Archer to set up air taxi network in LA by 2026 ahead of World Cup

Featured Article

Amazon is fumbling in India

Amazon’s decision to overlook quick-commerce in India is now looking like a significant misstep.

Amazon is fumbling in India

OpenAI’s GPT-4o, the generative AI model that powers the recently launched alpha of Advanced Voice Mode in ChatGPT, is the company’s first trained on voice as well as text and…

OpenAI finds that GPT-4o does some truly bizarre stuff sometimes

On Thursday, Box filled in a missing piece on its AI platform when it bought automated metadata extracting startup, Alphamoon.

Box adds crucial piece to its AI platform with Alphamoon acquisition

OpenAI has announced a new appointment to its board of directors: Zico Kolter. Kolter, a professor and director of the machine learning department at Carnegie Mellon, predominantly focuses his research…

OpenAI adds a Carnegie Mellon professor to its board of directors

Count Spotify and Epic Games among the Apple critics who are not happy with the iPhone maker’s newly revised compliance plan for the European Union’s Digital Markets Act (DMA). Shortly…

Spotify and Epic Games call Apple’s revised DMA compliance plan ‘confusing,’ ‘illegal’ and ‘unacceptable’

Thursday seeks to shake up conventional online dating in a crowded market. The app, which recently expanded to San Francisco, fosters intentional dating by restricting user access to Thursdays. At…

Thursday, the dating app that you can use only on Thursdays, expands to San Francisco

AI companies are gobbling up investor money and securing sky-high valuations early in their life cycle. This dynamic has many calling the AI industry a bubble. Nick Frosst, a co-founder…

Cohere co-founder Nick Frosst thinks everyone needs to be more realistic about what AI can and cannot do

Instagram is rolling out the ability for users to add up to 20 photos or videos to their feed carousels, as the platform embraces the trend of “photo dumps.” Back…

Instagram is embracing the ‘photo dump’

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Anyone paying…

Lyft ‘opens a can of whoop ass’ on surge pricing, Tesla’s Dojo explained and Saudi Arabia pumps $1.5B into Lucid

Flint Capital just closed its third fund at $160 million. Its has a unique strategy for finding its limited partner investors. 

Flint Capital raises a $160M through an unusual fund-raising strategy

Earlier this week it emerged that the DPC had instigated court proceedings seeking an injunction against X over the data processing without consent.

Elon Musk’s X agrees to pause EU data processing for training Grok

During testing, Google DeepMind’s table tennis bot was able to beat all of the beginner-level players it faced.

Google DeepMind develops a ‘solidly amateur’ table tennis robot

The X account announced that its Premium+ subscription would now be “fully” ad-free, leading some to question how this change would affect creator earnings.

As X sues advertisers over boycott, the app ditches all ads from its top subscription tier

Apple has further revised its compliance plan for the European Union’s Digital Markets Act (DMA) rulebook, which, since March, has forced it to give iOS developers more freedom over how…

Apple revises DMA compliance for App Store link-outs, applying fewer restrictions and a new fee structure

The rise of neobanks has been fascinating to witness, as a number of companies in recent years have grown from merely challenging traditional banks to being massive players in and…

Chime and Dave execs are coming to TechCrunch Disrupt 2024

If you visited the Wikipedia website on mobile this week, you might have seen a pop-up indicating that dark mode is ready for prime time.

How to enable Wikipedia’s dark mode

The home security company says attackers accessed databases containing customer home addresses, email addresses, and phone numbers.

Home security giant ADT says it was hacked

The Looking Glass Pro has a 6-inch display and a foldable base. It shows spatial images like those created with the Apple Vision Pro and iPhone 15 Pro.

Looking Glass’ new lineup includes a $300 phone-sized holographic display

TikTok’s latest offering is capitalizing on the app’s ability to serve as a discovery engine for other media — something its users already take advantage of by sharing short clips…

TikTok partners with Warner Bros. to become a discovery engine for TV and movies

Cocoon is a new startup built on the belief that greener steel production and the creation of concrete slag doesn’t have to be an either/or proposition.

Cocoon is transforming steel production runoff into a greener cement alternative

SoundHound, an AI company that makes voice interface tech used by car companies, restaurants and tech firms, is doubling down on enterprise services by playing consolidator in a crowded market.…

SoundHound acquires Amelia AI for $80M after it raised $189M+

Seeking mental health support is a complex process, but some founders believe that using AI to formalize techniques like cognitive behavioral therapy (CBT) can help folks who might not have…

Feeling Great’s new therapy app translates its psychiatrist co-founder’s experience into AI

The U.K.’s antitrust regulator has confirmed that it’s carrying out a formal antitrust investigation into Amazon’s ties with Anthropic, after Amazon recently completed a $4 billion investment into the AI startup.…

UK launches formal probe into Amazon’s ties with AI startup Anthropic