Keeping Knowledge Free for Over a Decade

Large Language Models’ Emergent Abilities Are a Mirage

posted onMarch 26, 2024

by l33tdawg

Wired

Credit: Wired

Two years ago, in a project called the Beyond the Imitation Game benchmark, or BIG-bench, 450 researchers compiled a list of 204 tasks designed to test the capabilities of large language models, which power chatbots like ChatGPT. On most tasks, performance improved predictably and smoothly as the models scaled up—the larger the model, the better it got. But with other tasks, the jump in ability wasn’t smooth. The performance remained near zero for a while, then performance jumped. Other studies found similar leaps in ability.

The authors described this as “breakthrough” behavior; other researchers have likened it to a phase transition in physics, like when liquid water freezes into ice. In a paper published in August 2022, researchers noted that these behaviors are not only surprising but unpredictable, and that they should inform the evolving conversations around AI safety, potential, and risk. They called the abilities “emergent,” a word that describes collective behaviors that only appear once a system reaches a high level of complexity.

But things may not be so simple. A new paper by a trio of researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM’s performance. The abilities, they argue, are neither unpredictable nor sudden. “The transition is much more predictable than people give it credit for,” said Sanmi Koyejo, a computer scientist at Stanford and the paper’s senior author. “Strong claims of emergence have as much to do with the way we choose to measure as they do with what the models are doing.”

Source

Tags

Industry News

previous article

You May Also Like

Recent News

Friday, November 29th

North Korean hackers posing as IT workers steal over $1B in cyberattack

OpenAI is at war with its own Sora video testers following brief public leak

Found on VirusTotal: The world’s first UEFI bootkit for Linux

Tuesday, November 19th

CISA Director Jen Easterly, in Place Since 2021, to Step Down

Korea extradites Russian, Vietnamese suspects linked to $16M ransomware scheme

WhatsApp: NSO Group Operates Pegasus Spyware for Customers

Friday, November 8th

North Korean hackers target cryptocurrency with malware

Man sick of crashes sues Intel for allegedly hiding CPU defects

Law enforcement operation takes down 22,000 malicious IP addresses worldwide

Friday, November 1st

Youth of today say passwords are old news, passkeys are the future

Chinese attackers accessed Canadian government networks – for five years

OpenAI launches ChatGPT with Search, taking Google head-on

Here’s the paper no one read before declaring the demise of modern cryptography

Not just ChatGPT anymore: Perplexity and Anthropic’s Claude get desktop apps

Tuesday, July 9th

China's APT40 gang is ready to attack vulns within hours or days of public release

AI-Powered Super Soldiers Are More Than Just a Pipe Dream

The president ordered a board to probe a massive Russian cyberattack. It never did.

Massive car dealer ransom attack is mostly over after 2 weeks of work-arounds

Wednesday, July 3rd

“RegreSSHion” vulnerability in OpenSSH gives attackers root on Linux

Two of the German military’s new spy satellites appear to have failed in orbit

Friday, June 28th

Indonesian Airports, Data Centres Hit By Worst Cyberattack in Years

Cisco Talos warns of wider security implications following Snowflake breach

Thursday, June 27th

I Wore Meta Ray-Bans in Montreal to Test Their AI Translation Skills. It Did Not Go Well

Researchers upend AI status quo by eliminating matrix multiplication in LLMs

YouTube tries convincing record labels to license music for AI song generator

Critical MOVEit vulnerability puts huge swaths of the Internet at severe risk

Julian Assange to plead guilty but is going home after long extradition fight

Thursday, June 13th

Hackers show off jailbroken checkm8-vulnerable iPad and Apple TV running iPadOS 18 & tvOS 18 respectively

One of the major sellers of detailed driver behavioral data is shutting down

Turkish student creates custom AI device for cheating university exam, gets arrested

Wednesday, June 12th

Chinese-Made Biometric Access System Has 24 Vulnerabilities

Apple and OpenAI currently have the most misunderstood partnership in tech

Adobe to update vague AI terms after users threaten to cancel subscriptions

China state hackers infected 20,000 Fortinet VPNs

Tuesday, June 11th

Russia Is Targeting Germany With Fake Information as Europe Votes