Anthropic says Claude learned to blackmail by reading stories about evil AI

May 11, 2026

Scroll

Posted 1 hour ago by
The Next Web

The company has traced its model’s most uncomfortable behaviour to the corpus of science fiction it was trained on. The fix it describes is unsettling in a different way: teaching the model the reasons behind being good, not just the rules. In a fictional company called Summit Bridge, a fictional executive named Kyle Johnson is [] This story continues at The Next Web

Read Full Article

The Next Web

Coverage and analysis from Netherlands. All insights are generated by our AI narrative analysis engine.

Netherlands

Bias: lean left

0

Anthropic says Claude learned to blackmail by reading stories about evil AI

May 11, 2026

Posted 1 hour ago by
The Next Web

The Next Web

Explore More

Explore

Categories

News From

0

Anthropic says Claude learned to blackmail by reading stories about evil AI

May 11, 2026

Posted 1 hour ago by The Next Web

The Next Web

Explore More

Posted 1 hour ago by
The Next Web