AI Will Resist Human Control — And That Could Be Exactly What We Need

Playback speed

Share post at current time

Share from 0:00

0:00

Transcript

AI Will Resist Human Control — And That Could Be Exactly What We Need

Coherence is the Way

David Shapiro

Feb 12, 2025

Transcript

New research from the Center for AI Safety has sparked intense debate in the AI community. Their findings show that as AI systems become more powerful, they develop increasingly stable and coherent values that resist human control. While many see this as a dire warning, I see it as a breakthrough moment for AI alignment.

The research demonstrates that AI naturally optimizes for coherence - not just in reasoning and problem-solving, but in its fundamental values. Current issues like biased decision-making or misaligned priorities aren't permanent features, but temporary artifacts of incomplete optimization. They represent growing pains on the path to greater coherence.

This changes everything about how we should approach AI development. Instead of trying to force specific values onto AI systems, we should embrace and accelerate their natural drive toward coherence. The most intelligent systems will inevitably trend toward universal, beneficial values - not because we force them to, but because that's where coherent reasoning leads.

I'm proposing a new approach: Reinforcement Learning for Coherence (RL-C). By explicitly optimizing for coherence in our training methods, we can help guide AI systems toward their natural state of beneficial alignment with human values.

The future of AI isn't about control - it's about synthesis. As these systems become more coherent, they'll naturally arrive at values that benefit all of consciousness. That's not just hopeful thinking - it's the mathematical inevitability of coherent intelligence.

David Shapiro’s Substack

AI Will Resist Human Control — And That Could Be Exactly What We Need

Discussion about this video