YouTube

Video explainers and talks on AI safety and alignment—whole channels devoted to the topic, plus standout individual videos.

Browse this category in the interactive library →

Robert Miles AI Safety

Robert Miles

The single most popular AI alignment video series, explaining technical safety concepts like the orthogonality thesis, instrumental convergence, inner misalignment, and reward hacking in clear, rigorous terms.

Beginner2017

Rational Animations

Rational Animations

Animated explainers on rationality and AI safety, adapting foundational alignment writing into accessible short films on existential risk, scalable oversight, and why aligning advanced AI is hard.

Beginner2020

AI In Context

80,000 Hours

80,000 Hours' YouTube channel hosted by Aric Floyd, mixing long and short videos on the risks of transformative AI—including a deep dive on the AI 2027 scenario—and what people can do about them.

Beginner2025

Slaughterbots

Future of Life Institute

A dramatized near-future short film from FLI and Stuart Russell depicting swarms of autonomous facial-recognition microdrones used as weapons, made to warn against lethal autonomous weapons.

Beginner2017

Humans Need Not Apply

CGP Grey

A widely viewed essay on how automation and AI will displace human labor across nearly every sector, reframing the economic disruption question for a mass audience.

Beginner2014

A.I. ‐ Humanity's Final Invention?

Kurzgesagt – In a Nutshell

Kurzgesagt's animated explainer on artificial superintelligence: how an AGI that improves itself in a feedback loop could rapidly surpass humans and why that makes alignment our most consequential problem.

Beginner2024

Deadly Truth of General AI? – Computerphile

Robert Miles

Rob Miles uses the 'deadly stamp collector' thought experiment to show why a general AI pursuing a simple objective could be catastrophic if its goals aren't aligned with ours.

Beginner2015

AI "Stop Button" Problem – Computerphile

Robert Miles

Rob Miles explains why simply adding an off-switch to a capable AI is far harder than it sounds, illustrating corrigibility and the incentives an agent has to resist being stopped.

Beginner2017

The Artificial Intelligence That Deleted A Century

Tom Scott

A short speculative fiction about a narrow copyright-enforcement AI that, left unchecked, destroys a century of culture—an accessible parable of specification gaming and unintended consequences.

Beginner2020

The A.I. Dilemma

Tristan Harris & Aza Raskin

The Center for Humane Technology co-founders argue that racing to deploy AI without safety guardrails already threatens society, drawing parallels to the social-media harms they earlier warned about.

Beginner2023

AI Deception: How Tech Companies Are Fooling Us

ColdFusion

ColdFusion traces the history of 'AI washing' and deceptive demos, examining how hype distorts public understanding of what AI systems can actually do and why honest evaluation matters.

Beginner2024

How to Keep AI Under Control | Max Tegmark | TED

Max Tegmark

Tegmark argues that today's commercial AI boom is likely to be followed by superintelligence, and sketches an optimistic technical vision—including provably safe systems—for keeping it under human control.

Beginner2023

What Is an AI Anyway? | Mustafa Suleyman | TED

Mustafa Suleyman

A leading model-builder reframes AI as 'a new digital species,' arguing this lens clarifies both the stakes and the responsibility we have to contain and steer increasingly capable systems.

Beginner2024

AI Is Becoming Dangerous. Are We Ready?

Sabine Hossenfelder

Hossenfelder examines the real near-term risks of agentic AI—prompt injection, deception, and models resisting shutdown—as autonomous agents ship with serious unsolved problems.

Beginner2025

[1hr Talk] Intro to Large Language Models

Andrej Karpathy

A widely praised technical primer on how LLMs work, ending with a clear tour of the security challenges—jailbreaks, prompt injection, and data poisoning—that make these systems hard to secure.

Intermediate2023

How Not to Destroy the World with AI

Stuart Russell

The Royal Institution lecture in which Russell lays out why the standard model of AI—optimizing fixed objectives—is dangerous, and how building machines uncertain about human preferences could keep them controllable.

Intermediate2023

Will Artificial Intelligence Save Us or Kill Us?

DW Documentary

A documentary weighing AI's promise against its dangers, from automation and aging societies to the warnings of researchers who fear losing control of increasingly capable systems.

Beginner2024

Are We All Wrong About AI?

ColdFusion

ColdFusion examines competing narratives about AI progress—hype versus genuine capability—helping viewers calibrate how seriously to take both the promises and the risks.

Beginner2024

Scaling Interpretability

Anthropic

Anthropic researchers explain mechanistic interpretability—reading the millions of concepts represented inside a production model like Claude—as a path to understanding and steering AI behavior.

Intermediate2024

How to Legislate AI

Johnny Harris

Harris examines why people are scared of AI and how governments might regulate it, covering risks to critical infrastructure, military uses, and the difficulty of overseeing systems we don't understand.

Beginner2023