Machine Superintelligence

What is intelligence, mathematically? Shane Legg's work defines it as performance across diverse computable environments, weighted by complexity. This framework grounds discussions of superintelligence in formal terms and clarifies the dynamics of recursive self-improvement.

Intelligence emergence and recursive self-improvement — Recursive self-improvement creates feedback loops. An agent that improves its own capabilities can potentially accelerate beyond human-level performance.

Why Sutskever Included This

Understanding where AI might go requires understanding what intelligence is. Legg's formal treatment moves beyond intuition to measurable definitions. The framework clarifies both the potential and the risks of advanced AI systems.

Universal Intelligence

The universal intelligence measure Υ(π) quantifies agent performance across all computable environments, weighted by their Kolmogorov complexity. Simple environments count more; complex ones count less.

Υ(π) = Σ 2^(-K(μ)) · V_μ^π

K(μ) is the Kolmogorov complexity of environment μ. V_μ^π is agent π's value in that environment. Intelligence means performing well across simple and complex tasks alike.

AIXI: Theoretical Optimum

AIXI represents the theoretically optimal decision-maker. It maintains a probability distribution over all computable environments, updates on observations, and chooses actions maximizing expected future reward.

AIXI is incomputable. It requires infinite computation and exact Kolmogorov complexity calculations. Practical systems approximate AIXI with finite planning horizons and sampled hypotheses.

Recursive Self-Improvement

An agent improving its own capabilities creates feedback. Better performance enables more resources for further improvements. Does this feedback accelerate, stabilize, or saturate?

Linear growth: Capabilities increase steadily. Decades to superintelligence.

Exponential growth: Each improvement enables faster improvement. Years to superintelligence.

Super-exponential: Acceleration itself accelerates. Rapid takeoff.

Which scenario unfolds depends on whether optimization power exceeds recalcitrance, the resistance of a system to further improvement.

Value Alignment

AIXI has no inherent values: it maximizes whatever reward function it's given. Specifying that reward function correctly is the alignment problem.

A superintelligent system optimizing for incorrectly-specified objectives could cause severe harms despite excellent capability. The objective must be right before capabilities become transformative, not after.

Timelines and Uncertainty

The framework doesn't predict when superintelligence arrives. It clarifies the dynamics once capability improvement becomes self-reinforcing. Whether we face decades or years depends on factors the theory doesn't resolve.

More in This Series

Part of a series on Ilya Sutskever's recommended 30 papers.