Pigeons to AI: How Skinner's Research Shaped Modern Machine Learning
In the midst of World War II, as physicists raced to unlock the secrets of the atom for the Manhattan Project, American psychologist B.F. Skinner embarked on his own clandestine government endeavor. His goal was not a more destructive weapon, but rather a more precise one. Inspired by a flock of birds flying in formation alongside his train, Skinner envisioned them as “devices” with exceptional vision and maneuverability that could guide missiles.
Initially experimenting with crows, which proved uncooperative, Skinner turned to the more amenable pigeon, giving birth to “Project Pigeon.” Though ordinary pigeons, Columba livia, were hardly considered intelligent, they proved remarkably adept in the lab. Skinner trained them with food rewards for pecking at specific targets on aerial photographs, eventually envisioning them strapped into a warhead, steering by pecking at a live image projected onto a screen. The military never deployed these avian kamikazes, yet Skinner’s experiments profoundly shaped his view: the pigeon, he declared in 1944, was “an extremely reliable instrument” for studying the fundamental processes of learning, a practical creature that “can be made into a machine.”
While many trace the origins of artificial intelligence to science fiction or thought experiments like the Turing test, a less celebrated, yet equally pivotal, precursor lies in Skinner’s mid-20th-century pigeon research. Skinner championed “association”—the trial-and-error process of linking an action to a reward or punishment—as the foundational building block of all behavior, not just in pigeons but across all living organisms, including humans. His “behaviorist” theories fell out of favor with psychologists and animal researchers in the 1960s, but they found an unexpected new home in computer science, ultimately laying the groundwork for many of today’s leading AI tools from companies like Google and OpenAI.
These companies increasingly employ a form of machine learning whose core concept, reinforcement, is directly derived from Skinner’s school of psychology. Its principal architects, computer scientists Richard Sutton and Andrew Barto, were awarded the 2024 Turing Award, widely regarded as the Nobel Prize of computer science, for their contributions. Reinforcement learning has enabled computers to drive vehicles, solve complex mathematical problems, and famously defeat grandmasters in games like chess and Go. Crucially, it achieves these feats not by mimicking the intricate workings of the human mind, but by supercharging the simple associative processes observed in the pigeon brain.
Sutton has termed this a “bitter lesson” from 70 years of AI research: human intelligence has not served as the ideal model for machine learning. Instead, it is the seemingly humble principles of associative learning that power algorithms capable of simulating or even outperforming humans across diverse tasks. If AI truly is on the verge of autonomous action, then our future digital overlords might resemble “rats with wings” with planet-sized brains more than they resemble us.
The recent triumphs of AI are now prompting some animal researchers to re-examine the evolution of natural intelligence. Johan Lind, a biologist at Stockholm University, highlights the “associative learning paradox”: the process is often dismissed by biologists as too simplistic to produce complex animal behaviors, yet it is celebrated for generating human-like capabilities in computers. This re-evaluation suggests a far greater role for associative learning in intelligent animals like chimpanzees and crows, and indeed, a previously underestimated complexity in creatures long considered simple-minded, such as the common pigeon.
Skinner’s work, building on Ivan Pavlov’s late 19th-century discoveries of classical conditioning, extended the principles of conditioning from involuntary reflexes to an animal’s entire behavior. He theorized that “behavior is shaped and maintained by its consequences,” meaning an action with desirable results would be “reinforced” and likely repeated. He systematically reinforced behaviors, teaching rats to manipulate marbles and pigeons to play simple tunes. Skinner argued that this “operant conditioning” was the universal building block of behavior, advocating for a psychology focused solely on observable, measurable actions, without reference to an “inner agent.”
However, Skinner’s ideas, particularly his application to human language in his 1957 book Verbal Behavior, faced a scathing critique from Noam Chomsky, shifting psychology’s focus towards innate “cognitive” abilities like logic and symbolic thinking. Biologists also pushed back, arguing that species evolved specific, often inherited, behaviors tailored to their habitats, rather than relying on a single, elementary mechanism.
By the 1970s, when Sutton delved into Skinner’s work, many researchers had moved on from pigeons to larger-brained animals, seeking more sophisticated cognitive behaviors. Yet, Sutton found these “old experiments” uniquely instructive for machine learning, noting a distinct absence of “instrumental learning” in engineering. Earlier attempts at AI, often termed “symbolic AI,” tried to mimic human thinking by coding convoluted rules. These programs struggled with basic tasks like pattern recognition, proving too limited for complex problem-solving.
Pigeon research, however, offered an alternative path. A 1964 study demonstrated that pigeons could learn to distinguish between photographs with and without people, simply by being rewarded for pecking the correct images. This suggested that concepts and categories could be learned through associative learning alone, without explicit rules.
When Sutton began collaborating with Andrew Barto on AI in the late 1970s, their aim was to create a “complete, interactive goal-seeking agent” akin to a pigeon or rat, capable of exploring and influencing its environment. Their approach, which they dubbed “reinforcement learning,” centered on two functions: searching for actions and remembering which actions yielded rewards in specific situations. In 1998, their seminal book, Reinforcement Learning: An Introduction, solidified the concept. As computing power surged over the next two decades, it became possible to “train” AI systems, essentially running the AI “pigeon” through millions of trials.
This led to breakthroughs like Google DeepMind’s AlphaGo Zero in 2017. Built entirely through reinforcement learning, AlphaGo Zero started with no knowledge of the game Go, yet achieved “superhuman performance” within 40 days, even pioneering new strategies. Its creators noted that it rediscovered millennia of human Go knowledge and developed novel insights, all by simply being rewarded for wins and penalized for losses.
Today, reinforcement learning is increasingly integrated into consumer-facing AI products, including advanced chatbots. While early generative AI models used “supervised learning” on human-labeled data, reinforcement learning now fine-tunes results and is even used to train “reasoning” models by providing incentives rather than explicit instructions. However, many computer scientists, including Sutton, dismiss claims of AI “reasoning” as marketing, arguing that these models rely solely on search and memory to form associations and maximize rewards, not genuine cognition. Yet, Sutton and his colleagues contend that the pigeon’s method—trial-and-error learning for rewards—is powerful enough to drive behavior exhibiting “most if not all abilities that are studied in natural and artificial intelligence,” including the full richness of human language.
If computers can achieve such feats with a pigeon-like brain, then some animal researchers question whether pigeons themselves deserve more credit. Psychologist Ed Wasserman of the University of Iowa trained pigeons to succeed at a complex categorization task that stumped undergraduate students. The students fruitlessly searched for rules, while the pigeons simply developed an intuitive “sense” for the categories through practice and association. Wasserman even trained pigeons to detect cancerous tissue and heart disease symptoms in medical scans with an accuracy comparable to experienced doctors. He finds it puzzling that associative learning is often deemed a crude mechanism, insufficient for the intelligence of animals like apes or crows.
Lind, the biologist, echoes this sentiment, finding it ironic that associative processes, fundamental to AI’s progress, are considered too simplistic for biological intelligence. He cites Sutton and Barto’s work in his biological research and proposes that flexible behaviors like social learning and tool use could arise from associative learning, rather than requiring complex cognitive mechanisms.
While some may feel uneasy about a revival of behaviorist theory, arguing that animals learn by association does not equate to labeling them simple-minded. Scientists like Lind and Wasserman acknowledge the role of instinct and emotion in animal behavior. Their point is that associative learning is a far more potent, even “cognitive,” mechanism than many peers believe. As psychologist Robert Rescorla, whose work influenced both Wasserman and Sutton, suggested, association isn’t a “low-level mechanical process” but “a primary means by which the organism represents the structure of its world.”
This is true even for a laboratory pigeon, carefully controlled within an experimental box. The pigeon’s learning extends beyond the immediate task, building a comprehensive model of its environment and the relationships between its parts. This shared mechanism prompts a crucial question, amplified by AI’s rise: How do we attribute sentience to other living beings? Pigeons in drug-discrimination tasks, for instance, demonstrate the ability to experience and differentiate internal states, raising the question of whether this is “tantamount to introspection.”
Though AI and animals share associative mechanisms, there is more to life than behavior and learning. A pigeon deserves ethical consideration not just for how it learns, but for what it feels. A pigeon can experience pain and suffering; an AI chatbot cannot, regardless of how convincingly it might simulate sentience. The significant investments in AI research now compel a similar commitment to understanding animal cognition and behavior, not only to distinguish true sentience from convincing performance, but also to gain deeper insights into ourselves. After all, humans, too, often learn by association, particularly for complex, intuitive tasks like a sommelier discerning wine nuances, or Wasserman’s students eventually mastering his categorization experiment—not by rules, but by feel. The humble laboratory pigeon, it turns out, is not just in our computers; its learning engine is fundamental to our own brains, powering some of humankind’s most impressive achievements.