The Origination of AI: 70 Years in the Making

This content is part of Module 1 | Fenris Education Program

Most people think AI showed up in November 2022. ChatGPT launched, the internet lost its mind, and suddenly everyone had opinions about artificial intelligence. Your coworker started using it to write emails. Your uncle started arguing about it at Thanksgiving. It felt like it came out of nowhere.

It didn't.

Artificial intelligence has been an active field of research for over 70 years. What felt like a sudden explosion was actually the result of decades of work, multiple collapses in funding, and researchers who had to literally stop calling their work "AI" just to keep the lights on.

If you're going to use these tools (and you should), it helps to understand where they actually came from. Not because you need to memorize dates, but because the history explains why things work the way they do right now. And it gives you a much better sense of where things are headed.

This is that story.

Before AI Had a Name (1940s to 1955)

Before anyone used the phrase "artificial intelligence," mathematicians and scientists were already asking the question: can a machine think?

In 1943, Warren McCulloch and Walter Pitts published a paper proposing the first mathematical model of how a neural network could work. They looked at how biological neurons fire in the brain and asked whether you could represent that process mathematically. You could. That paper is the earliest ancestor of the neural networks powering today's AI systems.

In 1950, a British mathematician named Alan Turing published a paper called "Computing Machinery and Intelligence." Instead of trying to define intelligence in the abstract, he proposed a practical test: if a machine can hold a conversation and a human can't tell whether they're talking to a person or a computer, the machine can be considered intelligent. This became known as the Turing Test, and people still reference it today.

Two years later, in 1952, an IBM researcher named Arthur Samuel built a program that could play checkers and actually improve its own game over time. It learned from experience. Samuel would later coin the term machine learning to describe what it was doing.

So by the mid-1950s, the ingredients were already on the table. Mathematical models of how brains work. A test for machine intelligence. A program that could learn. What was missing was a name.

The Birth of AI (1956)

In the summer of 1956, a group of researchers gathered at Dartmouth College in New Hampshire for an eight-week workshop. The organizers were John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon. Their proposal to the Rockefeller Foundation for funding this workshop contains the first known use of the phrase "artificial intelligence."

The Dartmouth workshop is considered the official birth of AI as a field. Attendees included Allen Newell and Herbert Simon, who showed up with something called the Logic Theorist, a program that could prove mathematical theorems. It's often called the first true AI program.

The people in that room went on to build the three major AI research centers of the era: Carnegie Mellon (Newell and Simon), MIT (Minsky), and Stanford (McCarthy). For the next two decades, they would push hard on the idea that machines could genuinely think.

The First Golden Age (Late 1950s to 1973)

The years after Dartmouth were full of optimism. Computers started solving algebra problems, proving geometry theorems, and processing English. The people building these systems genuinely believed human-level AI was just around the corner. Minsky predicted in 1967 that the problem of creating AI would be substantially solved within a generation.

It wasn't.

The programs were impressive for their time, but they were brittle. They could handle toy problems in controlled environments but fell apart when faced with the complexity of the real world. A 1966 government report called ALPAC evaluated machine translation and found that computers were nowhere close to matching human translators. Early enthusiasm started to cool.

Then in 1969, Minsky and a colleague named Seymour Papert published a book called Perceptrons that demonstrated the mathematical limitations of the neural network models being used at the time. This was a significant blow. Funding for neural network research dried up almost overnight, and the approach was essentially abandoned for over a decade.

The First AI Winter (1974 to 1980)

In 1973, a mathematician named James Lighthill delivered a report to the British Parliament concluding that AI research had failed to deliver on its promises. The U.S. and British governments responded by cutting funding for AI research.

This period is known as the first AI winter. Progress stalled. Labs shrank. The field went quiet.

Here's what's worth understanding about AI winters: they weren't caused by the ideas being wrong. They were caused by expectations being unrealistic relative to the computing power and data available at the time. The concepts behind neural networks, machine learning, and natural language processing were sound. The hardware just wasn't there yet.

The Expert Systems Boom (1980 to 1987)

AI came back in the early 1980s, but in a different form. Instead of trying to build general intelligence, researchers focused on expert systems: programs that encoded the knowledge of human specialists into a set of rules.

The idea was straightforward. You interview an expert (a doctor, an engineer, a financial analyst), capture their decision-making process as a series of if-then rules, and put it in software. One of the most successful was XCON, built at Carnegie Mellon for Digital Equipment Corporation, which configured computer orders and saved the company millions.

Japan launched an ambitious national project called the Fifth Generation Computer Project in 1980, which spurred renewed global investment in AI. Companies poured money into expert systems. For a few years, AI was a booming commercial market.

Meanwhile, researchers like John Hopfield and David Rumelhart quietly revived interest in neural networks with new architectures and an improved training technique called backpropagation. This would matter enormously later.

The Second AI Winter (1987 to Early 1990s)

The expert systems bubble burst. These systems were expensive to build, expensive to maintain, couldn't learn or adapt, and broke down the moment they encountered a situation their rules didn't cover. The market collapsed around 1987.

AI entered its second winter. And this one came with a stigma.

Researchers who were still doing legitimate, important work in the field started avoiding the term "artificial intelligence" entirely. They rebranded. The same work got called "machine learning" or "informatics" or "computational intelligence." Anything but AI. The term had become toxic in academic and funding circles.

This is important context for understanding where we are today. The field didn't die during the winters. The people doing the work just had to call it something else to survive.

Quiet Progress (Mid-1990s to 2011)

The AI winters ended not with a bang but with steady, unglamorous progress.

In 1997, IBM's Deep Blue defeated world chess champion Garry Kasparov. It was a massive public milestone, but it was also narrow AI. Deep Blue was built specifically for chess and couldn't do anything else.

The real action was happening in academia, far from the headlines.

In 2006, Geoffrey Hinton published key papers demonstrating how to effectively train deep neural networks, the same kind of networks that Minsky had dismissed decades earlier. The math had improved. More importantly, computers had gotten powerful enough to actually run these models.

Between 2007 and 2009, a Stanford researcher named Fei-Fei Li led the creation of ImageNet, a dataset of over 14 million labeled images across 22,000 categories. Building it required crowdsourcing labels from tens of thousands of people. It seemed like an academic exercise at the time. It turned out to be one of the most consequential datasets ever assembled.

The Deep Learning Breakthrough (2012)

This is the moment everything changed, even though almost nobody outside the field noticed.

In 2012, a team led by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton entered the annual ImageNet competition with a deep neural network called AlexNet. It won by a landslide, beating the runner-up by over 10 percentage points. That's an enormous margin in a field where improvements are typically measured in fractions of a percent.

AlexNet worked because three things had finally converged at the same time:

Large datasets like ImageNet gave models enough examples to learn from
GPU computing (graphics processors originally built for video games) provided the raw computational power needed to train deep networks
Improved training methods like the backpropagation techniques Hinton had been developing for years

This convergence is the actual origin of the AI you use today. Not a single breakthrough, but three separate threads coming together after decades of independent progress.

The Transformer (2017)

Five years after AlexNet, a team of researchers at Google published a paper called "Attention Is All You Need." It introduced a new architecture called the Transformer.

If you're not technical, here's what matters: the Transformer solved a fundamental problem in how computers process language. Previous approaches read text one word at a time, in order. The Transformer could look at all the words in a passage simultaneously and understand the relationships between them. This made it dramatically better at understanding and generating language.

The Transformer is the architecture underneath essentially every major AI system you interact with today. ChatGPT, Claude, Gemini, Copilot. All of them. A single paper from 2017 is the foundation for the entire current generation of AI tools.

The Explosion (2020 to 2024)

Once the Transformer architecture existed, progress accelerated fast.

In 2020, OpenAI released GPT-3, a language model with 175 billion parameters. It demonstrated something surprising: if you make a model big enough and train it on enough text, it develops capabilities that nobody explicitly programmed. It could write essays, answer questions, translate languages, and even write code.

In 2021, GitHub launched Copilot, an AI coding assistant. AI went from research papers to daily workflow tools.

In 2022, image generation tools like Stable Diffusion and Midjourney brought AI art to the mainstream. Then in November, OpenAI released ChatGPT. It reached 100 million users in two months, making it the fastest consumer technology adoption in history.

2023 brought GPT-4, Claude, and Gemini. Multiple companies were now building frontier AI systems. The U.S. signed an Executive Order on AI safety. The EU passed the AI Act, the world's first comprehensive AI regulation.

By 2024, AI startup funding was approaching $150 billion. The technology had gone from an academic curiosity to a geopolitical priority in under three years.

Where We Are Now (2025 to 2026)

Today's AI systems operate with context windows of over a million tokens (roughly the length of several novels). They can reason through complex problems, write and debug code, analyze documents, and maintain coherent conversations across long sessions.

Agentic AI, where systems can take actions and execute multi-step workflows rather than just generate text, is becoming standard. Open protocols like MCP (Model Context Protocol) are emerging to let AI systems connect with external tools and data sources.

Companies like OpenAI, Anthropic, Google, and Meta are all building frontier models. The competition is producing rapid improvements in capability, safety, and accessibility.

And the regulatory landscape is taking shape alongside the technology. The EU's AI Act classifies systems by risk level. The U.S. is developing its own frameworks. The conversation has shifted from "should we regulate AI" to "how."

The Visual Timeline

1943 — McCulloch & Pitts model neural networks mathematically

1950 — Turing asks "Can machines think?" Proposes the Turing Test

1952 — Arthur Samuel builds a checkers program that learns. Coins "machine learning"

1956 — Dartmouth Workshop. "Artificial intelligence" gets its name

Late 1950s–60s — AI labs at MIT, Stanford, Carnegie Mellon. Programs solve algebra, prove theorems, process language

1969 — Minsky's "Perceptrons" kills neural network funding

1974–1980: FIRST AI WINTER — Governments cut funding. Labs close.

1980 — Expert systems boom. Japan's Fifth Generation Project

1987–1993: SECOND AI WINTER — Expert systems collapse. Researchers rebrand to survive.

1997 — Deep Blue beats Kasparov at chess

2006 — Hinton revives deep neural networks

2009 — ImageNet: 14 million labeled images assembled

2012: THE INFLECTION POINT — AlexNet wins ImageNet. Big data + GPUs + better training = deep learning actually works.

2017 — "Attention Is All You Need." The Transformer is born

2020 — GPT-3: 175 billion parameters. Scale = capability

2022 — ChatGPT: 100 million users in two months

2023 — GPT-4, Claude, Gemini. AI arms race begins

2024 — EU AI Act. $150B+ in AI funding

2025–26 — Million-token context windows. Agentic AI. MCP. AI goes from tool to coworker

Why This Matters for You

Here's the reason this history is worth understanding before you touch any AI tool:

The systems you're using today are not magic. They're not unpredictable black boxes that appeared from nothing. They're the product of 70 years of research, two periods of complete funding collapse, and a handful of critical breakthroughs that happened to converge at the right time.

When you understand the history, you start to see the current moment differently. The rapid pace of improvement isn't random. It's the compounding effect of better architectures, more data, and more computing power all accelerating at once. It also means the limitations you see in today's tools are real engineering constraints, not permanent walls. The field has been solving "impossible" problems for seven decades. It's not going to stop now.

You don't need to become a researcher. You don't need to understand the math. But knowing that this technology has roots going back to the 1940s, that it survived two winters, and that it's built on a foundation of cumulative breakthroughs rather than a single flash of genius? That gives you a much better mental model for understanding what AI can do, what it can't do yet, and where it's going.

That's the starting point. Now let's learn how to actually use it.