Chapter 2: AI Foundations

1. Types of Intelligence

AI systems, particularly those based on machine learning and neural networks, rely on algorithms and structured processes to learn from data and make decisions. These processes follow clear rules and mathematical models to produce outcomes from specific inputs. Even probabilistic AI models still follow patterns based on statistical laws and require a lot of data to improve.

Human Intelligence, on the other hand, comes from the brain—a complex and still not fully understood network of neurons. It includes not just electrical signals and chemistry, but also emotions, experience, memory, and possibly even unconscious thought. Human thinking is shaped by our senses, our past, our emotions, and our social and cultural environments. This allows for creativity, self-awareness, and adaptability that goes beyond what AI can currently do. Human learning is deeply personal and influenced by society in ways that structured AI training is not. This makes human and machine intelligence fundamentally different, yet potentially complementary.

As we explore the basics of machine intelligence, it’s important to understand that the word "intelligence" means something very different for machines than for people. While AI is great at handling specific tasks quickly and with lots of data, human intelligence involves awareness, emotion, ethics, and imagination. Learning about the technology behind AI can help us appreciate both the limits and the strengths of these tools—and how they can work alongside human intelligence.

Machine Intelligence

The field of AI has developed various ways to categorize machine intelligence:

Reactive Machines can react to situations but have no memory. IBM's Deep Blue, a chess-playing computer, is an example.
Limited Memory systems like self-driving cars use recent data (like sensor input) to make decisions in real time.
Theory of Mind AI is a future goal for AI—systems that understand human emotions, beliefs, and intentions. Research is ongoing.

Human Intelligence

Psychologists have proposed different models to explain how human intelligence works in real life. These frameworks help us understand how thinking involves more than just solving problems.

Sternberg's Triarchic Theory:

Analytical Intelligence: Logical reasoning and problem-solving. Like taking a test.
Creative Intelligence: Generating new ideas. Like writing a poem or inventing something.
Practical Intelligence: Applying knowledge to everyday situations. Like managing time or social dynamics.

Gardner's Multiple Intelligences:

Linguistic Intelligence: Sensitivity to language. Seen in poets and writers.
Logical-Mathematical Intelligence: Skill in reasoning and numbers. Scientists and engineers.
Spatial Intelligence: Thinking in images and space. Architects and visual artists.
Bodily-Kinesthetic Intelligence: Using the body skillfully. Dancers and athletes.
Musical Intelligence: Understanding sound and rhythm. Musicians and composers.
Interpersonal Intelligence: Understanding others’ emotions. Teachers and counselors.
Intrapersonal Intelligence: Self-awareness. Philosophers and psychologists.
Naturalist Intelligence: Recognizing patterns in nature. Biologists and gardeners.

PASS Theory:

Planning: Setting goals and solving problems.
Attention: Focusing on tasks and ignoring distractions.
Simultaneous Processing: Understanding how things fit together.
Successive Processing: Following steps in a specific order.

2. Machine Learning

The modern AI boom comes from a major shift in the late 20th century—from rule-based programs to systems that learn from data. This shift gave us today's powerful tools that improve through experience rather than hard-coded logic. Machine learning, a branch of AI, uses algorithms and statistical models to learn from data and make predictions or decisions.

The Turing Test, proposed in 1950 by Alan Turing, asked if a machine could behave so intelligently that people couldn’t tell it apart from a human. Early AI tried to pass this test using Symbolic AI, or “GOFAI,” which used fixed rules and logic to simulate thinking. These systems were good at solving specific problems but couldn’t adapt to new ones easily.

Eliza

Eliza was an early program made in the 1960s by Joseph Weizenbaum. It mimicked a conversation with a therapist using simple pattern matching. While it could simulate conversation, it didn’t understand the meaning behind the words. Eliza showed both the potential and limits of early AI.

Eliza: natural language processing program by Joseph Weizenbaum

Symbolic AI eventually reached a limit. It couldn’t deal with the messiness of the real world. That’s when machine learning took over. In the 1990s and 2000s, AI began learning from data instead of depending on fixed rules. This opened up new possibilities—from recognizing speech to predicting medical outcomes.

Some milestones:

IBM’s Deep Blue beat world chess champion Garry Kasparov in 1997.
DeepMind’s AlphaGo defeated world champion Lee Sedol in 2016 using deep learning and reinforcement learning.

AlphaGo

AlphaGo showed that AI could master games of deep strategy. It learned not just from rules, but from experience—studying thousands of games and improving by playing against itself. Its success accelerated AI research across many fields.

AlphaGo vs Lee Sedol: Move 78 reaction and analysis

3. Neural Networks

Neural networks are the core technology behind modern AI. Inspired by the human brain, they are made up of layers of connected nodes—or artificial neurons—that pass information forward and adjust based on what they learn. Each connection has a "weight" that changes as the system trains, helping the network recognize patterns, make predictions, or generate new outputs.

Neural Networks

What makes neural networks powerful is their ability to learn from data—not by following a fixed program, but by adjusting their inner structure based on feedback. This process is called training. During training, the network compares its predictions with the correct answers, calculates the error, and updates its weights to do better next time. After many cycles, the network becomes skilled at recognizing patterns such as the shapes of letters, the tone of a voice, or the structure of a sentence.

You can think of a neural network like a team of musicians in an orchestra. Each “neuron” is one instrument. At first, they are out of sync, but through practice (training), they learn how to adjust to each other until the whole orchestra produces a harmonious performance that fits the piece of music (the task).

The most influential neural network architecture today is the Transformer. Transformers were introduced in 2017 and revolutionized how machines understand sequences of data like language, audio, or even images. Unlike earlier models that processed input one step at a time, transformers take in the whole input at once and decide which parts are most important. This is called self-attention.

Why Transformers Matter:

They can handle long passages of text and keep track of meaning over time.
They are fast and efficient to train, using parallel processing instead of sequential steps.
They work across different kinds of data—text, images, audio—enabling true multimodal AI.
They are the foundation for models like GPT (for language), DALL·E (for images), and Whisper (for audio).

Transformers allow AI systems to do more than just analyze—they can generate and communicate. They read prompts, understand instructions, and even compose stories or build software. By learning patterns in human expression, they’ve become powerful collaborators in creative and intellectual work.

Example of a Transformer in Action: Imagine you type the sentence: “The cat sat on the mat.” A transformer doesn’t just read it left to right. Instead, it looks at all the words at once and calculates how strongly each word relates to the others. For example, “cat” is strongly connected to “sat,” and “mat” is connected to “on.” This web of connections allows the model to understand not only the meaning of each word but also the relationships that give the sentence its sense. That’s why transformers can keep track of meaning across whole paragraphs or even pages.

4. LLMs and GPTs

Large Language Models (LLMs) are AI systems trained on vast amounts of text from books, articles, websites, and other sources. They use this knowledge to understand and generate human-like language. LLMs can help with tasks like translating languages, summarizing text, answering questions, and creating content.

Introduction to large language models

When we interact with an LLM, the basic unit of information is a token. A token is not exactly a word—it can be a whole word, part of a word, or even punctuation. For example, “unbelievable” might be split into the tokens [un], [believe], [able]. The model reads text as a sequence of tokens, not letters or whole words. Understanding tokens is crucial because the model predicts the next token in a sequence, step by step, to build sentences, paragraphs, or even entire essays.

An LLM is like an autocomplete system on steroids. When you start typing in your phone, it guesses the next word. An LLM works similarly, but instead of just finishing “Happy…” with “birthday,” it can generate entire paragraphs, poems, or computer programs by making the most likely prediction at each step.

Generative Pre-trained Transformer (GPT) models are a specific kind of LLM that use the transformer architecture. The term “pre-trained” means the model first learns general language patterns from huge datasets, and then it can be fine-tuned for specialized tasks like coding, tutoring, or creative writing. GPTs generate fluent, context-aware text by calculating which token is most likely to come next based on the tokens that came before.

Recent versions like GPT-4 can process and generate not just text, but also images, audio, and even video. This ability is called multimodal AI. It means one model can read, write, and "see" or "hear" at the same time—useful for creative projects, accessibility tools, or educational simulations.

Key AI Technologies

LLMs (Large Language Models): Trained on huge text datasets to understand and generate human-like language. Examples include GPT-4, Claude, and LLaMA.
GPT (Generative Pre-trained Transformer): A type of LLM built on transformers. Pre-trained on general text, then fine-tuned for tasks like writing, coding, or tutoring.
GANs (Generative Adversarial Networks): Paired networks that generate realistic images, audio, or video. (Covered later in the Image Generation chapter.)
Diffusion Models: Image generators that start with random noise and refine it step by step. (Covered later in the Image Generation chapter.)

How do language models guide image generators? Think of text as the brain and images as the hands. When you type a prompt like “a fox jumping over a fence in moonlight,” the language model interprets the sentence and translates it into a detailed internal structure—something the image model can understand. It’s language that organizes and controls the creation of visual, auditory, and interactive content.

Example of an LLM in Action: Suppose you ask an LLM: “Write a short story about a dragon who learns to cook.” The model breaks your sentence into tokens and looks at all the relationships between them (“dragon” relates to “fantasy,” “cook” relates to “food” and “kitchen”). Then, step by step, it predicts the next most likely token: [Once], [upon], [a], [time]… and so on, each choice influenced by the full context of the prompt. In this way, the model builds a coherent response that matches your request.

For now, our focus is on language. But keep in mind: the same underlying ideas about tokens, patterns, and prediction extend to other forms of media. In later chapters, we’ll explore GANs and diffusion models in detail to see how machines learn to generate images, sound, and beyond.

5. Prompts and Contexts

To get the most out of large language models and other generative AI tools, it’s important to learn how to write effective prompts. A prompt is the instruction or input you give to the AI. The clearer and more thoughtfully crafted the prompt, the better the result.

In generative AI systems, language is more than just input—it’s the interface. Whether you're generating text, images, code, or sound, prompts act as the control mechanism. They help you guide what the model does, how it does it, and who it does it for.

Good prompting is not just about being precise. It’s also about being expressive and strategic. Sometimes breaking a complex task into steps helps. Other times, using creative or poetic language can guide the AI toward more imaginative results. Language is powerful, and how you use it shapes how the AI responds.

People with backgrounds in writing, art, history, or philosophy often excel at prompt engineering because they understand how language works at many levels. Their ability to express complex or nuanced ideas helps them craft prompts that produce more meaningful results.

What Makes a Good Prompt?

Role: Ask the AI to take on a persona or expertise.
Example: “Act as a museum curator…”
Task: Describe exactly what you want it to do.
Example: “Summarize this article…”
Tone: Indicate the style or voice.
Example: “Explain it like I’m five…”
Format: Request a specific structure.
Example: “Use bullet points and headings.”
Audience: State who the output is for.
Example: “For a beginner-level student.”

Meta-Prompts:

One advanced method is meta-prompting, where you first ask the AI to design the prompts you will use for a complex, multi‑stage project. Instead of jumping into the work, you begin by co‑creating a plan of prompts that you’ll follow in sequence. This is especially useful when each step depends on the previous one.

For example, here’s how you might frame a meta‑prompt for a digital publishing workflow. First, you work with the AI to create a single reusable prompts that contains a logical sequence of prompts.

YOUR META‑PROMPT (ASK THIS FIRST):

“Help me generate a logically ordered set of prompts that will take a manuscript from a Word doc to a fully published, promoted HTML-formatted web article. Each prompt should state its specific goal, inputs, and required confirmation before moving to the next step.”

AI’S WORKFLOW PLAN (PROMPTS YOU WILL RUN STEP‑BY‑STEP):

STEP 1 – PROOFREADING PROMPT: “Review the manuscript for spelling, grammar, and formatting issues. Output a correction list with before/after examples. Wait for my approval before applying edits.”

STEP 2 – EDIT‑APPLICATION PROMPT: “Apply only the approved corrections to the manuscript. Return the revised text and a brief change log. Pause for my confirmation before formatting.”

STEP 3 – HTML FORMATTING PROMPT: “Convert the approved text into clean, semantic HTML (headings, paragraphs, lists, links, figure captions). Provide a single self‑contained HTML file. Wait for my sign‑off before metadata.”

STEP 4 – METADATA PROMPT: “Create SEO‑friendly metadata: title, description (≤160 chars), 8–12 keywords, author, publication date, and open‑graph tags. Return as a compact JSON block. Pause for review before social posts.”

STEP 5 – SOCIAL POSTS PROMPT: “Draft 3–5 short social posts (varied tone/audience), each with a unique hook and 1–2 suggested hashtags. Include one alt‑text suggestion per post.”

STEP 6 – SUMMARY REPORT PROMPT: “Produce a concise report describing what was done at each step, decisions taken, and any unresolved items to address next.”

This format demonstrates four important points:

Prompt‑for‑prompts: You begin by asking the AI to help you design the workflow first, giving you a reusable roadmap rather than a single one‑off answer.
Process‑oriented prompts: A meta‑prompt doesn’t just request a single output; it defines an ongoing approach. By telling the AI to pause for approval before moving on, it shifts from “one‑and‑done” answers to a collaborative back‑and‑forth. That’s how you get better results—and keep control of the work.
Logical sequencing: Clear step labels, goals, and “pause for approval” checkpoints keep the model from skipping ahead and make collaboration more reliable.

Segmenting the instructions with steps, headings, and short explanations also helps the AI follow the logic without skipping ahead. The model “sees” where each task begins and ends, which makes the whole workflow more reliable—and easier for you to review.

Here are some common prompting techniques that help you get the outputs you want:

Prompt Techniques with Examples

Few-Shot Prompting: Show the AI a few examples to guide its style or logic.
Example: “Translate: ‘Good morning’ → ‘Buenos días’, ‘Thank you’ → ‘Gracias’, ‘Please’ →”
Chain-of-Thought Prompting: Ask the AI to reason step-by-step.
Example: “Explain step by step how to bake a cake.” → “1. Preheat oven…”
Meta Prompting: Ask the AI to generate its own best prompt for a goal.
Example: “How should I ask you to summarize a book?”
Prompt Chaining: Use one prompt’s response as the setup for the next.
Example: Prompt: “What’s the capital of France?” → “Paris” → “Describe Paris as a tourist destination.”
Retrieval-Augmented Generation (RAG): Combine AI with real-time external data.
Example: “Summarize today’s climate news.” → [Fetch and summarize current news articles]

Prompting Principles

Clarity: Make your prompt easy to understand.
Context: Give enough background to frame the task.
Conciseness: Avoid unnecessary words.
Relevance: Stay on topic.
Detail: Be specific about what you want.
Creativity: Use open-ended language to invite varied results.
Iterative Refinement: Adjust prompts to improve the response.
Tone: Match the tone to the audience.
Experimentation: Try different phrasing and structures.
Feedback: Use AI responses to refine the next prompt.

Many AI tools today offer built-in prompt libraries or templates, making it easier for beginners to get started. These provide helpful starting points for everything from lesson planning to design ideation and marketing strategy.

Context Engineering:

Context Engineering goes beyond single prompts to shape the entire information environment. Context is what the model “knows,” “sees,” and “remembers” when it’s generating a response to a prompt. Context Engineering is about feeding the model the right knowledge, structuring that knowledge effectively, and managing how information flows over time. Done well, it makes AI output more reliable, relevant, and creative.

Context Engineering Strategies

Providing the Right Information: Load up relevant documents, facts, and examples into the prompt so the AI has the necessary “ingredients” for accurate answers.
Structuring the Context: Use clear formatting, stepwise instructions, tables, or even custom markup to pack more meaning into the AI’s limited context window.
Using Tools Effectively: Integrate resources like search engines, calculators, or APIs so the AI can access extra knowledge or perform actions it couldn’t on its own.
Managing the Flow of Information: Include mechanisms for “memory” and “forgetting”—remind the AI of key details, but also trim old or irrelevant info to keep the session sharp.
Role Framing & Persona Setting: Ask the AI to “be” an expert, editor, or coach, shaping its voice and focus.
Perspective Switching: Have the AI reframe content from multiple viewpoints to explore alternatives.
Constraints & Quality Filters: Set rules for tone, accuracy, and length—and emphasize quality over just stuffing more data into the window.
Handling Errors Gracefully: Anticipate mistakes; prompt the AI to review, correct, or re-try when something goes wrong.
Reflection & Self-Check: Ask the AI to critique or improve its own answers, creating a loop of refinement.

By combining prompting, meta-prompting and context engineering, you shift from giving instructions to designing environments for AI thinking. Prompts are the questions you ask; context is the stage you set for the answers.

Finally, remember that prompts and contexts don’t just shape the AI’s output—they also teach it about you. Every choice you make in wording, structure, and examples is a clue about how you think, write, and approach problems. By sharing your reasoning, style, and process in the prompt and context, you invite the model to mirror not just what you want, but how you would do it yourself.

6. Unit Exercise

To develop a grounded understanding of prompt engineering, this unit will have you engage in a hands-on exercise with a language model like ChatGPT or Claude. Here are the steps:

Refine a Prompt for Clarity and Direction: Start with a general question and improve it step by step.
Use Multi-Step and Multi-Modal Prompts: Combine text, data, and images in one task (e.g., analyze a chart and write a summary).
Test the Model’s Self-Critique Ability: Ask the model to assess and improve its own output.
Evaluate Long-Term Consistency: Keep a character or style consistent over a long conversation.
Use Metaphor and Abstract Language: Try poetic or metaphorical prompts and analyze how the model responds.
Give Contradictory Instructions: See how the model handles paradoxes.
Simulate Multi-Character Dialogues: Create a scene between characters with different viewpoints.
Incorporate Real-Time Data: Ask for summaries of live news or trending topics.
Explore Creative Freedom: Give an open-ended creative prompt and see how the model imagines new ideas.
Pose Ethical Dilemmas: Ask moral questions and see how the model reasons through them.

As you complete these exercises, reflect on how AI thinks, what it gets right, and where it needs guidance. Document your process and insights.

7. Discussion Questions

How should we balance the benefits of open-source AI tools with the need to ensure their safe and ethical use in society?
What new techniques are emerging that can improve the effectiveness of AI prompts while minimizing harmful or biased outputs? How can these be implemented in practice?
Given current AI limitations such as inconsistency, lack of memory, and inability to learn from past interactions, what are the prospects for overcoming these challenges? What might be the implications if they are not resolved?
Who should own the rights to content generated by AI, especially when it builds on existing language, art, and culture? How can we develop an ethical framework for AI-generated intellectual property?
As generative AI increasingly shapes fields like education, scientific research, the arts, and knowledge work, how might our understanding of originality and creativity evolve? Does a thriving culture require the freedom to remix, copy, and transform existing works?

8. Bibliography

Mitchell, Melanie. Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus and Giroux, 2019) A clear, insightful overview of AI concepts, history, and ethical ramifications from a leading researcher.
Tegmark, Max. Life 3.0: Being Human in the Age of Artificial Intelligence (Knopf, 2017) An accessible yet profound look at paths toward advanced AI and how to navigate its risks and possibilities.
Russell, Stuart. Human Compatible: Artificial Intelligence and the Problem of Control (Viking, 2019) Leading AI researcher dissects the key challenges and philosophical questions surrounding value alignment in advanced AI systems.
Irving, Gabriel and Askell, Amanda. "AI Safety Needs Social Scientists" (Distill, 2019) Influential paper arguing social sciences are crucial to developing robust AI value learning frameworks.