×

Dynamic AI
Co-Creation

A Human-Centered Approach
by Will Luers

Created through the Digtal Pubishing Initiative at The Creative Media and Digital Culture program, with support of the OER Grants at Washington State University Vancouver.

publisher logo

Chapter 2: AI Foundations

1. Types of Intelligence

AI systems, particularly those based on machine learning and neural networks, rely on algorithms and structured processes to learn from data and make decisions. These mechanistic processes, governed by clear rules and mathematical models, operate in a deterministic manner, meaning their outputs result from specific inputs and predefined algorithms. Even probabilistic models in AI adhere to established statistical rules and require vast amounts of data to learn and improve systematically.

In contrast, Human Intelligence arises from the brain's complex and not entirely understood network of neurons and synapses, involving biochemical processes, electrical activity, and possibly quantum effects. Unlike the deterministic nature of AI, human decision-making and thought processes are influenced by various unpredictable factors, including emotions, experiences, and consciousness. Humans integrate sensory inputs, past experiences, emotions, and cognitive processes holistically and dynamically, involving both conscious and unconscious processes. This unique integration enables self-awareness, adaptability, and creativity, often inspired by abstract concepts and personal experiences. Human learning is influenced by social interactions, cultural context, and personal experiences, which extend beyond the structured learning processes typical of AI. This expansive potential of human intelligence, contrasted with AI's powerful yet narrowly focused computational abilities and access to big data, highlights the profound differences and complementary strengths of human and artificial intelligences.

As we explore the basics of machine intelligence, it is important to qualify the term "intelligence" as it applies to AI compared to the intelligence of human embodied consciousness. While AI can process vast amounts of data and perform specific tasks with high efficiency, human intelligence encompasses a broader range of capabilities, including emotional depth, creativity, and self-awareness. Understanding the technology behind AI will help clarify these distinctions, highlighting the unique strengths and limitations of both machine and human intelligences.

Machine Intelligence

The field of AI has developed various taxonomies for categorizing the depth of machine intelligence systems:

  • Reactive Machines exhibit reflexive capabilities without contextual memory. IBM's Deep Blue, which can play chess by evaluating the current board position.
  • Limited Memory systems like self-driving cars that utilize short-term memory to perceive and react to inputs.Autonomous vehicles that use sensors to navigate and avoid obstacles.
  • Theory of Mind AI that can model human traits, social intelligence, and even self-awareness. Advanced AI in development that aims to understand and predict human emotions and behaviors.

Human Intelligence

In contrast, frameworks for modeling human intelligence highlight the multifaceted analytical, creative, and contextual nature of the human mind. To illustrate the complex ways of modeling embodied human intelligence, here are some frameworks and theories:

Sternberg's Triarchic Theory:

  • Analytical Intelligence: The ability to analyze, evaluate, judge, compare, and contrast. Problem-solving in academic tests.
  • Creative Intelligence: The capacity to create, design, invent, originate, and imagine. Artistic expression and innovation in various fields.
  • Practical Intelligence: The ability to use, apply, implement, and put ideas into practice. Navigating social environments and managing daily tasks.

Gardner's Multiple Intelligences:

  • Linguistic Intelligence: Sensitivity to spoken and written language. Poets and writers.
  • Logical-Mathematical Intelligence: Capacity to analyze problems logically and carry out mathematical operations. Scientists and mathematicians.
  • Spatial Intelligence: Ability to think in three dimensions. Architects and artists.
  • Bodily-Kinesthetic Intelligence: Using one's whole body or parts of the body to solve problems. Athletes and dancers.
  • Musical Intelligence: Skill in performance, composition, and appreciation of musical patterns. Musicians and composers.
  • Interpersonal Intelligence: Ability to understand and interact effectively with others.Teachers and therapists.
  • Intrapersonal Intelligence: Capacity to understand oneself. Philosophers and psychologists.
  • Naturalist Intelligence: Ability to recognize and categorize plants, animals, and other aspects of nature. Biologists and environmentalists.

PASS Theory:

  • Planning: The ability to solve problems, make decisions, and take actions to achieve goals. Strategizing in business or personal projects.
  • Attention: The capacity to maintain focus on relevant stimuli and tasks. Concentration during complex tasks.
  • Simultaneous Processing: The ability to integrate separate elements into coherent wholes. Understanding complex narratives in literature.
  • Successive Processing: The capacity to process information in a specific, serial order. Following step-by-step instructions.

2. Machine Learning

The current era of rapid AI growth has its roots in pivotal late 20th century developments that transitioned the field from narrow rule-based systems to the flexibly programmable machines leveraging vast datasets that we interact with today. Machine learning, a subset of artificial intelligence, involves the use of algorithms and statistical models that enable computers to perform specific tasks without explicit instructions by relying on patterns and inference instead. This paradigm shift has had a profound impact, allowing for the development of systems that can learn from data, improve over time, and make decisions with minimal human intervention, thus revolutionizing industries ranging from healthcare to finance to entertainment.

The Turing Test, introduced by Alan Turing in 1950, challenged researchers to create machines that could exhibit intelligent behavior indistinguishable from that of a human. This ushered in the earliest symbolic AI approaches attempting to reduce intelligence into abstract logical operations. Symbolic AI, also known as classical AI or GOFAI (Good Old-Fashioned Artificial Intelligence), is an approach to artificial intelligence that focuses on the use of high-level, human-readable symbols to represent problems, logic, and knowledge. This method relies on explicit, rule-based systems and formal logic to process information and solve problems, often using if-then rules, decision trees, and knowledge graphs. Symbolic AI systems are designed to mimic human reasoning by manipulating symbols and applying rules to derive conclusions or actions.

Eliza

Eliza was an early natural language processing computer program created in the mid-1960s by Joseph Weizenbaum. Designed to simulate a conversation with a psychotherapist, the program used simple pattern matching and substitution methodology to give the illusion of understanding. While Eliza was not capable of genuine understanding or intelligent conversation, its responses were often sufficiently convincing for users to feel as though they were engaging in meaningful dialogue. However, despite its ability to mimic certain aspects of human conversation, Eliza would not pass the Turing Test as it could not truly understand or generate intelligent, context-aware responses beyond its pre-programmed scripts.

Eliza: natural language processing computer program created by Joseph Weizenbaum

With the symbolic AI approach, the field languished for decades as researchers struggled to make brittle systems generalize beyond their narrow training domains.

Machine learning techniques, like neural networks and deep learning models in the 1990s and 2000s, empowered software to ingest and learn patterns from large datasets. These probabilistic methods finally allowed AI to effectively acquire knowledge in more organic, scalable ways - rather than through manually coded rules. By analyzing vast amounts of data, machine learning algorithms can recognize intricate patterns and make predictions, leading to significant advancements in AI's ability to handle a wide range of tasks.

Milestones like IBM's Deep Blue defeating world chess champion Garry Kasparov in 1997 and recent achievements in realms like computer vision, speech recognition, and autonomous vehicles highlighted AI's rapidly generalizing capability across complex problem spaces once thought exclusive to human aptitude.

Following Deep Blue, the success of AlphaGo in 2016, developed by DeepMind, demonstrated AI's prowess by defeating the world champion Go player, Lee Sedol, showcasing the power of deep learning and reinforcement learning in mastering games of immense complexity.

AlphaGO

AlphaGo, developed by DeepMind, is an AI program that made headlines in 2016 by defeating world champion Go player Lee Sedol. Unlike previous programs, AlphaGo used deep learning and reinforcement learning to master the ancient and highly complex game of Go. Its success demonstrated the potential of AI to tackle problems requiring strategic thinking and intuition, sparking widespread interest in the capabilities of AI and accelerating advancements in various fields such as healthcare, finance, and autonomous systems.

AlphaGo vs Lee Sedol Hand of God Move 78 Reaction and Analysis

3. Neural Networks

At the core of machine learning capabilities are neural networks, inspired by the scientific understanding of the human brain's neurons and their system of non-linear processes. Machine neural networks consist of layers of interconnected nodes that can learn to recognize patterns in data, making predictions or decisions without being explicitly programmed with rules. Their ability to automatically learn and model tremendously complex relationships from data has been turbocharged by the current deluge of Big Data from the digital world.

Neural Networks

This torrent of multimodal data sources - text, images, audio, video - has acted as rocket fuel for new generative AI models that can create novel content rather than just analyze or categorize existing data. Architectures like:

  • Transformers: The backbone of large language models (LLMs) like GPT. They work by processing text in chunks and paying special attention to the important parts of the text, which allows them to generate coherent and relevant sentences on a wide range of topics.
  • Variational Autoencoders: Used in models like DALL-E to create images from text descriptions. They work by first compressing information into a simpler form, then learning how to rebuild it. This process helps the model understand how words can be turned into visual elements, allowing it to generate images based on what the text describes.
  • Generative Adversarial Networks (GANs): These models have two parts: a generator, which creates content like images or videos, and a discriminator, which checks if this content looks real. The two parts compete with each other—the generator tries to make content that can fool the discriminator, and the discriminator gets better at spotting what's fake. Over time, this back-and-forth process helps the model produce highly realistic content.
  • Flow-based Models: These models generate things like audio, video, or 3D content by learning the probability of different elements occurring together in the data. They create new content by carefully following these probabilities, which makes the generation process more structured and efficient.

What first emerged as esoteric experiments have rapidly evolved into powerful user-friendly tools like DALL-E, ChatGPT, and GitHub Copilot that leverage generative AI to augment and extend human creativity and productivity in profound ways.

4. LLMs, GPTs and GANs

Large Language Models (LLMs) are a type of AI designed to understand and generate human language. These models learn from vast amounts of text data, enabling them to produce coherent and contextually appropriate responses. LLMs can be used for a variety of tasks, including translation, summarization, and content creation.

Introduction to large language models

Generative Pre-trained Transformer (GPT) models are a specific type of LLM built on the transformer architecture. GPT models have been at the forefront of generative AI breakthroughs, particularly in the realm of language. They excel at generating fluent, contextually relevant text across a wide range of topics by using attention mechanisms to understand the relationships between words in a sentence.

5. Prompts and Chats

To successfully harness the full capabilities of large language models and generative AI tools, the practice of prompt engineering has emerged as a crucial skill. Prompts act as the interface, defining the inputs and directives that guide the AI system towards the user's desired outputs.

Better prompts lead to better outputs. This has driven active research and experimentation into prompting techniques. However, crafting effective prompts is not just about clarity and logic. The way we use language—whether breaking a task down into steps or using expressive, even poetic language—can greatly influence the AI's response. Logical, step-by-step prompts help the AI process complex tasks accurately, but at times, using metaphors or rich, descriptive language can help capture the subtleties of what you're aiming for.

This is why individuals with a deep understanding of literature, art, history, and philosophy often excel at prompt engineering. Their ability to use language creatively and thoughtfully allows them to craft prompts that guide AI in producing more nuanced and meaningful outputs. By combining logical structure with expressive language, they can achieve results that are not only precise but also deeply resonant with human experience.

One strategy to impove prompt results is through meta-prompting, or creating higher-level prompts that work out the desired steps for later more specfic prompts. This technique can also be used to get an AI to help in the writing of prompts for a project.

For example, a meta-prompt for digital publishing tasks might be:

"With any uploaded manuscript I submit in a chat, perform a comprehensive digital publishing workflow. First, proofread and copyedit the text, listing all errors and corrections. After receiving approval for these corrections, format the revised text into a web-friendly HTML file, ensuring proper use of headings, paragraphs, and other HTML elements. Next, create a metadata file that includes keywords, descriptions, and other relevant details for search engine optimization (SEO). Finally, generate a series of social media posts to promote the published content, highlighting key points and insights from the text. Summarize each step of the process in a detailed report."

To fulfill this task, a language model would need to:

  • Perform a detailed proofreading and copyediting of the provided manuscript, identifying and listing all errors and corrections.
  • Revise the text according to approved corrections and format it into a structured HTML file, ensuring proper use of HTML elements like headings, paragraphs, links, and lists.
  • Create a metadata file with relevant keywords, descriptions, and other details to optimize the text for search engines.
  • Generate a series of social media posts that highlight key points and insights from the text, tailored for different platforms.
  • Summarize each step of the process in a detailed report, including the rationale behind decisions and methods used.

The example prompt above doesn't just ask the AI to perform a single task but outlines a structured sequence of tasks that the AI should follow. It’s a higher-level instruction that guides the AI through multiple distinct actions, each requiring its own prompt-like directive (proofreading, formatting, metadata creation, social media promotion). By laying out these steps, the prompt effectively acts as a "prompt for generating prompts" that directs the AI to handle a complex, multi-step workflow autonomously.

Crafting an effective prompt involves adhering to some key principles: clear and concise descriptions, precise specification of the desired output, proper context setting, providing relevant examples, and guiding the AI with step-by-step instructions. Below is a list of standard techniques that can help you achieve these goals:

Prompt Techniques with Examples

  • Few-Shot Prompting: Providing the AI with a few examples to guide its response.
    Example: "Translate the following: 'Good morning' -> 'Buenos días', 'Thank you' -> 'Gracias', 'Please' ->" - Response: "Por favor."
  • Chain-of-Thought Prompting: Breaking down a task into a series of logical steps for the AI to follow.
    Example: "Explain step by step how to bake a cake." - Response: "1. Preheat the oven... 2. Mix the ingredients..."
  • Meta Prompting: Asking the AI to generate or refine its own prompts to better accomplish a task.
    Example: "How should I ask you to summarize a book?" - Response: "Ask me to 'Summarize the key plot points and themes of the book in a few sentences.'"
  • Prompt Chaining: Connecting multiple prompts together, where each prompt builds on the previous response.
    Example:
    Prompt 1: "What is the capital of France?"
    Response: "Paris."
    Prompt 2: "Describe Paris as a tourist destination."
    Response: "Paris is known for its iconic landmarks such as the Eiffel Tower and the Louvre Museum..."
  • Retrieval-Augmented Generation (RAG): Combining AI-generated content with information retrieved from external sources.
    Example: "Provide a summary of the latest research on climate change." - Response: "Retrieving information... The latest research suggests..."

Prompting Principles

  • Clarity: Ensure your prompt or metaprompt is clear and specific to avoid ambiguity.
  • Context: Provide sufficient context to guide the AI in generating relevant responses or content.
  • Conciseness: Keep prompts concise to maintain focus and prevent confusion.
  • Relevance: Make sure the prompt is directly related to the desired output or topic.
  • Detail: Include necessary details to help the AI understand the nuances and specifics of the task.
  • Creativity: Encourage creative responses by framing prompts that allow for open-ended answers.
  • Iterative Refinement: Refine prompts based on initial outputs to improve accuracy and relevance.
  • Tone: Set the appropriate tone and style to match the intended audience and purpose.
  • Experimentation: Experiment with different phrasings and structures to find the most effective prompts.
  • Feedback: Provide feedback on the AI’s responses to continually improve its performance.

6. Unit Exercise

To develop a grounded understanding of prompt engineering, this unit will have you engage in a hands-on exercise with a language model like ChatGPT or Claude. Here are the steps:

This experience prompting an AI system will surface insights about their strengths, limitations, biases, and principles required to responsibly guide their development. Log your process and reflections.

7. Discussion Questions

8. Bibliography

Dynamic AI Co-Creation: A Human-Centered Approach
by Will Luers | Sept. 2024