Classical Conditioning
Ivan Pavlov: Won the nobel prize in 1904 for his work on digestive physiology: studying salivation in dogs |
Discovery of Classical Conditioning |
Noticed that dogs began to salivate not just at food (dry food = natural salivation), but also at stimuli associated with food (like the dish, the person, or the lab). These neutral stimuli began to trigger salivation after being repeatedly paired with food. |
Experiment Setup |
Pavlov rang a bell (neutral stimulus) before giving food. After several repetitions, the bell alone triggered salivation. |
Unconditioned Stimulus (US): Food – naturally causes a reaction. |
Unconditioned Response (UR): Salivation to food – natural, unlearned. |
Conditioned Stimulus (CS): Bell – originally neutral, becomes meaningful through pairing. |
Conditioned Response (CR): Salivation to the bell – learned response. |
Acquisition of Conditioned Responses |
At first, a Conditioned Stimulus (CS) (like a bell) does not cause a Conditioned Response (CR) (like salivation). After repeated pairings with the Unconditioned Stimulus (US) (like food), the CS starts to trigger the CR. Learning is gradual—the strength of the CR builds up over time with more CS-US pairings. |
Second-Order Conditioning |
Once a CS (e.g., light) has been paired with a US (e.g., food) to elicit a CR (salivation), a new neutral stimulus (e.g., a bell) can be paired with the CS (light) to also trigger the CR—even without the US. This is called second-order conditioning. Example: If the sight of a dentist causes fear due to painful experiences (US), then related cues (dentist's office, voice, etc.) can also trigger fear. It explains how fears or emotional responses spread through associations. |
Extinction |
If the CS is presented without the US repeatedly, the CR gradually weakens and disappears. This is called extinction. Example: If a bell is rung but no food follows, over time, the dog will stop salivating to the bell. Extinction is not the same as forgetting: Forgetting is slow; extinction can happen in just a few trials. Evidence: After a delay (with no exposure), the CR can return (spontaneous recovery). |
Spontaneous Recovery |
After extinction, if the animal is given a rest, the CS can again trigger the CR when presented. Shows that extinction doesn’t erase the original learning—it just suppresses it. Spontaneous recovery means the memory is still there; the animal is testing whether the CS is informative again. |
Reconditioning |
If the animal is conditioned again after extinction, it learns much faster than the first time. Suggests that some memory of the original learning remains. |
Real-Life Example: Exposure Therapy |
Used to treat phobias and anxiety. The feared stimulus (CS) is presented without danger or trauma (US). Over time, anxiety (CR) decreases = extinction. However, after therapy ends, anxiety can return = spontaneous recovery. Not a failure of therapy, just a sign that more sessions are needed. |
Generalization |
Definition: The tendency of a learned response (CR) to occur in the presence of stimuli that are similar, but not identical, to the original conditioned stimulus (CS). |
Example: A dog trained to salivate at a specific tone will also salivate (less strongly) to other, similar tones. |
Generalization Gradient: The more different a new stimulus is from the original CS, the weaker the conditioned response becomes. |
Discrimination |
Definition: The ability to distinguish between different stimuli, responding only to the CS+ (which is followed by the US) and not to similar but non-predictive stimuli (CS–). |
Example: If a red light (CS+) signals a boat horn (US), and an orange light (CS–) never does, a person will eventually tense up only to the red light. |
CS– Role: It signals the absence of the US, becoming an inhibitor that reduces the likelihood of the CR. |
CS as a “Signal” |
The CS works best when it predicts the arrival of the US. |
Timing Matters: Forward pairing (CS before US, short delay): Most effective. Simultaneous pairing (CS and US at the same time): Less effective. Backward pairing (US before CS): Least effective. Analogy: Like a caution sign before a dangerous curve: Just before the curve = effective (CS predicts US). Too early = ineffective (CS too far ahead of US). During or after the curve = useless or confusing (simultaneous or backward pairing). |
Contingency vs. Contiguity |
Contiguity means the CS (Conditioned Stimulus) and US (Unconditioned Stimulus) occur close in time. Contingency means the CS predicts the likelihood of the US. Key Insight: Learning doesn’t happen just because two things happen close together (contiguity); instead, learning depends on whether the CS provides useful information about the US (contingency). 🐶 Example: The dog hears a metronome and gets food. Many other things (light, noise) are also present. But only the metronome reliably signals the arrival of food. That’s contingency. |
The Role of Information |
Animals (and humans) learn only from stimuli that give reliable info about what’s going to happen. If a stimulus is always present, regardless of whether the US comes or not (like light fixtures), it gives no predictive value and won’t be learned as a signal. |
Experiment on Rats – The Role of Predictive Value |
Group A: Shock sometimes follows the bell, but it also happens just as often without the bell → no contingency, no learning. Group B: Shock more likely after the bell than without → some contingency, learning occurs. Conclusion: Even an imperfect predictor (40% chance) can cause conditioning if it increases the likelihood of a US compared to baseline. |
The Absence of Contingency |
When tones and shocks are randomly paired, there’s no way to know when a shock is coming → no conditioning happens. If shocks only follow tones, even inconsistently (e.g. 50% of the time), animals learn because the tone predicts something. Key Concept: Unpredictability leads to chronic stress. When there’s a danger signal (e.g. tone), there’s also a sense of safety when it’s absent. Random shocks = constant anxiety. |
Rescorla-Wagner Model |
Learning happens when there’s a surprise, and it stops when things become predictable. |
Core Idea |
Your brain is constantly trying to predict what will happen. If something happens that’s unexpected, your brain says: "Whoa! I didn’t see that coming — I need to learn from this!" |
How It Works |
Let’s say the brain keeps a “score” of how much it expects the US (e.g., food or shock) after a CS (like a tone). 🧾 The Learning Formula: Change in learning = How surprising the US is = (What actually happened) – (What was expected) Or: ΔV = λ – V Where: ΔV = change in strength of learning λ = the actual outcome (Was there food/shock? How strong?) V = what was expected (How much did the animal already think the food/shock was coming?) |
Key Points |
Learning = Surprise No surprise = no learning. Prediction gets updated each time based on error (difference between expected and actual outcome). Eventually, when predictions are perfect, learning stops. If a new CS (like a light) adds no new information, it won’t be learned — this explains blocking (an advanced concept, but tied to this model). |
CR and UR are NOT always the same |
UR (Unconditioned Response) happens naturally after the US (Unconditioned Stimulus). CR (Conditioned Response) is a learned reaction to the CS (Conditioned Stimulus). They might look different even though they're both “responses.” |
Example: |
Rat + shock (US) → jumps and squeals (UR) Flashing light (CS) → freezes and heart slows (CR = preparing for shock) |
CR = “Get ready!” response Not a direct copy of the UR—it's an anticipatory adjustment. |
CR helps the body prepare |
Animals aren’t reacting randomly — the CR prepares them for what's coming. |
Example: Tone → signals food is coming → dog moistens mouth Light → signals shock → animal freezes, alert 📌 This preparation makes the animal more efficient or safe. |
Conditioning and Drug Tolerance |
Repeated drug use (like heroin) leads to tolerance: you need more to feel the same effect. Why? Because the body learns to compensate in advance. How? US = heroin UR = drug’s biological effects (pain relief, dry mouth, good mood) CS = sight of needle, drug environment CR = opposite of the drug’s effects (more pain sensitivity, bad mood, wet mouth) 📌 CR is a compensatory response = the body is trying to stay in balance (homeostasis). |
Drug Craving = CR with no US |
If a user sees the CS (needle, place, routine) but no drug arrives, the CR still happens. So they feel pain, depression, cravings — the opposite of what the drug would’ve done. 📌 Craving = your body bracing for a drug that doesn’t come. |
Insight Learning
What is Insight Learning? |
Insight learning is a type of learning that happens suddenly, through understanding relationships between different parts of a problem, rather than through trial-and-error. |
|
It involves a sudden realization—an “Aha!” moment—where the solution just clicks. |
|
Insight is not based on conditioning or reinforcement, but on cognitive restructuring of the problem. |
Background & Theorist: Wolfgang Köhler |
|
Köhler was a Gestalt psychologist who studied problem-solving in chimpanzees. |
|
Gestalt psychology emphasizes holistic processing—how we perceive whole patterns, not just bits and pieces. |
Köhler’s Famous Experiments with Chimps |
🐒 Example 1: Sultan and the Stick Setup: A chimpanzee named Sultan was placed in a cage with a banana just out of reach, and sticks nearby. Process: Sultan tried reaching it unsuccessfully, then stopped and seemed to think. Insight: Suddenly, Sultan used one stick to pull another closer and joined them to reach the banana. 👉 He did not arrive at this through gradual trial-and-error—it came suddenly. |
🐒Example 2: Box Stacking Chimps were given boxes and a hanging banana. They stacked the boxes and climbed them to get the banana, showing understanding of spatial relationships. |
Key Characteristics of Insight Learning |
Characteristic |
Description |
Suddenness |
The solution appears all at once (the “aha!” experience). |
Understanding |
Involves grasping the structure of the problem, not random attempts. |
No trial-and-error |
Unlike conditioning, it doesn’t rely on repeated errors or reinforcement. |
Transferability |
The solution or principle can often be applied to new but similar problems. |
Requires mental reorganization |
Learner reinterprets the problem and mentally restructures the elements |
Insight Learning vs. Trial-and-Error Learning |
Insight Learning |
Trial-and-Error Learning |
Cognitive and sudden |
Behavioral and gradual |
Based on perception and problem analysis |
Based on repeated attempts and failures |
May take longer to reach, but solution is quick |
Gradual improvement over time |
Solving a riddle by rethinking it |
Trying keys one by one to open a lock |
📌 Implications of Insight Learning |
Shows the Role of Cognition: Learning isn’t always about reinforcement—thinking matters |
Applies to Humans & Animals: Though common in humans (especially in problem-solving), it has been shown in chimpanzees, birds, and other species. |
Relevance in Education: Encourages the design of learning environments that foster critical thinking rather than rote memorization. Helps explain creative problem-solving in real-life situations. |
Real-Life Examples: Figuring out a tricky riddle after staring at it for a while. A child suddenly realizing how to tie shoelaces after watching but not previously succeeding. An inventor seeing a solution to a problem after stepping away from it and then suddenly “seeing” the answer. |
|
|
Operant/Instrumental Conditioning
What is Instrumental Conditioning? |
Also known as Operant Conditioning. Involves learning voluntary behaviors (as opposed to automatic reflexes). Behavior is initiated by the organism, not triggered by an external stimulus. The outcome of the behavior (its consequence) shapes future behavior. |
Classical vs. Operant Conditioning |
Classical: Involuntary/reflexive responses (e.g., salivation). Instrumental: Voluntary actions (e.g., pressing a lever to get food). |
Thorndike and the Law of Effect |
Puzzle Box Experiment: |
Hungry cats placed in a box with a mechanism (like a loop or lever) to escape and access food. First attempts: random behaviors (biting, scratching). Eventually, by trial and error, they hit the correct action. With repetition, escape time gradually decreased—indicating learning. |
Insight or Gradual Learning? |
Thorndike found no sudden "Aha!" moment. Learning curve was gradual, not abrupt → suggests no reasoning, just reinforcement. |
The Law of Effect: |
Behavior followed by a reward → strengthened. Behavior followed by no reward or punishment → weakened. The animal doesn’t need insight or understanding—just consequences. 📌 Learning = responses are "stamped in" (if rewarded) or "stamped out" (if not). 🔁 Parallel to Natural Selection: Like evolution: successful behaviors "survive", and useless ones fade. No conscious direction—just selection based on outcomes. |
Skinner and Operant Behavior |
Key Ideas: Distinguished operant from classical conditioning: Classical: Response is elicited by a stimulus. Operant: Response is emitted voluntarily by the organism. Called these voluntary responses operants because they operate on the environment. 📍 Core principle: Behavior + Positive Consequence = More Likely in Future Behavior + Negative Consequence = Less Likely |
The Skinner Box: A controlled chamber where animals (rats, pigeons) could perform behaviors like pressing a lever or pecking a key for food. Allowed for rapid, repeated trials. Measured response rate = # of behaviors per unit of time. ✅ Advantage: More efficient than Thorndike's puzzle box (didn't need to reset after each trial). |
Differences Between Classical and Instrumental Conditioning |
Classical Conditioning: Learning about the relationship between two stimuli (e.g., bell and food). The response is automatic or reflexive (UR). Instrumental Conditioning (Operant Conditioning): Learning about the relationship between a response and its consequence (reinforcer or punishment). The response is voluntary. Despite differences, both involve learning relationships among events and share key phenomena (like extinction, generalization, and discrimination). |
Learning Trials and Extinction |
In classical conditioning: CS followed by US leads to learning. In instrumental conditioning: Response followed by reinforcer leads to learning. Extinction happens when reinforcement stops. The behavior gradually weakens or disappears. |
Generalization and Discrimination Generalization: After learning a response to one stimulus (S+), animals often respond similarly to similar stimuli. The further the test stimulus is from the original S+, the weaker the response (seen in pigeons trained with light colors). Discrimination: Animals learn to distinguish between stimuli that signal different outcomes: S+ (positive discriminative stimulus): Signals reinforcement. S– (negative discriminative stimulus): Signals no reinforcement. Example: A child behaves better when parents are around (S+) than when they’re not (S–). |
Complex Discriminations Animals (like pigeons) can make surprisingly complex discriminations: Water vs. non-water pictures Trees vs. non-trees Recognizing individual humans from varied angles Shows learning goes beyond simple sensory cues and includes abstract categories. |
Shaping (Successive Approximations) Used to teach complex or unlikely behaviors. Reinforcement is given step-by-step as the animal’s behavior gradually approximates the desired action. Example: To teach a rat to press a high lever: Reinforce being near the lever → facing it → raising head → touching lever → pressing lever. |
What is a Reinforcer? Primary Reinforcers: Naturally rewarding (e.g., food, water). Social Reinforcers: Praise, smiles, etc. Conditioned Reinforcers: Gain value by being associated with primary reinforcers (e.g., money). Some reinforcers are informational or experiential (e.g., watching a toy train, using a wheel). |
Behavioral Contrast The effectiveness of a reinforcer depends on context and past experience. A reward may seem large or small depending on what the subject was used to before. Example: 16 food pellets feel small after 60 but generous after 4. |
Intrinsic Motivation and Overjustification Effect Overjustification Effect: External rewards can reduce intrinsic interest in an activity. Example: Children who initially liked drawing became less interested after being rewarded with “Good Player” certificates and then having those rewards removed. |
Schedules of Reinforcement |
Partial Reinforcement: Behavior is reinforced only sometimes, not every time. Effect: Leads to greater resistance to extinction (we keep trying even if not always rewarded). |
Types of Reinforcement Schedules Ratio Schedules (Based on number of responses) Fixed Ratio (FR): Reinforcement after a set number of responses. E.g., FR 2 → reward after every 2 responses. Variable Ratio (VR): Reinforcement after a varying number of responses (average-based). E.g., VR 10 → reward on average after 10 responses (might be 5, then 15, etc.). Common in gambling (e.g., slot machines). Interval Schedules (Based on passage of time) Fixed Interval (FI): First response after a fixed time is rewarded. E.g., FI 3 minutes → reward given after 3 minutes if response made. Variable Interval (VI): Time interval changes; average time determines reward schedule. E.g., VI 8 minutes → reward on average after 8 minutes. |
Contingency in Instrumental Conditioning |
Contingency: Behavior must predict the reward—not just follow it. Similar to classical conditioning: Prediction, not just pairing, is key. The likelihood of reward must be greater with the behavior than without it. 🔧 Control Matters: Organisms like having control over outcomes. When individuals can predict and influence rewards, learning is enhanced. |
Experiment: Infants and Mobiles |
Group 1: Infants could control the mobile by moving their heads. Result: Enjoyed and engaged with the mobile. Group 2: Mobile moved independently, not due to the infant’s action. Result: Infants lost interest. Key Point: Even 2-month-old infants prefer control and enjoy mastery. |
Learned Helplessness |
The Dog Study: Group A: Dogs could stop shocks by pressing a panel. Group B: Received same shocks, but had no control—shocks were inescapable. Later task: Jump a barrier to avoid shock. Group A: Learned and escaped quickly. Group B: Became passive, didn’t try to escape—even though it was possible. |
Learned Helplessness: When previous lack of control leads to a belief that future attempts are useless. Leads to passivity, even when escape or success is possible. |
Latent Learning
Edward Tolman: Believed learning is more than behavior change—it's acquiring knowledge. |
Latent Learning: |
Learning that occurs without any obvious reinforcement and does not immediately manifest in behavior. The learned knowledge becomes apparent only when there is motivation to demonstrate it. |
Classic Experiment (Tolman & Honzik, 1930): |
Rats explored a maze for 10 days with no reward → no visible change in behavior. On Day 11, food was introduced at the goal box → rats quickly and accurately ran to the food. This showed they had formed a mental map of the maze during unrewarded exploration. |
Mental Maps (Cognitive Maps): |
Internal representations of the environment. Allow organisms to navigate spaces efficiently even without direct reinforcement. |
Implications |
Learning ≠ Immediate Behavior Change: Just because behavior hasn’t changed doesn’t mean learning hasn’t happened. Supports Cognitive Perspective: Emphasizes the role of internal cognitive processes, not just stimulus-response links. Challenge to Behaviorism: Opposes the strict behaviorist view (e.g., Thorndike’s law of effect) that learning only occurs via reinforcement. Practical Relevance: Students may learn a lot during lectures without immediately demonstrating it. Skills and knowledge can emerge when they become relevant or useful. Used by Many Species: Not limited to humans—many animals develop mental maps for foraging, navigation, etc. |
Observational Learning
What is Observational Learning? |
Learning by watching others and imitating their behavior. |
Also called social learning or vicarious learning. |
No direct experience or reinforcement needed—just observation. |
Once thought to be uniquely human, but now observed in many animals too. 📚 Examples: Monkeys: Learn fear by watching another monkey react fearfully. Pigeons: Imitate behaviors like pecking or stepping to get rewards after watching others. |
Mirror Neurons |
Special neurons in the frontal lobe near the motor cortex. Fire when you perform an action and when you see someone else perform the same action. Help with understanding others’ actions and imitating them. Found in many species, including humans. |
Human Imitation Starts Early |
Infants imitate facial expressions within the first month of life. Later, they mimic a wide range of behaviors. Two types of imitation: Mimicry: Copying the exact behavior. Modeling: Learning general rules or what behavior is "okay" in a setting. |
Deferred imitation: means imitating an action after a delay, not right after seeing it. The behavior is observed first, then reproduced later, sometimes hours, days, or even weeks afterward. |
Bandura's Bobo Doll Experiment |
Albert Bandura wanted to test whether children learn aggressive behavior by observing adults. |
Experiment Setup |
Participants: Preschool children (around 3–6 years old) Groups: Aggressive model: Kids watched an adult physically and verbally attack a Bobo doll (e.g., hitting, kicking, saying "pow!"). Non-aggressive model: Kids watched an adult play quietly and nicely with toys. Control group: Kids saw no model. |
Results |
After watching the adult, kids were taken to a room with toys including a Bobo doll. Children who saw the aggressive model were more likely to imitate the aggression — even using the same actions and words. Some children went beyond imitation, showing new aggressive behaviors. |
Key Findings |
Children learn social behavior like aggression through observation, not just direct reinforcement. Modeling matters: kids imitate what they see, especially if the model is powerful or similar to them. This learning can be immediate or delayed (deferred imitation). Boys showed more physical aggression than girls, but both imitated the behavior. |
4 Steps of Observational Learning |
Attention: You have to notice the behaviour |
Retention: You have to remember the behaviour |
Reproduction: You must be able to replicate it |
Motivation: You must want to do it |
Vicarious reinforcement: Vicarious reinforcement is when we learn by watching someone else get rewarded for a behavior — and then we're more likely to do that behavior ourselves. |
Characteristics of the Model That Affect Learning: |
Perceived Similarity People are more likely to imitate models who are similar to them (in age, gender, interests, etc.). |
Perceived Competence If the model appears skilled or knowledgeable, observers are more likely to imitate them. |
Status and Prestige Models with high social status (celebrities, teachers, respected peers) are more influential. |
Warmth and Nurturance Models who are kind, friendly, and caring tend to be imitated more often. |
Power or Authority Models who hold authority or power (like parents or police officers) can strongly influence behavior. |
Consistency of Behavior Consistent behavior across situations makes a model more trustworthy and worth copying. |
Reinforcement or Punishment Observed If the model is rewarded, the observer is more likely to imitate. If the model is punished, the behavior is less likely to be copied. |
|
Created By
Metadata
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets
More Cheat Sheets by rentasticco