Reinforcement |
In classical (Pavlovian) conditioning, where the response has no effect on whether the stimulus will occur, reinforcement produces an immediate response without any training or conditioning. When meat is offered to a hungry dog, it does not learn to salivate, the behavior occurs spontaneously. Similarly, a negative reinforcer, such as an electric shock, produces an immediate, unconditioned escape response.
To produce a classically-conditioned response, the positive or negative reinforcer is paired with a neutral stimulus until the two become associated with each other. Thus, if the sound of a bell accompanies a negative stimulus such as an electric shock, the experimental subject will eventually be conditioned to produce an escape or avoidance response to the sound of the bell alone.
Once conditioning has created an association between a certain behavior and a neutral stimulus, such as the bell, this stimulus itself may serve as a reinforcer to condition future behavior. When this happens, the formerly neutral stimulus is called a conditioned reinforcer, as opposed to a naturally positive or negative reinforcer, such as food or an electric shock.
B. F. Skinner |
In operant conditioning (as developed by B. F. Skinner), positive reinforcers are rewards that strengthen a conditioned response after it has occurred, such as feeding a hungry pigeon after it has pecked a key. Negative reinforcers are unpleasant stimuli that are removed when the desired response has been obtained. The application of negative reinforcement may be divided into two types: escape and avoidance conditioning.
In escape conditioning, the subject learns to escape an unpleasant or aversive stimulus (a dog jumps over a barrier to escape electric shock). In avoidance conditioning, the subject is presented with a warning stimulus, such as a buzzer, just before the aversive stimulus occurs and learns to act on it in order to avoid the unpleasant stimulus altogether.
Reinforcement may be administered according to various schedules. A particular behavior may be reinforced every time it occurs, which is referred to as continuous reinforcement. In many cases, however, behaviors are reinforced only some of the time, which is termed partial or intermittent reinforcement.
Reinforcement may also be based on the number of responses or scheduled at particular time intervals. In addition, it may be delivered in regularly or irregularly. These variables combine to produce four basic types of partial reinforcement.
In fixed-ratio (FR) schedules, reinforcement is provided following a set number of responses (a factory worker is paid for every garment he assembles). With variable-ratio (VR) schedules, reinforcement is provided after a variable number of responses (a slot machine pays off after varying numbers of attempts).
Fixed-interval (FI) schedules provide for reinforcement of the first response made within a given interval since the previous one (contest entrants are not eligible for a prize if they have won one within the past 30 days). Finally, with variable-interval (VI) schedules, first responses are rewarded at varying intervals from the previous one.
first response |