Classical Conditioning and the Clicker

How do you get paid? 

Does your boss stagger into your office groaning under the weight of fourteen cabbages, a lemon, six rolls of toilet paper, a jar of coffee, a few cans of petrol and twenty kilos of dog food? 

Probably not.

When you pay your bond, do you deliver three chickens, a pair of shoes and ten metres of fencing to the bank manager?

I doubt it.

I would love to deliver a cow, four goats, six snoek and a ton of good-quality, well-rotted manure to the Receiver of Revenue to pay my taxes.

But somehow I don’t think it would work.

 

So how do you get paid?

You probably get given a piece of paper.  Or possibly a lot of pieces of paper.  You work your butt off for an entire month, and you get a piece of paper in exchange.  And you keep on doing it!  Are you nuts?

Possibly not.

The piece of paper may be a cheque, or notification of a bank transfer, or you may be paid in cash, but whichever way you are paid, the piece of paper you receive represents the goods you can buy with it.  This is the function of money in our society.  It is a medium of exchange with which we can purchase the goods and services we require.

Why are you satisfied to receive money and not the actual goods and services you have earned?  Because you trust money as a medium of exchange.  Why do you trust it?  Because you have always been able to exchange it for what you wanted in the past.  In psycho-speak, you have been classically conditioned to form an association between money and the things you can buy with it.

Classical conditioning as a means of learning owes its discovery and terminology to the  Russian physiologist Ivan Pavlov (1848-1936).  Pavlov performed extensive studies on the basic reflexes of animals and in particular, the salivary reflex in dogs.  In his experiments, he would place food in a dog’s mouth and observe the salivary response.

One day, he noticed that his dogs started salivating at the sight of food.  This observation led him to formulate his famous bell-ringing experiment, in which he discovered the process of classical conditioning.  It goes something like this:

First, Pavlov noted that salivation was a basic reflex (as opposed to a learned behaviour).  He then noted that no learning (conditioning) needed to take place for food to cause salivation.  He therefore called the food the unconditioned stimulus (US) and the salivary response the unconditioned response (UR). 

(A stimulus is simply something external which acts as a signal to an animal, and a response is what the animal does after experiencing the stimulus.  For example, if you stick a pin (stimulus) into someone’s arm, he will flinch (response).  The pain response is another example of a reflex, or unconditioned response; you don’t have to learn that having a pin stuck in your arm is painful.  It just is.)

In psychology textbooks, which are all nearly unreadable, this is usually written as follows:

            US ————————-> UR

In English, the unconditioned stimulus (e.g. food) causes the unconditioned response (e.g. salivation).

Pavlov then added a stimulus, or signal, which meant nothing to the animal, so he called it a neutral stimulus (NS).  In his experiments, he used a bell.

To start with, he rang the bell and the dog, quite sensibly, didn’t salivate.  This showed him that the bell was in fact neutral (scientists have an obsession with proving things that are obvious to everyone else).  In psycho-speak:

            NS ————————–> no observable response.

Next, he rang the bell and gave the dog the food at the same time (in fact, the bell was rung immediately before the food was presented).  He did this several times.  Each time, the dog salivated.  Schematically:

            NS + US ——————–> UR

The next bit is the interesting bit.  Once Pavlov had paired the bell with the food a few times, he rang the bell without presenting the food and observed the dog – and the dog salivated.  The dog had made the association that the bell meant that food would surely follow.  The bell had thus become a conditioned stimulus (CS) and the salivary response had become a conditioned response (CR):

            CS —————————–> CR

Pavlov called salivation in response to the sound of a bell a conditioned response because the dog had to learn (or be conditioned) that the bell was associated with food.

Only a dog, you may be thinking, would be stupid enough to learn to drool at the sound of a bell.  Indeed.  It goes without saying that you, of course, have never felt even the slightest twinge of pleasure at the sight of a $100.00 bill – which is, after all, only a piece of paper!

Classical conditioning is an extremely powerful phenomenon which is at the heart of many of our instinctive emotional reactions, irrational fears and superstitions.  Properly understood and applied, though, it is also a very powerful therapeutic tool, as it tackles the ‘gut’ reaction underlying many problem behaviours.  It is particularly successful in the rehabilitation of aggressive dogs – and aggressive people! 

But how do we apply it to training?

One of the characteristics of the conditioned stimulus, e.g. the bell, is that the dog appears to perceive it as a cue that the unconditioned stimulus, e.g. the food will soon appear.  After very little conditioning, if the dog hears the conditioned stimulus, he will prick his ears up and look for the unconditioned stimulus.

Try it.  Do you have a ballpoint pen of the kind that has a nib which clicks up and down?  (A Parker jotter or similar pen will do very well).  Line up 20 treats on the counter in your kitchen (a frank cut into small pieces should work.)  Click the pen and immediately give your dog a treat.  Repeat this – click, treat, click, treat, click, treat – until you’ve used up 10 of your treats.  Remember to click first and then treat.

Now click once, but don’t give the treat immediately.  What does your dog do?

In all probability, he pricks up his ears, looks at you and says: “where’s my treat, then?” You have just classically conditioned a ballpoint pen as a conditioned stimulus.  To strengthen the association, carry on clicking and treating until you’ve used up all the treats.

Now remember that a positive reinforcer is anything which makes it more likely that a behaviour will be repeated (See Training without Pain on this site).  In other words, if the dog is reinforced after performing a behaviour you want, such as a sit, it’s more likely that he will sit again in the future.  Reinforcers such as food, water or sex constitute primary reinforcers because they meet a basic physiological need. 

Because you have conditioned the ballpoint pen as a conditioned stimulus which means ‘treat coming’, you can now ask, lure or coax Fido to sit, click the pen and then treat, and Fido will understand perfectly well when he hears the click that he is being reinforced for sitting, and that the treat (the primary reinforcer) is on its way.  The pen has become a secondary, or conditioned reinforcer (CR).

The most powerful conditioned reinforcer in our own lives is, of course, money.  People have been known to kill for it.  The association between money and reward is so strong for us that we work quite happily for money without ever thinking about precisely which primary reinforcer we are going to translate it into.

With persistence, the click takes on the same sort of meaning to the dog that money has for us, and develops tremendous reinforcing power in its own right.  It is important in training to maintain that association, and we do that by honouring the promise made by the click and treating every time we click.  After all, how would you feel if you walked into a shop and they refused to take your $100.00 bill?

In practice, we don’t use a ballpoint pen.  We use a clicker, which is basically a strengthened version of a child’s metal cockroach, but in fact we could use anything – a light, a bell, a whistle (deaf dogs can in fact be trained very successfully using a flash of light as a conditioned reinforcer).  There is no magic at all in the clicker; it does, however, have the advantages of

  • producing a sound which is dissimilar to most other sounds in the dog’s environment (take this one with a pinch of salt; Slug turns up looking for a treat every time I try to clip my toenails!)  

  • producing the same sound every time

  • producing a sound which is clearly audible to the dog at quite a distance

  • being quick and easy to operate (a reflex click will come out a lot faster than ‘good boy’ will)

These characteristics are important if the dog is not to become confused, and are the reasons for the clicker taking off as the conditioned reinforcer of choice in modern dog training.  It is also the reason that training using operant conditioning has, somewhat unfortunately, become known as ‘clicker training’.  I repeat:  there is nothing magic about the clicker, and adding a clicker to your normal training repertoire will buy you little or nothing.  Using the clicker correctly as part of a training system based on operant conditioning will, however, bring about astonishing levels of speed and accuracy, as well as improving your dog’s mental health (and his relationship with you) out of recognition.

At this point you may be wondering why anyone would bother with a conditioned reinforcer when it has to be paired with a food treat (primary reinforcer) anyway.  Why not just give the treat and have done with it?

If there are three things that are critical to successful training, they are: timing, timing and timing.  And there are three reasons for using a conditioned reinforcer (CR) such as a clicker: timing, timing and timing.

Although the concepts of operant conditioning were discovered in the laboratory by B. F. Skinner and his students, one of the first major applications in the real world was in the training of dolphins in aquaria.  It’s difficult to punish a dolphin if he does something you don’t like; he just swims away from you.  Choke chains don’t work on dolphins.

Furthermore, if a dolphin does something the trainer does want, such as a jump or a splash, by the time the trainer manages to get the treat (usually a fish) to him, he will probably have done several other things in between and may not even associate the reward with whatever it was the trainer liked.  He may eventually learn through trial and error that jumping will earn him a fish, but fine points like ‘jump high and to the left’ will be impossible to train. 

The use of conditioned reinforcers (a high-pitched whistle in this case) revolutionized dolphin training and made possible the almost unbelievably precise exhibitions we have come to expect from them.  The desired behaviour could be precisely marked using the CR at the moment it occurred, and because the dolphin had been conditioned to the whistle, he knew that he had been rewarded, and that the fish would follow; the likelihood of him repeating that behaviour thus increased. 

Yes, that’s all very well, you may be saying, but my dog isn’t swimming around under water when I train him; he’s right here next to me.  I beg to disagree.  First of all, any training which goes beyond the most basic involves some distance work.  Secondly, even when you’re right next to your dog, you have very little time to respond to a behaviour before he produces the next one; dogs move fast.  Studies show that for a dog to associate a reinforcer with a particular behaviour, the reinforcer needs to follow the behaviour within one second, and preferably within four-tenths of a second.   By the time you’ve mumbled ‘good boy’ and grabbed for the cheese, Fido could be over the hills and far away!

This becomes particularly important when shaping fine distinctions in behaviour; you might want to reinforce Fido for having a foot in a certain position, or being halfway into a sit.  Being able to mark the correct behaviour at the precise instant it occurs is probably the biggest advantage offered by a CR. 

The clicker thus has two very important functions in its role as conditioned reinforcer:

  1. as a cue that the treat is on its way, and

  2. as an event marker which marks the instant the desired behaviour occurred

The latter usage is critical when shaping behaviours – you may wish to mark a slightly straighter sit, a slightly faster trot, a raised head, pricked ears, you name it.  With the clicker, you can mark anything the dog is physically capable of doing, and this is what gives clicker training its astonishing accuracy and precision.

Can your dog chase his tail on command?  If not, try this.  (Tricks are a good place to start clicker training so you can hone your skills without adding some unwanted…um…variations to your obedience exercises!)

Get out plenty of treats.  Decide which way you want your dog to spin – left or right.  Let’s assume you’ve chosen the left.

Say nothing.  I repeat, say nothing.  This is not command-based training.  Your tone of voice is irrelevant.  In fact, you don’t even need a voice!

Start by clicking and treating (C/T) every time the dog looks to the left.  An eye movement is enough.  Keep going until he’s looking to the left at least 8 times out of every 10 trials. 

Up the criteria slightly.  Now you want him to turn his head slightly.  C/T for a slight head turn, don’t C/T for just an eye movement.  Count the number of successes and failures.  If he’s getting less than 2 out of 10 right, you’ve raised the criteria too sharply and he doesn’t know what to do – go back a step.  If he’s getting between 2 and 8 out of 10 right, he’s learning, but he hasn’t got it yet.  Keep going at this level.  If he’s getting more than 8 out of 10 right, he knows what you want and you can up the criteria again.  Perhaps you can C/T for a slightly sharper turn of the head.

Build your steps up gradually, asking for a sharper and sharper head turn, then a paw movement, then both paws, then a body bend and so on.  It’ll probably take a while and will seem quite slow compared to conventional training; but the important thing is that true learning is taking place.  There will be a point where your dog realizes that what he does influences whether you C/T or not; this is really exciting and you can expect to be jumped all over, several times.  His behaviours have now become truly operant; he deliberately operates on his environment in order to obtain a benefit.  Suddenly, your dog is training you, and just how intelligent he really is becomes astoundingly obvious!

Once he’s spinning away like a top, you can put the behaviour under stimulus control.  Add a verbal cue (we don’t call it a command any more) such as ‘Spin’ just before you C/T.  Gradually introduce the cue earlier and earlier.  The dog will associate it with the reward and will start offering the spinning behaviour whenever you say ‘Spin’.   Once this is completely reliable (at least 9 out of 10 successes in several trials), you can stop rewarding freely offered spins.  The dog will learn that you only C/T if he spins after you have given the cue, and that offering spins without the cue is pointless.  The ‘spin’ cue has become a discriminatory stimulus.   OK, you can call it a command if you really, really want to.   

Clicker training seems slow and painstaking at the beginning, but speeds up dramatically as the dog gets the idea.  Watching someone free-shape an experienced clicker dog is quite an experience; the dog starts experimenting freely to find the desired behaviour, and new exercises emerge with remarkable speed – and are remembered!  Morgan Spector, author of Clicker Training for Obedience, estimates that it takes about a third as long to put a clicker-trained dog through its obedience titles as it does a conventionally trained dog.  

The clicker training movement is busy revolutionizing the dog training world.  Suddenly it is possible to train accurate, reliable behaviours with no punishment or coercion; in fact, the training is done entirely hands-free, and the lead has become obsolete except as a safety measure.  The biggest operant conditioning success story comes straight from Skinner’s labs:  Robert Bailey and Marion Breland Bailey, two of Skinner’s most influential students, formed a company called Animal Behaviour Enterprises which over the last forty years has trained something like fifteen thousand animals of every species from cockroaches to elephants, including many dogs.  To improve your training skills, they offer a chicken training camp where the objective is to train a chicken to play a four-note tune on a xylophone – and believe me, it can be done.  In all their vast experience, they estimate that they have used punishment between six and nine times, and then only because their clients (who include the United States Department of Defence) insisted on it.  It raises some uncomfortable ethical questions about our obsession with punishment-based training, doesn’t it?

The clicker dog training movement was pioneered by Karen Pryor, a marine biologist and dolphin trainer who recognized that the principles of operant conditioning could be applied to dogs as easily as to any other species.  Her book, Don’t Shoot The Dog!, is a must-read for anybody wanting more information.  Amongst other things, she points out that dolphin trainers, who are accustomed to using positive reinforcement correctly every day, usually have exceptionally nice, well-behaved children! 

Clicker training has a very sound basis in scientific theory and uses a lot of scientific terminology, which may seem daunting to some people; yes, it is important to understand at least the basic concepts of operant conditioning.  It doesn’t really mix with traditional methods, and requires you to abandon much of what you have done in the past.

But once begun, clicker training, an art, a science and a sport, is a journey which is so enjoyable and rewarding for both you and your dog that I have never heard of anyone turning back!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>