Willkommen Gast. Bitte einloggen oder registrieren.

Benutzername: Passwort:

Loopytraining 2 (Mail von Alex K.)

  • 4 Antworten
  • 4338 Gelesen

0 Mitglieder und 1 Gast betrachten dieses Thema.

  • Think positive!
  • *****
  • Administrator
  • 29211 Beiträge
    • Click for Balance
Loopytraining 2 (Mail von Alex K.)
« am: 15. Dezember 2009, 17:14:10 »
erstellt am: 18 Okt 2009 11:29

dies ist die zweite Hälfte eines Mails. Der erste Teil geht über
Hip Shoulder Shoulder und ihr findet ihn hier

-----------------------------------------------------------

Loopy Training
You would think that this would be a great place to end this post, but you'd be wrong.  Before I plunge head long into my next project, I want to catch people up on this year's clinic theme.  Every year the clinics focus on a particular aspect of the work.  They have a general theme that runs through all the clinics even though the horses are at very different stages in the work.  This year's theme is loopy training.  The description I just gave you for teaching hip-shoulder-shoulder is a great example of loopy training.  Once I've defined what loopy training is, you'll be able to read back through the first half of this post, and you'll recognize the loops in the training steps I outlined.  I deliberately didn't write them out in loop form to start with.  That's a task I want you to do later.

So what is loopy training? In a very real sense loopy training is what we've been doing right along.  It's nothing new.  We just didn't call what we were doing by that name.  We also might not have recognized with any deliberate intent that we were using loops.  The reason for giving something a name is it draws attention to it and creates greater clarity, or at least awareness around an idea.  

The Naming of Things
I backed off a bit from the word clarity because sometimes what a name does is raise even more questions about a subject.  The whole field of operant conditioning is a great example.  Skinner didn't invent positive and negative reinforcement.  But putting labels on these concepts made them something we could focus on more directly and ask more questions about.  Just look at the number of books and articles that have been written about operant conditioning, and you'll see how many questions have been asked since Skinner attached a names to concepts.

In my work the first thing that I attached a name to was the t'ai chi wall.  I was referring to a rope handling technique, one of many that I use.  I gave it a name as a form of shorthand, a way to refer quickly to a particular action down the rope.  What I didn't count on was how naming it turned it into a separate entity.  It gave the t'ai chi wall a status above other unnamed rope handling techniques.  It gave it a life that the other techniques did not have.  

At first I was not at all sure I liked the result.  It seemed to separate this technique from everything else, to raise it above other rope handling skills in the magnitude of importance.  But over time as other names evolved and the whole process became integrated into an overall style of rope handling that I refer to now as t'ai chi rope handling, I've come to appreciate the naming process and the role that it plays in the evolution of ideas.

So for all those people who have been telling me I need to come up with a new name for the work I do, that clicker training just doesn't describe it completely enough, here's the next entry into the naming game: loopy training!  If you're in one of those situations where people are giving you a hard time for clicker training, you can look them straight in the face and say you aren't just a clicker trainer.  You're a loopy clicker trainer!  Hmm.  I think not.  We'll have to keep looking for that perfect name.  We'll keep loopy training to ourselves - at least for now.  We'll let it refer to a specific training strategy, one that gives us a way to identify good clicker training.

Loopy Training Defined
So what is loopy training? If you're at all familiar with clicker training, you've seen this phrase:

Behavior => click => reinforcement.

That's the basic premise of clicker training.  We want to make a certain behavior more likely to occur in the future, so we mark the occurrence of that behavior with an easily perceived, distinctive signal.  Then we link that signal to the presentation of something the animal will actively work for.  When the connection is made, the animal repeats the behavior in order to get the handler to click and reinforce him.

When the behavior is happening consistently, the handler can attach a cue to it so the phrase becomes:
Cue => behavior => click => reinforcement.

But these simple phrases should really be seen more as loops.  It isn't the single presentation of the click followed by the reinforcer that strengthens the desired behavior.  It is the repetition of the entire sequence.  So what we have is:
Behavior => click => reinforcement
=> behavior => click => reinforcement
=> behavior => click => reinforcement
=>

And once a cue has been attached:
Cue => behavior => click => reinforcement
=>Cue => behavior => click => reinforcement
=>Cue => behavior => click => reinforcement
=>
In other words we have loops.

The Evolution of The Term
The idea of loopy training grew out of several converging threads.  One was based on observations I was making at the Clicker Expos.  At the Expos I watched some elegant examples of shaping.  With dogs food delivery is often done by tossing the food out away from the dog.  This creates an automatic reset back to the beginning of the behavior.  For example, if a dog is being reinforced for going to a mat, tossing the food out away from the mat creates another opportunity for the dog to find the mat.  Click => toss the treat out away from the mat.  The dog leaps off his mat to retrieve the food and turns back immediately to repeat the clickable behavior - landing his feet on the mat.  

That's what happens when training is going well - when the dog understands the whole food retrieval part of the equation.  Click, glance at the handler to see which way the food is being tossed, go get it, then return immediately to the task.  Perfect.

Except at the Expos I was also watching novice handlers working with dogs who didn't understand the treat retrieval part of the equation.  They didn't always track the food toss so they missed getting any treat and quickly became discouraged.  Or they found the treat, but then kept hunting for more.  Ever hopeful, they obviously thought one piece of hot dog on the ground must mean that there would be more.  Their brains had obviously switched from the puzzle solving, engage-with-your-person mode into the follow-your-nose, single-focused hunt mode.  When they were satisfied that no more hotdogs were to be found, they remembered that their human was there and that he/she could generally be counted on for some entertainment.  But by then the connection with the earlier behavior was a distant and all but forgotten memory.

« Letzte Änderung: 15. Dezember 2009, 17:17:58 von Muriel »
Alles kommt zu dem, der warten kann.
  • Gespeichert

  • Think positive!
  • *****
  • Administrator
  • 29211 Beiträge
    • Click for Balance
Re:Loopytraining 2 (Mail von Alex K.)
« Antwort #1 am: 15. Dezember 2009, 17:15:17 »
Kay Laurence Video
Efficient, well understood food delivery was clearly a key element to training success.  When I was visiting with Jesús Rosales-Ruiz from the University of North Texas last winter, he showed me a video clip that Kay Laurence has up on her web site (learningaboutdogs.com).  It's an elegant bit of training.  The handler is sitting in a chair to train her dog.  The base behavior is a recall.  Each time the dog approaches the handler, she clicks, then she tosses the food out away from the chair.  The dog has perfect food retrieval.  He spots the throw.  Goes straight to the spot where the treat lands, grabs it up, and heads straight back towards the handler.  There's no time lost while he sniffs around for other goodies.  He's learned that the best chance he has for finding more is to return to his handler.  It's a clean, efficient loop of behavior:
The dog approaches the handler => click => leads to food toss => food retrieval
=> the dog approaches the handler => click => leads to food toss => food retrieval
=> the dog approaches the handler => click => leads to food toss => food retrieval
=>

Once this loop was well established, the handler placed a mat a couple of feet out in front of her chair.  Now the dog was passing over the mat on his way back to her.  The click occurred as his front paws landed on the mat. 

Once landing on the mat was firmly established, the handler began to change the direction of the food toss.  She had been throwing it out towards twelve o'clock, directly in front of her so all the dog had to do was return on a direct line to her chair.  Now she tossed the food out towards three o'clock, then nine, and finally into the hardest spot towards six so he had to turn around to get to the mat.

The dog clearly was making a deliberate deviation from a direct line back to the handler to land on the mat.  It was very cute.  Sometimes he would jump two paws together in an enthusiastic "got it!" landing on the mat.  A couple of times the handler delayed the click slightly so she could begin to create a stay on the mat. 

The key to this exercise was the food delivery.  Especially when contrasted with some of the training I'd seen where the dog was uncertain about the food delivery, it was clear that inefficient, poorly executed or poorly understood food delivery broke the links in the
behavior => click => reinforcement =>  loop. This clip clearly shows the importance of working out your food-delivery mechanics.

The "Poisoned Cues" Videos
Kay's clip was a great example of a clean training loop.  Another example can be seen in the "Poisoned Cue" DVD.  I've described the Poisoned Cue research many times, and we now have the DVD so I won't go into great detail here.  I would urge you, however, to get the DVD and watch the video clips.  It isn't just poisoned cues that they illustrate so well.  They also show you very clearly how good clicker training works.

In the experiment two cues were taught.  Both meant "come", but they were taught in different ways.  The cue "ven", meaning come was taught with positive reinforcement.  The dog was shaped through a series of approximations to approach the handler when he heard the word "ven".

The cue "punir" also means come, but it was taught with a combination of correction and positive reinforcement.  If the dog did not respond promptly enough, he was dragged by his harness over to the handler, and then he was clicked and reinforced.  The question that was being looked at was could you shape equally well with both cues. 

The experiment was conducted in a small office.  The floor was marked off in tile size grids.  In one set of trials the cue "ven" was given each time the dog stepped onto a particular grid to the left of the handler.  The question was could you shape behavior using this cue.  In the "Poisoned Cue" DVD you watch as a tail-wagging little poodle lands on the chosen square, hears the word "ven", and immediately heads over to the handler.  When he's on the grid directly in front of her, click, she delivers a treat.  The dog gets his goody, and then, tail still wagging, eyes bright, head up, he goes straight back to the magic grid.

In the contrasting trial the magic square is to the right of the handler.  When you watch this video clip, you can easily think you are watching a different dog.  In fact in presentations I do not tell people in advance that it is the same dog.  They think they are watching two different poodles.  The "ven" dog is an obviously happy, tail wagging, eager dog.   The "punir" dog is depressed.  The tail is drooped, his body animation is gone.  He wanders around the room, looking as though he is deliberately avoiding the square that leads to the click and a treat. 

When he finally does land on it and he hears the cue "punir", he heads back to the handler.  He clearly knows the correct response to the cue.  But unlike in the "ven" clip, he doesn't immediately return to the square.  He wanders around the room, looking somewhat aimless and listless.  It's a look I've seen in horses which is why I think it is so important to watch these video clips.  We need to recognize this "punir" state so we can head it off at the pass.  If we understand the dynamic behind the behaviors we're seeing, we won't take it quite so personally when our horses "blow us off" by wandering around the arena sniffing manure piles.  Yes, it may just be checking the day's messages, "give me a moment, I'll be right with you".  Or it could be avoidance behavior signaling the presence of poisoned cues in the system.

In the "punir" clip there are no clean loops.  The dog wanders around the room. He hesitates when he hears the cue. He hesitates before leaving the proximity of the handler.  Rates of reinforcement are low, and there's a lot of extra, unwanted behavior thrown into the mix. 

In the "ven" clip you see an example of efficient, clean loops. And in the "punir" clip you see the absence of loops.  Great contrast.
Alles kommt zu dem, der warten kann.
  • Gespeichert

  • Think positive!
  • *****
  • Administrator
  • 29211 Beiträge
    • Click for Balance
Re:Loopytraining 2 (Mail von Alex K.)
« Antwort #2 am: 15. Dezember 2009, 17:15:34 »
Clean Loops: The Hallmark of Good Clicker Training
In the Poisoned Cue research all corrections were stopped mid-way through the trials.   After this point, if you were to look in on a training session, both the "ven" and the "punir" conditions would look as though only positive reinforcement was being used.  You would see only clicks and treats, no corrections.  But you would also see two very different results.  In the "ven" condition you would see a happy, tail-wagging dog and clean, efficient loops.  In the "punir" state you would see a depressed dog and lots of unwanted behavior.

Jesús' conclusion: loops are a hallmark of good clicker training.  When you train under "ven" conditions, you will see loops developing.  The absence of clean loops may mean some of the cues you are using have become poisoned.

At one of the spring clinics one of the participants coined a wonderful expression: "aiming for ven".  I talked earlier about coming up with a name for the work.  "Aiming for Ven" would certainly be a strong candidate.  It would take a bit of explaining, but that's all right.  Imagine what the world would be like if more people understood what this phrase refers to.

Jackpots: An accidental Result
Jesús shared with me another study that one of his graduate students was currently working on. This one was on jackpots.  In this experiment the researcher was looking at the effect of jackpots on training.  Jackpots get a very mixed press.  Some people see them as an effective way to enhance training.  Others see them as a distraction.  Who is right?  That's what Jesús' graduate student wanted to find out.

The first question to be answered was what is a jackpot?  Kristy sent out questionnaires to trainers and received back an assortment of different answers.  For the purpose of her experiment she settled on a jackpot being an increase in the number of reinforcers given for a particular behavior.

During the experiment Kristy sat perched up on a tall kitchen stool.  She had taught her dog, a Boston terrier, to go out to the far end of the room and touch a target that was hanging on the wall.  Click.  She would drop a piece of kibble into a food bowl that was positioned next to the stool.  The dog would return promptly to the bowl to collect his treat, then he would head back to the target.  It was a clean, efficient loop.

At predetermined points in the training, Kristy would click, and instead of dropping one piece of kibble into the bowl, she would drop a handful.  Some of the kibble landed neatly in the bowl, but a few pieces would always bounce out, and her dog would waste precious time hunting down every scrap.  Her data was falling apart, but she couldn't tell if it was a function of the food delivery, or the distraction of getting more than he normally expected.

So she changed her food delivery system.  She got a long section of PVC pipe.  Now instead of dropping the treat directly into the bowl, she would click and drop it through the pipe.  It seemed like an elegant solution to the problem of the kibble bouncing out of the bowl.

She videotaped her first session using the PVC pipe.  The dog went out as usual and touched the target.  Click, he turned around, hesitated, looked confused, wandered to the bowl, got his treat.  Went back, but with noticeably less enthusiasm, touched the target, click.  Kristy dropped the kibble into the tube.  The dog hesitated, looked lost and finally wandered out of the room. 

What had gone wrong?  Kristy was a very skilled and experienced clicker trainer.  She was sure that her dog understood the meaning of the click, but that's not what his behavior was telling her.  He was leaving the training area in the same way that the dogs in the treatless clicks experiments left.  Break the promise of the click and the training falls apart.  But Kristy hadn't broken her promise.  From her point of view, she had delivered the treat.  The problem was her dog didn't know how to read her treat delivery signals.

He was used to hearing the click and looking up at Kristy's hands.  He watched her drop the treat into his bowl so he knew where it was going to be.  When she dropped it instead into the PVC pipe, he didn't understand that form of treat delivery.  That gesture meant nothing to him.  He'd heard the click, but from his point of view the promise of reinforcement had not been kept.

The Click Redefined
This accidental result of the breakdown in the food delivery system made Jesús think that perhaps we needed to rethink a bit the nature of the click. 

We're used to thinking in terms of the single phrase:

Behavior => click => reinforcer.
I've already described how we need to think of this not as a single phrase but as a repeating loop.

When the behavior becomes predictable and consistent, we can attach a cue to it.  We should all be familiar with this sentence:
Cue => behavior => click => reinforcer.

In this model of clicker training the cue and the click serve two different functions.  The cue acts like a green light that becomes attached to a behavior.  It tells the animal not only which behavior is most likely to be reinforced but when.  Perform the behavior off cue, and nothing happens.  Wait for the green light cue and that same behavior will earn a click and a treat. 

The click in this model acts as a "yes answer" signal.  It tells the horse that he got the right response, and he's going to get reinforced for it.

But Jesús was saying we could simplify the model.  Let's see how.  We've already seen that effect of cues works in two directions.
The effect of the cue   <= Cue => The effect of the cue.

Does this still make your head spin?  Lots of people stumble over this statement.  Let's look at it in more detail.

We'll begin by going back to the "ven" and "punir" video clips.  The cue "ven" was a signal the poodle understood.  When he heard it, he responded promptly.  He broke off what he was doing and headed straight over to the handler.  The cue clearly had an effect on the behavior that occurred immediately after the click.  But the cue also effected the behavior that was occurring as the cue was given.  It made it more likely that it would be repeated.

Jesús would have us think of the click as simply just another cue.  The click signals food is coming.  In the Kay Lawrence video the dog clearly understood the meaning of the click.  When he heard it, he looked at his handler. The motion of her hand told him where the food would land.  Without the orienting signal of the click, her hand motion might have been missed.  To the dog the click meant "look at your handler".  After he got his treat, he could afford to take his eyes off the handler and focus instead on the mat because the click would let him know when to look at her again.

The click as an orienting signal is one effect that it has.  It also has another.  It makes whatever was occurring just prior to the click more likely to occur again.  If the dog was jumping on the mat at the moment he heard the click, then jumping on the mat will have a high probability of being repeated.  After all, the dog wants to get his handler to sound the magic clicker.  It's a predictor of treats, and it turns on a fun moment of the hunting game.

If we think of the click as a cue, then that should help us as we think about using other cues to shape behavior, as well.  We know we can shape with a click and a treat.  If other cues are also predictors of good things, why shouldn't we be able to shape with them as well?  In other words we should be able to form linked chains of behaviors where the cue for the next behavior serves to reinforce the preceding behavior.

Here's another way of looking at this. The dog has learned that "ven" means come.  Coming always produces treats, so "ven" is a reliable predictor of goodies.  The dog wants the handler to say the word "ven", so any behavior he finds that makes it more likely that the handler will say the word "ven" is likely to be repeated.  Let's suppose you have a golden retriever puppy.  It's spring so he's shedding, and it's also muddy outside.  He's just been outside for a run.  I'm sure you can picture the state of his coat! 

He comes rushing in to your house, and because he's been out playing and having a grand time chasing sticks, he's ready now for a nap.  Before you can stop him, he makes a beeline for your newly upholstered couch.  Of course you call him off of it.  He's an eager student so he comes dashing over to you, expecting a click and a treat.  How can you resist all that eager, golden retriever enthusiasm?  You click and give him the last tidbits of treats you have in your pocket.  What have you just reinforced? Coming.  And what else?  Jumping up on the couch.  He's learned that it's a great way to get you to say the magic "come" word.

Solving the Training-Chain Puzzle
So there's the bind.  And it's always been a stumbling block for me when I start thinking about the linkages between behaviors, especially when I start thinking about chains.  Unless you are starting right at the moment of birth, there's always something that precedes the something you're clicking the animal for.  So how do you avoid reinforcing behaviors you don't want?  You tighten up your loop.  You click your dog when he comes, but, as he leaves, and before he can head back to the couch, you call him to you again.  You make your loop so tight and small that it excludes any unwanted behaviors.  When your loop is clean, you make it a little more complex, adding duration or new behaviors to the loop, but always building the loop so it remains clean.
Alles kommt zu dem, der warten kann.
  • Gespeichert

  • Think positive!
  • *****
  • Administrator
  • 29211 Beiträge
    • Click for Balance
Re:Loopytraining 2 (Mail von Alex K.)
« Antwort #3 am: 15. Dezember 2009, 17:15:59 »
Tightening the Loop: The Grown-Ups are Talking
That's how we begin with clicker training.  Think about the foundation lesson, the grown-ups are talking, please don't interrupt.  When you stand next to your new-to-clicker-training horse with your hands folded in front of you covering up your treat pockets, what does he do?  He mugs you, of course.  He doesn't know this part of the rule book yet.  He just knows he can smell goodies in your pockets, and it's worth nudging around your hands to see if he can find them.  So he nudge, nudges you before moving his nose out away from your body.  Click and treat.  Haven't you just made a little chain?  Nudge my person, move my nose away from her arm, click => treat.  Nudge my person, move my nose away from her arm, click => treat.

Well, yes, that is possible, but that's not what happens.  The reason we don't get caught up in this little circle of unwanted behavior is because we tighten up the loop.  The handler puts a piece of duct tape on the back of her hand that is closest to her horse.  As soon as she delivers the treat with her other hand, she touches the duct tape.  It becomes her target.  If she can touch the duct tape before her horse can move his nose back to mug her, click, he gets another treat.  It's a very simple way to tighten up the loop so the unwanted mugging behavior is excluded.

Now here's the elegant piece that thinking in loops adds to the training.  In these early clicker lessons, I have the handler put a small number of treats into her pouch.  When the treats are gone, I have her step away from her horse to count out another small handful.  While she is reloading her pouch, I have her assess her horse's training.  I don't just want people clicking and treating.  I want them observing and thinking. 

Assessing the horse's performance is a skill I want the handler to become very practiced in.  I want her to notice all the little details.  What did the horse do well?  Were there any unwanted behaviors creeping into the training?  What does his behavior tell you about what you need to work on next?  These are just a few of the questions I ask.  What they all boil down to is how clean was your loop? 

If it was clean throughout all segments of it, you can move on.  You can make the loop a little more complicated.  For example, you might choose to withhold the click an instant longer so he keeps his nose away from your treat pouch just a little bit longer. A clean loop tells you you're ready for this next step.

When the simple grown-ups loop becomes well established, you could add a second behavior to your loop.  You might ask him to back up (assuming this behavior had already been introduced to him).  You could then reinforce backing, not with a click and a treat, but by going into your grown-ups are talking stance.  This very distinctive stance will quickly take on meaning for your horse.  It becomes a cue for him to stop and look straight ahead so his nose is away from your treat pocket.  Click and treat.  Your loop is becoming more complex.

Clearing out the Poisoned Cue Loops with Clicker Training
As long as both behaviors have been taught in a "ven" way, you'll get a clean, error-free loop.  But suppose you are working with a crossover horse.  He's learned backing in a very different way.  So now when you ask him to back, he goes into a "punir" type of body posture.  Yes, he backs, but his steps are too quick because he's learned to hurry back to avoid a reprimand.  You have the behavior, but the loop really isn't clean.  How do you clean up the loop?  Well, one way would be to reteach backing.  What the poisoned cue research showed was it wasn't the behavior that was poisoned, but the cues that were associated with it.  So change the cue to create a "ven" form of backing. 

The emotional effects of a poisoned cue linger long after the active use of corrections has stopped.  That's particularly an issue with a long-lived species like the horse.  My horses are full of the archeological layers that are their training.  They forget nothing.  Want to know what I was working on in 1996?  Peregrine can show you.  How about 1988.  It's still in there.  All the blind alleys, mistakes, lessons learned the hard way - they are still there.  We've just learned together how to avoid triggering those old unwanted responses, how to manage things better so those buttons aren't pushed.  Clicker training has helped with that.  It by-passes the poisoned cue effect of the old cues by retraining the behaviors in new ways.  It creates the "ven" emotional state of happy, exhuberant, eager-to-please horses.

This is why it is so important with crossover animals to retrain all the familiar, easy behaviors even though your horse already "knows that".  You may want to get on and ride.  Your goal with the clicker may be to work on polishing your transitions.  Instead you find yourself teaching your horse to stand on a mat.  Everything is everything else.  Mat work is all about balance and smooth transitions.  It resolves so many of the issues that interfere with good riding balance, but it does it in a totally unfamiliar way for the horse.  The cues that evolve will be "ven" cues, attached to a "ven" attitude.  The only question is: is the rider ready for a "ven" horse? 

That's a serious question.  So many of us have only known shut down "punir" type animals.  They appear quiet and mannerly, and for many this is their concept of a good horse.  Sparkle in the eye, opinions, enthusiasm, these are scary concepts.  So the beginning stages of clicker training are designed to ease people into an appreciation of the clicker-trained horse.  I want manners, absolutely, and that's what the foundation lessons give me.  But I also want "ven".  That's the added element of joy that has attracted so many of us to clicker training and that gives us a very real sense that this training is fundamentally different from the other forms of horse training that we have encountered.

More on Poisoned Cues: Lumping
Monitoring your loops is one way to watch out for poisoned cues.  And cleaning up your loops is a way to by-pass them.  In the poisoned cue DVD Jesús talks about the mixing of negative reinforcement with positive reinforcement.  That's what creates the ambiguity and poisons the cues.  In other words when the horse is given a go forward cue, he doesn't know if he's going to get a pat on the neck indicating he made the right choice, or a jab in the sides from a set of spurs because he didn't respond fast enough, or perhaps a tug on the bit because he responded too much.  He never knows if the go forward cue is going to lead to good things or bad, and it is this ambiguity that creates the "punir" side effects.

But wait a minute.  We use negative reinforcement in our clicker training and we seem to create happy "ven" horses. Or are we just fooling ourselves, and all we really have is a sugar coating on top of "punir" cues?  This is the question I asked of Jesús during his presentation for the DVD.  His response was, no what we are doing with the horses is different.  And it's different because we are shaping with negative reinforcement.

I would say this a little differently.  Poisoned cues emerge out of lumping.  In the "punir" training of the poodle, the problem wasn't the use of negative reinforcement per se.  It was the way in which the negative reinforcement was applied.  The dog wasn't shaped using small approximations.  He was dragged over to the handler. 

Pressure and release of pressure provides wonderful information.  It is a safe, kind, effective training tool - when it is used with refinement.  When the training steps are made small enough, the pressure remains below the threshold where it might become painful or fear inducing.  Information creates "ven" learning and enthusiasm.  Pain and fear create "punir" survival tactics.  One belongs in a training setting.  The other does not.

So how does this relate to loops? 
Alles kommt zu dem, der warten kann.
  • Gespeichert

  • Think positive!
  • *****
  • Administrator
  • 29211 Beiträge
    • Click for Balance
Re:Loopytraining 2 (Mail von Alex K.)
« Antwort #4 am: 15. Dezember 2009, 17:16:26 »
Back to Hip-Shoulder-Shoulder
Let's loop around (no pun intended) to the beginning of this over long post.  I began by describing hip-shoulder-shoulder.  Suppose you've been beetling away at it, but you weren't sure what you were doing, and you're new to the t'ai chi rope handling techniques.  When you slid down the lead, you used a bit too much push, too much muscle force.  Instead of going to the point of contact as shown on the "Shaping on a Point of Contact" DVD, you pushed past that and forced your reluctant horse to back.  The result is he's grown resentful of the lead.  He's developed a "punir" type of attitude toward backing.  Not only does he look grumpy, he also tries to keep you from asking for backing by grabbing at the lead before you can get to the snap. 

You also never really perfected grown-ups.  When you first started clicker training you thought of the foundation lessons as something of a check list.  You did a little grown-ups, got the basic idea and then moved on. If you don't keep him busy, he still mugs your pockets, and he's a little grabby when he takes his treats.

You've got a lot to clean up before you can build much of a hip-shoulder-shoulder loop.

Earlier I wrote:
"So let's look at a bit more of the preparation.  It's the same preparation you went through to get three-flip-three.  You used the lead to ask your horse to walk forward.  Click and treat.  Repeat this simple request until your horse understands this slide-to-the-snap go-forward cue."

After all this discussion of loops do you recognize what this is?  It's an example of a simple loop.  In fact it's the base kernel of the hip-shoulder-shoulder loop (and a lot of other useful loops besides.)  I'll make this clearer by writing it out as a loop:

Cue =>behavior => click => reinforcement
=> cue => behavior => click => reinforcement
 => etc.

So we have:

Go-forward cue => shift of weight forward => click => treat
=>  go-forward cue => shift of weight forward => click => treat
=> etc.

How clean is this loop?  Maybe the horse I just described starts to bite at the lead when the handler starts to slide down it.  Okay.   Let's find a loop that is clean.

Grown-ups are talking.  How clean is that?  You may have to go back to the basics of putting duct tape on your hand to clean up your grown-ups.  And you may have to spend some time developing grown-ups into a solid, fully functional behavior, but it will be time well spent.  (If you want an image of a horse who is wonderfully solid in grown-ups watch Nikita in the third hour of the "Lesson 1: Getting Started with the Clicker" DVD.  Even when I tried to tempt her into mugging with bad mechanics, she kept her good manners in tact.)

Let's suppose you've focused some training time on grown-ups, and you now have a grown-ups loop that is working for you.  The cue for grown-ups has evolved out of the shaping process.  Every time you go into your grown-ups stance, your horse responds by going into the corresponding grown-ups, head-away-from-your-body position.  Click, treat, back to your grown-ups-are-talking position.

It's a clean loop.  When a loop is clean,  you know you can add a new element to the equation.  But remember, in good shaping you add only one new element at a time.  So you might choose to lengthen out the time you ask him to stay in grown-ups.  Or you might begin clicking the moment his ears begin to flick forward.  But you wouldn't go after both at the same time.

Keeping Things in Balance
When a loop is clean, not only can you add a new element to the equation, but you really should.  Staying too long on one thing can create problems.  It can get elements of your training out of balance.  Remember for every exercise you teach, there is an opposite exercise you must teach to keep things in balance.  Stay too long on a mat, and your horse won't want to leave the mat.  Work too long on head down, and your horse will think that's the only clickable option available to him.  So clean loops tell you not only that you can move on, but that you should.

Grown ups is becoming a clean loop.  Yes, you could continue to work on it, to develop it further, but during this training session, your goal is hip-shoulder-shoulder, so you're going to connect your grown-ups-are-talking loop to an ask down the lead.  You already know the lead rope can cause problems, so you're going to tighten down your "go-forward" loop so it excludes any unwanted behavior. 

That may mean that you begin by turning slightly toward your horse out of grown-ups as you slide your leading hand an inch down the rope. Click, treat, back to grown-ups.  Your aim is to slide all the way down the lead so you can ask your horse to step forward, but first you need to get him comfortable with the initial activation of the lead.  As your loop at this stage becomes clean, you'll go a little further down the lead. Clean loop by clean loop, you'll expand the distance you can slide down the lead .



So here's your initial loop:
Grown-ups stance => grown-ups behavior => click => treat => turn and slide a short way up the lead => horse remains in neutral grown-ups balance, mouth away from the lead => click => treat => grown-ups stance
 => etc.

You are surrounding the new request with a familiar, easy, comforting behavior.  As you expand down the lead, you'll continue to surround the exercise with easy, familiar behaviors.  The more carefully you build your loop, the bigger the security blanket surrounding the new pieces will become.

Going Micro to Avoid Poisoned Cues
And you will be avoiding poisoned cue territory because you will be chunking down below the level that the poisoned cue questions arose.  I'm going to make a generalization here.  Poisoned cues are a function of lumping.  You can create a "punir-like" experience by being a positive reinforcement lumper, just as much as a negative reinforcement lumper.  If you ask for too much too fast, you will get uncertainty, hesitation, a reluctance to play the game. 

The research looked at the effect of combining negative reinforcement lumping with positive reinforcement, but lumping in general creates unwanted side effects.  When you see someone using clicks and treats, but the horse is getting frustrated, angry, anxious, grabby, you are often seeing the effects of too much lumping on the part of a novice trainer.  One of the great strengths of clicker training is it teaches us to be splitters. So even if you start out making this sort of mistake, learning to split resolves the problem because it takes you into micro steps - below the layer that created the problem.

(And I would also venture to say that lumping with negative reinforcement creates more detrimental, long lasting side effects than lumping with positive reinforcement.  If you are a novice trainer, it is much better to make your lumping/gross timing mistakes as a clicker trainer than as a force-based trainer.  Any graduate students want to take that one on?)

Homework
So loop yourself back to the first half of this post and give yourself an assignment.  Go through the steps I wrote out for developing hip-shoulder-shoulder.  You'll see each step is just an expansion of the preceding loop.  Write out these instructions in the loop format.  Look at how the exercise expands increment by increment into the full loop we call hip-shoulder-shoulder.

Here's part two of the assignment.  Now that you've written out the basic steps ask yourself are these steps refined enough for your horse?  Would you need to add steps within steps to keep your loops clean?

How clean does the loop need to be initially for your horse to succeed?  You won't expect perfect loops all the time, especially not when you are first teaching a new element, but how messy can a loop be and for how long before you know you've asked for too big a jump in criterion?

And how clean is clean?  That's a question I can't answer for you.  You have to decide.  Clean enough is always in the eye of the beholder. If you are working with your straight forward, easy-going weekend riding horse, clean enough may be defined very differently than it would be for the two year old stud colt you are bringing on as your next performance superstar. As you gain more experience you may find yourself going back over your loops many times, polishing them up and making them even cleaner yet.   

Just as you have to keep dusting out your tack room, with any horse from time to time you also have to go back and clean up your training loops. (I'd say your house instead of your tack room, but we're all horse people.  I know most of us keep a cleaner barn than we do a house.) But the point is stuff creeps in.  Little things begin to mess up loops.  The good news about resistance is it will get bigger.  If you don't notice a little unwanted clutter piling up in your loops, don't worry about it.  Over time the clutter will build up to the point where you can't help but notice. 

If you're one of those people who waits until you can no longer get through the tack room door before you do a spring cleaning, you may also tolerate a fair amount of unwanted behavior in your loops before you take note. As you build up a relationship with your horse, you'll know how much tolerance you can have for messy loops.  With some horses you'll be at ease with very casually formed loops.  With other, more reactive, or more sensitive, challenging horses, you'll want to build and maintain your loops with great care.

Loops begin with tight, clean kernels.  The foundation lessons provide us with great base loops. You can add behaviors into loops.  You can make existing behaviors within a loop more complex (for example, adding ears forward to grown-ups.) You can connect one loop to another.  You can have a loop that contains multiple clicks and treats.  You can build loops that depend upon the effect of cues working in two directions so the presentation of a cue reinforces the preceding behavior and the click and treat occurs at the end of a long chain that loops back on itself. 

A completed hip-shoulder-shoulder sequence would be an example of this later type of loop.

The cue to go forward starts the loop in motion.
The request for the jaw reinforces the go forward response.
The second request for the jaw reinforces the first give of the jaw.
The third request for the jaw reinforces the second.
The ask for the hip reinforces the third give of the jaw.
The first step back reinforces the give of the hip.
The second step back reinforces the first step back.
Click! leading to a treat reinforces the whole unit.
The request to go forward starts another loop in motion and reinforces the loop that preceded it.

If it's all built out of clean loops and "ven" cues, the result will be a happy horse who glides across the dance floor of your arena with the skill and grace of Gene Kelly. 

Happy dancing!

Alexandra Kurland
theclickercenter.com
Alles kommt zu dem, der warten kann.
  • Gespeichert