Train her well: slave learning theory

"Learning Theory" is a discipline of psychology that attempts to explain how an organism learns. It consists of many theories of learning, including instincts, social facilitation, observation, formal teaching, memory, mimicry, classical and operant conditioning. It is these last two that are of most interest to animal trainers, and likewise of interest to those who train slaves. These types of conditioning are not isolated only to animals, but are applicable to behavioral modification in humans as well.

Classical Conditioning Theory

Classical Conditioning is the type of learning made famous by Pavlov's experiments with dogs. The gist of the experiment is this: Pavlov presented dogs with food, and measured their salivary response (how much they drooled). Then he began ringing a bell just before presenting the food. At first, the dogs did not begin salivating until the food was presented. After a while, however, the dogs began to salivate when the sound of the bell was presented. They learned to associate the sound of the bell with the presentation of the food. As far as their immediate physiological responses were concerned, the sound of the bell became equivalent to the presentation of the food.

Classical conditioning is used by trainers for two purposes: To condition (train) autonomic responses, such as the drooling, producing adrenaline, or reducing adrenaline (calming) without using the stimuli that would naturally create such a response; and, to create an association between a stimulus that normally would not have any effect on the subject and a stimulus that would.

Stimuli that subjects react to without training are called primary or unconditioned stimuli (US). They include food, pain, and other "hard-wired" or "instinctive" stimuli. We do not have to learn to react to an electric shock, for example. Pavlov's dogs did not need to learn about food.

Stimuli that subjects react to only after learning about them are called secondary or conditioned stimuli (CS). These are stimuli that have been associated with a primary stimulus. In Pavlov's experiment, the sound of the bell meant nothing to the dogs at first. After its sound was associated with the presentation of food, it became a conditioned stimulus. If a warning buzzer is associated with the shock, the animals will learn to fear it. This concept is identical for humans. For example, the alarm clock going off every morning has 'conditioned' many people to wake suddenly, within several seconds of the clock going off. At first, this sound only disturbed sleep, but after a while, it was associated with the need to wake up and go to work.

Secondary stimuli are things that the trainee has to learn to like or dislike. Examples include school grades and money. A slip of paper with an "A" or an "F" written on it has no meaning to a person who has never learned the meaning of the grade. Yet students work hard to gain "A's" and avoid "F's". A coin or piece of paper money has no meaning to a person who doesn't use that sort of system. Yet people have been known to work hard to gain this secondary reinforcer.

Application

Classical conditioning is very important to animal trainers, because it is difficult to supply an animal with one of the things it naturally likes (or dislikes) in time for it to be an important consequence of the behavior. For example, it is hard to reward your slave with a piece of chocolate while she's at home, cleaning the house, and you're at work. So trainers will associate something that's easier to "deliver" with something the subject wants through classical conditioning. Some trainers call this a bridge (because it bridges the time between when the animal performs a desired behavior and when it gets its reward). Marine mammal trainers use a whistle. Many other trainers use a clicker, a box with a metal tongue that makes a click-click sound when you press it. It is common for many Masters and Mistresses to use certain words and phrases (such as "Good girl").

You can condition the subject to the noise (such as a clicker) by clicking it and delivering some desirable reward, many times in a row. Simply click the clicker, pause a moment, and give the dog (or other animal) the treat. After you've done this a few times, you may see the animal visibly startle, look towards the treat, or look to you. This indicates that she's starting to form the association. It's called "creating a conditioned reinforcer". The click sound becomes a signal for an upcoming reinforcement. As a shorthand, some clicker trainers will say that the click = the treat. How could you apply this to slave training? If, for instance, you are training your slave to respond to the words "beautiful slave." You may, over the period of a week or two, say these words and stroke the slave's hair, neck, or breast gently. You must associate the word with the action by keeping it consistent, every time, at the same speed. Within a few weeks, the words "beautiful slave" will cause the slave to relax, or smile, and become very warm to you within a matter of seconds.

Operant Conditioning

Classical conditioning forms an association between two stimuli. Operant conditioning forms an association between a behavior and a consequence. (It is also called response-stimulus or RS conditioning because it forms an association between the subject's response [behavior] and the stimulus that follows [consequence])

Four Possible Consequences

There are four possible consequences to any behavior. They are:

Something Good can start or be presented
Something Good can end or be taken away
Something Bad can start or be presented
Something Bad can end or be taken away

Consequences have to be immediate, or clearly linked to the behavior. With verbal humans, we can explain the connection between the consequence and the behavior, even if they are separated in time. For example, you might tell a friend that you'll buy dinner for them since they helped you move, or a parent might explain that the child can't go to summer camp because of her bad grades. With very young children, humans who don't have verbal skills, and animals, you can't explain the connection between the consequence and the behavior. For an animal, the consequence has to be immediate. In many training cycles with slaves, you are actively trying to get the slave to react, not think-then-react, producing an immediate obedience. This type of conditioning can work well for this type of training as well. The way to work around this is to use a bridge (see above).

Technical Terms

The technical term for start or "be presented" is positive or additive, since it's something that's added to the trainee's environment.

The technical term for "end or be taken away" is negative or subtractive, since it's something that's subtracted from the trainee's environment.

Anything that increases a behavior - makes it occur more frequently, makes it stronger, or makes it more likely to occur - is termed a reinforcer. Often, an animal (or person) will perceive "starting Something Good" or "ending Something Bad" as something worth pursuing, and they will repeat the behaviors that seem to cause these consequences. These consequences will increase the behaviors that lead to them they are reinforcers. These are consequences the trainee will work to attain, so they strengthen the behavior.

Anything that decreases a behavior - makes it occur less frequently, makes it weaker, or makes it less likely to occur - is termed a punisher. Often, an animal (or person) will perceive "ending Something Good" or "starting Something Bad" as something worth avoiding, and they will not repeat the behaviors that seem to cause these consequences. These consequences will decrease the behaviors that lead to them they are punishers.

Applying these terms to the four possible consequences:

Something Good can start or be presented, so behavior increases = Additive Reinforcement (R+)
Something Good can end or be taken away, so behavior decreases = Subtractive Punishment (P-)
Something Bad can start or be presented, so behavior decreases = Additive Punishment (P+)
Something Bad can end or be taken away, so behavior increases = Subtractive Reinforcement (R-)

or:

<><><><> <><><><><> <> <><><><> <><><><><> <><><><><> <> <><> <><> <> <><><><> <><><><><> <><><><><> <> <><> <><> <> <><><><> <><><><><> <><><><><> <> <><>

	Reinforcement (behavior increases)	Punishment (behavior decreases)
Additive (something added)	Additive Reinforcement: Something added increases behavior	Additive Punishment Something added decreases behavior
Subtractive (something removed)	Subtractive Reinforcement Something removed increases behavior	Subtractive Punishment Something removed decreases behavior

Remember that these definitions are based on their actual effect on the behavior in question: they must reduce or strengthen the behavior to be considered a consequence and be defined as a punishment or reinforcement. Pleasures meant as rewards but that do not strengthen a behavior are indulgences, not reinforcement; aversives meant as a behavior weakener but do not actually weaken the behavior, are ineffective.

Positive (Additive) Reinforcement

This is possibly the easiest, most effective consequence for a trainer to control (and easy to understand, too!). Additive reinforcement means starting or adding Something Good, something the trainee likes or enjoys. Because the trainee wants to gain that Good Thing again, they will repeat the behavior that seems to cause that consequence.

Examples of additive reinforcement:

The dolphin gets a fish for doing a trick. The worker gets a paycheck for working. The dog gets a piece of liver for returning when called. The slave gets to go out for dinner for cleaning the house. The cat gets comfort for sleeping on the bed. The wolf gets a meal for hunting the deer. The child gets dessert for eating her vegetables. The dog gets attention from his people when he barks. The child gets ice cream for begging incessantly. The slave gets a hug for pouting. The dog gets to play in the park for pulling her owner there. The snacker gets a candy bar for putting money in the machine.

Secondary additive reinforcers and Bridges

A primary additive reinforcer is something that the trainee does not have to learn to like. It comes naturally, no experience necessary. Primary R+s usually include food, often include sex, the chance to engage in instinctive behaviors, and for social animals, the chance to interact with others.

A secondary additive reinforcer is something that the subject has to learn to like. The learning can be accomplished through Classical Conditioning or through some other method. A paycheck is a secondary reinforcer - just try writing a check to reward a young child for potty training!

Trainers will often create a special secondary reinforcer they call a bridge. A bridge is a stimulus that has been associated with a primary reinforcer through classical conditioning. This process creates a conditioned additive reinforcer, often called a conditioned reinforcer or CR for short. Trainees that have learned a bridge react to it almost as they would to the reward that follows (such as saying "beautiful slave" will get an immediate reaction, though the slave hasn't been touched).

Schedules of Reinforcement, and Extinction

A schedule of reinforcement determines how often a behavior is going to result in a reward. There are five kinds: fixed interval, variable interval, fixed ratio, variable ratio, and random.

A fixed interval means that a reward will occur after a fixed amount of time. For example, every five minutes. Paychecks are an example of scheduled reinforcement.

A variable interval schedule means that reinforcers will be distributed after a varying amount of time. Sometimes it will be five minutes, sometimes three, sometimes seven, and sometimes one. My e-mail account works on this system - at varying intervals I get new mail (for me this is a Good Thing!).

A fixed ratio means that if a behavior is performed X number of times, there will be one reinforcement on the Xth performance. For a fixed ratio of 1:3, every third behavior will be rewarded. This type of ratio tends to lead to lousy performance with some animals and people, since they know that the first two performances will not be rewarded, and the third one will be no matter what. This is not a schedule of reinforcement I would suggest for slaves. A fixed ratio of 1:1 means that every correct performance of a behavior will be rewarded.

A variable ratio schedule means that reinforcers are distributed based on the average number of correct behaviors. A variable ratio of 1:3 means that on average, one out of every three behaviors will be rewarded. It might be the first. It might be the third. It might even be the fourth, as long as it averages out to one in three This is often referred to as a variable schedule of reinforcement or VSR (in other words, it's often assumed that when someone writes "VSR" they are referring to a variable ratio schedule of reinforcement). This is the most effective schedule for use on slaves, because it incorporates motivation to keep the behavior consistent. If you were rewarding your slave with ice cream every few days or so for keeping the laundry done, you might find the laundry is consistently done and done well, in anticipation that "today" is the ice cream day.

With a random schedule, there is no correlation between the trainee's behavior and the consequence. With human training it is important that the trainee understand the connection between the behavior and result or the lack of logical connections will cause the behavior to change drastically in order to find something else that works better.

If reinforcement fails to occur after the subject performs behavior that has been reinforced in the past, the behavior might go away entirely. This process is called extinction. The subject sees that whatever s/he is doing has failed to work, so they will stop or begin a new behavior in order to look for the treat again. A variable ratio schedule of reinforcement makes the behavior less vulnerable to extinction. If you're not expecting to gain a reward every time you accomplish a behavior, you are not likely to stop the first few times your action fails to generate the desired consequence. Slot machines use variable reinforcement to avoid extinction, "OK, I didn't win this time, but next time I'm almost sure to win!"

When a behavior that has been strongly reinforced in the past no longer gains reinforcement, you might experience what's call an extinction burst. This is when the animal performs the behavior over and over again, in a burst of activity. Extinction bursts are something for trainers to watch out for! This is less common in humans, but with slaves you may see it in some sexual situations. The slave has learned how to perform fellatio a certain way, with a certain movement of tongue or mouth. If suddenly, this learned skill no longer works to produce a pleasurable sound from her Master, or an orgasm, she may perform it harder and faster. If this extra burst of behavior fails to bring the reward, it will stop altogether. This is why you must remain consistent with rewards.

One of the animal trainers I learned with has cautioned against needlessly using variable schedules. Most useful behaviors, he points out, will get some sort of reinforcement every time. You might not always click and treat your dog for sitting on cue, but you will always reward it with some recognition and praise ("Good dog!"). If there is some circumstances where you will be unable to deliver any reinforcement (during a long sequence of behaviors, or when the animal is out of contact), then you will need to build a buffer against extinction with a VSR. Otherwise, don't bother. This is difficult to do in a situation where you will not be with the subject 24 hours a day.

Cautions in using positive reinforcement:

If the subject is afraid or fearful while performing the behavior, you may be rewarding the fear. For instance, if you get into an loud argument with someone else, in front of the slave, and specifically tell her to "remain quiet," she may associate being fearful of her Master yelling with being quiet. Then when you tell her later that she was good for remaining quiet, the next time you need her to answer you when you're reprimanding her, she may say nothing at all.

The timing must be precise. The reward must follow directly after the behavior and not several minutes afterward.

The reward has to be sufficient to motivate a repetition. Mild praise won't be enough for some slaves. Others require the larger displays of approval, etc.

Reinforcements can become associated with the person giving them. If the slave realizes that she can't get any rewards without you present, she will not be motivated to act when you are not there.

Subjects can get sated with the reward you're offering when they've had enough, and it will no longer be motivating. This is common in using orgasms as a reward, after one or two, the slave is not inspired to continue behaving well to gain another one.

Reinforcers increase behavior. If you don't want your trainee actively trying out new behaviors ("throwing behaviors at the trainer"), don't use random reinforcement. Use a positive reinforcement to train a subject to do something. This is very, very common in slaves who will try a plethora of behaviors to get attention, from dressing nicely, wearing their owner's favorite scent, acting up, etc.

Negative (Subtractive) Punishment

Subtractive punishment is reducing behavior by taking away Something Good. If the subject was enjoying or depending on Something Good she will work to avoid it getting taken away. They are less likely to repeat a behavior that results in the loss of a Good Thing. This type of consequence is a little harder to control.

Examples

The child has his crayons taken away for fighting with his sister. The window looking into the other monkey's enclosure is shut when the first monkey bites the trainer. "This car isn't getting any closer to Disneyland while you kids are fighting!" The dog is put on leash and taken away from the park for not coming to the owner when the owner called. The slave is taken out of a store for back-talking her owner.

Secondary Subtractive Punishers

Trainers seldom go to the trouble of associating a particular cue with negative punishment. It's sometimes called a "delta" stimulus. Some dog owners make the mistake of calling their dogs in the park and then using the negative punishment of taking the dog away from the fun. "Fido, come!" then becomes a conditioned negative punisher. The most common one for use with slaves are statements like "Go upstairs and sit on the bed and wait for me" when the slave is in trouble. The statement itself (not the actual waiting) develops a strong nervousness.

Additive Punishment

Positive punishment is something that is applied to reduce a behavior. The term "positive" often confuses people, because in common terms "positive" means something good, so we will refer to it as additive punishment. Also keep in mind that with these terms, it is not the trainee that is "punished" (treated badly to pay for some moral wrong), but the behavior that is punished (reduced). Additive punishment, when applied correctly, is the most effective way to stop unwanted behaviors. Its main flaw is that it does not teach specific alternative behaviors.

Examples

Both our society and nature seem to have a great fondness for additive punishment, in spite of all the problems associated with it (see below). The peeing on the rug (by a puppy) is punished with a swat of the newspaper. The driver's speeding results in a ticket and a fine. The baby's hand is burned when she touches the hot stove. Walking straight through low doorways is punished with a bonk on the head. A slave tries to get out of her restraints, and falls over. In all of these cases, the consequence (the additive punishment) reduces the behavior's future occurrences.

Secondary Additive Punishers

Because a additive punisher, like other consequences, must follow a behavior immediately or be clearly connected to the behavior to be effective, a secondary additive punisher is very important. (This is especially true if the punisher is going to be something highly aversive or painful). Many trainers actively condition the word "No!" with some punisher, to form an association between the word and the consequence. The conditioned punisher (CP+) is an important part of training with Operant Conditioning.

Cautions in using Additive Punishment:

Behaviors are usually motivated by the expectation for some reward, and even with a punishment, the motivation of the reward is often still there. For example, a predator must face some considerable risk and pain in order to catch food. A wolf must run over rough ground and through bushes, and face the hooves, claws, teeth, and/or horns of their prey animals. They might be painfully injured in their pursuit. In spite of this, they continue to pursue prey. In this case, the motivation and the reward far outweigh the punishments, even when they are dramatic.

The timing of a positive punishment must be precise. It must correspond exactly with the behavior for it to have an effect. (If a conditioned punisher is used, the CP+ must occur precisely with the behavior). If your slave forgets her verbal protocols at dinner, and multiple times you punish her by taking her out of the restaurant for it, she will form an association with going to dinner = messing up, and will become very stressed when going out after that. However, if you give her a command to be silent for an extended period of time, she will associate the verbal correction with the verbal mistake.

The aversive must be sufficient to stop the behavior in its tracks - and must be greater than the reward. The more experience the trainee has with a rewarding consequence for the behavior, the greater the aversive has to be to stop or decrease the behavior. If you start with a small aversive (mild spanking or a stern talking-to) and build up to a greater one (hard whipping or full-on yelling), your trainee may become adjusted to the aversive and it will not have any greater effect.

Punishments may become associated with the person supplying them. The dog that was hit after chewing on the furniture may still chew on the furniture, but he certainly won't do it when you're around! This applies to slaves to a large degree.

Physical punishments can cause physical damage, and mental punishments can cause mental damage. You should only apply as much of an aversive as it takes to stop the behavior. Do not over-punish for a little infraction, and do not continually punish. Slaves especially need a sense of closure; she knelt on rice for 15 minutes, and now it's over with, hug her and get on with it. i recommend the use of reinforcers with slaves after a punishment has been endured and their behavior has now been corrected. This is helpful for slaves to realize that the behavior is now acceptable and you are no longer upset.

Punishers suppress behaviors. Use additive punishment to train a subject not to do something, not to do something else.

Subtractive Reinforcement

Subtractive reinforcement increases a behavior by ending or taking away Something Bad or aversive. By making the trainee's circumstances better, you are rewarding it and increasing the likelihood that it will repeat the behavior that was occurring when you ended the Bad Thing.

In order to use subtractive reinforcement, the trainer must be able to control the Bad Thing that is being taken away. This often means that the trainer must also reapply the Bad Thing, and reapplying a Bad Thing might reduce whatever behavior was going on when the Bad Thing was reapplied. And reducing a behavior by reapplying a Bad Thing is positive punishment.

Examples

The choke collar is loosened when the dog moves closer to the trainer. The whipping stops when the slave apologizes. The reins are loosened when the horse slows down. The car buzzer turns off when you put on your seatbelt. Dad continues driving towards Disneyland when the kids are quiet. You stop pulling the slave's hair when she finally kneels.

Secondary Subtractive Reinforcers

Trainers seldom go to the trouble of associating a particular cue with subtractive reinforcement, but it can be done. Not generally advisable for slaves, as they will get confused with punishments.

Internal Reinforcers and Punishers

Unfortunately, trainers can not control all reinforcers and punishers. There are a number of environmental factors that are going to affect the subject's behavior that you have no control over, but which will still be a significant consequence for your trainee.

Some of these come from the subject's internal environment - their own reactions. Relief from stress, pain, or boredom are common reinforcers and some "self-reinforcing" behaviors are actually maintained because of this. Examples: a dog barking because it relieves boredom, or a person chewing on her fingers or smoking a cigarette because it relieves stress. Drivers speed because it is fun. Guilt is an internal punisher that some people experience, and guilt from not pleasing their owner is common in slaves.

This is offered as a learning tool for Master and Mistresses, and even the slaves and submissives themselves. Sometimes, understanding the tools that are used can make them more effective. The examples here are only that: examples. They are not "the way" nor how you should do it. These are simply guidelines for developing your own training systems.

Saturday

slave learning theory

3 comments: