Differential Reinforcement and Dieting: The Flaws of the Force-Free Ideology
George is a 30 year-old man. He is 15kgs overweight. He lives on processed junk food and soft drinks, and hasn’t cooked a real meal in years. He is perpetually tired, suffers from high blood pressure, and has a very low self-esteem. So, George decides to go on a diet. He consults a renowned nutritionist, who puts him on a plant-based, whole-food diet. No more processed crap, no more added sugars, fats or fast foods.
His motivation for a better, healthier life is at an all-time high. He gets home, goes through his kitchen cabinets and fridge, purging everything. Then, George goes along to the supermarket for his first healthy shopping trip.
He sees signs for chocolate, lollies, chips and biscuits; he ignores them. ‘Nope, that’s the old me! The guy that gets diabetes!’ On he goes to the produce section, filling his trolley up with leafy greens, vegetables, fruits, nuts and seeds.
The first week was hard. The cravings were almost unbearable, and the new diet seemed bland and boring. But he stuck with it. About 10 days in, his palette began to change, and he really started enjoying this new food.
Sure enough, George lost all his excess weight and he was feeling much better.
But sometimes, the cravings came back. When he visited family for Christmas and was faced with the temptations of a festive meal, he caved. When it was his birthday and his co-workers brought him a cake, he had a piece. He went on a date with a woman who wanted a casual pub meal; Bangers and Mash, with a glass of coke.
Everywhere he went, some sort of treat was being advertised. Sometimes, he couldn’t help but imagine how good a Big Mac would taste, and compare it to the plate of roast veggies he had in front of him. There was always an event that, for him, was associated with high sugar, high fat foods. Sometimes he resisted temptations, other times it was too much.
Slowly, there were more and more weak moments and George started reverting back to his old habits.
Eventually, he put all the weight he lost back on.
George is like 95% of other people whose diet fails them. Why?
Diets rely on differential reinforcement; an attempt at replacing a ‘problem’ behaviour with a ‘productive’ behaviour; eating unhealthy food is replaced with eating healthy food.
The motivation to eat unhealthy crap is a strongly inbuilt genetic drive to consume high calorie food. Our brain tells us we need that cupcake, that packet of chips, and that glass of coke. Our gut is full of food-specific bacteria and neurons that scream out for more, more, more when we deprive them of their sustenance. People who eat a lot of sugar, have a disproportionate amount of sugar-craving bacteria in their gut. The more we try to starve them, the more they trigger sugar-cravings.
George’s environment was not suited to dieting, and avoiding the food his body wanted so badly. Every time he went outside, he’d see a never-ending cycle of ads. Every time he sees one, his brain lights up; ‘yes, yes that one!’ Even in his own home, he wasn’t safe. Social media, the internet and TV bombard him with image after image of his ‘trigger’.
Not only was this kind of food innately motivating for him to eat (despite the long list of harmful side-effects of him eating it), but it was also associated with happy times. Social gatherings and special occasions typically revolve around food… junk food. So, George was battling both a genetic predisposition towards craving processed foods and a long reinforcement history for doing so.
His motivation for the positive, productive behaviour of healthy eating (improved quality of life) pales in comparison. As does the vast, vast majority of people who try – and fail – to diet.
So, what does this have to do with dog training? A portion of the training industry (Force-free trainers) rely heavily on differential reinforcement to stop unwanted behaviour in dogs.
There are numerous similarities between dogs and George that often – though not always – render such an approach completely ineffective. This occurs, broadly speaking, in two cases; when the problem behaviour has a strong genetic component and/or when it has an extensive reinforcement history.
Take predatory behaviour as an example. For many breeds of dogs, part of, or the full predatory sequence (find->chase->catch->kill->eat) has been selectively bred into them to the point of insanity. This drive goes beyond sugar addiction in humans, and will emerge and persist in situations devoid of what we would consider stereotypical ‘triggers’. A lurcher or sighthound, bred to pursue small game for example, may transfer their instinct to the destruction of sheep. This is becoming increasingly problematic in parts of the UK, where sheep often free-roam.
The force-free approach to the sheep worrying dog would be; first to control their environment. Put the dog on a short or long leash so that they cannot get any reinforcement for the problem behaviour and to avoid going to areas frequented by sheep (don’t go outside George, don’t turn on the TV or look at your phone so you don’t see images of unhealthy food). Secondly, an alternative behaviour (the recall) is taught and rewarded heavily. The goal is for the alternative behaviour to outcompete the problem behaviour in terms of value in the dog’s head (take this veggie platter, not that big slice of chocolate cake). However, unless the trainer can produce a live rabbit as a reward for recalling… this is unlikely to effect permanent and reliable behaviour change.
Behaviours with a long reinforcement history, such as leash pulling or jumping on people, are often equally as resistant to change as genetically reinforced behaviour. A dog with years’ worth of reinforcement for jumping on people (30 years’ worth of Birthdays = cake) is unlikely to consistently offer an alternative sitting behaviour with a measly weeks’ worth of reward (5 years’ worth of Birthdays = riced cauliflower and veggies).
When differential reinforcement is combined with the intelligent, measured and targeted use of aversives, however, reliability and consistency is highly likely. If George had caved and opted to eat his favourite meal; a Big Mac, chips and drink; and subsequently been struck down with food poisoning, his brain would likely have been repulsed at everything associated with that meal henceforth. The safety of the plant-based, whole-food meal that he prepares himself would be a far likelier choice for George in the future, and would be far more appealing too because it has never caused an upset tummy.
For the predatory dog, an aversive stimulus associated with sheep will – when done correctly – effectively inhibit all predatory behaviour towards the sheep and the dog will seek to avoid them in the future. This, combined with training a recall (an alternative behaviour to those associated with predation), can be incredibly reliable at extinguishing the problem.
If we could eliminate 100% of the advertisements relating to unhealthy food George saw in his day to day life, and restricted access at the supermarket that would permit him to buy only healthy, whole-foods, and had everyone in his life change their cultural leanings towards using junk food for celebrations, and had all fast food shops ban George from purchasing their food, and did whatever else it took to prevent his cravings from being triggered, he may eventually kick the addiction. After a year, perhaps? Two? Then start removing those environmental controls to slowly reintroduce triggers when he has a better handle on his habits. Maybe after 30 years he could stop living in a bubble.
Controlling the environment of dogs often becomes a life-long strategy for those adopting an aversive-free ideology. Dogs are permanently kept on leash, they’re prevented from getting in the vicinity of their ‘triggers’ and are often thus prevented from being given any freedom to fulfill their physical and behavioural needs.
This is no way to live a happy life for a dog, or their owner. Being the troll under the bridge is just as emotionally exhaustive as it is frustrating for the traveller trying to cross.
A behavioural rehabilitation and training plan that relies heavily on long term management and differential reinforcement without including the deliberate use of aversives is flawed and ineffective, plain and simple.