6.1 Introduction

Some concepts are quite useless in their broadest meaning. Motivation is such a concept. There is, however, a more restricted sense of the word which is useful or even essential in any theory of cognition. This is the idea of motivation as a central state reflecting the combination of internal needs and external possibilities. In the motivational system, these two sources of information come together and they are used to form a decision on what to do. The representation of this decision constitutes the motivational state of the animal (See section 3.2), which tells the animal what it should do.

In this chapter, we will try to investigate the role of motivation as a determinant of behavior and learning. We will try to show that motivation and emotion plays an important role in cognitive processes which has often been overlooked in the past. The traditional dichotomy between cognition and emotion is probably responsible for the lack of interest in motivational theories within cognitive science (LeDoux 1995). This is unfortunate for a number of reasons.

First, there seems to be no clear distinction between cognition, motivation and emotion in our daily activities. For example, the planning of a vacation is usually considered to be a cognitive activity. Every step of the planning process is guided by one's ideas of a good holiday, one's hopes, fears, etc. Without this side of the planning process, the vacation would hardly be worth planning, since it could only result in any of an infinite number of equally valued plans, none of which would fit one's requirements for a good holiday. It appears that all thinking is biased by emotive values in this way.

Next, cognition cannot be understood without reference to motivation. Higher cognitive processes like categorization and problem solving only play any role when they are used to fulfil some goal of the individual, that is, only when they are motivated. This is unfortunately an aspect of cognition that is often ingnored.

Last, it is important to realize that motivation precedes cognition. Cognition depends on the motivational system, whereas the motivational system can operate without cognition. This point will be developed further in section 6.8. We will also argue that the role of emotions is to tell the animal what it should have done, when its expectations are inaccurate or its behavior is unsuccesful.

We will see that the motivational state has two important properties: First, the motivational state is central to the whole organism. This means that at all times, the organism is influenced by one single motivational state which selects a single engagement. Second, the motivational state is dynamic. As the internal needs and external possibilities change or are reevaluated, the motivational state may change at any time.

6.2 The Determinants of Motivation

Learning is just one determinant of behavior. The other important determinant is motivation. Some early theories did not consider motivation as a source of behavioral direction. Instead, they focused on learning alone. For example, the experiments of Pavlov (1927) did essentially ignore the role of motivation. This is curious, since the classical association of a bell with salivation could only be learned when the dog was hungry. A satiated dog simply does not show the salivation reflex.

In Thorndike's experiments, the role of motivation was even less clear. The cat in its box was given food when it had managed to escape, but this was more or less an afterthought. It was never clearly established whether the cat tried to escape because it wanted food or because confinement is aversive to the animal.

The role of motivation compared to learning was clearly shown in an experiment by Clark (1958). A number of rats were trained to press a bar in order to obtain food. The rats were then placed in different groups and deprived of food for between 1 and 23 hours. The longer the rats had been deprived of food, the faster they pressed the bar to obtain it. Since all the rats had received equal training, the differences could not be due to differences in learning. Consequently, the experiment shows that motivation, or in this case, the deprivation of food, plays a significant role in determining behavior.

Hull (1943) proposed that deprivation induces an aversive state in the organism. This state was called a drive. In his conception, drive increases the overall level of arousal in an animal. Drive was thus considered as a property of need states which motivates behavior. Hull considered drive to have an overall effect on behavior in that it energizes it, but does not determine what behavior is performed. This is usually called a generalized drive and has now been replaced by the concept of arousal (Gray 1975). The theory can be summarized in Hull's law,

(Equation 6.2.1)

where D is the drive level, V the stimulus intensity, K the incentive motivation - a concept we will return to below - and SHR is the habit strength that results from learning. They all multiply to produce the reaction potential, SER, which is a measure of the likelihood that an animal will produce a particular response, R, given a stimulus, S.

Later studies have shown that the role of drive is to selectively increase the frequency of behaviors likely to reduce the drive (See Bolles 1967). In Hull's formulation an increased drive caused by deprivation of food would increase the likelihood of any behavior, not only of eating. Taking this into account, we must relate both the drive and the incentive to a particular type of motivation and Hull's law changes to,

(Equation 6.2.2)

Here, Dm is a specific drive, such as hunger, which relates to a particular motivation, m, for example motivation to engage in food seeking behavior or eating. The incentive motivation must also be related to the particular motivation that is indicated by the subscript m. Later studies have also shown that the interaction between deprivation and learning may be more complicated than multiplication (Bolles 1969), but the basic idea is very sensible and as we will see below, the concept of motivation can be used to great utility as the basis for a theory of choice.

Another possible extension, which we will not investigate further here is to use vectors for the representations of drives. This makes it possible to have a whole set of motivational states which represent, for example, hunger (McFarland and Sibly, 1972). This corresponds to the idea that one can have hunger for different types of foods.

6.3 Drives

Drive is probably one of the vaguest terms in psychology. It has been used in many different meanings, some of which have gained a very bad reputation. The use of drive that is intended here is as a state or value representing the urgency of a behavior. In some cases, a drive relates to a physiological state produced by a deprivation or increased levels of food, a hormone etc. In other cases they relate to the presence of a noxious stimuli such as a loud noise, pain, heat, etc. In these cases, drives are equivalent to need states, but we will also allow constant drives which compete with these more usual drives. Constant drives represent the relative importance of different engagements which do not change over time.

Drives, thus, have the role of representing the relative importance of an engagement at a certain time. This implies that a creature must have one, or possibly many, drives for each of its innate engagements. We can identify at least six different types of drives.

Homeostatic Drives The first type of drive that we will consider is generated by violation of homeostasis. This includes, for example, hunger and thirst, but also the responses to heat and cold. Violation of homeostasis induces a drive signal into the motivational system. The probability of an engagement increases with larger deviation from the optimal level but the drive signal does not directly activate a behavior.

It is instructive to compare this notion of drives with the behavioral model proposed by Powers (1973). Powers' model is built on the assumption that the present action of an organism is a function of its present perceptions and an internal reference perception. These are compared to generate an error signal which facilitates behavior. This is exactly the idea that the discrepancy between the present state of the organism (the present perception) and its desired state (the reference perception) drives behavior. The idea of an error signal is very similar to the idea of a drive. (Compare this idea with the discussion of approach behavior in chapter 4).

In Powers' models, this idea is generalized to hierarchically connected comparators. The comparator on each level generates the reference perception to the system below it. What starts out as a mechanism for maintaining homeostasis has turned into a behavioral control system. In the process, the motivational system has disappeared and so has the ability of the system to select among different engagements. Though elegant, Powers' model cannot explain the selection of one engagement among many when several discrepancies are present. However, it deserves recognition as an early and thorough system model of horizontal decomposition of competence (Compare Brooks 1986, and Schnepf 1991).

Powers' model treats behavior as a function of two variables: the present perception and the reference perception. We want to argue that three variables are needed: the present perception, the present motivational state and the 'reference perception'. However, the 'reference perception' needs not be explicitly represented since it is essentially fixed by evolution.

Discrepancies can thus be detected directly without reference to the optimal state of the organism. Thus, the comparators do not exist in a physical sense and the reference perceptions are not really perceptions at all. Yet, the organism behaves as if they were present.

Also, as we have already discussed, the role of the present perceptions is not to determine the violation of homeostasis, but to modulate the motivational state. This is especially important in, for example, sexual and maternal behavior. Motivation for these are not brought about by lack of homeostasis.

In the following, we will stick to the simplistic model of homeostatic drives described here, but there are admittedly a number of extensions which could be made. The most important extension is that most animals do not eat because they are hungry, but rather to avoid becoming hungry (Collier 1983). This means that an animal must be able to anticipate its future needs or alternatively defend their food intake rather than homeostasis. We can still use the notion of drive in these cases by allowing it to have a less direct relation to immediate physiological need.

Noxious Drives We may also consider the signals from noxious stimuli as drives. This includes, for example, the sensation of pain. Since noxious signals will also constitute external incentives, it is possible to consider the drive for avoidance of noxious stimuli as constant. As we will see below in the discussion of choice mechanisms, there is reason to include constant drives of this types, since it makes it easy to model individual dispositions of an animal as the levels of these constant drives. For example, an animal with a lower level on its 'pain avoidance' drive, will appear to be more resistant to pain than one with a higher level on this drive since it will engage in avoidance more seldom and with less vigor.

Cyclical Drives There are a number of cyclical drives which vary with the time of day or the year. These are not directly controlled by an internal or external stimuli. Instead they are generated by an oscillator. This oscillator in turn is influenced by external stimuli such as the length of the day, odors, or the amount of light in the environment. In this group of drives we find, for example, tiredness and wakefulness, the sexual drives of most animals, and migratory drives.

It is interesting to consider to what extent hunger drives are also of this type. Perhaps hunger is also controlled by an internal oscillator which locks to the cycle which decides the time when an animal usually gets hungry and eats. A mechanism of this type would greatly simplify the anticipation of future hunger (Collier 1983), but we will not consider it further here. It could also potentially account for fluctuation in eating behavior depending on the time of day or the season.

Default Drives The fourth type of drives can be called default drives since they only influence the animal when no external cue is present which commands the animal to do something else. The most important drive of this type is concerned with grooming (Bolles 1984). Such behavior is performed when the animal has nothing else to do. Typically, grooming behaviors consist of a fixed-action pattern (See section 2.3 and 2.4). Activities of this type are also used as displacement activities, that is, behaviors that are generated when an internal motivational conflict blocks the execution of a more appropriate behavior.

Exploratory Drive Another drive which is similar to default drives in that it tries to activate behavior when the animal has nothing else to do is the exploratory drive. The role of this drive is to let the animal try out behaviors either at random or directed by some exploratory mechanism to learn about its consequences in a certain environment. The exploratory drive can also interact with the perception of unknown stimuli to produce exploratory behavior which is directed toward a specific stimulus. In chapter 7 and 8, we will see how the exploratory drive interacts with learning and orienting behavior to produce efficient search of a novel environment.

Anticipatory Drives Finally, there is reason to believe that, in higher animals at least, there exists a sixth source of drive signals. This type of signals are internally generated, but do not relate to any present need of the animal. They can influence any other drive and their purpose is to simulate a drive, typically a homeostatic one, which is not present. This is necessary for the planning of future needs (Gulz 1991). This type of drive will be further considered in chapter 9 in relation to planning and anxiety.

The anticipatory drive is similar to the default drives and the exploratory drives in that it does not need sensory stimulation to become effective. It can be held at a constant level and activate anticipatory planning behavior when the animal has nothing more important to do (See Gulz 1991, and chapter 9).

Figure 6.3.1 Motivation as a function of drive (D1 and D2). (a) The level of motivation (M1 and M2) interacts with the output from the behavior modules (B1 and B2) through multiplication. (b) Competition between the two motivational states allows only one behavior module to control behavior at a time.

Motivation as Competing Drives

We see that drives can be classified as constant or varying. In many cases, it may not be sensible to consider constant drives as drives at all, but in everyday language we talk about a curiosity drive much in the same way as a hunger drive. Although the mechanisms controlled by these two drives are very different, there is an underlying similarity in how the drives are involved and this makes it reasonable to use a similar formalization of both cases.

Let us now look at how the motivational state of an animal is changed by its drives. To a first approximation, the strength of a motivation, m, called Mm, depends only on a corresponding drive, Dm, which gives the following formula for the calculation of motivation:

(Equation 6.3.1)

In figure 6.3.1a, a simple network is drawn which can calculate motivation in this way. The motivational state influenced behavior by facilitating the behavior command sent from the two behavior modules (Werner 1994). In the right figure, competition between the two motivational states is introduced as described in section 1.4. As a consequence, only one motivation can be active at a time. This property of the motivational system is useful when the additive composition of the behaviors generated by the different behavior modules is not meaningful. When competition is included in the motivational system (figure 6.6.1b), the creature uses arbitration by central selection, which allows it to perform one behavior at a time (See section 2.3).

Figure 6.3.2 shows how the life space of an animal changes as a result of change in drive level for the network with and without competition. In the simulations, the approach tendency toward a goal of type m was multiplied with the corresponding motivation, Mm. As can be seen, the borders between the basins of attraction are moved away from the goal with the highest corresponding drive level as could be expected. The animal will consequently be more likely to choose that goal instead of the other. When competition between the two motivational states is added, only goals relating to the dominant motivation are perceived at all.

While this mechanism makes the creature attend to its most important need first, it does not generate optimal behavior. In chapter 7, we will investigate how an attentional mechanism can be added which lets the creature weigh together its needs with the psychological distance to the various goals. With an attentional mechanism, the creature can select the best overall goal without sacrificing its ability to perform one behavior at a time.

6.4 Incentive

Another of Hull's contributions to this area includes the idea that motivation is determined by two factors. As described above, the first is the need state or drive. The second is incentive, that is, the presence of an external stimulus which predicts a future reduction of the need of the animal. For example, the presentation of food would constitute an incentive to a hungry animal. By the process of secondary conditioning, other stimuli which predict food can also become incentives as described in section 5.6.

Figure 6.4.1 The life space of a creature with different motivational conditions. (a) The drives corresponding to each of the goals are equal, that is, M0 = M1 (b) The motivation for the left goal is larger than the motivation for the right one, that is, M0 > M1. This makes the left basin of attraction larger. (c) Competition is added which makes the goal with the least motivation disappear entirely.

Schachter (1970) describes a number of interesting studies concerning the distinction between internal and external cues as determinant of behavior. It appears that different individuals are more or less likely to base their actions on internal or external cues respectively. A number of experiments show a striking difference between the normal and obese subjects. While in normal weighted subjects, their self-reports of hunger feelings coincide closely with stomach contractions recorded by a gastric balloon, obese subjects show no such correlation. Instead, their eating habits seem to be almost exclusively controlled by external cues, or as formulated by Buck (1988), "[t]hey are, in effect, on a 'see-food' diet: if they see food, they eat it".

Although the theory is not without its problems, there exists a number of experiments which show that obese people, and rats for that matter, respond more readily to external stimuli than to internal needs. This is true, not only for food related cues, but for all types of external stimuli. If nothing else, these studies surely motivate the distinction between drive and incentive.

We may distinguish between two types of incentives. The first type can be called external incentive. This type of incentive is directly generated by an external object such as food. In section 1.3, we introduced yet another determinant of motivation which was called internal incentive. This third determinant should be distinguished from external incentive in that it is not directly derived from the external goal stimulus, but it must be generated by a more complex internal process whether this be the result of secondary conditioning, planning or even some innate structure.

Figure 6.4.2 Examples of the four types of incentives for an animal which has innate mechanisms for the recognition of fish and water and the relation between them, but must learn the valence of candy and where to get it.

External incentives are usually closely connected with approach and consummatory behavior since the goal object is present when an external incentive is generated. Secondary incentives, on the other hand, are related to instrumental appetence behavior since the goal object is not yet present.

It is also possible to distinguish between primary and secondary incentives. Primary incentives are available to the animal before any learning takes place, while secondary incentives are the result of learning. Typically, primary incentives are external and secondary incentives are internal. We may thus distinguish between the four types of incentives exemplified in figure 6.4.2

Taking incentives into consideration, we can now extend our motivational system further. The strength of a motivation can now be calculated by the formula,

(Equation 6.4.1)

where, the three inputs are the drive, Dm, the internal incentive, Im, and the external incentive, Em. These three values are all functions of the internal and external stimulus situation of the animal. To allow constant drives, we must also include a constant, dm, which is added to the function Dm. We also add a constant to the internal and external incentives called cm, which allows the level of motivation to be non-zero even when both incentives are inactive. The calculation of the strength of a motivation is now given by,

(Equation 6.4.2)

The corresponding neural network is shown in figure 6.4.2.

This view of motivation is very similar to the idea of inter-behavior bidding discussed by Sahota (1994). All behaviors, or in this case, engagement modules, can be considered as bidding for the control of the creature at any time. The engagement which places the largest bid according to equation (6.4.2) will be allowed to control the creature. As pointed out by Minsky (1987), a bidding scheme of this type can easily give rise to oscillations in the choice of engagement if drive alone is the only factor influencing the choice. In the next section we will see how such oscillations can be prevented.

Figure 6.4.3 A motivational system which receives incentive signals from the two behavior modules. These signals are composed of the three factors, internal incentive, I, external incentive, E, and a constant factor, c, and gate the signals from the drive receptors.

6.5 Choice and Persistence

There is one large problem with the architecture shown in figure 6.4.2 which is not immediately obvious. Since the creature will always choose the behavior with the largest level of motivation, oscillations can easily occur. Suppose, for instance, that the animal is both hungry and thirsty, but slightly more thirsty than hungry. Assume further that food and water are readily available, but at different locations which require that the animal moves from one place to another to eat or to drink.

Since the creature is more thirsty than hungry, its first behavior will be to approach the location of the water. It will then drink a little and its thirst drive will decrease slightly. This will cause the hunger motivation to win the competition and the creature will set of for the food. Once it has eaten a bite, the thirst will be stronger than the hunger, and it will set off for the water again, and so on. This is obviously not a very clever strategy since most of the time will be spent on the way between the food and the water.

There are two solutions to this problem. The first is the incentive mechanism discussed above (See also McFarland and Bösser 1993). This will make the creature more hungry the closer it comes to the food, which will make it stay at the food longer than without the incentive signal. The incentive mechanism, thus, sets up a positive feedback-loop. This is not enough, however, and in many cases an additional mechanism is needed.

In this solution, two separate opposing systems are introduced for each drive. For example, to hunger, we add a system for satiety. When the animal becomes hungry, the hunger motivation is held high until it becomes inhibited by the satiety system. This will let the creature eat until it get satiated and not only until it is no longer hungry. By including this type of hysteresis, the eating behavior will be more persistent.

Figure 6.5.1 The effect of having two opposing systems for the hunger motivation. (a) Only one system is used which increases the level of hunger when the need raises above a set point. This will result in oscillations which cause short bursts of eating behavior. (b) The effect of using two opposing systems instead. A hunger system activates the eating behavior and a satiety system inhibits it again when the need has decreased sufficiently.

In figure 6.5.2, a simple neural circuit is shown which can generate a persistent hunger motivation as the one shown in the bottom graph of figure 6.5.1. A single sensor is used for both hunger and satiety. In general, different sensory systems could be used for the two opposing systems. This is shown in figure 6.5.3. Note that in both networks, the original drive signal is added to the persistent drive level to let the choice in the motivational system reflect the actual level of the drive to some degree. In the hunger system of real animals, many systems of this kind interact with each other to produce the feelings of hunger or satiety. It has been suggested that centers in the hypothalamus play a role similar to these systems (Teitelbaum 1961) although this view is now recognized as too simplistic (Grossman 1979, Collier 1983).

Figure 6.5.2 A network which generates persistent hunger until satiety occurs. The middle node, dp, becomes active when the drive level raises above a certain hunger level and represents the persistent component of the drive. This node keeps itself active using a recurrent connection until it is shut off by the satiety system, d-. The satiety node is inhibited until the drive level decreases below the satiety level. The persistent hunger is added with the original drive signal at d+ to produce a hunger signal which is persistent but still reflects the absolute drive level.

Figure 6.5.3 The general architecture of a network for calculation of persistent motivations. The drive sensor D+ increases motivation and becomes persistent when it increases sufficiently as in figure 6.5.2. The sensor D- activates an inhibitory system which shuts off the persistent motivation.

6.6 Optimal Choice

How should a motivational system be constructed to let the creature make the optimal choice of behavior with respect to psychological distances and expected rewards? The answer to this question depends on what the overall goal of the creature is. We will consider two possible answers here. The first is that the creature should try to maximize the reward it receives and the second is that it should try to minimize the risk of starvation and death.

If we take the first position, the motivational system needs to calculate the expected reward for each potential action which can be performed in a certain situation. In chapter 5, we saw how instrumental conditioning could be used to estimate the potential reward in each situation with respect to a specific goal. When many goals are present, this mechanism will instead calculate the potential reward with respect to the closest goal of a certain type.

For example, one reinforcement module may deal with one specific hunger, while another may deal with another hunger or, for example, thirst. The choice to be made in the motivational system concerns what type of behavior the creature should engage in. For example, should it activate its engagement module for eating or drinking? Once the choice has been made, the individual engagement systems take care of the actual execution of the behavior. This suggest that internal and external incentive should be calculated by the engagement modules and then sent to the motivational system where they should influence the choice of behavior.

Let us first assume that the drive levels for two engagements, dA and dB, are equal. This implies that the creature should select the engagement which will reach its corresponding goal with the least effort. That is, the creature should select the engagement for which the discounted reward is highest. Recall from section 4.2 and 5.8 that the value of a reward, RA, received at a goal, gA, is computed by multiplying it with the discounted cost, c(z,gA), of moving from the present location, z, to the goal, gA. If gA and gB are the goals corresponding to the two motivations, the creature should select engagement A if,

(Equation 6.6.1)

and engagement B otherwise.

This implies that when learning is included in a creature, incentive is equivalent to primary or secondary reward. Primary incentive can be generated from the primary reward and secondary incentive from the secondary reward. Note however, that incentive may be present even without learning which means that incentive and reward are not identical concepts. Given this similarity, we can define goal situations as primary motivators and stimuli which generate secondary incentive as secondary motivators.

How can we extend this choice to situations where the drive levels are not equal? We could se above, in equation 4.4.2, that the incentive signals should be multiplied with the drive signals. The choice situation above will then be transformed into the following. Select engagement A if,

(Equation 6.6.2)

and B otherwise. We see that the motivational system described above calculates the optimal choice in this way.

There is one question that we have not considered yet, however. How should the drive level reflect the needs of the creature to make its choices optimal? This question cannot be answered generally but must depend on how the different engagements of the creatures relate to each other. One principle that can be formulated, however, is that an animal should minimize its loss rather than maximizing its gain. These two strategies may appear identical, but there are many cases where they are different. This is especially true when the outcome of a behavior is uncertain. To survive, an animal should avoid making choices which could result in starvation even if the risk is very small. No matter how much food it can receive by performing a potentially lethal behavior, this gain should never outweigh the risk, if some other possibility exists which will keep the animal alive with almost absolute certainty.

On the other hand, an animal should always choose to eat if possible when it is on the edge of starvation. This implies that drive should not necessarily be a linear function of need. Figure 6.6.1 shows how a hypothetical hunger drive may increase drastically when the need of the animal passes above a certain point.

Figure 6.6.1 Drive as a function of need. The drive signal increases drastically when the need reaches dangerous levels to make certain the animal chooses the appropriate engagement.

Principles like this may be the mechanism between the apparently irrational choices made by human subjects in many experiments. Subjects will prefer to play a game where the gain is smaller, but certain, rather than a game where the gain is higher but at the risk of not winning anything at all (Pitz and Sachs 1984). If the price was food and the subject was a hungry animal on the edge of starvation, the choice would in fact be rational. However, since the modelling of these types of choice mechanisms require a representation of certainty, we will not consider them further here.

6.7 Emotion

The description of motivation above raises the question of where emotions fit in? A definition of emotions which fits nicely with the view of motivation presented here is that emotions are "states elicited by reinforcing stimuli" (Rolls 1986, Gray 1982). This definition results in a view of emotions where they can be categorized along four dimensions which are arranged according to figure 6.7.1.

We have already encountered these dimensions in chapter 5. We saw that they result from the comparison of actual rewards or punishment with expected ones. This implies that expectations are necessary for emotions along the rage-relief dimension. Without expecting a rewarding situation, how could one get disappointed? The same is true about relief. If we did not expect a punishing event to take place, we would not feel relief when it is omitted.

Figure 6.7.1 The dimensions of emotion. The presentation, omission or termination of reinforcing stimuli form the basis for a number of emotional states (Adapted from Rolls 1986. See also section 5.2 and figure 5.2.1).

As pointed out by Rolls (1986), the type of emotion is also dependent on what behaviors can be performed at a given time. When active behavior is possible when a rewarding stimulus is omitted, it may result in anger. If only passive behavior is possible, sadness or depression, may result. This aspect of emotions can be compared with the role of the applicability predicate in selecting between preparatory and consummatory behavior (See section 5.3).

For the pleasure-fear dimension, two mechanisms are possible. In the first case, the emotional level directly depends on the intensity of the reinforcing stimuli. In the other case, the emotion also depends on expectations. In this case, the emotion is generated by the reinforcement which results when the reinforcing stimulus is presented. No reinforcement is generated when the reward or punishment is already expected. This implies that emotional states will be weaker the more expected the reward or punishment is. According to this view, a reward which is entirely expected will not result in any pleasure at all.

When emotions are identified with reinforcing signals rather than stimulus intensity, an animal needs to increase the reward it receives to evoke an emotional state of constant intensity. Constant ecstasy will, thus, require a constantly increased level of the rewarding stimulus, a fact well known to drug addicts where this property takes hazardous proportions.

Figure 6.7.2 Emotion as reinforcement. The different states of the reinforcement module can be interpreted as emotions. The primary reinforcers induce fear and pleasure (or expectancy). The omission or termination of fear will induce a state of relief. Omission or termination of an appetitive state induces frustration.

In less dramatic contexts, a creature that strives for pleasurable experiences instead of a low level of motivation will constantly need more reinforcing stimulation. Since this is not always desirable, the creature needs some mechanism which will prevent events from becoming reinforcing when they are dangerous. For example, food should not be reinforcing to a satiated animal. Indeed, too much eating does result in pain. For other motivations, such as exploration, no such system is necessary.

An important property of emotions is that they are motivating in two different senses. In the first sense, they are motivating in the future, since they increase the likelihood of a certain behavior at a later time. For example, a creature, which has been rewarded when it has performed a certain behavior, will generate a larger incentive signal the next time it finds itself in the same context. This in turn will motivate the behavior.

However, emotions also appear to be motivating directly, since a reinforcing stimulus often causes a specific behavior to be performed. For instance, the presentation of food will motivate an eating behavior. This consequence needs not to be generated from the reinforcement signal which constitutes the emotion, however. It may merely coincide with it. It is quite possible for this effect of the reinforcing stimulus to be present although no emotional state is generated.

Figure 6.7.2 shows how the different emotions relate to the reinforcement module introduced in chapter 5. This module receives three types of inputs and calculates six types of emotional signals. The two primary inputs comes from perceptual systems which identify fear and expectancy respectively (Compare Panksepp 1986). Secondary input is generated by the contextual input which tells the reinforcement module about prior rewards and punishments. There are three outputs on each side of the reinforcement module. One for primary extrenal incentive, one for secondary internal incentive, and one for the omission or termination of the reinforcing stimulus for the opposing system.

When a punishing situation occurs unexpectedly, it will generate incentive signals for avoidance behavior and start the learning process in the reinforcement module. This state is called 'fear' (Rolls 1990). Stimuli that predict the punishing stimulus will generate secondary avoidance incentive, that is, passive avoidance or behavioral inhibition. When such a stimulus is presented, the resulting state is called 'anxiety' (Gray 1982). When the expected punishment is omitted, the resulting state is called 'relief'.

Figure 6.7.3 shows how the signals from the reinforcement module influence the motivational system. The various outputs are used as incentives or drives for different motivations. On the appetitive side, internal and external incentive is multiplied with the drive signal that indicates the current need of the creature. The product of these two signals is used to activate motivation m1. When this motivation wins the competition, the creature will approach and consume the reinforcing stimulus. On the aversive side, the primary incentive facilitates motivation m3, which is assumed to have a subthreshold resting activity which corresponds to a constant drive. If the incentive is high enough, the creature will activate an active avoidance behavior.

There are also two motivational states corresponding to the two emotional states of relief and frustration. Both these emotions act as drives for particular motivations. Motivation m1 which is associated with relief makes the creature act in whatever way is appropriate when it feels relieved (perhaps it smiles). When a stimulus appears that predicts punishment, motivation m4 is activated which generates passive avoidance and when the creature becomes frustrated, motivation m5 is activated which may cause the creature to attack the cause of the frustration, or in other cases, an innocent bystander. Since there is no equivalent of behavioral inhibition on the expectancy side of the reinforcement module, the effects of fear and expectancy are not symmetrical.

It is very common for the immediate motivation that accompanies emotions to generate some form of communicative signals, but since we will not deal with social interaction in this book, we will not consider them further here.

Figure 6.7.3 Motivational states influenced by the reinforcement module in figure 6.7.2. Presentation, termination and omission of stimuli activate motivational states for agression and expressions of emotion. These motivations compete with the more basic ones for appetitive and aversive behaviors.

The property of motivational states to cause frustration or relief when it is changed has been called the motivational rebound effect. This effect rests on the fact that many motivational states seem to come in pairs: tiredness and wakefulness, hunger and satiety etc. When the need corresponding to the motivational state is satisfied, the opposite motivational state is activated. The phenomena have been much studied, for example, in relation to drug abuse (Solomon 1980). The introduction of the drug is followed by a state of happiness, but when the drug is removed the opposite feeling arises. The same is true of fear and relief. After being exposed to a situation of extreme fear, the termination of fear does not result in a neutral state but rather in euphoria.

There is, however, more to motivation than can be included in the framework presented above. In agreement with self-attribution theory (Schachter 1964), what a creature may experience as emotions is the result of its internal categorization of the motivational system and its effect on behavior. The motivational system, as such, does not produce emotional experiences. The only function of the motivational system is to direct behavior now and in the future. It may very well produce the external attributes of emotions such as a smile or tears, but the experience of emotion is the result of an introspective process. This places the experience of emotion clearly within the area of cognition.

This view also implies that emotions must be learned from experience. How hunger effects behavior in the direction of food is not initially known to the animal. It is learned from its own behavior and perhaps later learned to be ignored or suppressed. If emotions are viewed in this way, it is not too remote to place love and hate side by side to hunger and tiredness. They must all be experienced and learned before they can become emotions. They differ in complexity only but not in nature.

To conclude it is possible to consider motivations as states which tell the organism what it should do at a certain time based on its internal needs and external possibilities. Emotion, on the other hand, is concerned with what the animal should have done. When the reward or stimulus situation of the animal is unexpected, an emotional state is activated. This state has two functions. The first is to control learning which lets the animal cope with the preceding situation in a better way the next time it occurs. The second function is to motivate specific emotional behaviors.

6.8 The Roots of Motivation

In this section, we will present a number of evolutionary steps that may have lead to the type of motivational system described above (See also Balkenius 1993). To do this, we have to start with some design principles that seem reasonable from an evolutionary point of view. The first pair of principles is reduplication and variation. Reduplication is a process which makes copies of an already existing structure and variation is the process whereby existing structures can be changed. We want to propose that it is possible to find three architectural principles which superimpose on each other during evolution. Based on their underlying architectural principle, we can define three classes of systems: (1) the subsumption architecture, (2) the centralized control architecture, and (3) the layered architecture.

These architectural principles do not directly describe the motivational system, but the organism as a whole. Each class is necessary for the next to evolve and each subsequent class of systems have a competetive advantage compared to systems in the previous class. This ensures the adaptive value of architectural changes which moves a species towards a higher class. The rest of this section describes the different classes in detail and identifies the necessary steps from one class to the next.

The Subsumption Architecture

The simplest system will possess a subsumption architecture (Brooks 1986). Such an architecture consists of a number of distributed systems which control different behaviors. They can, but need not interact, with each other. A subsumption architecture typically consists of a number of layers which can subsume each other. For example, a lower layer can produce walking behavior while a layer on top tells it to walk forward, turn to the left or to the right. Within AI, this approach to the construction of intelligent robots have gained an increasing popularity during the last few years (Beer 1990).

An architecture of this kind has three main features. First, it is non-representational. There do not exist any representations of the external world. Instead, the world is used as its own best model (Brooks 1991a). Second, the creature is reactive. It reacts directly to stimuli in the world. For example, a creature can avoid an obstacle simply by changing the motor patterns of its legs when its whisker contacts with the obstacle. Third, behavior is distributed. Different behavioral modules operate independently of each other to a large extent.

It is now possible to give examples of some simple creatures which use a subsumption architecture. We will try to keep the presentation as simple as possible to emphasize on the essential features of each architecture. Of course, real creatures are much more complex.

Our simplest system will simply consist of sensors and effectors. The sensors are connected directly to the effectors and either activate or inhibit them. Typical examples of such creatures are the vehicles invented by Braitenberg (1984). The simple control system of the vehicle is sufficient to produce taxic behavior. In protozoa, the entire control system can reside in a single cell. In larger animals, however, a simple nervous system is required to send the signal from the sensors to the effectors.

In a larger creature, the need arises for a connecting inter-neuron between the sensor and effector. This step does not change the behavioral abilities of our creature but is a requirement for its greater physical size. It is also essential to the future development of the species, as it is the origin of the nervous system. The next two classes can emerge in any order during evolution, but the two steps are both necessary for the evolution of the motivational system.

The introduction of an inter-neuron makes it possible for the sensory signals to interact before they reach the effectors. For example, two sensory signals can converge on a single inter-neuron which will then detect the conjunction of two sensory events. It is also possible for one inter-neuron to inhibit or activate another inter-neuron to produce complex behavior. In these systems, the inter-neuron integrates information from several sources and transforms it in some way instead of just propagating it from one sensor to one effector. Many of the systems described in section 3.4 are of this kind.

A parallel development is the introduction of sensors which can be of two kinds: external and internal. This is the result of a bodily change that places some sensors inside the organism. An external sensor reacts on some external event or state while an internal sensor reacts on the internal state of the organism. For example, the internal sensor of the feeding system could react on hunger and the external system could react on food being present. If both sensors where signalling, an eating behavior would be triggered. Computer simulations have shown that such nodes can spontaneously emerge during evolution (Cecconi and Parisi 1993). This is the origin of drives and incentives as defined above.

As in the simpler systems, a number of controlling systems can, in principle, exist in parallel. Each system is governed by its own sensory signals which control its behavior regardless of the other systems. The robots presented by Brooks (1991) are essentially of this complexity. They can produce very complex behavior without any central control mechanism. The central argument of this section is that such an architecture cannot, however, be extended to include cognitive processes without including a central control mechanism.

The Centralized Control Architecture

The creatures described above were characterized by the lack of central control. This is changed in the next class of creatures which exhibit the simplest possible architecture capable of centralized decisions. This is a necessary step toward a higher cognitive system and the origin of the motivational system. A similar position is held by Sjölander (1992) who discusses the need for centralized representations of objects.

When the organism develop more complex behaviors, they will inevitably disturb each other. When this happens, it is essential that the different systems can inhibit each other in order to produce one behavior at a time. In a neural context, this can be achieved by lateral inhibition (See section 3.3). There exist three different forms of lateral inhibition which could produce the desired result: feed-forward inhibition, feedback inhibition and recurrent inhibition (Grossberg 1973). Feed-forward inhibition is primarily used in other contexts to compensate for different overall activity levels (for example in the visual system in the LGN). Feedback inhibition have been used in at least one model of animal behavior which was based on the theories of Lorenz and Tinbergen (See Schnepf 1991).

In the following, we will assume that our creature is using recurrent inhibition to select one behavior instead of the other as described above. The properties of recurrent inhibition are essentially the same as for feedback inhibition, but they are simpler to analyze. In a recurrent network where the lateral inhibitory connections are of the same strength, the behavior that receives the most activation will be selected. One layer collects the activation of each behavioral system and the next layer is responsible for the selection of the behavior with the strongest activation.

The Layered Architecture

The next evolutionary step is to include several levels of control. This is not to say that a layered architecture is not possible on the lower levels. Indeed, the subsumption architecture is typically constructed in a number of layers. However, the nature of those layers are very different from the layers discussed here.

At this final stage, the system can develop in a new way. While the basic subsumption architecture, together with the lateral inhibition, keeps the creature alive, evolution can experiment on the next layer of our architecture. One of the most powerful mechanisms invented by evolution is that of learning. A learning ability can evolve on top of the previous system once motivation has been included, since learning has an obvious relation to motivation as we have seen above. Learning needs the motivational state to detemine when a certain behavior is good. Without it, the creature could only learn that a behavior is good or bad, but could not relate it any of its needs.

The learning architecture makes it useful for the animal to develop a more advanced perceptual system. It can supply the learning system with more advanced perceptual cues. Mechanisms for categorization and classification will now give the animal an adaptive advantage. From these capabilities evolves the mechanisms, which are usually considered to constitute cognition.

6.9 Conclusion

Words such as drive, incentive, motivation and emotion have had many different meanings within different theories. Even today there is no consensus as to what, for example, an emotion is (LeDoux 1995). What is called drive in one theory may be called emotion or motivation in another. We have tried to use these word in a way that is, at least, consistent with each other while retaining as much as possible of their ordinary meaning. Since our main concern is artificial creatures rather than real animals, one cannot help feeling that these concepts are more complicated in real animals. Clearly, we are nowhere close to the complexity of human emotions. We believe, however, that in taking a design perspective on motivation and emotion, we have shown that these concepts can be used to great utility in the design of artificial creatures.

We have seen that a creature needs a central motivational state which is responsible for the selection of one motivation at a time. This requirement is closely connected to the idea of separate modules for different engagements which is essentially a strategy for task decomposition (Tenenberg, Karlsson and Whitehead 1993).

The choice of motivational state is influenced by three factors called internal and external incentive and drive. The incentives tell the creature about the current possibilities of fulfilling a need, while the drive signal informs it about the urgency of that need. Incentives can also be classified as primary, that is, innate, or as secondary, that is, acquired. Primary incentives were mapped onto the concept of primary motivators, and secondary incentives were considered as secondary motivators. This implies a close coupling between perception and motivation as will be further developed in the next chapter.

The view of drives, which we have presented, assumes that there is one or several drives for each engagement of an animal. This is thus an attempt to resurrect one of the original meanings of the word drive where it was more or less analogous to the concept of an instinct. While it has been argued that the explanatory value of this drive concept is nil (See Bolles 1967), we hope to have shown that it is quite useful from a design perspective. This usefulness comes from the separation of the drive concept from the engagement it controls.

The idea of pure emotions relating to a small set of biological processes is very similar to the idea of engagement systems described above (See Plutchik 1991). The problem for any such theory of emotion (or motivation) is, of course, to identify exactly which these basic processes are (Ortony, Clore and Collins 1988). We have made no attempt to produce a comprehensive list of engagements, since we do not believe that any such general list exists. The set of engagement systems required by an artificial creature depends critically on what it is supposed to do, and so will its emotions.

It was shown how emotions can be seen as states caused by reinforcing stimuli. Depending on the nature of the stimulus, emotions could be categorized into four basic dimensions: pleasure, fear, frustration and relief. It was also shown how incentives could be generated by the reinforcement module.

As mentioned at the end of chapter 5, the perfect symmetry between the different emotions is a simplification. Since many different engagement systems are involved, it would be more accurate to depict emotions in a three dimensional space where the third dimension would represent the engagement. Some engagements may not include both the positive and negative side of the diagram shown in section 6.7. For example, an engagement system for aggression may not include the pleasurable side. Such an engagement system may, however, interact with other systems (Compare Konorski 1967).

It has been suggested that the amygdala controls the emotional processes which form associations with primary reward and punishment, while the orbitofrontal cortex may be involved in the detection of mismatch between actual and expected reward or punishment which results in frustration or relief (Rolls 1986, 1995). The amygdala receives inputs from a large variety of sources in the overlying temporal lobe cortex and other areas. Some outputs from the amygdala, as well as from orbitofrontal cortex, are directed towards the hypohalamus which is known to mediate motivational and emotional responses (Rolls 1986).

To summarize, the basic function of motivations is to tell the creature what it should do, while the role of emotions is to tell the creature what it should have done.



This text is an excerpt from:
Natural Intelligence in Artificial Creatures
© 1995 by Christian Balkenius
Lund University Cognitive Studies 37
ISBN 91-628-1599-7
ISSN 1101-8453
ISRN LUHFDA/HFKO--1004--SE
The printed version of the book can be ordered from:
Lund University Cognitive Science
Kungshuset, Lundagård
S-222 22 LUND
Sweden
or by e-mail to:
sekreteraren@lucs.lu.se

christian.balkenius@lucs.lu.se