3.1 Introduction

As described in the preceding chapter, there are many ideas from animal learning theory and artificial intelligence that can be used as a starting point for the construction of artificial creatures. In this chapter, we will introduce two areas of research that will play an important role in the rest of the book.

First we will take a look at a number of design principles that have recently been proposed in the field of behavior-based robotics. Systems designed according to these ideas go by the names of situated agents (Agre and Chapman 1987), embedded systems (Kaebling 1993), autonomous agents (Maes 1990), and reactive control (Arkin 1990). The principles introduced here will guide the construction of our artificial creatures. We will also introduce some terminology that will be used in later chapters.

Next, the basic principles behind neural networks will be described with the emphasis on the types of systems that will be used in our artificial creatures. We will see how model neurons can be connected in different ways to perform various functions which will be used later on in the book.

The chapter concludes with a simple example of an artificial creature. In this example, we show how the proposed design principles constrain the construction of an artificial nervous systems. The artificial world that will be used throughout the book is also introduced together with the sensors and effectors of the creature.

3.2 Behavior-Based Control

A behavior module is a subsystem that is responsible for one specific coupling between sensors and actuators (figure 3.2.1). A behavior module can, thus, be thought of as a transfer function which transforms its sensory input to an actuator output. In most of the literature on behavior-based control, a behavior module is referred to simply as a behavior, but since this terminology is rather confusing as well as in conflict with its use in biology and psychology, we will use the longer term instead.

Figure 3.2.1 A behavior module is defined as a transformation from sensory input and to actuator output.

The use of behavior modules as the basic building block contrasts sharply with the view of traditional AI where control is typically based on a set of goals, a model of the world and a search procedure (Brooks 1991a). The search procedure tries to find an action sequence that changes the state of the world to the desired goal state. If such an action sequence is found, it will be executed step by step in the real world.

In a behavior-based agent, the goal needs not be explicitly represented. Instead, behavior modules are selected on an immediate sensory basis in such a way that they are likely to move the agent closer to the goal in the real world. Problems are avoided when they occur. As we will see in later chapters, a behavior-based agent can be augmented with explicit goal representations and planning, but such abilities are not part of its primary repertoire.

The main argument in favor of behavior-based control is that it does not rely on an accurate internal world model. This avoids problems that occur when the internal world model does not agree with the real world. Instead the world is used as its own model in the sense that all actions performed are triggered by the real world as it presents itself at the sensors (Brooks 1990, 1991a, Agre and Chapman 1987). Since the real world is always in agreement with itself, the agent is more likely to react correctly. However, since no internal world model is used, a strictly behavior-based system will not be able to use explicitly represented goals or plans.

Another important aspect of behavior-based systems are thatthey are grounded. This means that they satisfies the physical grounding hypothesis that states that to build an intelligent system it is necessary to have its representations grounded in the physical world (Brooks 1991a, Harnad 1990).

The identification between internal behavior modules and external behaviors is somewhat simplified but it expresses the spirit in which our creatures will be built. We will see that behavior in general is an emergent function of a system. That is, that there is no strict correspondence between externally observed behavior and internal modules. In the field of behavior-based robotics, emergent functionality is usually though of as a result of the interaction between the agent and the environment. In a more general setting, emergent properties are such that they can not be predicted from the parts that constitute the system (Churchland 1984). The behavior of our creatures will be emergent in both senses of the word.

A single behavior module would not be of much use if it were to operate on its own. To construct a system with some level of sophistication, a number of behavior modules must be combined. There seems to be two distinct ways of doing this. The first is borrowed from the subsumption paradigm (Brooks 1986, 1991a, b) and the second has its inspiration in the idea of a central motivational state.

Subsumption

In the subsumption paradigm the control system of an agent consists of a number of behavior modules arranged in a hierarchy. The different layers in the architecture take care of different behaviors. The lower layers control the most basic behaviors of the creature while the higher behavior modules control more advanced functions (figure 3.2.2). This idea is not entirely unlike hierarchical motor patterns as they are used in ethology (Tinbergen 1951/1989, Tyrell 1993).

Figure 3.2.2 A hierarchy of behavior modules.

A typical low level behavior in a subsumption style creature controls activities such as object avoidance, wandering and exploration. On a higher level we may find processes like object identification and planning. Each higher layer is able to monitor and control the underlying layers, but the communication between modules is reduced to a minimum. While traditional systems can be said to be vertically decomposed into processing stages, a system of the present type is horizontally decomposed into behavior modules (Schnepf 1991).

An important type of subsumption hierarchy can be called an appetence system. It consists of at least two parts that control appetence and consummatory behavior. Most parallel engagements can be divided into these two components. For the eating behavior, the appetence behavior consists of searching for, or collecting food, while the consummatory, or terminal, behavior corresponds to the actual eating of the food (Lorenz 1977). The distinction between the two is that the first behavior is instrumental in achieving the second. The appetence and the consummatory behaviors can be organized in a subsumption style hierarchy (figure 3.2.3).

Figure 3.2.3 An appetence system consists of behavior modules for appetence and consummation.

The consummatory behavior, such as eating or sex, is generally very rigid while the appetence behavior can gain a lot from learning. Most of the learning processes that we will investigate in chapter 5 will operate on this part of the behavior.

A glance at the animal learning literature tells us that this basic appetence system is usually much more elaborated in real animals. For example, Timberlake (1983) suggests that the appetitive behaviors of animals are controlled by what he calls behavior systems. There exist different behavior systems that are related to feeding, mating, parenting, bodycare and other activities. In relation to learning, it is important to understand that the learning methods involved in the different parts of a behavior system are very different.

For feeding, the behavior system consists of behaviors for individual foraging, social-approach, investigation, predation, food handling, hoarding, ingestion and possibly rejection of the food. All these behaviors can be further divided into smaller pieces. For example, investigation may consist of approach, sniff, nose, lick and bite (Timberlake 1983). Clearly this view of behavior can easily be reconciled with ideas from the subsumption paradigm (see also Collier 1983).

Parallel Engagements

In our artificial creatures, the parallel to behavior systems in real animals will be engagement modules. Different engagements are controlled by parallel systems. In an artificial creature, behaviors such as eating and sleeping may be implemented as parallel engagements. Each engagement is controlled by its own subsumption hierarchy as shown in figure 3.2.4.

Figure 3.2.4 Each engagement has its own hierarchy of behavior modules.

Figure 3.2.5 Two parallel engagement systems that share two behavior modules.

Since some layers of different engagement modules will usually be the same, it is possible for different engagement modules to share behavior modules. Figure 3.2.5 shows an architecture where the two layers are shared by two parallel engagement modules.

It is clear that rather complex systems can be built by combining behavior modules into larger structures in the two ways that we have described so far. When a single hierarchy is used, the subsumption mechanisms are sufficient to choose between the different behavior modules, but when parallel engagements are introduced, some additional mechanism is necessary to restrain different engagement hierarchies from interfering with each other. This is the role of central engagement selection.

Central Engagement Selection

When an agent has different and competing engagements, it must be able to select among these activities in a rational way. To do this, it must evaluate its needs and its possibilities and make a decision about what to do. This decision is represented by a functionally central motivational state that is responsible for activating or inhibiting the appropriate modules for the selected engagement (Gallistel 1980). We will call the part of the agent that handles this decision the motivational module.

It has sometimes been argued that a central motivational state is neither necessary nor very useful in an autonomous agent (Maes 1991). It seems, however, that the problem with central motivational states lies, not in making a central decision, but rather in the view that once a decision is made, it is not changed until its corresponding goal is reached.

The type of motivational state we are advocating is continuously reevaluated and does, thus, represent a transient momentary decision and not a long term commitment. This motivational state can change at any moment to reflect changes in the environment or a new evaluation of old knowledge (Balkenius 1993)

Figure 3.2.6 The determinants of motivation. The motivational state is a function of three factors, internal drives tells the creature about its current needs. External incentive tells it about goal objects which are directly accessible, and internal incentive tells it about more distant possibilities.

Nor is it necessary that the motivational module is physically located at one particular place in the agent. It needs to be central only in the sense that the decision made has global consequences for the whole creature. It is important not to misunderstand the principle of central engagement selection. It simply states that all behaviors cannot be executed at the same time and that the choice of engagement cannot be made locally. One part of the agent cannot decide to search for food while another decides to dance. However, most engagements, and especially parts of behaviors, are best handled in a distributed fashion using a subsumption architecture. The motivational state is selected as a result of three factors (figure 3.2.6).

It is important to recognize that the difference between external and internal incentive is not whether learning is involved or not. Both processes can depend on learning. For instance, learning that an apple tastes good could make the apple an external incentive at a later stage. Instead it is the presence of the goal object that determines whether we have a case of external or internal incentive.

The three factors are weighted together for each of the engagements of the agent and a decision is formed about what the agent should do. It is quite possible that this decision is changed as the agent tries to pursue its goal. If our creature comes close to water on its way to the food, it may very well change its decision and drink on its way. Once the need for water has decreased, the creature will continue toward the food. This is an example of opportunistic behavior that results from the interaction between a motivational module and learning or a reactive control system. Figure 3.2.7 shows an architecture where two engagements, A and B, centrally compete for activation.

Apart from determine what the agent should do, the motivational system plays an important role in directing learning. Since the motivational module is the only subsystem that contains information about what the agent is trying to do, it is in a favorable position to determine the success of the currently generated behavior. When the actual success of a behavior does not match the expectations, the motivational system can generate learning signals to the other parts of the creature. As we will argue in chapter 6, such learning signals make up the basis for emotions. There, we will also return to the role of the motivational module in learning and give a longer argument in favor of its inclusion in artificial creatures.

Figure 3.2.7 A behavioral hierarchy is selected centrally.

Functional Levels

The behavior generated by an animal has its causes both within the animal and externally. Let us introduce a useful distinction between three levels of internal causes of behavior (figure 3.2.8, see also Balkenius 1994b and Gärdenfors and Balkenius 1993). A less sophisticated creature may have only the first level while a more advanced animal may have all three.

The first level refers to the innate properties of the system. At this level we find the fixed motor patterns of many animals and the basic reactive behavior of our artificial creatures. At the second level, the innate behavior becomes adaptable by the introduction of learning. Such learning is a direct consequence of experience with the world. The final level is engaged in self-supervised learning. This type of learning is not governed directly by the external world. Instead it is controlled by knowledge of the external world and the consequences of actions performed in it. Adaptation based on self-knowledge is a special type of learning at this level. In chapter 9, we will return to this last level.

Figure 3.2.8 Three functionals levels of behavior control.

It is not necessary that the three levels correspond to physically distinct modules. For instance, in a module built as a neural network with learning nodes, it is not possible to physically distinguish between the innate and the learning level. At a functional level, however, the two levels are still different. We have not been able to find any example of a system where all three levels exist in a single component, but they probably exist.

When constructing artificial creatures, we will find it necessary to incorporate the three functional levels both within small subsystems such as a single neuron and at a larger scale where the different levels are handled by distinct modules.

Summary

We have presented a number of architectural principles which are useful as design directions when constructing an artificial creature. The most important principle is that autonomous agents should be behavior-based. Once this principle is established we have to consider how to combine behavior modules into larger systems. We have proposed two possible mechanisms. The first is the subsumption architecture introduced by Brooks (1986) and the second is a central motivational system.

In whatever way the behavior modules are combined, the generated behavior can be caused at three different functional levels. The first level corresponds to innate behavior, the second to learned behavior and the third to behavior which has been rehearsed internally or is governed by self-knowledge of some kind.

The ideas presented are mainly based on the type of mechanisms one would find in biological systems. We believe that the type of overall architecture presented here can be used as a starting point for the development of very advanced and capable autonomous agent. In section 1.5, we will consider a very simple creature constructed according to these principles, but first we need to consider some properties of the artificial neural networks which will be used for the nervous systems of our creatures.

3.3 Artificial Neural Networks: A Brief Introduction

Artificial neural networks are, of course, the obvious choice for the nervous system of an artificial creature. However, there are more reasons than simply the name. The most important is that by using artificial neural networks, the representations used will be of the kind used in real brains. These representations use labeled line coding with graded activity values (Martin 1991). The idea behind labeled line coding is that each line, in this case, the connection from one node to another, has a fixed meaning. This type of representation is easiest to understand in the case of sensory representations. A specific signal means that a specific sensory stimulus is present. The intensity of the signal reflects directly the intensity of the stimulus. The use of labeled line coding does not imply that representations must be local, that is, it is not necessary that one single line codes for one single stimulus. In most cases, a large set of lines will be active for any particular stimulus. This is called a distributed representation. Such representations can be comfortably represented as vectors. We do not want to claim that these representations are the same as those in the brain, but they are certainly similar in many respects. While it would be possible to use this type of representation without the neural networks, we think it would be harder to do so consistently.

The next reason for using artificial neural network is that the computations will be "in the style of brain" (Arbib, Conklin and Hill 1987). Again, it is not possible to claim that the computations in the brain are the same, but they are probably more similar to those of neural networks than to any other type of computational model. In this context, we also want to mention that we will make no attempt to make our neurons very realistic. Given the complexity of the model we will develop, and the amount of speculation that inevitably will go into it, it would be nearly impossible to try to be true to the detailed facts about neuronal functioning.

When we will point out the similarities between an artificial neural circuit and some region of the real brain, this will only be a statement about the function they fulfil, not about a similarity at a neuronal or network level. Having said this, we can now safely introduce the neurons and connections which will be used in this book.

Figure 3.3.1 The neuron model. The node receives three input signals s0Šs2 that generates an activity x, which in turn controls the output signal f(x).

A Model Neuron

The model neurons we will use in our artificial creatures can be described by two factors. The first is how they combine their input signals to form its activity level, and the other is how they generate an output signal from this activity level (figure 3.3.1).

The input signals will be combined using both temporal and spatial integration. Let s0Šsn be the input signals received by a node x. The activity x changes according to the differential equation,

(Equation 3.3.1)

A node of this type is usually called a leaky integrator since it integrates its input over time and leaks through the term -ax. If a=0, the node does not leak and is simply called an integrator. The term aR describes the resting activity of the node. When no input is received, the node will return to this value. Throughout this book, it will be assumed that R=0 when it is not explicitly mentioned that it has a different value.

When the input signals changes slowly compared to the time constant of equation (3.3.1), the activity of the node can be approximated by,

(Equation 3.3.2)

In this case, the neuron is called a summing node. When the integrating property of the node is not important, we will use this type of node since it is easier to analyse. Unless it is explicitly mentioned that integrator nodes are used below, all nodes are assumed to be of the summing type. The output signal from a node is calculated by its output function, f(x). We will use four different types of output functions in this book. The first, and simplest, is the linear function,

(Equation 3.3.3)

The second output function is called semi-linear (Amari 1982) and includes a threshold level below which no output is generated,

(Equation 3.3.4)

Our next output function is the threshold function,

(Equation 3.3.5)

And finally, we will use a sigmoid output function (Cohen 1983),

(Equation 3.3.6)

Depending on the parameters n and d, this function can look either as a semi-linear or a threshold function. In either case, it is continuous which will be important in some cases. In the neural network literature, the logistic function f(x) = 1/(1+e-x+q) is often used as sigmoid function (for example, Hopfield 1984), but for the purposes of this book equation (3.3.8) has more suitable properties.

Figure 3.3.2 The Different Node Types.

There are three distinct types of neurons, or nodes, in the nervous systems of our creatures (figure 3.3.2). The first type is the internal node. These nodes are drawn as circles. The second type are the motor, or output, nodes which are drawn as double circles. These nodes control the behavior of the creature. The last type of node is the input, or sensor, node and these are drawn as squares. The output from a sensor node is set by a sensor on the body of a creature (See section 3.4).

The nodes can be connected together by various types of connections. These are shown in figure 3.3.3. When a signal passes through a connection, it is multiplied with the weight, or strength, of the connection. Weights can be either positive or negative. Connections with positive weights are called excitatory while connections with negative connections are called inhibitory. The excitatory connections are drawn in white while the inhibitory are black.

Figure 3.3.3 The Different Connection Types.

If w0Šwn are the connection weights on the connection leading to node x, the summed input is given by,

(Equation 3.3.7)

Connections can be either fixed or plastic. Plastic weights change when the node to which they are connected receives a learning signal through a learning connection (figure 3.3.4). The change in the plastic connection, w, is governed by a learning rule of the form,

(Equation 3.3.8)

where L(x) is the learning signal transmitted from node x, f(y) is the output of node y, and g(z) is the signal passing through the plastic connection. Various learning rules will be presented as they are needed.

Another type of connection is the facilitating one (figure 3.3.4 c and d). Instead of influencing the activity of another node, a facilitating signal changes the signal in an other connection. Let F be a facilitating signal and f(x) be the signal in the facilitated connection from node x to node y with the weight w. The resulting signal is the product of the two signals, that is Ff(x). When this signal reaches node y it will also have been multiplied with the connection strength w. The signal reaching y will, thus, be Ff(x)w.

Figure 3.3.4 Examples of connections. (a) The node x excites the node y. This node in turn inhibits z. (b) The node x sends a learning signal to y which influences the plastic connection, w, between z and y. (c) The signal sent from x to y is facilitated (or gated) by the signal F. (d) The output from node x is facilitated (or gated) by the signal F. (e) The signal sent from x to y is suppressed by the signal S. (f) The output from node x is suppressed by the signal S.

It is also possible for a facilitating connection to facilitate a whole node and not only a connection (figure 3.3.4 d). In this case, it is the output of the node that is multiplied by the facilitating signal. In the example in figure 3.3.4 this gives the same result as facilitation of a single connection, but it is not necessarily the case.

The negative counterpart of facilitation is suppression. This situation is similar to facilitation except that the suppressed signal from x to y is multiplied with (1-S) instead, that is, if f(x) is the signal transmitted in the suppressed connection and w is the connection strength, the resulting signal will be (1-S)f(x)w. This means that the larger the suppressing signal is, the smaller the suppressed signal will get. See figure 3.3.4 e and f. The behavior of facilitated or suppressed nodes and connections are similar to that of sigma-pi units where inputs to a node can interact both additively and multiplicatively (Williams 1986).

A final property of connections is that they delay the signal transmitted in them. We will assume that all connections have a fixed time delay of 1 time unit. This means that if node x sends the signal f(x) to node y through a connection with weight w at time t, the activity of y will be increased by f(x)w at time t+1. Taking this into account we can summarize the calculation of the activities of the nervous system as follows.

Let x0Šxn be the activities of all the nodes in the network and let fi(xi) be their output functions. The connection from node xi to xj is called wij. When only summing nodes are used, the successive states of the network are given by,

(Equation 3.3.9)

When neither facilitation nor supression is used, Wij(t) is set to the connection weight between node we and j, that is Wij(t)=wij(t). If facilitiation exists in the connection, Wij(t) is set to,

(Equation 3.3.10)

where Fij is the set of indices of the nodes facilitating the connection between node we and j. Similarily, if supression is used,

(Equation 3.3.11)

where Sij is the set of indices of the nodes supressing the connection between node we and j.

In all simulations reported below, Eulers method is used to calculate the activities of the integrating nodes. It is well known that this method gives rather large errors (Eldén and Wittmeyer-Koch 1987), but since it has worked well in the simulations in this book, we see no reason to use a more complicated method. It also makes it simple to use both integrating and summing nodes concurrently. The formula (3.3.9) is simply replaced by,

(Equation 3.3.12)

For a summing node ej=aj=1, for a leaky integrator 0

System Properties

When nodes are connected together into networks, it is often interesting to study the long-term behavior of the system. Given that the network receives constant input from the environment, one of three things can happen. In the first case, the activities of the nodes converge to some fixed value. When this happens, the network can be said to have made a choice. The final state is called a fixed point in the state space of the network. In the second case, the network goes through a sequence of states which recur with a fixed time interval. The network is said to have reached a limit-cycle. In the last case, the network exhibits chaotic oscillations. When this happens, the activity of the nodes goes through a never-ending sequence of states that never repeats itself. This last type of behavior is generally undesirable.

In the artificial nervous systems which will be developed in this book, it is usually required that parts of the network reaches either a stable state or a stable limit-cycle. However, this is generally not a desirable property of the nervous system as a whole. A stable state in the nervous system would mean constant behavior throughout the lifetime of the creature. The entirely reactive creatures we will discuss in chapter 4 are exceptions. Since their internal state always reflects the input they receive, they will always be in a stable state when the input is constant. This is not much of a problem, however. Since the creatures are moving around, their input will change all the time, and so will their internal state.

Another question to ask about the internal state is whether it is stable or not. If the state changes slightly, will it return to the previous fixed point or limit-cycle? If it does, the prevous state was stable. Otherwise it was unstable. Since these properties have already been formally studied by many researchers (Amit 1989, Cohen 1983, Kamp 1990), we will not do so here. Instead, we will try to summarize some important architectures which we will use later on in this book.

Oscillation A simple oscillator can be constructed by letting an integrating node with a threshold output function, f(x), inhibit itself when its activity passes over the threshold (figure 3.3.5). Given a constant input signal, s, the activity, x, of the node will gradually build up until the threshold is reached. At this point, the node will generate a brief output pulse which will reset the node and the cycle will repeat itself again. For a fixed threshold, the frequency of the emitted signal will be proportional to the intensity of the input signal.

The oscillator is one of the most important types of units used in motor control (Gallistel 1980). Since the creatures in this book will have very limited motoric abilities, the oscillator will not be used to any large extent below.

Figure 3.3.5 A simple oscillator. See the text for futher explanation.

Competition One important property of many neural networks is the ability of the nodes to compete with each other and to form a global choice. This process is implemented by letting all the nodes gradually inhibit each other until only one node is at its supra-threshold level. Figure 3.3.6 shows two simple networks with this property. In each network, the three nodes x1Šx3 receive different input signals called s1Šs3.

Figure 3.3.6 Two simple competitive networks. All signals except the largest are dynamically quenched by recurrent inhibition.

Let us first consider the network to the left. The output of each integrator node xi is given by a sigmoid function. Competition is implemented by an auxiliary node called x0. This node calculates the sum of all the output signals f(xi) and generates recurrent inhibition through the semi-linear output function g(x0). This output signal inhibits the nodes x1Šx3 with equal strength. Given that the activities of x1Šx3 build up gradually, the activity of x0, and consequently also the level of inhibition g(x0), will gradually increase. When the process stabilizes, only the node with the largest input signal will remain at its supra-threshold level. The output signal of this node, say x1, will be equal to its input signal s1. Figure 3.3.7 shows the development of the output signals in a simple competitive situation. As can be seen, all outputs start to increase, but all except for the strongest are quenched by the recurrent inhibition. At equilibrium, the output g(x0) is equal to the maximum of the input signals. This network can, thus, be used to calculate the maximum of a number of signals.

Since networks of this type have been much studied in the neural network literature (Amari 1982), we will not develop the analysis further here. We will only note that the network to the right in figure 1.4.4 shows almost identical behavior to that to the left. In this network, all nodes inhibit all other nodes proportional to their output signal. This has the same effect as the inhibition from the auxillary node in the network to the left. The only difference is that the outputs are not summed at an extra node. Since the inhibitory connections protrude to the sides of each node, the network is said to use lateral inhibition. Inhibition of this type is very common in biological neural networks.

Amari and Arbib (1977) and Amari (1982) analyse networks with recurrent inhibition through an a extra node. A similar network, but based on different principles, is presented in Trehub (1991). Grossberg (1973) has studied the behavior of networks with lateral inhibition combined with different output functions and update rules. He has also shown how choice can be combined with normalization of the input pattern and contrast enhancement in a single network. These properties require quite complex dynamics and will not be used here. We will instead use different, and simpler, network architecture for these processes (See section 7.2 for an example of normalization).

Figure 3.3.7 The development of the different activity levels in the nodes of the networks in figure 3.3.6.

Cooperation A content addressable memory can be constructed by connecting a number of nodes together as shown in figure 3.3.8. This is called a recurrent network. Each threshold output signal, f(xi), is connected back into the network to all other nodes through plastic connections of varying strength. An input signal, si, will first activate its corresponding node, xi. The output from this node will then activate all other nodes that it is connected to. These, in turn, will propagate the activity to the nodes to which they are connected, and so on. This process is called spreading activation (Rumelhart and McClelland 1981).

Figure 3.3.8 A content addressable memory.

If two nodes, xi and xj, both activate the other in this way, they are said to cooperate. If one node in a set of cooperating nodes is activated, the mutual cooperation will set up a positive feedback loop which will eventually activate all nodes in the set. We call the activity pattern that results a resonant state (Grossberg 1989, Shepard 1984). This property can be used to construct a content addressable memory (Grossberg 1989).

Let us call the input to the network s = and the output f = . Let us further assume that we want to store the two patterns a = <1, 1, 0, 0> and b =<0, 0, 1, 1> in the network. This is accomplished by setting the connection from node we to j, wij to 1 if there exists a pattern where both position we and j are 1. To store pattern a, w01 and w10 are both set to 1 since pattern a contains 1:s at position 0 and 1. Similarly, for pattern b, w23 and w32 are set to 1. All other connections, including the connection from each node to itself, are set to 0.

Figure 3.3.9 shows the resulting network. If we set the input to any part of the pattern a, say <1, 0, 0, 0>, the output will be the whole pattern, in this case, <1, 1, 0, 0>. The same is true about the pattern b. if the last half of this pattern is given as input, that is, s = <0, 0, 0, 1>, the output will approach <0, 0, 1, 1>. Since each pattern is associated with itself, a system of this kind is sometimes called an auto-associator and a content addressable memory is sometimes alternatively called an auto-associative memory.

It is clear that it is easy to store any number of patterns in a memory of this kind as long as they do not overlap. If we try to store two patterns which do overlap, such as <1, 0, 0, 0> and <1, 1, 0, 0>, the largest pattern will always be recalled. This problem gets even more severe, if we try to store a number of patterns that partially overlap, say <1, 1, 0, 0>, <0, 1, 1, 0> and <0, 0, 1, 1>. Giving the network any of these patterns as input will recall the pattern <1, 1, 1, 1>, which was never stored at all.

Figure 3.3.9 A content addressable memory which has learned a number of patterns.

A number of methods have been developed to handle these problems. The most obvious one is to let the connections have varying strengths and to include inhibitory connections which make it possible for the different nodes to compete with each other and this increases storage capacity in many cases. Another strategy is to add high-order nodes which recognize specific activity patterns and enhance the storage of these (Cohen and Grossberg 1987). We will return to these problems many times in the discussion of associative learning below. Here we will only present an overview of two different learning strategies which can be used in a network of this type.

There are essentially two methods to set the weights of a recurrent network. The first is to increase the connection weights each time two nodes are active together as in the example above. This method was pioneered by Hebb (1949) and its different variants are often referred to as Hebbian learning rules. The central idea is that temporal contiguity of activation increases the connection, or association, between two nodes. The second method is to let the connections between nodes represent the statistic contingency between them. Networks of this type can be based, for example, on Bayesian decision theory (Lansner and Ekeberg 1989) or on an estimate on the transferred information between the nodes (Balkenius 1992).

The distinction between these two methods for formation of associations has been much discussed whithin animal learning theory (See for example Mackintosh 1983). The important distinction between these two learning types has not been much discussed among neural network researchers.

Conclusion

This section has introduced the basic neural network concepts which we will use throughout this book. A number of node and connection types have been presented. We have also taken a brief look at the phenomena of competition and cooperation in small networks of interacting nodes. In the next section these concepts will be put to work in our first example of a complete artificial creature.

3.4 A Complete Artificial Creature and Its World

The principles presented in the previous section can be incorporated also in very simple creatures. In this section, we want to present one such example which will also serve as an introduction to the methodolgy used in the rest of the book, but first we will describe the world that our creatures will inhabit. Since the creature constructed in this section is mainly intended as an example, we will not put forward much argument in support of the different design choices made here. In fact, some of them are not even very good. In order not to clutter the presentation, such arguments will have to wait until the following chapters.

Figure 3.4.1 An example environment. Four walls, three food objects and a single creature are present.

Elements of the Environment

An animal is nothing without its environment. It supplies it with food, shelter and all other things that are essential for a living organism. If we want to study complete autonomous creatures, this can only be done if we also investigate the world which it inhabits. Since the creatures we will consider in this book are all simulated, their world differs in a number of ways from the 'real' world. It is, thus, of great importance to fully understand the details of this alternative world before we can start the construction of our artificial creatures.

The simulated environment consists of an infinitely large two dimensional plane, though in practice, the environment is constrained by walls. Creatures and objects are placed in this plane and are all simulated as two dimensional shapes such as circles and squares. In this respect, at least, the environment is not entirely unlike Abbott's Flatland fantasy although the behaviors of its inhabitants are entirely different (Abbott 1884/1991).

There are a number of objects in this world such as walls, food and creatures. It has not been the goal to make the environment as realistic or as complex as possible. While very complex environments can give us insights which are not possible in a simple environment, it is also very often important to strip the environment of all but the essential features. It is useful to consider the world described here not as a model of reality but rather as an alternative world which shares some aspects with the real world. If we were to carry out the task described in this book in the real world with robots instead of simulated creatures, many details of the creatures would have to be different. However, the general methodology would be the same.

Walls Walls are represented as lines in the plane. Walls can be either opaque or non-opaque. Opaque walls do not let odors through while non-opaque walls do. The primary function of walls is to structure the environment into something a little more interesting than an empty surface.

Food There are four types of food in the environment: A, B, C and D. Food is depicted as circles with varying radius and contains a varying amount of energy. They also produce a smell. This smell, that may be a complex composition of odors, can be detected by the creatures and it can guide them towards the food. What type of food it is can be detected by the creature once it is in contact with it and when it eats, a certain amount of energy is transferred from the food object to the creature.

This energy can be of four types called e0, e1, e2 and e3. Different activities of the creature use different types of energy which makes it necessary for our creature to eat different types of food containing different energy types.

Aversive Objects An aversive object gives a creature an Œelectric shock', if it comes into contact with it. These objects are circular or points, just like food, and also give off smells. When a creature is in contact with an aversive object, its shock sensor reacts.

Other Creatures Apart from the creature, which is in our focus of attention, other creatures may also be present in the environment. Usually, these will be copies of our current creature but at times they will be what we will call irrelevant creatures which simply move around at random. The role of these creatures is to annoy the real creature and make sure the environment is unstable.

This list of objects is not complete and we will introduced various other things as we go along, but they are sufficient for our first example of a neurally controlled artificial creature.

The Body of an Artifical Creature

Our creature has a perfectly circular two dimensional body. This may not be much of a body, but since we will mainly study spatial behavior and not motor control, it will be quite sufficient since it contains the most important features of a body, namely sensory receptors and a motor system as well as a simple metabolism. While the nervous sytem of the creature will be developed into considerable complexity through the rest of this book, the body will not be changed at all. By commiting ourselves to a fixed body, we know that a better performance is the result of the developed nervous system and not of a new and better body. The different sensory and motor systems of the creature are described briefly in figure 3.4.2 and below. (See Balkenius 1994a for further details).

Touch The creature has one whisker on each side of the body which is directed forward. The whiskers have their origin at the center of the body and can have a variable length. The angles between the forward direction and the whiskers are always the same for both whiskers. The whiskers have outputs which tell the creature whether or not they have collided with any object such as a wall or another creature. They do not react on contact with food. There is also a sensor in the body which reacts when the creature has collided with an obstacle. This signal can be used as a last resort when the whiskers have failed to react. This is how this sensor is used in the simulations reported below. The neurons which are activated when the whiskers are in contact with an object are call wL and wR. The collision sensor is denoted by c.

Smell A number of smell sensors are located at the end of the whiskers. These are called sL0ŠsL7 and sR0ŠsR7 for the sensors on the left and right side respectively. The smell sensors detect the Œconcentration' of a number of different Œchemical' substances in the environment around the creature. The primary sensory system of our creature will be concerned with smell and we will develop the analysis of odors to some complexity in the following chapters.

Taste When the creature is in contact with food, one of the four food detectors will react. There is one food detector for each of the four food types A, B, C and D. These detecors are called fA, fB, fC and fD . It is possible for the creature to eat a food object only when the food detector reacts on it. Apart from generating eating behavior, the food detectors can be used as signals which initiate learning. The food detectors can be considered to generate taste signals to the creature.

Figure 3.4.2 The body of the creature with its sensory and motor neurons.

Pain The body is also susceptible to Œelectric shock'. Various aversive object generates these signals, if the body is in contact with them. This makes the single pain receptor, p, react. Like the food detectors, the pain sensor can be used to drive learning.

Needs Our creature also has sensors for its bodily needs. To keep things simple, all the artificial creature needs is to eat at regular intervals. There are four internal need sensors which determine the internal level of four types of energy needed by the creature called e0Še3. Based on the signals from these sensors, the creature can make choices about what to eat.

Note that the outputs from these sensors indicate that the creature does not need the corresponding energy. This means that the creature should look for food containing energy 0 when the sensor e0 does not react.

Motor System The body moves through the environment using two motors which drive wheels on each side of the body. By varying the speeds and directions of the motors, the creature can move forwards or backwards, turn while moving or spin. This type of general body architecture has been used in a number of studies both with physical robots and in simulated environments, most notably by Braitenberg (1984). In these studies, the body and motor control is not the primary object of study. However, the type of output signals necessary to control the movement of the creature can easily be adapted to more developed models of locomotion (for example, Beer 1990, 1992, Ekeberg 1992) or to robots (see, for example, Hirose 1993).

In the simulations shown below, all the movements of the creature are generated by a model of a physical motor system with the only exception that the creature is weightless and the wheels have infinite friction on the ground. The consequence of these simplifications is that the creature has no physical momentum and that the rotations of the wheels are the only thing that move the creature. If the wheels have stopped, the creature is absolutely still. The simulated motor system is a sufficiently good approximation of a robot moving slowly, but not very appropriate if we want to compare it with the precise movements of a fast running animal.

All the movements of the creature are controlled by a set of four motor neurons. Two of these, mL and mR, control the speeds of the motors on the left and the right side of the body. Another neuron called Eat, controls the eating behavior of the creature. When this neuron is activated, the eating behavior starts. Neurons of this type are often referred to as command neurons and can be found in many lower animals (Shepherd 1988). The last two output neurons, wj and wp, control the angle between the two whiskers and their protrusion respectively. By changing the output level of these neurons, the creature is able to move its whiskers (see section 4.4). The different motor neurons are shown in figure 3.4.2.

Metabolism Our creature has a simple metabolism. The energy gained from eating decreases at a rate which depends on the current activity of the creature. The faster a creature moves, the larger the decrease will be. This metabolic model makes it favorable for a creature to rest when it is not in need of food since this will lower its metabolic rate.

If we want to evaluate the performance of a creature, the type of metabolism we select will play an important role. While it is tempting to evaluate a creature based on how good it is at solving a problem, such as finding a hidden food pellet, this can never be our only criterion. Real creatures spend a lot of energy while searching for food and if we do not take this into account, the best creature would be one that generates very strange behavior.

For example, if the creature is rewarded for finding and eating a lot of food but not penalized for moving fast, it need not be very smart. The best strategy will be to run as fast as possible to cover the whole of the environment at a minimum of time. If, on the other hand, it is expensive to run, it is much more beneficial for the creature to use a more clever strategy. This implies that a performance measure must be based both on the observable behavior of the creature and on the amount of energy used to carry out the task.

A Complete Creature

Since we are primarily interested in cognitive abilities, the parts of the creature described above will be held fixed during the development of the various successively more complex nervous systems. In this way, we will know that any altered performance of the creature is a result of its new nervous system and not of some changed property of its body or sensors.

We are now in a position to construct a simple creature which conforms with the principles described in the previous section. The first principle tells us that we should start with a description of the behaviors that we want the creature to use, starting from the most essential and working towards the more complex. The final creature will consists of a set of distinguishable subsystems which we will introduce one at a time. When the creature is complete, we will investigate the various evolutionary sequences which could have constructed that creature.

Let us first construct a complete artificial creature which simply wanders around and explores its environment at random. This example is similar in many respects to the artificial cockroach (Periplaneta computatrix) described by Beer (1990).

Subsystem A: Move A simple explorative behavior consists of two parts, one behavior which moves the creature around and a second which lets it avoid obstacles. Moving around is accomplished by simply giving the motors constant input signals. If the input signals given to the two motors are different, the motor speed will differ and the creature will move in a circle. This is useful since it will make the creature cover a larger portion of the environment than movement in a straight line.

Figure 3.4.3 shows the motor nodes necessary to generate this behavior. The left motor is controlled by the node mL and the right motor by the node mR. The resting levels of the two nodes are shown below each node.

Figure 3.4.3 The Œnetwork' that generates the basic move behavior. Only the two output nodes are needed. Since the resting activity of the left node (1.00) is slightly lower than that of the right (1.01), the creature will move in a large circle.

Figure 3.4.4 shows the path taken by a computer simulated creature with such a behavior. As we can see, it almost immediately get stuck at a wall. Fortunately, the motors do not run at exactly the same speed and the creature will slowly turn and finally get away. This is possible only since the simulation shown used what we may call Œslippery walls' (see Balkenius 1994a). Such walls let the creature turn even though the force generated by its wheels is in the direction of the wall.

Figure 3.4.4 The route taken by our first computer simulated creature. It will stay in the same path forever.

Subsystem B: Avoid There are two problems with the behavior of our current creature. The first is that it uses most of its time at walls slowly turning, and the second is that it very easily gets stuck in a loop as can be seen in figure 3.4.4. We avoid the first problem by making use of the whiskers. When the left whisker is in contact with a wall, we let the creature increase the speed of the left motor. The right whisker will increase the speed of the right motor in a similar way. The extra signal to the motors will make the creature turn away and successfully avoid obstacles much faster than the previous creature.

This is a simple example of a subsumption hierarchy since the turning behavior overrides the moving behavior. Combining behaviors in this way will be one of the main subjects of chapter 4. There is one more turning behavior to consider, however, and that is the situation when the creature has walked straight into a wall and both whiskers react. Since the speed of both motors will increase, the creature will continue to move straight ahead into the wall.

Figure 3.4.5 The controlling network for Move and Avoid. The input nodes for the whiskers on the left and on the right are called wL and wR. The motor nodes are called mL and mR.

Figure 3.4.6 The path taken by the creature with Move and Avoid behavior.

This situation can be avoided if the connections from the whiskers to the motors have different strengths. In this case, the motors will have different speeds and the creature will slowly turn until only one of the whiskers reacts. When this happens, the creature will turn faster and eventually get away from the wall.

The necessary controller is shown in figure 3.4.5. The touch signals from the two whiskers enter the network at the input nodes wL and wR. When any of these are active, its output signal is sent to the motor nodes mL and mR respectively. The numbers at the connections indicate the connection strengths on the connections from the touch sensors to the motor nodes. The behavior generated by this controller is shown in figure 3.4.6.

It is interesting to note that although the control system can be considered behavior-based, there is no strict correspondence between modules in the network and overt behaviors. The same connections in the neural network are used to let the creature turn left or right and to avoid straight-ahead collisions. The combined abilities can be considered a behavior module for different types of avoidance. The network in figure 3.4.5 can be described by the subsumption hierarchy shown in figure 3.4.7.

Figure 3.4.7 The subsumption hierarchy corresponding to the network in figure 3.4.5.

Subsystem C: Explore The path taken by our new creature is much better than the first example since it does not get stuck in loops in the simple example environment. In more complex environments this is still a problem however and something more is needed. In Beer (1990) it was suggested that a good exploratory behavior could be constructed by using a very simple strategy. The creature turns slowly to the left or to the right while moving and changes between these two behaviors after random intervals. There are two reasons for using such a behavior. The first is that the randomness of the walk makes sure that the creature will not get stuck and the second is that the large circling movement will eventually cover the whole environment. If we want our creature to explore, this is obviously a good thing.

To make the creature able to avoid walls, we keep the previous architecture and simply add a few neurons that make it change between the turning right and turning left behaviors. The network used here is different from the one described in Beer (1990), but his solution would probability be possible here, too.

Figure 3.4.8 The neural controller for the exploratory behavior Move+Avoid+Explore. The random circular walk is controlled by a stochastic sample-and-hold circuit.

To change between the two behaviors, a random sample-and-hold curcuit was constructed (figure 3.4.8). A threshold summing node with feedback, n2, is used to store the current mode of the creature. The state of this node can be either 1 or 0. If this node is active, the creature will turn left and if it is inactive, the creature will turn right. This is our first example of a neural controller with an internal state. However, since the state is set at random, the behavior of the creature is not disrupted if it loses its state for some reason.

To set the turning mode at random, two noisy neurons were used to change between the two states. The first noisy neuron, n0, is active or inactive with equal probablity while a second neuron, n1, is allowed to burst with a small probability. Every time n1 generates an output signal, it first resets the summing neuron, n2, and then samples the output of n0 by facilitating it. If n0 is active, the mode neuron, n2, will be set again and otherwise it will stay reset. Since the output of n0 has equal probability for both states, the probability for each of the two modes is the same.

Figure 3.4.9 shows the behavior generated by this controller. As can be seen, it does not only cover a large area of the environment and prevents the creature from getting stuck in loops, it also shows some occasional wall following behavior which is a combination of the turning and obstacle avoidance behaviors. Wall following can readily be considered as an emergent property of the network controller. There is no special wall following module in the nervous system, but the creature will engage in this behavior anyway. Below, we will develop a set of more complex exploratory behaviors.

Figure 3.4.9 The exploratory behavior generated by the nervous system in figure 3.4.8.

Subsystem D: Eat Walking around exploring the environment is of no use in itself, of course. The only reason for our simple creature to do so is that it will be able to find food which is not accessible from its initial location. Eating behavior can easily be added to our creature since it is already equipped with both taste sensors and an eating command node. We let the creature eat as soon as its taste sensors react. We need to add very little to the controller to make this possible. We connect each of the sensors for pallative food with the eating command node as shown in figure 3.4.10. This network works even if it is totally disconnected from the movement controller described above. This would be a simple example of completely distributed control.

Figure 3.4.10 Each of the food detectors for pallative food starts eating behavior.

Subsystem E: Stop at Food A better behavior is generated, if the eating system is allowed to inhibit the exploratory behavior to let our creature slow down when it tastes food. This will make it possible to eat more than if eating has to be done, so to speak, on the fly. This admittedly simple view of eating is perhaps sufficient for a very simple creature, but we will see in chapter 6 that much more complexity is needed to generate a good eating behavior.

Figure 3.4.11 A goal directed network. The network guides the creature toward objects that smell of smell 0 or smell 1.

Subsystem F: Goal Direction Our current creature will have great troubles finding food since its movement is not at all goal directed in the sense that it specifically moves the creature towards food. Such a behavior can be constructed by using smell cues in the environment. Since the two smell receptors are placed apart from each other on each side of the body, the differences between the left and right smell can be used to guide the creature towards food. Figure 3.4.11 shows a network based on this idea.

Figure 3.4.12 Goal-directed behavior. The creature turns towards food when it is sufficiently close to smell it.

When the smell of food is more intensive on the left than on the right, the right motor will increase its speed more than the right one and the creature will turn left towards the food. This design was suggested by Braitenberg (1984) as an example of the simplest way in which goal direction could be constructed. There exists a number of variations on this simple circuit and we will consider them at length in chapter 4.

The creature with this goal directing system uses an approach-consummation strategy. When it smells food, it will turn towards it and increase its speed until it reaches the food where it will slow down and start to eat. When all the food is eaten or when it has left the food patch, it will continue to explore the environment (figure 3.4.12).

Figure 3.4.13 A complete creature which uses many of the mechanisms described in the text. Note that the smell sensors are connected to the motors on the opposite side of the creature.

Subsystem G: Central Engagement Selection There is one problem with our current creature. It will never stop eating if there is food around since it has no way of knowing if it needs more food or not. A way to avoid this problem is to let the creature monitor whether it needs more energy or not. Figure 3.4.13 shows an example of such an architecture. Two need sensors, e0 and e1, are added which react when the creature has no need for a certain food type. These sensors inhibit the food detectors as well as their corresponding smell receptors. As a result, the creature is insensitive to the types of food it does not currently need and will consequently ignore them. This is a simple form of motivationally biased perception. In chapter 7, this type of mechanism will be the basis for a discussion of attention.

The system consisting of the two need detectors and their inhibitory influences on the rest of the controller is a simple system for central behavior selection. We see that this system only shows one of the three aspects of such a system, namely internal drive. The introduction of internal and external incentives will be postponed until chapter 6, where their relation to motivation will be investigated.

Conclusion

This section described a complete creature showing how the design principles introduced in section 3.2 could be incorporated in a very simple creature. Our example creature implemented the principles of behavior-based control, subsumption, parallel engagements, and central engagement selection. Most of all, it is entirely reactive. In the next chapter, we will take a closer look at different types of reactive behavior. In the later chapters of the book, we will take a closer look at learning, motivation and perception in general, and in our artificial creature in particular.



This text is an excerpt from:
Natural Intelligence in Artificial Creatures
© 1995 by Christian Balkenius
Lund University Cognitive Studies 37
ISBN 91-628-1599-7
ISSN 1101-8453
ISRN LUHFDA/HFKO--1004--SE
The printed version of the book can be ordered from:
Lund University Cognitive Science
Kungshuset, Lundagård
S-222 22 LUND
Sweden
or by e-mail to:
sekreteraren@lucs.lu.se

christian.balkenius@lucs.lu.se