Alex's Notes

Task Environments

Source

Presented in Russell Norvig Chapter 2: Intelligent Agents

Definition

Task Environments are essentially the problems to which rational agents are the solutions.

Specification

We can specify task environments by specifying the PEAS, ie the performance measure, the environment, the actuators and the sensors:

  • First, what is the performance measure to which we would like our agent to aspire.

  • Second, what is the environment the agent will face.

  • Third, what are the actuators the agent will need to act in the environment.

  • Finally, what are the sensors the agent will need to receive input from the environment.

On p. 61, figure 2.5 RN sketch the task environments for a range of agents. The virtual task environments of software agents can be just as complex as robots in a physical environment. Eg a bot that trades on an auction platform may be dealing with millions of other users and billions of objects.

Properties of Task Environments

Although the range of task environments in AI is obviously vast, RN present a number of dimensions that enable us to categorize task environments.

The dimensions determine to a large extent the appropriate agent design and the applicability of the main families of techniques for implementation. This informal intro is developed more formally in later chapters:

Fully observable vs. partially observable

If an agent’s sensors give it access to the complete state of the environment at each point in time, then we say that the task environment is fully observable. A task environment is effectively fully observable if the sensors detect all aspects that are relevant to the choice of action, which depends on the performance measure. In a fully observable environment, the agent need not keep track of the state of the world in internally maintained state.

An environment may be partially observable because of noisy and inaccurate sensors, or because parts of the state are simply missing from the sensor data. If the agent has no sensors at all the environment is unobservable. The agent’s plight is not necessarily hopeless in such cases.

Single agent vs. multiagent

This may seem obvious - eg an agent solving a sudoku puzzle by itself is in a single-agent environment, while an agent playing chess is in a two-agent environment.

But when must we view another entity as an agent? Eg should an automated taxi have to treat another car as an agent?

The key distinction is whether B‘s behavior is best described as maximizing a performance measure whose value depends on agent A‘s behaviour.

For example in chess, the opponent B is trying to maximize its performance measure at agent A‘s expense. This is therefore a competitive multiagent environment.

By contrast when driving, minimizing collisions maximizes the performance measure of all agents (typically) so it is at least partially a co-operative multiagent environment.

Agent design issues in multiagent environments are often different from single agent environments. Communication often emerges as rational behaviour, for example, while in competition randomness might be valuable to avoid predictability.

Deterministic vs. nondeterministic

If the next state of the environment is completely determined by the current state and the action executed by the agent(s), then we say the environment is deterministic; otherwise it is nondeterministic. (p. 63)

An agent need not worry about uncertainty in a fully observable, deterministic environment. However if the environemnt is only partially observable, it might still appear to be nondeterministic. Most real situations are so complex that they must be treated as nondeterministic.

the word stochastic is used by some as a synonym for “nondeterministic”, but we make a distinction between the two terms; we say that a model of the environment is stochastic if it explicitly deals with probabilities… and “nondeterministic” if the possibilities are listed without being quantified.

Episodic vs. sequential

In an episodic task environment, the agent’s experience is divided into atomic episodes. In each episode the agetn receives a percept and then performs a single action. Crucially, the next episode does not depend on the actions taken in previous episodes. (p. 63)

Many classification tasks are episodic, the agent receives an input to classify, does so then moves on. Its decision on the next item does not depend on the previous one, usually.

In sequential environments, on the other hand, the current decision could affect all future decisions.

Eg chess or driving, a move could affect all future moves. Episodic environments are simpler as the agent doesn’t need to think ahead.

Static vs. dynamic

If an environment can change while the agent is deliberating it is dynamic, otherwise static. Static environments are easier to navigate as the agent does not need to engage externally while deliberating, or worry about time passing.

Dynamic environments keep demanding action from the agent. If an agent is deliberating, it is effectively deciding to do nothing. If the environment doesn’t change, but the agent’s score drops with time passing, we say that is semidynamic.

Driving is clearly dynamic, crossword puzzles static, chess with a clock is semidynamic.

Discrete vs. continuous

The discrete/continuous distinction applies to the state of the environment, to the way time is handled, and to the percepts and actions of the agent. (p. 64)

Eg chess is largely discrete, there are discrete set of percepts and actions, a finite number of game states (excluding clocks). Driving is a continuous state and time problem, and driving actions are continuous too.

Known vs. unknown

In a known environment, the outcomes (or outcome probabilities) for all actions are given. If the environment is unknown, the agent will have to learn how it works to make good decisions.

This distinction is not the same as fully or partially observable. Eg a game of solitaire may be fully known (ie the rules) but only partially observable. Or a new computer game may be unknown (what do the buttons do?) but fully observable (I can see the whole map).

The performance measure itself may not be known. Eg a personal assistant may have to learn the preferences of its owner. This makes such cases inherently multiagent.

Summary

The hardest task environment, like driving, is partially observable, multiagent, nondeterministic, sequential, dynamic, continuous and unknown.

Figure 2.6 on p. 65 lists some common tasks and their place on these dimensions. For example:

  • Crossword: fully observable, single-agent, deterministic, sequential, static, discrete

  • Poker: partially observable, multiagent, stochastic, sequential, static, discrete

  • Medical diagnosis: partially observable, single agent (but caveats), stochastic, sequential, dynamic, continuous

  • English tutor: partially observable, multiagent, stochastic, sequential, dynamic, discrete

Finally, RN observe that experiments are often not carried out in a single environment, but in an environment class. For example, driving, we’d want to test the agent on multiple roads and conditions.