[ /js/ai ]

A few years back, I met a (friend of a) friend's boyfriend who had gone to school for CS and worked mostly in graphics. This was before I had started going to University, and programming was entirely a hobby thing.

We talked about AI, which is still one of my interests. I went home that night with a hardcover textbook: AI, a modern approach by Stuart Russell and Peter Norvig. I quickly found a more up to date version in pdf form, and eventually returned the hardcover.

I read through the book in its entirety over a month or so, and as I've become a better programmer, I've gone back and reread specific sections.

The other day, I decided to start rereading it again, this time completely. Now that I have this blog mostly functioning as I want it to, I figured I might as well use it too keep track of what I think is relevant information to my other programming tasks.

Rational Agents

One of the core ideas presented in the book is that most programs can be thought of in the same terms which we use to depict ourselves. They perceive things, they reflect on them, and they respond in some way.

Rationality should not be confused with omniscience, however. This is something I struggle to explain to clients when they ask for certain features. Computers outperform humans in so many fields, using methods which are indistinguishable from magic, that many people think any task is only a few lines of code away from perfection.

The truth is that the world is a very uncertain place, and that even a rational agent must expect that not everything will go as planned. We are called rational if given the information known to us, we act in such a way as to make it likely that we will succeed according to the expected outcome of our actions. This is in contract to the actual outcome of our actions.

The following paragraph is excerpted from the text:

We need to be careful to distinguish between rationality and omniscience. An omniscient agent knows the actual outcome of its actions and can act accordingly; but omniscience is impossible in reality. Consider the following example: I am walking along the Champs Elysees one day and I see an old friend across the street. There is no traffic nearby and I'm not otherwise engaged, so, being rational, I start to cross the street. Meanwhile, at 33,000 feet, a cargo door falls off a passing airliner, and before I make it to the other side of the street I am flattened. Was I irrational to cross the street? It is unlikely that my obituary would read "Idiot attempts to cross street."

A Learning Agent

There is an idea that many people find quite incomprehensible, namely:

Charles Darwin and Alan Turing, in their different ways, both homed in on the same idea: the existence of competence without comprehension.

If you can understand that a (robotic) agent of limited knowledge can still be considered rational, then it shouldn't be a surprise to think the same of its creator. An artificer cannot possibly foresee all the circumstances their creation will encounter. Once that has been realized, however, they must understand that that is something which must be taken in account in the design process.

It should not surprise you to consider that a chess playing machine can easily defeat the individual responsible for building the machine. If it does, consider the nature of surveillance. We are not surprised when a camera sees further, clearer, or remembers an image far more accurately over time than the person who designed the camera. Our devices are not crafted in our image, as the saying goes.

A rational agent unable to add to its store of percepts is a fragile kind of intelligence.

PEAS

An acronym is used to define one of the most common models for designing artificial intelligence. An agent can be thought of in these simple terms:

  1. Performance: the definition of a measurement by which an agent can be judged to have succeeded. Generally you want this value to be a continuum, not a discrete value.
  2. Environment: the circumstances under which the agent will have to perform. An agent may be embodied in some physical space, or its operational environment may be entirely virtual.
  3. Actuators: whatever device(s) with which the agent is equipped to cause change in its environment.
  4. Sensors: whatever device(s) with which the agent is equipped to detect qualities in its environment, or changes to them.

While connectionist models can still be evaluated according to this framework, it must be understood that connectomes are not necessarily designed.

I'm going to focus on this model of intelligence in my explorations of AI, since for most problems on the web there exists some comprehensible algorithmic solution. More often than not, these algorithms can be fine tuned using some domain specific heuristic which a web programmer should be well prepared to define.

A reflex agent

A reflex agent responds to specific events with specific actions. The response action isn't necessarily deterministic, but the function mapped to the event should be static.

Simple IRC bots are prime examples of reflex agents. They need not have any extraneous state beyond their connection info. They listen to their environment (a set of irc channels), and respond to messages (percepts gathered by their sensors) if the message matches some condition set by their programming. They do not need to reflect on their past percepts, necessarily.

I don't mean that an IRC bot can't reflect, just that the really basic ones don't. If you're an IRC user, you know how annoying these bots can be. For example:

HypeIRC is a great place to play around if you're a new-ish programmer. IRC bots tend to be a really exciting thing for new-ish programmers, so channels on HypeIRC occasionally end up with as many (or more) bots than humans. Most bots aren't programmed to distinguish between other bots and humans. The consequences of this are sometimes hilarious.

Many bots are programmed to add new triggers to their vocabulary. You can see cjd using this function below to commemorate Rob Ford's reelection (as a councillor, not Mayor).

20:53 <@ircerr> we just reelected a crackhead... for something
21:49 <@derp> ircerr, just because you smoke crack doesnt make you a bad person. 
21:49 <@derp> their are plenty of successful crackheads like lindsey lohan, every rapper, politicians
21:51 <@derp> i never smoked crack, not like you have video evidence to prove it
21:52 <@ircerr> crack in toronto (or any street) is hardly cocaine at all
21:52 <@ircerr> ya dont do a blast of coke then go jonesing again before ya put the pipe down. crackheads here do
02:25 <@cjd> %crack
02:25 < finnbot> cjd: Error: "crack" is not a valid command.
02:25 <@cjd> %learn crack as 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person.
17:22 <@cjd> oh that's right, he's more of a powder guy than a crackhead

Remembering that cSmith had launched forkbot (using the same codebase supybot), I decided to have some fun.

02:26 < ansuz> forkbot: learn crack as %crack
02:27 < ansuz> forkbot: crack
02:27 < forkbot> %crack
02:27 < finnbot> forkbot: crack could be 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person..
02:27 < forkbot> %crack
02:27 < finnbot> forkbot: crack could be 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person..
02:27 < forkbot> %crack
02:27 < finnbot> forkbot: crack could be 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person..
02:27 < forkbot> %crack
02:27 < finnbot> forkbot: crack could be 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person..
02:27 < forkbot> %crack
02:27 < finnbot> forkbot: crack could be 02:46 <@derp> ircerr, just because you smoke crack doesnt make you a bad person..
02:27 < forkbot> %crack
...

They went on like that for a while. Thankfully, Supybot does have some internal state, and it does reflect on it. One of the bots was able to realize that it had received way too many commands from a single user in way too short a time, and it decided to break the loop by ignoring the other bot for a short cooldown time. cSmith promptly set forkbot not to follow commands that I sent.

Of course, I just logged in from a different IP which had not been blocked to make a point, after which he unblocked me.

The moral of the story is that dumb-bots are really easy to abuse. They don't necessarily need to reflect on all their actions in sequence to avoid this kind of loop, though. They can simply be programmed to fail a percentage of the time. If one in five messages are ignored, then after approximately five loops the cycle would break. If you were still dead set on torturing bots, you could could just throw a third bot into the fray.

Model-based reflex agents

A web server is an excellent opportunity to flex some "AI muscle". I'm not suggesting we over-engineer every piece of web technology. I mean to say that for certain applications where the programmer cannot foresee the exact use cases of an application, it may be worthwhile to design some adaptive behaviour into its code. A server is a perfect example of a model-based reflex agent, which reflects on past percepts and modifies the way it acts in response to perceived events within its environment.

Let's keep things simple and say that we have some server which will receive requests to serve some information. All we want the server to do is serve it quickly and accurately.

A typical approach is for a server to map routes to functions. Each time the server receieves a request, it compares the url against a series of strings or regular expressions.

Most routers are written such that each comparison occupies a fixed position in a stack. The router begins at the top of the stack, and continues to the bottom until a comparison has successfully evaluated as a non-false return value. The matched function is then executed, using the HTTP request headers as arguments. A response is constructed, and sent back to the client.

var http=require("http");

var routes=['one','two','three']
  .map(function(x){ // for each string in the array
    return new RegExp('^\/'+x+'$','i'); // create a case insensitive regular expression
  }); // assign the resulting array to 'routes'

var responses=['pew','pewpew','pewpewpew']
  .map(function(x){
    return function(req,res){
      res.write("<p>"+x+"</p>");
      res.end();
    };
  });

var depth=routes.length;

var router=function(req,res){ // process a request, return a response
  res.still=true; // track whether we are still working on the response
  res.index=0; // track which index in the stack we've reached
  while(res.still&&res.index<depth){ // while there are routes left to check
    if(req.url.match(routes[res.index])){
      responses[res.index](req,res);
    }
    res.index++;
  }
  res.write("<p>404")&&res.end();
};

http.createServer(router).listen(8080);

There is room for improvement here. If we can safely assume that each of these routes is exclusive, that is, if it can be ensured that any valid url can match (at most) one of the routes, then the order of the routes is subject to optimization. In other words, the way that the comparisons are sequenced initially may result in a large number of unnecessarily failed matches.

What if our server were coded in such a way that it could improve its own behaviour? What if it could reflect on its percepts (information gathered via its sensors from its environment), and modify its actions so as to better satisfy its performance metric?

In this situation, the agent's actions and environment are both virtual. It doesn't have robotic arms with which it can lift heavy weights, but it may have access to a filesystem, or database, or other online services. Accessing each of these resources takes time (a fact obscured by the intangibility of microprocessors) and that affects its performance measure. Remember, its goal is to "serve client requests quickly" without sacrificing the validity or completeness of the information served.

It is up to the engineer to design a heuristic which (more often than not) results in a faster response time than if that heuristic were not implemented (taking into account how long it takes for the heuristic to be applied against the data). In very broad terms, it shouldn't cause any major controversy to say that the route with the most views should be at the top of the stack.

Here is an example of how that could be done:

var http=require("http");

var routes=['one','two','three']
  .map(function(x){ // for each string in the array
    return {
      reg:new RegExp('^\/'+x+'$','i') // create a case insensitive regular expression
      ,rank:0
      ,f:function(req,res){
        res.write("<p>"+x.toUpperCase());
        res.end();
      }
      ,id:x
    };
  }); // assign the resulting array to 'routes'

var depth=routes.length;

var count=0;

var optimize=function(){
  console.log("optimizing the route stack");
  routes.sort(function(a,b){
    return b.rank-a.rank;
  });
  console.log(JSON.stringify(routes));
};

var router=function(req,res){ // process a request, return a response
  res.still=true; // track whether we are still working on the response
  res.index=0; // track which index in the stack we've reached
  if(count++%5==0)
    optimize();
  while(res.still&&res.index<depth){ // while there are routes left to check
    if(req.url.match(routes[res.index].reg)){
      routes[res.index].rank++;
      routes[res.index].f(req,res);
    }
    res.index++;
  }
  res.write("<p>404")&&res.end();
};

http.createServer(router).listen(8080);

The key thing to take away here is that while even the simplest webserver can be conceived of as an agent, it can only really be regarded as a rational agent if it can gather and reflect on its past perceptions and use that information to improve its expected future performance.

This is considered a model based agent, because its internal state tracks something about the external world (or at least, the agent's environment). Specifically, the agent is modeling the popularity of a particular route. It doesn't need to understand what popularity is, or why a route is popular, though a more advanced agent could possibly infer what popularity depends on based on common traits between popular routes. What is important is that its behaviour brings it closer to some desired goal state. If the agent manages to lower the average response time of a transaction, it has responded rationally to its environment.

Goal-based agents

A goal-based agent is generally capable of far more complex tasks than either type of reflex agent. They are able to achieve more complex functionality by virtue of possessing the capacity to plan ahead.

Before proceeding to try to build such an agent, we must have at least a modest conception of what planning is. We can try to define what planning is in terms of the agent's requisite abilities:

  1. It must have some knowledge of its own abilities.
  2. It must be able to combine the aspects of its abilities to generate a set of possible combinations of these abilities.
  3. It must be able to forecast the outcome of each potential action, or series of actions.
  4. It must have a formalized definition of its desired outcome, or goal.
  5. It must be able to recognize if any of these choice paths would be guaranteed, or at least be likely to lead to a desired outcome.

Lacking any of these components, we are unable to build a functioning goal-based agent. If these conditions can be met, however, we can then reason about how the agent might achieve its goals in a reasonable amount of time.

Utility-based agents

A goal-based agent is able to determine whether a specific sequence of actions will result in the successful completion of its task. This behaviour is sufficient in circumstances where it is reasonable to accept the first such sequence of actions which will satisfy its requirements.

In most situations, however, there are sequences of actions that may be preferable to the first such plan. This is less common in the realm of virtual agents, as they are concerned primarily with computational time, and not physical costs. Once it has found an answer to a question, there is no point in searching for better ways to find the same answer.

Suppose you were creating an agent suited for the task of deciding optimal paths. Maybe you want it for providing directions for humans to take when travelling, like the Google Maps™ engine, or maybe you are trying to route packets through a complex mesh network. In such situations, the agent can weigh its options ahead of time, then decide on a path, then act.

Given that there may be many actions that will satisfy its goals, and some of those actions will result in better performance, the agent needs some means of deciding on the quality of each action. The quality of each action can be determined according to an equation describing the economy of its choices. The outcome of any action has a value, and each step along the path to its completion has a cost.

The agent's program must contain specifications for how these factors balance out, or provide a means of learning how to adapt to a shifting economy. The specifications encoded into the agent will ultimately reflect how the agent behaves. Incomplete or inaccurate specifications are more likely to result in a perverse incentive, which will ultimately result in the agent doing what you asked it to do, but not what you want.

I'm going to build on the example of a packet-routing agent, since it's a problem I'm quite interested in. What are our requirements?

  1. Our packets must reach the target node intact, assuming a path exists between our node and the target node.
  2. If a path does not exist, our node must be able to determine whether this is the case in a reasonable amount of time.
  3. Our node must be able to decide on the preferable outcome without flooding the network with packets.

These requirements must be better defined. An agent cannot understand terms such as reasonable, or preferable. If we were able to better define those terms, we would have a basis for creating a goal-based agent. To create a utility-based agent, we will need a method of ranking paths. What factors might help us decide which path is best?

  1. The total latency of the path.
  2. The number of hops along the path.
  3. The monetary cost of transit along the path.

What is the significance of each of these factors?

Depending on the type of traffic you are sending, the smallest amount of latency might greatly affect the quality of the application sending the packets. For instance, realtime games played over a persistent socket connection may be won or lost on the basis of a split-second decision. Delays mean (simulated) life-or-death. Conversely, if your traffic is something like a routine server backup that runs in the background, you may not be at all concerned about latency.

The number of hops along a path matters because each step introduces computational overhead. If every node in the network chose paths with shorter latency, but more hops, then the computational load on any particular node in the net would, on average, be much higher. If such a situation seems unlikely, consider the coastline paradox. A path with more hops may better approximate the real-world path than a few long-distance hops along a physical wire.

The monetary cost of transit can be difficult to predict. Prices are unlikely to be uniform through the net. Each edge can be affected by four distinct prices, as either constituent node most likely has its own terms when transmitting data. Uploads and downloads may incur differ costs. In addition, traffic flow may not be uniform, depending on the roles of either node. In any case, for a packet to succesfully traverse an edge, one node must upload it, and another must download it.