In attempting to flesh out the concept of search as it applies to simple agents, I decided at one point that a useful model of behavior was one of “context, action, and effect”: that is, “Where am I now?”, “What can I do from here?”, and “Where will I be once I do that?” From a graph search perspective, the contexts are the nodes of the graph, the actions are the (directed) edges, and the effects are the destination nodes. Physical search would resemble something like a coloring traversal, in that the agent can only occupy one context (place) at a time, but will remember previously visited nodes. Mental search, on the other hand, could involve activating more than one context at once, allowing the equivalent of breadth-first/A* searches. The context nodes, if not also the actions, would no doubt require “fuzzy” matching of some sort: new input combinations would be compared to existing contexts, with the result of either creating a new context or adjusting an existing one. It would also be possible to consider percentage matching of multiple nodes simultaneously, which starts to remind one of a Markov model. Similarly, one could imagine multiple simultaneous searches–orthogonal or otherwise–as well as “templating” behaviors to combine different simultaneous contexts. With templating to allow the inclusion of “variables,” the process starts to resemble Prolog‘s model of computation with backtracking search.
The last thing I was thinking about when I left off was the emergence of search algorithms in nature and their relationship to consciousness. The idea of consciousness as a search process is nothing new, and search is one of the most fundamental brain-controlled behaviors: witness the experiments with rats in mazes. The questions, then, are these: How did search evolve or how does it emerge from simpler behaviors, such as reflexive reaction to the presence of food? What is the simplest organism that can search, and what are the simplest other behaviors that require neural control? How did physical search evolve into mental search, allowing organisms to explore routes without physically traversing them? How did concrete mental search evolve into abstract search, where the nodes being traversed are no longer represent physical locations and actions, but rather more general concepts? How does concrete or abstract search lead to the synthesis of new concepts, which themselves become part of the search space?
The approach I’m interested in taking is one that focuses on the core, rather than the periphery, of the cognition process. I don’t expect or desire to solve the problem of computer vision or speech recognition, for example; I intend to assume that those problems can be solved and provide input to agents in high level symbolic form. Similarly, I’m not interested in modelling cognition at the neuronal level, but at a level that bridges the gap between the symbolic and neuronal layers. I’m tempted to say that I’m interested in a holistic approach, except that too often people who espouse holism are those who can’t be bothered to understand the parts of a system and thus make claims to mystical knowledge of the whole.
In keeping with the sentiment behind my last post, I left High Fidelity recently in order to take another stab at building something on my own while my mind is still relatively flexible. I’ve decided to abandon the game project and focus exclusively on artificial intelligence, since that’s what I consider “important,” in the sense that if there’s any chance at all of even the slightest bit of success, then I can’t think of anything more worthwhile for me to work on. Unfortunately, unlike the game project, it’s far more difficult simply to begin programming, and I have a difficult time measuring productivity in terms of anything other than code output–not to mention that I enjoy the process of coding much more than I do writing about or talking about code, and that I feel adrift without a stable code base and the daily routine of contributing to it. So, my immediate goal is to reach a point at which I can begin implementation, even if it’s just a framework or test bed for further development.
In the course of implementation, it would benefit me to seek out opportunities to learn new technology, so as to expand my marketable skill set in preparation for the next time I seek employment. I’ve been thinking that when I do that, I’ll broaden my search beyond just game and graphics jobs, which increases the number of possible technologies to consider. Among the possibilities are Unity 3D, which I’ve already played with a little; GPGPU coding using OpenCL; general machine learning and data analysis techniques; mobile development; and web development using more modern techniques and languages than the ones I’ve used in the past.
As a general strategy, I believe in a “bottom up,” biomimetic approach to intelligence: anything that resembles intelligence such as ours must be “grown” like us. To me, that suggests raising agents within 2D or 3D virtual environments, starting with simulations of the most primitive behavior (such as reacting to the presence of food) exhibited by single cell organisms, and attempting to introduce complexity incrementally in an echo of the idea that ontogeny recapitulates phylogeny. Unity seems like a natural fit for such simulations.
A recent experience reminded me that if I want to get anything done in life that requires significant brainpower (at least the kind that allows one to develop new abstractions), I had better do it while I’m still relatively young. It seems like peoples’ thought patterns follow a predictable trajectory over the course of their lifetimes: first, as infants, they form seemingly random connections; then, through childhood and adolescence, they’re prone to generalization; then, through adulthood and early middle age, abstraction; and finally, in late middle and old age, they become less likely to form new abstractions and instead focus on concrete matters. This reflects the development of the brain, which starts out with a vast number of connections that are progressively winnowed down over the course of a lifetime. My dad compared this to sculpture, where excess material is gradually removed until only the target form remains. This suggests two things. First, if possible, one must direct the course of sculpture towards a goal state that will be of use in later years. One finds the best examples of this by considering careers that lend themselves towards working well beyond the normal retirement age–judges and teachers, for example. Second, one should take full advantage of the period in which one’s capacity for abstraction is at its peak. I suspect this is the reason why mathematicians are widely expected to produce their best work before they turn 30; mathematics is abstraction in its most pure form.
After a lot of thinking and Wikipedia research, I’ve realized that I was somewhat naive in my choice of test bed. Handling the real-valued inputs and outputs of the agents in the previous post poses a challenge beyond that of handling discrete values–for example, in storage, how does one group similar values in order to avoid creating a new mapping for every unique number, no matter how slight the difference between it and its neighbors? The easiest solution would be to use uniform quantization, though that would require us to know in advance the desired granularity. In researching alternatives that would remove that requirement, I learned about the general problem of clustering, and algorithms such as k-means clustering, which groups similar values together into a predetermined number (k) of clusters using a simple iterative process. An extension of this is fuzzy c-means clustering, which removes the hard boundaries between clusters in favor of granting each point a weighted degree of membership in each cluster. An agent that learns over time would require an online/sequential version of the algorithm. For example, consider a “fuzzy map” data structure with capacity k, requiring a distance function for keys and blend functions for keys and values. The first k values would be inserted unchanged, with retrieved values blended according to key distance. Subsequent insertions would adjust the key/value pairs according to their distance to the new key, the total weight of values already inserted, and perhaps an additional factor to allow more recent mappings to overcome the inertia of historical ones. An advantage of this approach is that it sets a bound on the amount of memory (and lookup time, insertion time, etc.) required for the mapping.
On considering how the agent must process values, it becomes clear that derived data–values obtained through transformations of the raw stream–are often more significant than the originals. For example, when a mouse sniffs around to locate the source of a scent, the important datum is whether the intensity of the smell increases or decreases with each movement: if the smell decreases when the mouse rotates left, then the source of the smell is to the right. This suggests providing the scent delta instead of or in addition to the absolute intensity. In general, the problem of augmenting a stream of raw values with layers of derived information suitable for input into a higher level cognitive algorithm is likely to require both simple transformations like differences and more advanced analyses such as the identification of repeated patterns, even when they occur at different scales or contain loops or gaps. This pattern recognition requirement is one of the most compelling reasons to think that the animal brain, with its robust pattern matching ability enabled by massive parallelism, may have a distinct advantage over any currently realizable artificial system–at least in terms of interacting with the real world, or any artificial environment that approaches its complexity. For instance, consider that the sparse coding believed to be employed by the visual cortex uses a large number of neurons to analyze small sections of the visual field in a manner that is robust to input noise, scaling, and translation.
Apart from simple transformations and pattern identification, it may be necessary to process input further in order to create mental models of context. That is, the brain’s representation of the current environment may not be created–or created entirely–by the principal cognitive process, but rather by a preprocessing phase that specializes in determining spatial relationships. The equivalent in simulation terms would be to provide the locations, shapes, movement vectors, etc., of objects in the environment (including the agent itself) instead of or in addition to the raw sensory inputs. Indeed, assuming that more information is always more useful than less (assuming effective means of determining relevance) and that an emergent process would have to derive such spatial information anyway, it would seem that providing the information explicitly is a no-brainer (so to speak). However, considering this and the many other layers of derived data with which we plan to augment the raw input, we rapidly encounter problems relating to the curse of dimensionality. For example, were we to use the k-nearest neighbors algorithm to find historical contexts similar to the current one in order to find outputs likely to result in reward, the proximity in relevant inputs would likely be drowned out by differences in irrelevant ones.
A 2D environment seems like a reasonable test bed for simulating animal behavior: simple to implement, efficient to execute, and providing ample opportunities to tweak and extend both agents and environment. Following the scientific tradition of running rodents through mazes, I am imagining the agents as “mice” pursuing food “pellets” within a bounded space, receiving input from their virtual senses and producing movement controls as output. At least for the first experiments, the cognitive model (and the simulation) will execute in discrete time steps. At each step, the mice will collect input from their senses and provide it to their cognitive model, which will update its state and provide output. The simulation will enact the chosen output, and report back to the model with reinforcement. For now, input will consist of “smell” derived from the proximity of nearby pellets; output will consist of rotation and forward advancement; and reinforcement will consist of a unit of “pleasure” when the mouse consumes a pellet.
The first goal will simply be to have the mice learn to follow smells in order to consume pellets more efficiently than they would by moving randomly. Assuming that can be accomplished, further experiments would have them learn to navigate both static and moving obstacles, and eventually to traverse complicated courses such as mazes. In the course of experimentation, I expect to expand their sensual repertoire to include sight and touch.
Here’s the control case: entirely random movement.
Recently, I’ve been trying to develop a low-level model of animal behavior in the hopes of expanding it into a general model of feedback-based cognition, since higher cognition presumably evolved from simpler processes. The goal of this exercise is to find a model capable of “learning” at the most basic and natural level: e.g., “If I push this button, I receive immediate pleasure; therefore, I shall push the button as often as possible,” or “when I experience something like this sequence of inputs, and I subsequently produce something like this sequence of outputs, I will receive pleasure in roughly this amount of time.” Note that “pleasure” in these cases is provided externally; it is simply the abstract input whose maximization is the goal of the model, though it can be assumed to represent the satisfaction of physical desires, such as hunger. Also note that the model is probabilistic and “fuzzy”: rather than learning and reproducing exact sequences, it recognizes input with relative certainty and may produce output with stochastic variations.
The essential form of the model requires that it generate output (O) over time (t), based on input (I) and pleasure (P). While P is assumed to be real-valued, I and O are generic, and may assume any type that provides the set of functions, such as random() and distance(A, B), required by the model. To simplify the definition of prospective models, it’s useful to divide them into two separate categories: discrete time models, wherein t progresses in unit steps (0, 1, 2, 3…) and at each step the model accepts values for I and P and generates a value for O; and continuous time models, in which I and P events may be received, and O events generated, at any real-valued t. Discrete models are easier to generate and analyze, but continuous models are better representations of real world behavior.
In both cases, the model must choose O to maximize expected P, E(P). A perfectly rational model would presumably attempt to maximize total E(P) (the sum of all expected pleasure events over the lifetime of the model), but assuming that the model’s lifetime can’t be known in advance, and considering that the animal behavior I’m attempting to model seems generally to prefer immediate over delayed gratification, it seems reasonable instead to have the model attempt to maximize E(P) within a finite span ahead of the current time, weighting more immediate expected pleasure events higher than more distant ones. I’ll refer to this metric as the immediacy-weighted expected pleasure (IWEP). In the discrete case, the goal of the model is to select O at each time step to maximize IWEP, given the historical values for I, O, and P at previous time steps, and the current I.
As a baseline for comparison, one approach would simply be to choose randomly: O = random(). More usefully, one could assume that the best way to predict the optimal output is to compare the current context (the sequence of preceding I, O, and P, and the current I) to historical contexts, producing a new output by combining previous outputs according to the total immediacy-weighted pleasure that followed them, including random changes to avoid stagnating in local maxima. This approach is analogous to optimization via genetic algorithms: the principal force is the combination of previously successful candidates via a crossover function that merges their attributes, while a mutate function serves to introduce variation. The model would start by generating random output, only attempting to reproduce earlier outputs when they seemed sufficiently likely to result in higher pleasure. In all likelihood, the model would periodically have to return to experimenting with (increasingly) random output in order to expand the search space. This “experimentation” mode or factor would have to take effect or become dominant either when the model can afford to take the risk (because immediate pleasure is guaranteed, perhaps), or when it must (because no suitable existing solution is known).
One of the most interesting further steps in developing such a model is the incorporation of feedback and the subsequent development of a search process within the optimization framework. The next step towards true cognition is the ability to traverse networks of associations formed by past experience in order to find paths representing better solutions to current problems than those that may be generated through simple combination and mutation of previous solutions.
Although I’ve decided to put Witgap on hold while I attempt to write some actual code in support of my AI investigations, I wanted to document one additional aspect of the game: its theme. I’ve given a fair amount of thought to the technical basis of Witgap, but fairly little to its content. One of the crucial factors determining the success of any game is the combination of imagined settings, characters, plots, and devices that transform a set of data artifacts into a virtual world.
One thing I want to avoid is leaving the theme completely open, like that of Whirled. In the same way that complete freedom of gameplay saps players’ motivation, complete freedom of theming starves their imaginations. People thrive under constraints. However, the themes typical of online games–notably, those derived from fantasy and science fiction–can lead to overconstraint. For instance, in a world of orcs and wizards, a spaceship would seem out of place; likewise, a “hard” science fiction setting has no room for gods and magic. More importantly, the theme of a world tends to limit the stories one can tell or that audiences are willing to accept. This is the classic problem with “genre fiction”: the demands of the genre, be they establishing a unique and consistent fantasy world or following a formula like that of the mystery, precolor the plot and characters of the story. In particular, genre trappings tend to expand the scale of drama at the expense of subtlety: when murder is afoot or the fate of the universe is at stake, the minute drama that we experience every day falls beneath the noise floor. Fiction without genre, on the other hand–in particular, fiction that employs as its setting the world that its readers live in, at least as a baseline–starts with a neutral white, and provides a greater opportunity to explore subtlety of experience. Additionally, if one is to compare actual present reality to any imagined fictional world or historical setting, it becomes clear that the real world has vastly more source material with which to build.
One could argue that science fiction subsumes present reality, particularly when it employs devices like Doctor Who’s time travel, Star Trek’s holodeck excursions, or the simulated world of The Matrix. Similarly, “urban fantasy” settings augment reality with monsters and magic. Unfortunately, all of these devices end up trivializing the real world: again, our day to day experiences become inconsequential in the face of attacks by aliens and robots and vampires. Conversely, one could explore fantastic settings in dreams, hallucinations, or stories embedded within a container of realism–but that would in turn trivialize the fantasy worlds. The “magic realism” approach, in contrast, manages to blend fantastic elements into actual reality without sacrificing subtlety, but it does so at the expense of consistency. Where fantasy and science fiction thrive by constructing and obeying novel sets of rules for their environments, making them well-suited to the algorithmic world of gaming, magic realism relies on the unpredictable whims of the author. As soon as a magic realist world solidifies and becomes consistent in the mind of the reader, it becomes “mere fantasy.”
Still, magic realism–or some internally consistent approximation thereof–is the most promising basis for the theme of Witgap. Starting with a representation of actual, current reality as a baseline will provide the greatest level of freedom to expand into fantastic and speculative branches. The ideal is “reality++”: a versatile version of our present reality that will support excursions into other realities and inclusions of their aspects and elements without trivializing or rendering unreal either the primary or the alternate dimensions, all while maintaining a set of consistent game rules and a continuity of experience. Part of the challenge will be that of “scale reset”: because stakes may vary greatly from one story to the next, there must exist some mechanism to ensure that the epic scope of one story doesn’t overshadow the subtle realism of the next.
The practicality of realism is one of the major factors separating games as a medium from books and movies: realistic novels are just as expensive to produce as fantastic ones, and realistic films are cheaper, but the cost of producing games increases dramatically with increased graphical realism. This is one of the reasons that independent films tend to focus on realistic stories, whereas independent games tend towards abstract representation and cartoonishness, applying their innovation towards mechanics. A game like Witgap, however, that avoids or limits graphical representation, is free to represent both realistic and fantastic settings to an extent that graphical games cannot match.
Recently, I’ve been trying to think of applications to drive my AI research. Firstly, I know from experience that developing technology without an application (or set of applications) in mind leads to a state of aimless, meandering expansion. When I worked on NPSNET-V, for instance, I never managed to come up with a “killer app” to drive development of the framework. While I had the pie-in-the-sky ideal end state in mind (that of the Metaverse, or at least an equivalent in terms of networked military simulations), I didn’t bother to envision the intermediate state: a modestly scoped, easily demonstrable application based on the framework that would clearly show the abilities of the architecture. Without such an application in mind, I simply added features according to what seemed interesting at the time: an HTTP server module exposing application state as serialized XML, a Telnet server module allowing remote manipulation, etc. In contrast, when I created Clyde, I had Spiral Knights in mind at all times. Although I intended the library to be generalizable to other games, having Spiral Knights as a baseline was crucial as a source of inspiration and focus.
Secondly, I’ve been considering how to reach an optimal state in my own life, and it seems like the ideal situation would allow me to develop my AI project in a setting that fulfills my other desire: to work in a close-knit group of smart and motivated people in a sustainable fashion. It seems like one possible way to effect that scenario would be to design a product based on the technology and either form a start-up to develop it or convince an existing organization to take it on.
So far, the most promising application idea I’ve had is a combination virtual pet and recommendation system (and/or other systems providing “useful” functionality on top of the virtual pet experience). The trick with getting users to train an AI system is that they have to put up with and push past the early stages of learning, where the output will consist primarily of useless (but perhaps entertaining) babble. Children seem to manage by being cute and triggering an emotional reaction, and, if I had to guess, the appeal of pets is related to that response. As human beings, we seem to have a preformed slot for endearingly ignorant creatures, and it’s to that slot one must aim if one is to take advantage of the human tendency to educate without external incentive.
Virtual pets have been done before many times, in many formats, but they tend not to use any “real” intelligence, nor to provide any “real” utility; instead, they consist of simple rule-based systems in an entertainment-oriented environment, typically providing a set of games for users to play with their pets (example: NeoPets). My idea is something of an inversion of the process of gamification: rather than shoehorning game-derived techniques into non-game applications, I want to take a game environment and add aspects of practical utility to it. The eventual goal, after users have trained their pets sufficiently, is to encourage a symbiotic relationship between pets and their owners: that is, one in which both parties rely on each other to their mutual benefit.
One of the most fundamental decisions for such a system is the extent to which the “mind” of each symbiont is separate from the others. On the one hand, there’s an undeniable appeal to starting each new pet with a blank mental slate: a completely fresh start for the user and their pet. On the other hand, that approach will lead to a large amount of redundancy, and won’t allow users to benefit directly from the training supplied by other users. It may be that the answer is to allow pets to communicate with one another directly to share information, or it may be that the ideal form of the software is like that of the mythical Hydra: a single entity presenting a separate face to each user.
Over the past several months, I’ve come up with a few rules of thumb to guide my AI research. In no particular order:
Limbs, not wheels. This concept comes from a Straight Dope article: “Why has no animal species ever evolved wheels?” The presumed answer is that evolution requires continuity of function–that is, in order to evolve a new feature over the course of successive generations, each stage of that feature’s development must provide an increased advantage over the one before. Members of a lineage evolving legs can use half-limbs to propel themselves, but a half-wheel is useless. Mutations allow for sudden jumps of form, but the likelihood of forming something as complex as a wheel from a mutation is infinitesimal (unless that wheel was already present as a latent form in the genome). Similarly, our mental apparatus is not likely to contain complex features unless their partial implementation was present in and useful to our ancestors. Non-human animals are the most obvious source of inspiration to consider when attempting to trace human intelligence back to its more primitive forms in order to simplify the process of modeling it. Similarly, it’s useful to consider the development of human society (language, etc.) as a continuous process made possible by progressive mental advancement.
Time and space are fundamental. This is related to the concept of embodied cognition that I mentioned previously. For evolutionary reasons, human-like (or animal-like) thought is inextricably dependent on the perception of our spatial and temporal environments. We understand concepts as basic as containment (as of a member in a set), proximity, sequentiality, and causality in terms of association with our perceptions of and interactions with the physical world in time and space. A model of human consciousness would require equivalents to the mechanisms that allow us to understand time and space in order to relate to us in a meaningful fashion.
Unguided learning is preferable (if possible). This is mostly a practical concern. I don’t have access to a stable of graduate students to act as trainers, so I am particularly interested in learning processes that do not require manual control: for instance, creating associations by freely exploring an environment.
The Internet is the (an) environment. Never before has there existed such a vast body of information accessible freely and instantly to any connected program. It seems clear that any successful attempt to model intelligence would do well to make use of this resource to create associations equivalent to the ones that drive human consciousness. It makes sense to think of the Internet as an environment not unlike a physical one: programs “sense” and “act” through application protocols much as they would interact with objects and agents occupying a real or simulated space.
The crowd are the (some of the) trainers. The Internet makes instantly available not only a huge body of information, but also a huge number of potential human interactors. If they can be motivated to do so, they will act as guides to learning.
The trainers are part of the equation. The assumption that the model to be trained mimics the process of human cognition implies that the behavior of the human trainers can be also be understood in terms of the model. This is important when determining how to motivate the trainers and guide them into teaching the models correctly.
The (first) language of the machine may not be English. The idea of a “chat bot” that speaks intelligible English may be somewhat specious. While English synthesis is eventually desirable, there are many other format possibilities for input and output, such as images, sounds, or computer languages like HTML or Scheme. Some of these may prove more fruitful for experimentation, particularly in the early stages, since they have qualities different from English: simplicity, for instance, or amenability to gene-like “crossover” synthesis.
Dreaming, “free” association, and normal thought aren’t separate cases; they are variants of the same process. My suspicion is that normal thought can best be described as “constrained association,” and that dreaming is a more intense, closed-loop form of thought wherein the stream of associations assumes the quality of real, waking experience. I think that I’ve experienced a halfway point to this at the threshold of sleep, where thoughts seem especially vivid and automatic, and the outside world shrinks in significance, but a flicker of waking consciousness remains, along with the understanding that the experiences aren’t real.