After considering various permutations of and puns on the words “balls” and “pucks”–”brass balls,” “balls to the wall,” “all the pucks,” “zero pucks,” “smarmy pucks,” etc.–I decided on “Snarky Pucks” as at least a working title and registered snarkypucks.com. The joke may be a little subtle (Google shows only 600 results for the phrase “snarky fucks”), but it actually gives some idea of what the game is about.

I started out working on the client side in CoffeeScript/jQuery/HTML5, with my very first steps being related to scrolling and dirty region management on an HTML5 canvas. I assumed that the environments would be comprised of a set of 2D vector features (based simply on canvas’s drawing primitives), and that it would be best to avoid redrawing all of the visible features when scrolling. Instead, I copy the moved portion within the frame and dirty the exposed portion, which seems to work well, though I’ve only done the most basic testing on my phone. One of the useful features of HTML5 canvas is that you can stack them on top of one another with transparency, avoiding the need to dirty and redraw everything in a lower layer when you change something in a higher one. I found this especially useful for the editor tools–dragging out a line, for example.

After getting a basic scrolling grid up and running, I started working on a track editor in tandem with the basic functionality that will be shared with the editor and the actual game: feature management, intersection testing, serialization to JSON, etc., etc. It gave me a chance to return to some of the fun 2D geometry of Spiral Knights(‘s gameplay), and I used the same space representation: a spatial hash at the highest level (avoiding the need to specify/limit the track size), with quadtrees as the hash values. The editor is essentially just a vector graphics editor, complete with undo/redo, group selection and editing, and import/export; in fact, I used it to create a basic logo for Facebook:

editor

My last client-side push before delving into the server code was to add a test mode that simulates the process of entering and executing moves that the game will use, as well as the interactions between pucks and their environments. I’m using an extremely simple point model for the physics, with the pucks’ motion governed simply by their initial velocity and constant friction (notably, pucks have no “spin”), which makes it easy and efficient to depict where the pucks will travel if they don’t collide with other pucks:

test

Since then, I’ve been riding Ruby on Rails for the server code: authentication (including Facebook login), player management, track management (saving, revising, publishing), game creation, and soon, game invites. My plan is to stick (for now) with bare bones page-oriented interfaces for the admin interfaces, but create a more AJAX-y, “SPA”-type interface for the actual gameplay. So far, I’m liking the concept of Rails more than I like Ruby as a language, but I figured I’d start with the framework that spawned so many, many others.

 

After several weeks of accomplishing nothing aside from besting my already prodigious records for oversleeping and playing through the entire Dragon Age series, I’m starting to accept that maybe I don’t have it in me to solve the grand challenge of AI. My strengths are really in implementation rather than research, and while it’s fun to think about AI, I already miss working with other people on more clearly defined problems. I plan to start looking for another job soonish–at latest, after going to Poland at the end of June–but in the meantime, I’ve been thinking about how to break my slump with a more tractable project. I thought about returning to my “ASCII sandbox MMO” game, and in particular its Scheme interpreter, but apart from the fact that that game is ambitious in scope and limited in potential appeal, it doesn’t use any new technologies that are likely to be interesting to potential employers. It seems reasonable to attempt to add a few keywords to my resume, and there are a few areas that I’ve been curious about but have never bothered to explore.

I attempted today to come up with a list of technologies to delve into, with the idea that I’d build a project around them. I’ve never worked with mobile apps, for instance, and the likely platform for me to start with would be C++ with Qt on Android (and maybe eventually iOS), since I’ve been using that on desktops for quite a while now. On the other hand, it seems entirely possible to use JS/HTML5 canvas for mobile development, and since another set of technologies that I want to try are modern web development systems (JS with frameworks like Angular, server-side frameworks like RoR), that seems like it might be a better starting point.

In terms of a project, my current idea (as of today) is an asynchronous party game inspired by Whirled’s LOLcaptions. I’m not wedded to the idea of continuing in game development, but it seems slightly easier to come up with an original game than an original non-game app. As a decidedly softcore gamer, the games I enjoy fall into one of two categories: first, extremely content-heavy story-driven mentally-untaxing single player games (Dragon Age, Mass Effect, Elder Scrolls, Fallout, Half-Life, Portal…), and second, extremely casual mentally-untaxing social games (poker, Bang!, Everything, Cards Against Humanity, Mario Kart…). Given that I’m not likely to come up with a massive amount of content, it seems reasonable to try something in the second category.

At present, my vision of the game is one where, in small groups, people log in every day to enter a series of moves, which consist of dragging and launching a ball as one would in a mini golf game (showing my age, I guess I’m thinking specifically of Zany Golf), racing their balls around one of various tracks. This provides ample opportunity for various pinball-like obstacles, widgets, and power-ups, but the primary goal would be to hit targets that activate, for the day’s round, a LOLcaptions-esque challenge: from various web APIs or internal databases, the app selects a random image, news story, Wikipedia article, question, etc. Each player responds to the prompt, and votes on one of the other (anonymized) responses at the next turn, with the winner receiving points or an advancement bonus (depending on how the overall game winner is determined).

Basically, I think that the entire point of multiplayer gaming is social interaction, but mobile devices (and, to a lesser extent, anything other than in-person gameplay) make that awkward–thus, the idea is to force social interaction by making it part of the gameplay. To encourage social interaction, one benefits from a continuous stream of novel prompts. I’m most familiar with this from my countless hours spent lurking on news discussion sites like Slashdot and Fark. The other crucial aspect is encouraging people to log in once a day for rationed novelty, as in Everything–or Facebook, for that matter.

I’m not concerned with monetization at this point, or perhaps ever–the goal is simply to get something running that’s entertaining to play, and to play around with some new technology in the process.

 

My first idea for an incremental improvement after the last post was to add a regularly recurring “hunger” state to address in part the problem of exploration versus exploitation: the agent would explore the environment when not hungry, and when it became hungry, it would seek out its best known path to the reward.  However, in considering how to implement that, I fell into the rabbit hole of considering what it meant to “learn” to associate the presence of hunger and the reward stimulus with actual reward, as well as to predict when hunger will strike again. To do this, a simple model based on perceptrons–the simplest type of neural network–seemed like a reasonable starting point.

I modified the experiment to use a (sparse) perceptron model with inputs of the constant bias, the direction of travel (one of the four cardinal directions), and the position (one of width * height), and the output of the new position, using one versus rest (that is, choosing the output with the highest weight) to ensure that only one output position is chosen. The lines extending from the agent indicate the adjacent locations to which it believes it can travel. Each time the agent attempts to move and either successfully reaches another location or bumps into a wall (indicated by thickening the wall segment), it adjusts its weights according to the perceptron algorithm.

Note that the actual predictions aren’t used for anything yet; I’ve basically just switched from a model that “remembers” locations to one that remembers transitions–that is, their last visited time and reward distances. In order for backtracking to work correctly, I added a special case that updates the last visited time for the reverse of the last transition. Ideally, the agent would instead learn to predict reversible actions (e.g., moving left after moving right returns to previous state). I also left out the “Distance” reinforcement mode, because inverting the perceptron model (finding appropriate sets of inputs corresponding to an output) is somewhat tricky.

Ideally, rather than providing the absolute position as an input/output, it would be a derived, internal state–that is, the agent would be aware of its relative motion and would construct a mental model of its surroundings by bumping around strategically.

 

Here, I’ve extended the simulation by adding a basic reward mechanism. The agent explores its environment as before (preferring less recently visited nodes), but when it encounters the goal location (the plus sign), it notes the distance either along the path it took to reach the location, or from any node already visited. After being respawned at a random location, it considers the reward distance as well as the last visited time when choosing a new node. Any node known to lead to the reward will be chosen over an unknown node, which means that this simulation will retread existing paths rather than finding new, better ones. An improvement would be to add a preference or chance to explore new paths, somewhat analogous to the role of mutation in genetic algorithms (escaping from local maxima).

I should note that both this and the previous simulation assume that the agent can always perfectly identify its location (current node), and from that the neighboring nodes.

 

As a first stab at exploring how search might emerge from simpler behaviors, I figured I’d start with a simple coloring traversal. Here, I generate a fully-connected maze using a simple algorithm: starting with all cells walled off, I merge neighboring groups of cells at random until only one group remains. The agent traverses the maze by “remembering” the last time it visited each cell and always choosing the least recently visited cell as its immediate destination.

The next step will be incorporating “reward” into the simulation and using the expected reward as another weighting factor in the choice of destination.

 

In attempting to flesh out the concept of search as it applies to simple agents, I decided at one point that a useful model of behavior was one of “context, action, and effect”: that is, “Where am I now?”, “What can I do from here?”, and “Where will I be once I do that?”  From a graph search perspective, the contexts are the nodes of the graph, the actions are the (directed) edges, and the effects are the destination nodes.  Physical search would resemble something like a coloring traversal, in that the agent can only occupy one context (place) at a time, but will remember previously visited nodes.  Mental search, on the other hand, could involve activating more than one context at once, allowing the equivalent of breadth-first/A* searches.  The context nodes, if not also the actions, would no doubt require “fuzzy” matching of some sort: new input combinations would be compared to existing contexts, with the result of either creating a new context or adjusting an existing one.  It would also be possible to consider percentage matching of multiple nodes simultaneously, which starts to remind one of a Markov model.  Similarly, one could imagine multiple simultaneous searches–orthogonal or otherwise–as well as “templating” behaviors to combine different simultaneous contexts.  With templating to allow the inclusion of “variables,” the process starts to resemble Prolog‘s model of computation with backtracking search.

 

The last thing I was thinking about when I left off was the emergence of search algorithms in nature and their relationship to consciousness.  The idea of consciousness as a search process is nothing new, and search is one of the most fundamental brain-controlled behaviors: witness the experiments with rats in mazes.  The questions, then, are these: How did search evolve or how does it emerge from simpler behaviors, such as reflexive reaction to the presence of food?  What is the simplest organism that can search, and what are the simplest other behaviors that require neural control?  How did physical search evolve into mental search, allowing organisms to explore routes without physically traversing them?  How did concrete mental search evolve into abstract search, where the nodes being traversed are no longer represent physical locations and actions, but rather more general concepts?  How does concrete or abstract search lead to the synthesis of new concepts, which themselves become part of the search space?

The approach I’m interested in taking is one that focuses on the core, rather than the periphery, of the cognition process.  I don’t expect or desire to solve the problem of computer vision or speech recognition, for example; I intend to assume that those problems can be solved and provide input to agents in high level symbolic form.  Similarly, I’m not interested in modelling cognition at the neuronal level, but at a level that bridges the gap between the symbolic and neuronal layers.  I’m tempted to say that I’m interested in a holistic approach, except that too often people who espouse holism are those who can’t be bothered to understand the parts of a system and thus make claims to mystical knowledge of the whole.

 

In keeping with the sentiment behind my last post, I left High Fidelity recently in order to take another stab at building something on my own while my mind is still relatively flexible.  I’ve decided to abandon the game project and focus exclusively on artificial intelligence, since that’s what I consider “important,” in the sense that if there’s any chance at all of even the slightest bit of success, then I can’t think of anything more worthwhile for me to work on.  Unfortunately, unlike the game project, it’s far more difficult simply to begin programming, and I have a difficult time measuring productivity in terms of anything other than code output–not to mention that I enjoy the process of coding much more than I do writing about or talking about code, and that I feel adrift without a stable code base and the daily routine of contributing to it.  So, my immediate goal is to reach a point at which I can begin implementation, even if it’s just a framework or test bed for further development.

In the course of implementation, it would benefit me to seek out opportunities to learn new technology, so as to expand my marketable skill set in preparation for the next time I seek employment.  I’ve been thinking that when I do that, I’ll broaden my search beyond just game and graphics jobs, which increases the number of possible technologies to consider.  Among the possibilities are Unity 3D, which I’ve already played with a little; GPGPU coding using OpenCL; general machine learning and data analysis techniques; mobile development; and web development using more modern techniques and languages than the ones I’ve used in the past.

As a general strategy, I believe in a “bottom up,” biomimetic approach to intelligence: anything that resembles intelligence such as ours must be “grown” like us.  To me, that suggests raising agents within 2D or 3D virtual environments, starting with simulations of the most primitive behavior (such as reacting to the presence of food) exhibited by single cell organisms, and attempting to introduce complexity incrementally in an echo of the idea that ontogeny recapitulates phylogeny.  Unity seems like a natural fit for such simulations.

 

A recent experience reminded me that if I want to get anything done in life that requires significant brainpower (at least the kind that allows one to develop new abstractions), I had better do it while I’m still relatively young.  It seems like peoples’ thought patterns follow a predictable trajectory over the course of their lifetimes: first, as infants, they form seemingly random connections; then, through childhood and adolescence, they’re prone to generalization; then, through adulthood and early middle age, abstraction; and finally, in late middle and old age, they become less likely to form new abstractions and instead focus on concrete matters.  This reflects the development of the brain, which starts out with a vast number of connections that are progressively winnowed down over the course of a lifetime.  My dad compared this to sculpture, where excess material is gradually removed until only the target form remains.  This suggests two things.  First, if possible, one must direct the course of sculpture towards a goal state that will be of use in later years.  One finds the best examples of this by considering careers that lend themselves towards working well beyond the normal retirement age–judges and teachers, for example.  Second, one should take full advantage of the period in which one’s capacity for abstraction is at its peak.  I suspect this is the reason why mathematicians are widely expected to produce their best work before they turn 30; mathematics is abstraction in its most pure form.

 

After a lot of thinking and Wikipedia research, I’ve realized that I was somewhat naive in my choice of test bed. Handling the real-valued inputs and outputs of the agents in the previous post poses a challenge beyond that of handling discrete values–for example, in storage, how does one group similar values in order to avoid creating a new mapping for every unique number, no matter how slight the difference between it and its neighbors?  The easiest solution would be to use uniform quantization, though that would require us to know in advance the desired granularity.  In researching alternatives that would remove that requirement, I learned about the general problem of clustering, and algorithms such as k-means clustering, which groups similar values together into a predetermined number (k) of clusters using a simple iterative process.  An extension of this is fuzzy c-means clustering, which removes the hard boundaries between clusters in favor of granting each point a weighted degree of membership in each cluster.  An agent that learns over time would require an online/sequential version of the algorithm.  For example, consider a “fuzzy map” data structure with capacity k, requiring a distance function for keys and blend functions for keys and values.  The first k values would be inserted unchanged, with retrieved values blended according to key distance.  Subsequent insertions would adjust the key/value pairs according to their distance to the new key, the total weight of values already inserted, and perhaps an additional factor to allow more recent mappings to overcome the inertia of historical ones.  An advantage of this approach is that it sets a bound on the amount of memory (and lookup time, insertion time, etc.) required for the mapping.

On considering how the agent must process values, it becomes clear that derived data–values obtained through transformations of the raw stream–are often more significant than the originals.  For example, when a mouse sniffs around to locate the source of a scent, the important datum is whether the intensity of the smell increases or decreases with each movement: if the smell decreases when the mouse rotates left, then the source of the smell is to the right.  This suggests providing the scent delta instead of or in addition to the absolute intensity.  In general, the problem of augmenting a stream of raw values with layers of derived information suitable for input into a higher level cognitive algorithm is likely to require both simple transformations like differences and more advanced analyses such as the identification of repeated patterns, even when they occur at different scales or contain loops or gaps.  This pattern recognition requirement is one of the most compelling reasons to think that the animal brain, with its robust pattern matching ability enabled by massive parallelism, may have a distinct advantage over any currently realizable artificial system–at least in terms of interacting with the real world, or any artificial environment that approaches its complexity.  For instance, consider that the sparse coding believed to be employed by the visual cortex uses a large number of neurons to analyze small sections of the visual field in a manner that is robust to input noise, scaling, and translation.

Apart from simple transformations and pattern identification, it may be necessary to process input further in order to create mental models of context.  That is, the brain’s representation of the current environment may not be created–or created entirely–by the principal cognitive process, but rather by a preprocessing phase that specializes in determining spatial relationships.  The equivalent in simulation terms would be to provide the locations, shapes, movement vectors, etc., of objects in the environment (including the agent itself) instead of or in addition to the raw sensory inputs.  Indeed, assuming that more information is always more useful than less (assuming effective means of determining relevance) and that an emergent process would have to derive such spatial information anyway, it would seem that providing the information explicitly is a no-brainer (so to speak).  However, considering this and the many other layers of derived data with which we plan to augment the raw input, we rapidly encounter problems relating to the curse of dimensionality.  For example, were we to use the k-nearest neighbors algorithm to find historical contexts similar to the current one in order to find outputs likely to result in reward, the proximity in relevant inputs would likely be drowned out by differences in irrelevant ones.

© 2012 Andrzej Kapolka Suffusion theme by Sayontan Sinha