Jun 132018

In order to learn more about machine learning, I thought it would be interesting to work through a set of exercises implementing different techniques. This first one is a simple perceptron test that shows what the model can and can’t “learn” by training and testing on the same, maximally simplistic data set: the full domain of two binary inputs. With the Presets dropdown, you can select different training data and see that the model works on most of them, but not on the XOR/XNOR functions (as expected).

Dec 192015

Since my last post, I’ve fleshed out Snarky Pucks, and have conducted a couple week-long games with friends, using Facebook for coordination. I believe the consensus at this point is that, while the snark part is fun, the puck part feels largely redundant. Part of the problem is lack of physical contention, as the pucks tend to spread out over the track and the wacky puck-on-puck mayhem I had envisioned has thus far failed to materialize. I have some ideas as to how to address this–including the nuclear option of simply removing the puck part and perhaps turning it into a separate game–but, for now, it seemed prudent to improve the aspect to which players responded.

Wikipedia’s random article link–which has provided me with hours and hours of entertainment–was my first idea for a source of prompts, and I naively assumed that other sites would recognize the genius of the idea and offer a similar feature. In fact, the only suitable equivalent I could find was Reddit’s random/random path, which serves up a random post from a random Subreddit. For the first two games of Snarky Pucks, those were the only prompt sources. An additional complication that I hadn’t anticipated was simply rendering the prompts within the game. Wikipedia generously allows its articles to be loaded in an iframe on another domain, but this is uncommon: most sites disallow it via the X-FRAME-OPTIONS header. For the Reddit articles, I had to have Snarky Pucks read the RSS feed for the random articles, render the content as HTML, then cache the result to serve to players (with a link to the full article). Also, to avoid delays when processing turns, I had the server run a background job continually filling pools of prefetched prompts.

Midway through the first game, two things became apparent: first, that the Wikipedia prompts were generally more fruitful than the Reddit ones, but second, that there are many, many dull Wikipedia articles. I knew this from casual browsing, of course–typically, I would have to click through several soccer teams, politicians, and Polish voivodeships before hitting an interesting article. Matt Jensen had the bright idea of using stats.grok.se to find articles with a minimum number of recent page views, which helped substantially. As for Reddit, I used a fixed weighting that preferred Wikipedia prompts with a ratio of 7:1.

For the next game, though, I wanted to avoid hard-coding the prompt weights–particularly since I planned to add several new prompt types and didn’t know which ones would prove popular. I added an option for players to rate each prompt (thumbs up or down) at the voting stage, and used the results to adjust the weighting. First, the overall ratings determine a baseline for players who haven’t themselves (yet) rated the prompt types. The lowest-rated type has a weight of 1/N, the highest a weight of N, and thus the ratio between them is N2 (currently, N = 3). As players rate prompts themselves, their contributions to the game’s weightings approach personalization asymptotically: with R ratings, a player’s weighting influence is proportionally 1/R the global ratings, (1 – 1/R) their own. The games’ weights are simply the average of all players’.

In looking for new prompt types to incorporate, there were two categories in particular that I wanted to investigate: news stories and images. Rendering the RSS from Reddit suggested doing the same with more typical RSS feeds, and some cursory Googling of “best RSS feeds” led me to Mashable, BoingBoing, and TechCrunch: all relatively high-volume feeds with images and easily digestible summaries. I modified Snarky Pucks’ background job to collect all stories from these feeds within the last week, render them, and provide them on demand. Images proved slightly more complicated, though not for lack of options. In terms of APIs that provide random images on demand, I briefly considered Splashbase and The Cat API, but found their content limited in terms of its potential for humor. Of the many photo-sharing sites that Wikipedia lists, several offer RSS feeds of recent content, and it was tempting to incorporate these, since I could use the same code as for the news prompts. However, only the feed of 500px provided both full-sized images and a minimal level of curation. For the rest, I used the APIs of Imgur, DeviantArt, and Flickr to incorporate recently added images as prompts. These varied sources are hardly equal in terms of their inclusion criteria; after several days of pulling in new content, I’ve pooled about 4500 images from 500px (the upcoming feed), 1300 from DeviantArt (the undiscovered gallery), 19000 from Flickr (recent photos), 11000 from Imgur (new images); and 160 articles from BoingBoing, 600 from Mashable, and 200 from TechCrunch.

The next prompt type that I want to incorporate would consist of player-submitted open-ended questions, with Mad-Libs-style random substitutions–e.g., “What would the child of {player-1} and {player-4} be like?” In addition to the questions themselves, players could submit entries for the substitutions: customized nouns, verbs, phrases, etc.

May 152015

After considering various permutations of and puns on the words “balls” and “pucks”–”brass balls,” “balls to the wall,” “all the pucks,” “zero pucks,” “smarmy pucks,” etc.–I decided on “Snarky Pucks” as at least a working title and registered snarkypucks.com. The joke may be a little subtle (Google shows only 600 results for the phrase “snarky fucks”), but it actually gives some idea of what the game is about.

I started out working on the client side in CoffeeScript/jQuery/HTML5, with my very first steps being related to scrolling and dirty region management on an HTML5 canvas. I assumed that the environments would be comprised of a set of 2D vector features (based simply on canvas’s drawing primitives), and that it would be best to avoid redrawing all of the visible features when scrolling. Instead, I copy the moved portion within the frame and dirty the exposed portion, which seems to work well, though I’ve only done the most basic testing on my phone. One of the useful features of HTML5 canvas is that you can stack them on top of one another with transparency, avoiding the need to dirty and redraw everything in a lower layer when you change something in a higher one. I found this especially useful for the editor tools–dragging out a line, for example.

After getting a basic scrolling grid up and running, I started working on a track editor in tandem with the basic functionality that will be shared with the editor and the actual game: feature management, intersection testing, serialization to JSON, etc., etc. It gave me a chance to return to some of the fun 2D geometry of Spiral Knights(‘s gameplay), and I used the same space representation: a spatial hash at the highest level (avoiding the need to specify/limit the track size), with quadtrees as the hash values. The editor is essentially just a vector graphics editor, complete with undo/redo, group selection and editing, and import/export; in fact, I used it to create a basic logo for Facebook:


My last client-side push before delving into the server code was to add a test mode that simulates the process of entering and executing moves that the game will use, as well as the interactions between pucks and their environments. I’m using an extremely simple point model for the physics, with the pucks’ motion governed simply by their initial velocity and constant friction (notably, pucks have no “spin”), which makes it easy and efficient to depict where the pucks will travel if they don’t collide with other pucks:


Since then, I’ve been riding Ruby on Rails for the server code: authentication (including Facebook login), player management, track management (saving, revising, publishing), game creation, and soon, game invites. My plan is to stick (for now) with bare bones page-oriented interfaces for the admin interfaces, but create a more AJAX-y, “SPA”-type interface for the actual gameplay. So far, I’m liking the concept of Rails more than I like Ruby as a language, but I figured I’d start with the framework that spawned so many, many others.

Apr 152015

After several weeks of accomplishing nothing aside from besting my already prodigious records for oversleeping and playing through the entire Dragon Age series, I’m starting to accept that maybe I don’t have it in me to solve the grand challenge of AI. My strengths are really in implementation rather than research, and while it’s fun to think about AI, I already miss working with other people on more clearly defined problems. I plan to start looking for another job soonish–at latest, after going to Poland at the end of June–but in the meantime, I’ve been thinking about how to break my slump with a more tractable project. I thought about returning to my “ASCII sandbox MMO” game, and in particular its Scheme interpreter, but apart from the fact that that game is ambitious in scope and limited in potential appeal, it doesn’t use any new technologies that are likely to be interesting to potential employers. It seems reasonable to attempt to add a few keywords to my resume, and there are a few areas that I’ve been curious about but have never bothered to explore.

I attempted today to come up with a list of technologies to delve into, with the idea that I’d build a project around them. I’ve never worked with mobile apps, for instance, and the likely platform for me to start with would be C++ with Qt on Android (and maybe eventually iOS), since I’ve been using that on desktops for quite a while now. On the other hand, it seems entirely possible to use JS/HTML5 canvas for mobile development, and since another set of technologies that I want to try are modern web development systems (JS with frameworks like Angular, server-side frameworks like RoR), that seems like it might be a better starting point.

In terms of a project, my current idea (as of today) is an asynchronous party game inspired by Whirled’s LOLcaptions. I’m not wedded to the idea of continuing in game development, but it seems slightly easier to come up with an original game than an original non-game app. As a decidedly softcore gamer, the games I enjoy fall into one of two categories: first, extremely content-heavy story-driven mentally-untaxing single player games (Dragon Age, Mass Effect, Elder Scrolls, Fallout, Half-Life, Portal…), and second, extremely casual mentally-untaxing social games (poker, Bang!, Everything, Cards Against Humanity, Mario Kart…). Given that I’m not likely to come up with a massive amount of content, it seems reasonable to try something in the second category.

At present, my vision of the game is one where, in small groups, people log in every day to enter a series of moves, which consist of dragging and launching a ball as one would in a mini golf game (showing my age, I guess I’m thinking specifically of Zany Golf), racing their balls around one of various tracks. This provides ample opportunity for various pinball-like obstacles, widgets, and power-ups, but the primary goal would be to hit targets that activate, for the day’s round, a LOLcaptions-esque challenge: from various web APIs or internal databases, the app selects a random image, news story, Wikipedia article, question, etc. Each player responds to the prompt, and votes on one of the other (anonymized) responses at the next turn, with the winner receiving points or an advancement bonus (depending on how the overall game winner is determined).

Basically, I think that the entire point of multiplayer gaming is social interaction, but mobile devices (and, to a lesser extent, anything other than in-person gameplay) make that awkward–thus, the idea is to force social interaction by making it part of the gameplay. To encourage social interaction, one benefits from a continuous stream of novel prompts. I’m most familiar with this from my countless hours spent lurking on news discussion sites like Slashdot and Fark. The other crucial aspect is encouraging people to log in once a day for rationed novelty, as in Everything–or Facebook, for that matter.

I’m not concerned with monetization at this point, or perhaps ever–the goal is simply to get something running that’s entertaining to play, and to play around with some new technology in the process.

Mar 252015

My first idea for an incremental improvement after the last post was to add a regularly recurring “hunger” state to address in part the problem of exploration versus exploitation: the agent would explore the environment when not hungry, and when it became hungry, it would seek out its best known path to the reward.  However, in considering how to implement that, I fell into the rabbit hole of considering what it meant to “learn” to associate the presence of hunger and the reward stimulus with actual reward, as well as to predict when hunger will strike again. To do this, a simple model based on perceptrons–the simplest type of neural network–seemed like a reasonable starting point.

I modified the experiment to use a (sparse) perceptron model with inputs of the constant bias, the direction of travel (one of the four cardinal directions), and the position (one of width * height), and the output of the new position, using one versus rest (that is, choosing the output with the highest weight) to ensure that only one output position is chosen. The lines extending from the agent indicate the adjacent locations to which it believes it can travel. Each time the agent attempts to move and either successfully reaches another location or bumps into a wall (indicated by thickening the wall segment), it adjusts its weights according to the perceptron algorithm.

Note that the actual predictions aren’t used for anything yet; I’ve basically just switched from a model that “remembers” locations to one that remembers transitions–that is, their last visited time and reward distances. In order for backtracking to work correctly, I added a special case that updates the last visited time for the reverse of the last transition. Ideally, the agent would instead learn to predict reversible actions (e.g., moving left after moving right returns to previous state). I also left out the “Distance” reinforcement mode, because inverting the perceptron model (finding appropriate sets of inputs corresponding to an output) is somewhat tricky.

Ideally, rather than providing the absolute position as an input/output, it would be a derived, internal state–that is, the agent would be aware of its relative motion and would construct a mental model of its surroundings by bumping around strategically.

Mar 052015

Here, I’ve extended the simulation by adding a basic reward mechanism. The agent explores its environment as before (preferring less recently visited nodes), but when it encounters the goal location (the plus sign), it notes the distance either along the path it took to reach the location, or from any node already visited. After being respawned at a random location, it considers the reward distance as well as the last visited time when choosing a new node. Any node known to lead to the reward will be chosen over an unknown node, which means that this simulation will retread existing paths rather than finding new, better ones. An improvement would be to add a preference or chance to explore new paths, somewhat analogous to the role of mutation in genetic algorithms (escaping from local maxima).

I should note that both this and the previous simulation assume that the agent can always perfectly identify its location (current node), and from that the neighboring nodes.

Mar 042015

As a first stab at exploring how search might emerge from simpler behaviors, I figured I’d start with a simple coloring traversal. Here, I generate a fully-connected maze using a simple algorithm: starting with all cells walled off, I merge neighboring groups of cells at random until only one group remains. The agent traverses the maze by “remembering” the last time it visited each cell and always choosing the least recently visited cell as its immediate destination.

The next step will be incorporating “reward” into the simulation and using the expected reward as another weighting factor in the choice of destination.

Feb 122015

In attempting to flesh out the concept of search as it applies to simple agents, I decided at one point that a useful model of behavior was one of “context, action, and effect”: that is, “Where am I now?”, “What can I do from here?”, and “Where will I be once I do that?”  From a graph search perspective, the contexts are the nodes of the graph, the actions are the (directed) edges, and the effects are the destination nodes.  Physical search would resemble something like a coloring traversal, in that the agent can only occupy one context (place) at a time, but will remember previously visited nodes.  Mental search, on the other hand, could involve activating more than one context at once, allowing the equivalent of breadth-first/A* searches.  The context nodes, if not also the actions, would no doubt require “fuzzy” matching of some sort: new input combinations would be compared to existing contexts, with the result of either creating a new context or adjusting an existing one.  It would also be possible to consider percentage matching of multiple nodes simultaneously, which starts to remind one of a Markov model.  Similarly, one could imagine multiple simultaneous searches–orthogonal or otherwise–as well as “templating” behaviors to combine different simultaneous contexts.  With templating to allow the inclusion of “variables,” the process starts to resemble Prolog‘s model of computation with backtracking search.

Feb 082015

The last thing I was thinking about when I left off was the emergence of search algorithms in nature and their relationship to consciousness.  The idea of consciousness as a search process is nothing new, and search is one of the most fundamental brain-controlled behaviors: witness the experiments with rats in mazes.  The questions, then, are these: How did search evolve or how does it emerge from simpler behaviors, such as reflexive reaction to the presence of food?  What is the simplest organism that can search, and what are the simplest other behaviors that require neural control?  How did physical search evolve into mental search, allowing organisms to explore routes without physically traversing them?  How did concrete mental search evolve into abstract search, where the nodes being traversed are no longer represent physical locations and actions, but rather more general concepts?  How does concrete or abstract search lead to the synthesis of new concepts, which themselves become part of the search space?

The approach I’m interested in taking is one that focuses on the core, rather than the periphery, of the cognition process.  I don’t expect or desire to solve the problem of computer vision or speech recognition, for example; I intend to assume that those problems can be solved and provide input to agents in high level symbolic form.  Similarly, I’m not interested in modelling cognition at the neuronal level, but at a level that bridges the gap between the symbolic and neuronal layers.  I’m tempted to say that I’m interested in a holistic approach, except that too often people who espouse holism are those who can’t be bothered to understand the parts of a system and thus make claims to mystical knowledge of the whole.