Posts Tagged machine learning

Genetic Algorithm For Hello World

Genetic Hello World

This article works through the creation of a ‘toy’ genetic algorithm which starts with a few hundred random strings and evolves towards the phrase “Hello World!”. It’s a toy example because we know in advance what the optimum solution is – the phrase “Hello World!” – but it provides a nice simple introduction to evolutionary algorithms.

I have written this article primarily for developers who have a casual interest in machine learning. I don’t talk much about the implementation of the code itself because there’s not much of interest there – the beauty of genetic algorithms is their simplicity, so the code isn’t that interesting, other than in as much as it’s not usual to do such things in JavaScript. For ‘real’ applications of genetic algorithms, I’d suggest looking into existing established frameworks for your language.
Read the rest of this entry »

, , , , , , , ,

21 Comments

A quick overview of machine learning techniques

Go straight to Falken's MazeMachine learning is a fascinating discipline. Often inspired by natural processes, it can produce astounding results in a wide range of applications. Modern web search is underpinned by ML techniques such as clustering and statistical text processing. Computer games make use of evolutionary algorithms to produce better artificial enemies. Your camera probably has face detection in it for aiding auto-focus. Machine learning is key to making our technology better and our lives easier.

Today I’m going to give a very brief and incomplete overview of machine learning technologies and applications. There are three broad types of machine learning: Categorisation, Optimisation and Prediction.

Read the rest of this entry »

, , , , , , , ,

No Comments

XKCD Colour Survey – a 3D visualisation

Randall Munroe (of XKCD) has been running a “name the colour” survey for the last few months and today released the data. The results are broken down by gender, and he makes some fascinating obvservations.

I’ve been working on a colour-related project on and off for about a year now, and part of that has involved creating visualisations of the RGB colour space – plotting red on the X axis, blue on Y, green on Z. So I thought it would be interesting to map XKCD’s 954 most common colour names in three dimensions – his charts are nice n’ all, but sexy javascript 3d is, well, sexier:

XKCD Colour Names in 3d

If you click that, you’ll be taken to the demo page, where you can interact with the cube. It’s a bit juddery, working entirely with divs and DOM manipulation, rather than canvas. Here’s a view from the side angle that shows off the “3d-ness” a bit more:

Of course, a canvas is really a better solution, so here’s a version that renders onto canvas, which is much much faster, but currently omits the names:

Read the rest of this entry »

, , , , ,

15 Comments

The K-Means Clustering Machine Learning Algorithm

The k-means clustering algorithm is one of the simplest unsupervised machine learning algorithms, which can be used to automatically recognise groups of similar points in data without any human intervention or training.

The first step is to represent the data you want to group as points in an n-dimensional space, where n is the number of attributes the data have. For simplicity let’s assume we just want to group the ages of visitors to a website – a one-dimensional space. Let’s assume the set of ages is as follows:

{15,15,16,19,19,20,20,21,22,28,35,40,41,42,43,44,60,61,65}

Now there are a number of ways we could separate these points out; one obvious way that springs to mind is simply to iterate over the set and find the largest gap between adjacent ages. For some sets this can work quite nicely, but in our case it would give us quite unbalanced groups; 15-44 and 60-65, the latter containing just 3 points, and the former being far too broadly distributed.

Using k-means clustering we can obtain more tightly defined groups; consider the set {1,2,3,5,6,9} – using the simplistic greatest distance technique we gain two sets {1,2,3,5,6} and {9}, while the k-means algorithm produces a grouping of {1,2,3} and {5,6,9} – more cohesive clusters with less dispersion of points inside the groups – k-means tries to minimise the sum of squares within a cluster:

Notice how in the first set of clusters the outermost points of the first cluster are quite far away from the middle of the cluster, while in the second set, the points in both clusters are closer the center of their groups; making these clusters more well-defined, less sparsely populated.

Now let’s have a look at the algorithm and work through the example age data to see if we can get a tighter grouping than 15-44 and 60-65 using clustering:

Read the rest of this entry »

, , , ,

10 Comments

Adaptive Web Sites

(this is a slightly expanded transcript of a talk I gave at Oxford in June 2009 about my work there)

Hi! I’m Howard Yeend, my supervisor is Vasile Palade, and the title of my project is:

Implementing Adaptive Web Sites using Machine Learning and Ajax“.

But before I talk about what all those buzzwords mean, I’d like to give a little background information about why this is an important research area, and why I feel it’s the right project for me.

When I was trying to think of a project title, I had a question in mind:

How can we improve the web?

And I think that’s a hugely important question for us to ask.

Read the rest of this entry »

, , , , , , , ,

14 Comments