A quick overview of machine learning techniques


Go straight to Falken's MazeMachine learning is a fascinating discipline. Often inspired by natural processes, it can produce astounding results in a wide range of applications. Modern web search is underpinned by ML techniques such as clustering and statistical text processing. Computer games make use of evolutionary algorithms to produce better artificial enemies. Your camera probably has face detection in it for aiding auto-focus. Machine learning is key to making our technology better and our lives easier.

Today I’m going to give a very brief and incomplete overview of machine learning technologies and applications. There are three broad types of machine learning: Categorisation, Optimisation and Prediction.

Categorisation techniques allow you to group things with similar attributes together. For example, if gender, eye colour and bank balance are attributes of people, we can plot these into a 3 dimensional space and automatically identify clusters of people with similar attributes. Perhaps we’d find that blue eyed people are generally richer. Categorisation can reveal traits and connections in your data that you hadn’t even thought about before. Once we’ve identified clusters, we can also quickly assign new people to the clusters. So when a new person arrives, we can quickly see whether they’re closer to the ‘rich and blue eyed’ box or the ‘poor and male’ box. Imagine that we could cluster web pages, with topics as the dimensions. Then when we see a new web page we can quickly assign it a likely topic. I’ve already spoken about one categorisation technique called k-means clustering.

Optimisation techniques seeks to maximise some value. For example, you could write a system which varied the position, size and colour of adverts on a webpage trying to maximise the click-through-rate. Or try to work out an optimal poker strategy based on information about the hand and other factors. Genetic algorithms provide one optimisation technique. Others include simulated annealing, hill-climbing, and many others.

Prediction techniques involve training a system by giving many examples of data with known outcomes and then presenting an unseen example and correctly predicting the outcome. For example, we could give such a system data about the number of people who visit our website, how long they stay for, where they came from, what time of day they visit and so on, and also the revenue for that day. Then on a new day we could try to predict what the revenue would be. The sports industry is also prime for prediction software – train the system on past football matches to predict the result. Artificial Neural Networks are one way to predict outcomes given training, for example the folks at 20q.net have produced a handheld machine which very accurately plays twenty questions using an artificial neural network.

I’ve already implemented k-means clustering in PHP and JavaScript, and I’ve also written a very basic genetic algorithm in JavaScript which I’ll blog about when I’m a bit happier with the code. One day I hope to move on to neural networks and make a bajillion pounds predicting horse racing results ;-)


Related Posts:

, , , , , , , ,

Comments are closed.