A quick overview of machine learning techniques

Go straight to Falken's MazeMachine learning is a fascinating discipline. Often inspired by natural processes, it can produce astounding results in a wide range of applications. Modern web search is underpinned by ML techniques such as clustering and statistical text processing. Computer games make use of evolutionary algorithms to produce better artificial enemies. Your camera probably has face detection in it for aiding auto-focus. Machine learning is key to making our technology better and our lives easier.

Today I’m going to give a very brief and incomplete overview of machine learning technologies and applications. There are three broad types of machine learning: Categorisation, Optimisation and Prediction.

Read the rest of this entry »

, , , , , , , ,

No Comments

Poking holes in PHP object privacy

Get it? Holes? Cheese?PHP provides a decent model of class member visibility, with public, private, and protected members to help you define tight APIs for your objects and show other developers how your object is supposed to be used. But used naively, PHP’s ‘magic methods’ can easily and subtly subvert this system, making everything public.

If you’re still new to object oriented programming in PHP5, think of “public” as roughly analogous to “my function’s arguments” and “private” as “local variables inside the function”. You wouldn’t want someone calling your function and messing with the local vars, and you wouldn’t want someone using your object messing with its private members.

Magic methods provide functionality like catching references to methods and properties which are not visible to us, and doing special things with them. Magic methods have always struck me as a bit weird, and whenever you bring them up in discussions online, there’s always a few people with reservations about them – efficiency, clarity, use-cases and so on.

I’m still in two minds; they can be useful in some circumstances, but here’s one reason why they could be considered harmful: Used carelessly, they can easily enable an OOP antipattern where all class members become public, even those declared as private or protected in the class definition. Read the rest of this entry »

, , , , , , , , , ,

7 Comments

Six years of blogging – lessons learned

I’ve had over 1.5 million visits and over 2 million page views, I’ve earned nearly $3000 in adsense, and been linked to from W3C and xkcd, and I look to the future of puremango with optimism. But it wasn’t always this way. The journey started with all the counters on zero.

Six years ago today I registered my first domain name as a place to dump my code and generally show off my spare-time projects, silly little pieces of code that had no real business traction and so wouldn’t see the light of day at work. I wrestled with a few cringe-worthy domain names (stuff like PHPPro.com? Leetcoder.com?) before I decided that going abstract was the way forward, and PureMango.co.uk was born.

So I thought it would be interesting to review the lessons learned over the last six years as a blogger. If you’re a blogger just starting out, or you’re thinking of starting a blog, or even if you’ve got a blog that you don’t really keep up with, I hope you’ll be able to learn from my mistakes, and perhaps even be inspired by my moderate success.

Read the rest of this entry »

, , , , , , ,

4 Comments

Ten Ideas

Being a hacker is all about the open sharing of ideas. So why do I keep my list of ‘projects in development’ so close to my chest? Inspired by tales of R&D departments with security measures the military would weep at? Enchanted by the notion that my ideas are worth millions, I just need to unleash them, then sit back and watch the cash roll in? Yeah, that’s pretty much it!

Yep, until very recently I was an idea hoarder. But inspired by Jacques Mattheij’s recent outpouring of his ideas, I’ve changed my attitude. I’m in good company – the folks at ycombinator have shared their list of “ideas we want to fund“, the people at halfbakery.com have an entire social ecosystem based around sharing ideas, and the Six Month MBA team have listed a whopping 999 business ideas for anyone to pick up and use.

Why share my ideas? Ideas are often said to be worthless until implemented. I’d objected to that sentiment in the past, being a big ideas person. But now I can see there’s truth in it – a bad idea implemented excellently will trump a good idea implemented poorly, and as Paul Graham says: “imaginative people will take (the ideas) in directions we didn’t anticipate”, and “No matter what your idea, there’s someone else out there working on the same thing”. Sharing something multiplies its value.

I encourage you to share your ideas with the community too, because:

  1. Someone’s probably already thought of it anyway – no need to keep it secret
  2. You haven’t done anything with it yet – so maybe you’re not the right person to bring it forward
  3. Inspiring others benefits everyone- let’s talk about these ideas, and create new ones
  4. You’re not as clever as you’d like to think – others can see problems and opportunities that you can’t
  5. Sharing ideas can kickstart the product – if everyone says “wow I like this”, then you know what to do

So without further ado, ten ideas I’m thinking about:

Read the rest of this entry »

, , , , ,

4 Comments

Overloading in PHP

Murray Picton wrote up a blog post today on overloading functions in PHP. Overloading is a useful feature of many languages. Murray gives a nice definition in his post:

Overloading a function is the ability to define a function more than once with a different set of parameters for each one and then when it is called, the version of the function that matches the parameter set will be executed.

so you can define function foo(String $a) and foo(Array $a) and a different version will be called depending on how you call it. But, PHP’s idea of overloading is completely different and not really related.

Hello ladies, look at PHP, now back to Java, now back to PHP. Sadly, PHP’s not Java, but it can look (a bit) like Java if you stop using lady-scented functions and start using method overloading.

I just wanted to get that quote in somehow.

Murray’s solution is to use func_get_args inside your function, and perform some logic to switch execution to a different branch depending on what we find there. Something like:

function foo() {
    $args = func_get_args();
    if(is_string($args)) { /* do some stuff */ }
    else if(is_array($args)) { /* do some stuff */ }
}

This works well enough, but there are a few things I don’t like about it:

  1. You have to put all the code into one function, instead of literally overloading multiple functions
  2. The “which version am I” code and the actual functionality are intermixed inside the function

I thought an OO solution might mitigate against these things, allowing us to write completely separate functions, but which can be called with the same name, a different one being executed depending on the arguments. I’m not certain my solution is an improvement over Murray’s, but it’s a different approach if nothing else:

Read the rest of this entry »

, , , , , , , ,

5 Comments

WordPress Performance Tips

WordPress is an excellent piece of software. It helps over 27 million people publish their blogs, and people view pages on WP-hosted blogs over 2.1 billion times a month(source). I use wordpress on all my blogs. But there is growing evidence that page load times are a large contributing factor to bounce rate – those people who close your site before it’s even finished loading. As google have recently shown, the future is instant, and if your blog is taking 7 seconds to load, that could be 6.5 seconds too long for some visitors.

This blog post will teach you how to optimize wordpress performance and keep your visitors more engaged. I’ll be taking you through each step as we speed up wordpress. These tips are largely independent of the theme you’re using, and I’ll guide you through the process as simply as I can.

Setting Up – Profile Your Site.

Read the rest of this entry »

, , , , , , , , , , , ,

6 Comments

JS1k Winners – Top Ten Entries

So, the JS1k contest is over and the winners are finally in! What a fantastic event this has been, massive thanks to all the organisers and judges, and to all the entrants for putting on a great show, I can’t think of a single entry I wouldn’t have been proud to have written myself, and some of the entries were simply amazing.

The judges did a brilliant (and difficult!) job of ranking the entries and choosing their top ten. I’ve compiled the entries into a list below, and you can click through to the demos themselves from here. I read that the official site will be updated tonight, so check back at js1k.com for the full scoop :)

If this is the first you’ve heard of the contest, head over to js1k.com and browse through all the entries. I also compiled a list of all the tweet-sized entries, as I’m buying a copy of Douglas Crockford’s JavaScript:The Good Parts as a prize for the best tweetable entry. Have a look at my own tweetable entries too.

Read the rest of this entry »

, , , , , ,

2 Comments

Why Diaspora won’t beat facebook, but you might.

“Diaspora, the ‘anti-Facebook’, is doomed”, says Milo Yiannopoulos of the Telegraph.

I agree, but not for the reasons Milo gives. He says Diaspora will fail because:

  1. Facebook’s really big.
  2. Zuck’s invested in it.
  3. There’s a lot of competition.
  4. Facebook’s well funded.
  5. Diaspora hasn’t launched yet.

I think those are simply the problems that any startup is bound to face – indeed, the situation was largely the same when facebook launched against the giant that was myspace, or when Microsoft decided to enter the same market as IBM.

But the topic itself is an interesting one. An investigation into how a startup might challenge Facebook’s dominancy will doubtless reveal some insights into more general approaches that web endeavors can take to sweep away their competitors.

Read the rest of this entry »

, , , ,

5 Comments

Complete list of #JS1k tweet-sized entries

The #JS1k contest ended last night. It challenged web coders to write some interesting JavaScript in 1 kilobyte (1024 bytes) or less. That script is put into a basic html page which includes a canvas element and not much else. It’s a pretty crazy challenge – 1k is not a lot of code – and there are some really clever micro-optimisations going on in some of the entries.

But the homepage also states “Bonus points if your submission fits in one tweet ;)“. Now that’s a whole other level of madness. Useful code in 140 bytes? You’ve barely enough room to find the canvas element, let alone do anything with it!

So I decided to make an (unofficial) list of all the entries that are or claim to be tweet-sized. The current js1k homepage doesn’t offer a way to filter entries by size (hint, hint!), so I think this is pretty useful. I listed my own tweetable entries in an earlier post.

Also, I’m giving a copy of Douglas Crockford’s “Javascript: The Good Parts” as a prize for entry that the judges deem is the best of the tweetable ones. I think it’s a fitting prize, packing a whole lot of awesome into a thin package ;)

Read the rest of this entry »

, , , ,

3 Comments

Google Instant Launches

Kapow! And the web wakes up and scrambles over itself to play with google’s latest, pretty bold, innovation in search: search-as-you-type, or in Google-speak “Google Instant”.

The new feature combines autosuggest with realtime search. The video below demoes the functionality, and it’s live right now on the search giant’s homepage if you’re signed in to your google account.

Here are some quick and dirty first impressions:

Read the rest of this entry »

, , , ,

No Comments