# Machine Learning for Hackers

**ISBN:**978-1-449-30371**Publisher:**O’Reilly**Publication Date:**February 2012**Pages:**324**Price:**£30.99

Machine Learning for Hackers provides an introduction to, unsurprisingly, machine learning in the statistical programming language, R. For those unfamiliar with what machine learning is, it’s a branch of artificial intelligence whereby training data is used to form an algorithm which can categorise future data. For example, a SPAM filter.

The book covers the basics of installing and using R and statistics in general, linear regression and categorisation, non-linear data and regularisation, Principal Component Analysis, multidimensional scaling for clustering, k-nearest neighbour and Support Vector Machines for non-linear classification. Each technique and idea is introduced, applied to simple example data, and then developed into a real-word application with realistic data. The explanations and examples are clear, well written and easy to follow. Several ‘housekeeping’ techniques are covered; error checking and how to format real-world imperfectly formatted data into a machine readable format.

If you’re unfamiliar with R, the example code provided by Machine Learning for Hackers will be of very little use to you – R’s syntax is rather unique and it (along with a few popular libraries) provides many statistical functions which would need re-implementing if another language were to be used. The ‘for Hackers’ moniker denotes that the book is aimed at a certain type of programmer: one who will be happy to know how to use an algorithm, but not necessarily why or how it works. This is demonstrated by the lack of any of the mathematics or workings of the underlying algorithms being used being explained. This fact may be considered a good or bad thing.

If you’re looking for a very solid, well written, overview of machine learning in R, Machine Learning for Hackers is a great starting point, as long as you’re willing to read around the subject (the recommended texts and books cited list works well). Just be aware that the print version of the book is black and white, whilst the graphs were clearly designed for a colour publication. Chapter 12 is almost worthless without being able to distinguish points of the graphs, so either follow along running the example code to generate your own graphs, or make use of Safari Books Online to access a pdf version.

--

Jonathan Hammler