Big Data

Front page of Practical Recommender Systems

For a computer scientist like me, the world of IT is such an exciting place! Since I started at university, I have seen the creation of companies like Amazon and Google, and later Netflix. They were for sure lucky to be in the right place at the right time. But it was ingenuity that has kept them in the market. What they did is a long story, but what I find interesting is that they have taken large quantities of content and made it accessible to the masses.

One of the advantages of being an internet business is the fact that you are not limited by physical walls like traditional shops and your list of products can be close to never ending. If a physical store was truly so vast, customers would struggle to find anything and simply get lost. They would probably go to the shop next door, which has fewer products and buy things that are not exactly what they wanted, but are easily accessible.

Offering lots of content does not ensure success, not even if you have precisely what your users want. Often 20% of your content will produce 80% of your business, if you can match the rest of the 80% of the content with your users, you will have more happy users and more business. The problem of activating the last 80% of the content is called the long tail problem.

A way to enhance the accessibility to the content for the users is to add a recommender system to you site. This can attempt to predict what your customers want and serve it to them.

Implementation of Recommender systems is an intriguing task. The actual algorithms like collaborative or content-based filtering are just a small part of it. If you do not feed the algorithm with the right data, it will not produce anything worth looking at. Using user ratings will often not produce the results that users want. Looking at context is also often something worth thinking about. And when it is all implemented and running, how do you know that it is working, how do you measure improvements?

I never found a book answering these questions; I found lots of good books explaining how to implement the algorithms mentioned above, but never a book that described everything around as well. So I started working on one. It just came out in an early release at Manning

Go and have a look, the first chapter is free!

Manning.com/falk.

2014-06-13 06.17.18 Everybody is talking about it, everybody is saying that they will soon have a version ready that will utilize the heaps of data, which are piling up in databases around us. But what is actually possible to achieve with it? Some say EVERYTHING, others are a bit more sceptical and think:

it’s being paraded around as a magic bullet, raising unrealistic expectations that will surely be disappointed. – Cathy O’Neil and Rachel Schutt in “Doing Data Science”

In my opinion, Big Data can be used for many things, but like everything using statistic, you should remember that correlation does not imply causation – just because something happens just after something else, it does not imply that one is a reaction to the other. Manipulated correctly data can prove almost any thesis, and its contradicting thesis. It is exciting to search for patterns or structure in the sea of data, to seek out information which no man has seen before, but be careful and sceptical always, especially when the results are too good.

I think its interesting that people can analyse data and find that children performs better in school when they eat breakfast every day[1], but personally I am more into predicting things whether it is to recommend good books, predict earthquakes or finding pregnant women from they shopping habits[2], is incredible cool.

I have been working with recommendation systems, studied Machine learning at the university and I am now working with it. I will always try to collect new ideas and learn more, which I intend to write about here. Many can be also found in Danish here at QED.dk

My hope is that this blog can be a place for people to come and read new interesting posts on Big Data and Machine Learning, but also please add to the discussion in comments or as guest bloggers.

Thank you for reading this, hope to see you again!

Notes from A Curious Mind

Introducing Practical Recommender Systems

Everybody is Talking About Big Data.