Every now and then you hit a dearth of information. Its like striking black gold or a never ending seam at the face of a coal mine. Anyhow, it seems to be the case, having come across a tweet regarding some PDFs of slides for talks given by Alex Lin of Intelligent Mining.
As is clearly stated on their site, Intelligent Mining was set up to both “help people develop a clear understanding of the possibilities and challenges of modern predictive analytics techniques in online environments” and “Create solutions to help our clients leverage their data assets and make their websites more efficient & the visitor experience more relevant. Solutions that add value to your business.” and they clearly seem to do that.
In the Intelligent Mining knowledge base are 3 PDFs:
- Building a Predictive Model – An example of a product recommendation engine.
- Recommendation Engine Demystified – Neighbourhood based collaborative filtering.
- Probabilistic Retrieval – Incorporating Probabilistic Retrieval Knowledge into TFIDF Search Engine
The know-how contained in these documents is hardly going to get you up and running with a recommendation engine of your very own, but they will at least put you on the right track to being able to sniff out the tools and build what you want.
It goes without saying that you will need no fear of maths and scientific equations, because the PDFs are packed full of them, but broader topics within them include:
- Item and user-orientated collaborative filtering
- Data normalisation
- Neighbourhood formation & recommendation generation
- Challenges of available data
- Best practices and end goals
Anyhow. Before I take any longer, by way of introduction to these useful docs, tuck in, and hopefully you can learn a thing or two about building your own recommendation engine.
Web-head & art collector, living in East London and huffing on the fumes of the planet since '78. Here are my thoughts.
recommendation engineer Mar 5, 2011
Does it seem to anyone else like these “agnostic” recommendation engine tools are really under-developed in a theoretical sense?
The Ars Technica blog of a UNM assistant prof is the only place on the web that I’ve seen any top of using the topology of the underlying data to improve the recommendation process. Zero academic papers on this as far as I know. And the poset topology has never been applied to rating systems as far as I can tell.