eZFind Solr/Lucene.....Mahout?

Author Message

Conrad Decker

Monday 31 January 2011 2:10:47 pm

Hey all, I've recently had a request for some information on a site that would be used as a recommendation engine. The recommender that we're currently looking at for providing the underlying engine is Mahout. From what I'm able to gather (and please feel free to speak up if I'm incorrect), eZFind uses the Solr/Lucene search library. The Mahout project used to be a subproject of the Lucene project, but is now a Top Level ASF project. Seeing that it used to be a subproject, I was curious if any of the eZFind functionality taps into the recommendation side of things.

I'm a complete rookie when it comes to this stuff, and after spending a few hours perusing the interwebs and looking through some of the eZPublish documentation, I figured I'd reach out to the community to see if there is any experience with this sort of implementation.

I've spent some time going through the documentation for eZFind, and it all seems to make sense. I also understand that it's possible to communicate with Mahout via HTTP so perhaps this is easier then I had initially anticipated. I'm just curious if any of this might somehow be tied into some existing eZPublish functionality, or if I'm looking at having to start writing from scratch.

Any and all help is greatly appreciated. Thanks!

Ivo Lukac

Tuesday 01 February 2011 3:28:46 am

Hi,

Few months ago I was interested in something like this but haven't had time to look deeper. What I verified is that it is possible to easily use ezfind solr/lucene index as a mahout input. Questions is do you have all information in ezfind what you need...

Anyway, an eZ partner: http://yoochoose.com did have a presentation about a recommendation extension they are building at this moment. (using mahout as a engine). So you can try to contact them. And give us more feedback here :)

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Conrad Decker

Tuesday 01 February 2011 7:58:53 am

Awesome, at least that's a start in the right direction...thanks for the input Ivo. I'm going to do some more research and will be sure to update this posting with any findings I make.

If anyone else has any further info it would certainly be appreciated. I'm surprised this isn't something that has been tackled before with eZPublish but hopefully we can make some headway with it.

Thanks!

Ivo Lukac

Tuesday 01 February 2011 8:21:50 am

"

I'm surprised this isn't something that has been tackled before with eZPublish but hopefully we can make some headway with it.

"

Well it is not some base stuff. Mahout is not a very easy tool, you need to have some math background to use it.

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Conrad Decker

Tuesday 01 February 2011 9:07:07 am

Haha...very true! It's far from simplistic...but the eZ Community never ceases to amaze me :)

Ivo Lukac

Wednesday 02 February 2011 3:55:52 am

Found the link regarding Solr -> Mahout integration:

http://www.lucidimagination.com/blog/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Conrad Decker

Wednesday 02 February 2011 6:19:35 am

Perfect! I'll take a look, and will let you know how I make out.

Thanks so much for your help Ivo.

Michael Friedmann

Wednesday 02 February 2011 10:31:51 am

Hi all,
I just want to add some comments from my side (YOOCHOOSE) to clarify some of the discussed topics.

First, we currently developing a module that can be used to generate recommendation and integrate it into an ez installation. This will be a SaaS solution that is integrated via http (RESTful-API). Therefore it is not necessary to struggle with a Mahout installation or even an eZFind at all to use recommendation or any math professors are needed to get Recommendations. :-)

Second, I see – as you already discovered – the main problem to get valuable recommendation is to deal with the underlying math and statistics combined with the need to do some heavy computation. Our service does exactly address this problem. We grab the data and usage information out of an ez installation and calculate on our machines the recommendation. So our solution is not intended to compete with Mahout or be the next open source project in the area of recommender systems, it is a Service.

Third, we did try to use Mahout – and we keep on trying :-) – but we currently do most of the part in our own implementation. We also see that currently a lot of effort is put into Mahout on Apache side, but I don’t see that there will be a full recommender service at the end of this development. From my point of view you will still need to put a lot of effort to get it running and integrate it into your environment.
If you have any questions do not hesitate to contact me directly. You’ll find contact details on our web-page: www.yoochoose.com

Ted Dunning

Wednesday 02 February 2011 12:31:54 pm

"

Third, we did try to use Mahout – and we keep on trying :-) – but we currently do most of the part in our own implementation. We also see that currently a lot of effort is put into Mahout on Apache side, but I don’t see that there will be a full recommender service at the end of this development. From my point of view you will still need to put a lot of effort to get it running and integrate it into your environment.

"

The Mahout community is very responsive to questions. Check us out on the mailing lists. You can find information at http://mahout.apache.org.

It is true that Mahout requires some integration effort, but it does include an end-to-end recommendation system. You have to define what goes in and what goes out and there are a number of parameters to select as well. Once you do that, however, you should have a pretty performant and flexible system. Certainly I would be surprised if you could get something similar working in a short period of time unless you restrict yourself to small data and a very limited set of algorithms.

Sebastian S

Thursday 03 February 2011 12:47:00 am

You will find a bunch of people on [email protected] that have experience in running the mahout recommender system in production systems. Feel free to turn to us and we will help with advice and patches.

Nicolas Pastorino

Thursday 03 February 2011 12:48:22 am

"

You will find a bunch of people on [email protected] that have experience in running the mahout recommender system in production systems. Feel free to turn to us and we will help with advice and patches.

"

Thanks for stepping-by Sebastian !
Long-live the Apache foundation,

Cheers,

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Conrad Decker

Thursday 03 February 2011 5:50:32 am

Wow...thank you Ted & Sebastian both for reaching out to us here. You can definitely plan on hearing from me in the near future as I continue my trek down the Mahout road :)

Thanks for your help!

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.