Indexing ALL ms office files ?

Author Message

Jean-Yves Zinsou

Friday 13 March 2009 1:48:51 am

Hello dear ezcommunity,
i have been searching around and was not able to find a clear answer to my quesion.

Can someone tell me if they have succeeded in indexing all msoffice files including docx files? Or show me some track to follow?

Note: the eZpublish will be hosted on my client server running windows :-(

Do Androids Dream of Electric Sheep?
I dream of eZpubliSheep....
------------------------------------------------------------------------
http://www.alma.fr

Paul Borgermans

Friday 13 March 2009 8:34:32 am

You may have a look at http://projects.ez.no/eztika

There are currently some problems for CJK documents though

hth
Paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Jean-Yves Zinsou

Friday 13 March 2009 8:47:13 am

Thanks a lot Paul,
what does CJK mean ?

Do Androids Dream of Electric Sheep?
I dream of eZpubliSheep....
------------------------------------------------------------------------
http://www.alma.fr

Paul Borgermans

Friday 13 March 2009 9:53:20 am

There are some known issues with CJK = Chinese, Japanese, Korean font sets, probably all asian languages (just tested CJK for now)

For pdf indexing CJK, best use xpdf and use a wrapper script/.bat that you configure in binaryfile.ini with the following content:

<path to>pdftotext -enc "UTF-8" $1 -

hth
Paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Jean-Yves Zinsou

Friday 13 March 2009 10:25:11 am

Thanks a lot Paul ,

You made my day !! ;-)

Do Androids Dream of Electric Sheep?
I dream of eZpubliSheep....
------------------------------------------------------------------------
http://www.alma.fr

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.