Saturday 04 October 2003 8:25:48 am
Hi Marco The search engine indexing is an issue which hinders ez publish to work as a good DMS or for any large web site. Especially the ranking or relevance is not up to the level actually required. And users DO rely on global search. The ez crew is certainly aware of this, but I don't know what the future will bring (apparently only a few are asking for binary file indexing for instance). I haven't played with openfts yet. Below my hints (but not yet done it myself, powerpoint is the most urgent for me)
---------powerpoint and excel-------- For powerpoint and excel files, you may try http://chicago.sourceforge.net/xlhtml/ the powerpoint conversion is included in the xlhtml archive. You will also need lynx to do html to text conversion and wrap everything in a shell script to be called by the binary file handler. Idem dito for zipped versions.
--------msword-------- I'm surprisd wvware does not work for you, since I got it nicely running. Does it work on the command-line? I first had to make sure the right xml config was actually there (on SuSE Linux). Do you have the most recent version?
--------zipped --- files Add a cpu or two and wrap unzip,gunzip in a shell script
---openoffice----- There are some xslt filters included which should work after unzipping.
---wordperfect--- See the openoffice filter, its standalone! I hope openoffice will provide more command-line options, so we can use it as a vehicle for all kinds of office formats Have a nice weekend -paul
eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans
|