Forums / Setup & design / Cannot extract text from openoffice documents via apache

Cannot extract text from openoffice documents via apache

Author Message

Massimo Sanna

Tuesday 11 September 2007 3:28:53 am

Hi there,
I'm implementing a website which should feature around 13gb of contents in pdf, doc and odt.
To index all this content, I'm using pdftotext and openoffice from command line, with eZ Publish 3.10rc1 and eZ Find 1.0beta3. There's a custom parser in place which launches a nifty python script which just spits out on stdout all the content of every readable openoffice document.

I'm able to run the php-cli scripts to reindex the website as root, but when I upload a new file to the website I get the following error in the index log file:

09/11/2007 [10:26]  filename: var/ezwebin_site/storage/original/application/b9b6c03340525f50463500b08883e1b2.odt
09/11/2007 [10:26]  creation of executable memory area failed: Permission denied
Error (<class uno.com.sun.star.uno.RuntimeException at 0xb7f3e0bc>) :exception type not found: bad_allocpure virtual method called
St9type_infoSt8bad_castSt10bad_typeidN10__cxxabiv117__class_type_infoEN10 __cxxabiv120__si_class_type_infoEN10__cxxabiv121__vmi_class_type_ infoEPKePeePKdPddPKfPffPKyPyyPKxPxxPKmPmmPKlPllPKjPjjPKiPiiPKtPttPKsP ssPKhPhhPKaPaaPKcPccPKwPwwPKbPbbPKvPvvN10__cxxabiv123__fundamental _type_infoEN10__cxxabiv117__array_type_infoEN10__cxxabiv120__function_type_ infoEN10__cxxabiv116__enum_type_infoEN10__cxxabiv1.__pbase_type_info

I tried to login as apache on the server, and infact launching the same script from the command line gave me the same error. After giving full ownership and write permissions to /var/www I've been able to run the python script from command line, but it still gives me the error when ezpublish launches it.

Anybody has some ideas on how to fix the permissions? I don't know what else to look :-(
Max