Forums / Developer / ez publish going slow and hangs when publishing content

ez publish going slow and hangs when publishing content

Author Message

K259

Friday 19 March 2004 4:42:10 am

Our database with content is around 800mb, we have a lot of editors, running v.3.3, and are using Turck MMcache to speed up php and the ez code.

Our site is awful slow and the speed goes up and down. Our hardware is very good, and only ez publish is running on the server.

When the editors publish their content, the whole site hangs, and when the publishing process is finished(php processing and mysql-manipulation), the site works again.

It's not only the admin-part that hangs, but also the frontend hangs until the publish-process is finished. We've got a lot of feedback from our editors, and also plenty of feedback from our visitors on this issue. So it's a big problem. Last time I had an ez publish lesson to teach the new editors how to publish, we had to drop the live publishing, and use the hand-out I had copied for them instead.

Very soon I'm going to have another course, and are sceptic to what will happend, when I now see that the front-end hangs when just one editor is publishing. So what will happen when 15 publish at the same time..? :/ Got bad experience from this.

Anyone have experienced some of the same problems?

Georg Franz

Friday 19 March 2004 5:54:05 am

Hi Zinistry,

I've had the same problem.

The performance break down at my site happens because eZ is clearing and recreating the cache after publishing.

So how many cache-files do you have? (At my installation, there are around 100.000 cache files and around 10.000 cache-directories in the "worst-case".)

I've experimented a lot and I've found some solutions:
a) Exclude the var-directory for php-accellarators
b) Add more indexes to your db
See: http://ez.no/community/forum/developer/performance_mysql_add_index
(This will NOT slow down the publishing process but speeds up the thing! Reason: At the publishing event, a lot of "select queries" are made!
c) If you are running mysql 4.x, enable the "query_cache"
http://ez.no/community/forum/developer/performance_mysql_4_x

This will help a bit, but mostly speed up the "viewing of content".

But to solve the "caching"-problem I had to go another way: Store the cache in the database instead of storing it in files. ( the cache of "cache-blocks" and "content-view")

Everybody said that this method will slow down the system, but thats not true if you have a large installation. (Instead of deleting 100.000 files you made one query to delete 100.000 rows)

Moreover, the cache-control is improved, because I can specify exactly which cache-entries are expired. (The "normal ez" looks in each directory if the file exists etc.)

I've send my "db cache add on" to the ez crew and Bard said, that he will have a look for it. (I think he had no time till now).

I've written the "db cache-add on" for the 3.4 version, but - I think - it's not difficult to rewrite the thing for 3.3. So - if you want - I can send you the add-on.
( [email protected] )

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

Alex Jones

Friday 19 March 2004 6:23:06 am

Emil, would it be possible for you to post it in the Contributions? If not, please send me a copy at 'alexj' at 'agrussell.com'.

Thanks!

Alex

Alex
[ bald_technologist on the IRC channel (irc.freenode.net): #eZpublish ]

<i>When in doubt, clear the cache.</i>

Georg Franz

Friday 19 March 2004 6:33:08 am

Hi Alex,

no, it's not a good idea to publish the code in the "contribution section" because I have altered mainly "kernel"-files. (So it's not an extension but a hack of ez).

I will send you the files via email.

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

Bård Farstad

Friday 19 March 2004 6:54:50 am

Just a note on cache files and PHP Accelerators. From version 3.4 all cache files for content and template block will be named .cache which makes then not cached by the PHP Accelerators. This should remove the problem with PHP A. going slower and slower due to the number of cache files in memory.

--bård

Documentation: http://ez.no/doc

Paul Borgermans

Friday 19 March 2004 10:23:28 am

Hi

In our case, the indexing of new content is part of the delay (sometimes really long xml content and files), which is solved today in svn:

I patched one of our 3.3 sites from the trunk with the new code for delayed indexing. That works like a charm and makes the editing more responsive here. Check out revision 5546 and copy the files below. Then create the table for the pending actions, enable delayed indexing and configure cron (and the cron.ini.append) or your preferred scheduler. I set the cron to execute every 10 minutes.

These are the files that have changed (from pubsvn.ez.no)

A /trunk/cronjobs/indexcontent.php
M /trunk/doc/changelogs/3.4/CHANGELOG-3.4.0alpha2
M /trunk/kernel/classes/ezsearch.php
M /trunk/kernel/content/ezcontentoperationcollection.php
M /trunk/kernel/sql/mysql/kernel_schema.sql
M /trunk/kernel/sql/postgresql/kernel_schema.sql
M /trunk/settings/site.ini

No garantees of course, but it works here in on a production server.

Regards

-paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Bård Farstad

Friday 19 March 2004 3:21:45 pm

Paul,

in 3.4 we've just added delayed indexing. This means that you don't need to wait for an object to be indexed while publishing. It will be run on a cron job later on.

--bård

Documentation: http://ez.no/doc

Georg Franz

Friday 19 March 2004 5:13:03 pm

Hi Bard,
hi Paul,

-) If the "indexing" of new / altered content which is published is moved to a cron job which runs e.g. every 10 minutes - the "performance problem" of publishing is not solved but moved for 10 minutes.

-) The php-accelarators weren't the main problem because you were always able to exclude the var-directories in the config-settings of the accelarators. So it's really nice that the cache-block files has now the extension ".cache" but that won't help at all.

-) If you have optimized the settings of your db, the publishing-process of new content isn't the main "bottleneck" (of course, many - also unnecerssary - queries are made, but that won't harm the system)

Believe me, the only real reason why the performance breaks down if you publish content is the deleting / recreation process of the cache. (content-view-cache and cache-blocks)

If you won't believe me, try the following thing at ez.no:

-) Clear the cache of ez.no completly.
-) Open 5 instances of the winHTTrack (a website copier)
-) Get 5 different "copies" of ez.no at once with winhttrack
-) wait some 10 minutes after a bit of the cache is "recreated".
-) edit a top level (or 2nd level) node of ez.no and publish it (while winhttrack is running).
... and so on ...

... and what is happening? I am sure that the ez.no server is having "a really nice day" ...

But maybe the ez.no site isn't a perfect example because you are not using many cache-blocks. If you have e.g. a dynamic menu which has different content (depending on the user-roles and the current uri) at each page or you have a "user-box" (with personal info about the current user) or you have a different "random quote" at each page ... or you are showing the bookmars of the user at each page too ... or you have a large forum and forum-threads with many levels ...

As I said, at my installation (and the size of my db is "only" 300 MB, far away from 800 MB), unfortunately I HAVE TO use around 100.000 cache files. (I really tried everthing to decrease the amount of the cache files ...)

And the only way to solve this problem is to move the cache in a "cache table" inside the db as I have already suggested. And if someone publishes content only the affected cache-entries will be deleted (with only one or two queries).

Another note: If you have a very popular site which has a good position at Google at some popular keywords you can be sure that around 40% of the traffic of your site is created by search / email (...) spiders / crawlers.

The main "problem" of the crawlers: They are getting each page. And if they are e.g. email-spiders they want to get each page very fast.

And the worst-case szenario of a big ez3 installation is:
-) cache is nearly full
-) 2 editors publishes long articles at nearly the same time at top level nodes.
-) 3 crawlers have openend e.g. 5 connections and are getting 10 pages of "low level nodes" (5nd or 6th level) at the same time
-) Only 5 "real" visitors want to view the homepage within a minute.
-) Only 1 user tries to publish a forum post
-) Only 1 user tries to publish a guestbook entry

... good night server ...

And this "nightmare" has happened nearly each day at my site before I am used the "db-cache" solution. But now the performance problems are - nearly - gone.

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

K259

Friday 19 March 2004 6:25:20 pm

Tnx for all the replies, and tnx to Emil for good testing, solutions and information about this issue. I have to support Emil in this issue, because I've also tested different stuff on ez sites(our own sites), checked out the logs when the system goes down etc., and also with httrack, and compared ez publish with other cms(which I've also tested) regarding stability and processing time.

With httrack the ez-sites just crashes and have to be "restarted". The same happends with the other tests. The other "tests" I've done(not with any programs), I'm not going to explain in any further details(due to possibillities of abuse), but I notify in our logs that someone already know these weakness of ez publish, and plays with our site :/

So, my question to ez is: "when is it possible to get an upgrade/solution for this, and what will it cost/how many hours will you need to use on this" if the ez crew fix this for us(I hope it's possible before our courses starts in april/may...so we don't have to run the course with paper again instead of letting the new editors test-publish in ez publish).

Maybe a good idea would be if the ez crew installed some testsites(or someone else?), where we could test these things, so we could find all these ez bugs of unstabillity, and get ez really stable in the future versions? ;)

Nite

Georg Franz

Saturday 20 March 2004 5:14:18 am

Hi Nite,

I've had also the problem that apache crashed without an obvious reason. I've studied the logs -> nothing.

After some weeks I found the reason but unfortunately not the bug which causes the problem:
http://ez.no/community/bug_reports/strange_apache_crashes_print_pagelayout_tpl

Maybe it's the same at your installation?

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

Paul Borgermans

Saturday 20 March 2004 2:43:44 pm

Hello Emil,

Thanks for your insightful post. Yet I should add that the delayed indexing does not affect the cache files, this remains inside the publishing part. What it does is remove the indexing delay before the editor gets back to the normal view mode. I don't know where in the normal publish process the indexing happens and wether it delays the creation of the cache files too and if it matters after all. Take a look at the cronjob script in the trunk and the search class: it is really only the search index tables that get partially cleared and refilled.

But I don't like the brute force clearing of all cache blocks upon publishing content and I agree fully that some more intellgent algorithm is needed. The database based solution looks attractive. It may also be used for an efficient pro-active creation of caches when creating and updating new content. I use also quite a few cache blocks on my sites which multiply by the number of nodes and roles in the system and sometimes the user. Only, the server is not on the high load of yours (intranet with on average 60 users) so I never had a situation where the server crambled to a standstill (yet).

Best regards

-paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Olav Bringedal

Monday 12 September 2005 5:37:42 am

I have turned on delayed indexing, but still its a REAL pain to publish content in a database with about 8000 articles.

Note the SQL count, prolly does one per publication.

Time accumulators:
 Accumulator	 Elapsed	 Percent	 Count	 Average
ini_load				
Load cache	0.1384 sec	0.2878%	17	0.0081 sec
Mysql Total				
<b>Mysql_queries	18.8857 sec	39.2722%	7947	0.0024 sec</b>
Looping result	2.1296 sec	4.4284%	7946	0.0003 sec
Template Total	47.5438 sec	98.9%	2	23.7719 sec
Template load	0.0572 sec	0.1190%	2	0.0286 sec
Template processing	47.4850 sec	98.7433%	2	23.7425 sec
override				
Cache load	0.0387 sec	0.0804%	3	0.0129 sec
Sytem overhead				
Fetch class attribute name	0.0000 sec	0.0000%	0	0.0000 sec
class_abstraction				
Instantiating content class attribute	0.0178 sec	0.0371%	6	0.0030 sec

Can anyone please explain?

A delayed publishing seems to be what we need.


Senior Consultant
http://Umoe-consulting.no

Gabriel Ambuehl

Monday 12 September 2005 5:44:20 am

You are aware that your stats show 98% spent in template code, are you? (IMHO, there's something wrong with it's claims anyhow, goes well over 100% ;)

I'd think about caching more than MySQL performance if this was my site. Also, for reference, ez.no has somewhere around 98000 objects and publishing usually works just fine.

Edit: publishing this post took about 2s.

Visit http://triligon.org

Olav Bringedal

Monday 12 September 2005 11:47:22 pm

I saw that. The problem is that im not sure i dare touch the default settings in admin access.

Id be happy to have some directions on how to cache/compile the admin interface though.

the only optimalisatin done from default is that i removed the treemenu entirely as a loading 8000 objects naturally took even more time.


Senior Consultant
http://Umoe-consulting.no

Olav Bringedal

Wednesday 28 September 2005 1:56:31 am

basicly solved here: http://ez.no/community/bugs/subtree_expiry


Senior Consultant
http://Umoe-consulting.no