Friday 19 March 2004 5:13:03 pm
Hi Bard, hi Paul, -) If the "indexing" of new / altered content which is published is moved to a cron job which runs e.g. every 10 minutes - the "performance problem" of publishing is not solved but moved for 10 minutes. -) The php-accelarators weren't the main problem because you were always able to exclude the var-directories in the config-settings of the accelarators. So it's really nice that the cache-block files has now the extension ".cache" but that won't help at all. -) If you have optimized the settings of your db, the publishing-process of new content isn't the main "bottleneck" (of course, many - also unnecerssary - queries are made, but that won't harm the system) Believe me, the only real reason why the performance breaks down if you publish content is the deleting / recreation process of the cache. (content-view-cache and cache-blocks) If you won't believe me, try the following thing at ez.no:
-) Clear the cache of ez.no completly.
-) Open 5 instances of the winHTTrack (a website copier)
-) Get 5 different "copies" of ez.no at once with winhttrack
-) wait some 10 minutes after a bit of the cache is "recreated".
-) edit a top level (or 2nd level) node of ez.no and publish it (while winhttrack is running). ... and so on ... ... and what is happening? I am sure that the ez.no server is having "a really nice day" ... But maybe the ez.no site isn't a perfect example because you are not using many cache-blocks. If you have e.g. a dynamic menu which has different content (depending on the user-roles and the current uri) at each page or you have a "user-box" (with personal info about the current user) or you have a different "random quote" at each page ... or you are showing the bookmars of the user at each page too ... or you have a large forum and forum-threads with many levels ... As I said, at my installation (and the size of my db is "only" 300 MB, far away from 800 MB), unfortunately I HAVE TO use around 100.000 cache files. (I really tried everthing to decrease the amount of the cache files ...) And the only way to solve this problem is to move the cache in a "cache table" inside the db as I have already suggested. And if someone publishes content only the affected cache-entries will be deleted (with only one or two queries). Another note: If you have a very popular site which has a good position at Google at some popular keywords you can be sure that around 40% of the traffic of your site is created by search / email (...) spiders / crawlers. The main "problem" of the crawlers: They are getting each page. And if they are e.g. email-spiders they want to get each page very fast.
And the worst-case szenario of a big ez3 installation is:
-) cache is nearly full
-) 2 editors publishes long articles at nearly the same time at top level nodes.
-) 3 crawlers have openend e.g. 5 connections and are getting 10 pages of "low level nodes" (5nd or 6th level) at the same time
-) Only 5 "real" visitors want to view the homepage within a minute.
-) Only 1 user tries to publish a forum post -) Only 1 user tries to publish a guestbook entry ... good night server ... And this "nightmare" has happened nearly each day at my site before I am used the "db-cache" solution. But now the performance problems are - nearly - gone.
Kind regards, Emil.
Best wishes,
Georg.
--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004
|