PreviewCache script

Author Message

Sebastiaan van der Vliet

Tuesday 05 July 2005 3:29:05 pm

Hello,

I want to further improve the response times of an ezp 3.5.1 website. TemplateCache, TemplateCompile and ViewCaching are all enabled.

I can not use static cache, since the pages include dynamic elements. Does it make sense to run a small script which generates 'PreviewCache' for a list of nodes/content types, for example folders?
If it does, would you need to run such a script before or after the template compile script (eztc.php)?

Thanks for your help,
Sebastiaan

Certified eZ publish developer with over 9 years of eZ publish experience. Available for challenging eZ publish projects as a technical consultant, project manager, trouble shooter or strategic advisor.

Sebastiaan van der Vliet

Wednesday 06 July 2005 2:02:16 pm

In order to improve the response times of the site I developed a small 'crawler' script. This script 'crawls' the node tree, in order to generate the cache. The following parameters are available: limit, depth, class and delay (seconds). This script is only useful if you have some type (template/content) of caching enabled.
http://ez.no/community/contribs/hacks/ezcrawler

Comments on this approach, and/or the functionality of the script would be very much appreciated.

Certified eZ publish developer with over 9 years of eZ publish experience. Available for challenging eZ publish projects as a technical consultant, project manager, trouble shooter or strategic advisor.

Ekkehard Dörre

Friday 08 July 2005 3:21:40 am

Hi,

images variations were generated to? This makes sense in large galleries or with big image sizes.

Greetings, ekke

http://www.coolscreen.de - Over 40 years of certified eZ Publish know-how: http://www.cjw-network.com
CJW Newsletter: http://projects.ez.no/cjw_newsletter - http://cjw-network.com/en/ez-publ...w-newsletter-multi-channel-marketing

Sebastiaan van der Vliet

Tuesday 12 July 2005 5:23:36 am

Hi,

The ezcrawler script does not contain any scripts for caching itself. It only visits the page, and thus acts as the first visitor that generates the cached version of a page. In order to generate cached versions of image variations you would need to crawl all pages that contain the various image variations. The crawler script has been modified so you can now pass the start node (e.g. media/galleries) as well.

Greetings,
Sebastiaan

Certified eZ publish developer with over 9 years of eZ publish experience. Available for challenging eZ publish projects as a technical consultant, project manager, trouble shooter or strategic advisor.

Leif Arne Storset

Tuesday 06 September 2005 12:06:43 pm

See my comments and code at http://ez.no/community/contribs/hacks/ezcrawler

Leif Arne Storset

Zdenek Ziegler

Saturday 10 September 2005 5:06:40 am

Hi,
don't somebody knows how to make this scipt work on ez 3.6?
On 3.5.1, everything is ok, but on 3.6, I get this error:

Fatal error: Undefined class name 'ezcontentobjecttreenode' in /home/mysite/public_html/bin/php/ezcrawler.php on line 87

Fatal error: eZ publish did not finish its request
The execution of eZ publish was abruptly ended, the debug output is present below.

Thanks for your help,
Zdenek

Nicklas Lundgren

Monday 19 September 2005 12:53:50 am

Hi,

I made this script work in ez 3.6 by defining the path to the ezcontentobjecttreenode class like this:
include_once( 'kernel/classes/ezcontentobjecttreenode.php' );

best regards - and thanks to Sebastian!

/Nicklas

Nicklas Lundgren, Managing Director
Novitell AB, Sweden

Lydie Soler

Monday 09 January 2006 9:36:33 am

Hello,

I have tried to run the ezcrawler script. well it runs... but it doesn't do much... here is the output

 Crawling 0 nodes...

here is the command I am running...

ezcrawler.php  -s site_unite_metarisk --node=163 --depth=3 --limit=100 --class=folder 
--delay=10

my site access is called site_unite_metarisk
and my root node number is 163.... (I use them in the site.ini ... so it should be sure these are correct)...

any idea! Thanks for your help!

Lydie Soler

Tuesday 10 January 2006 3:16:53 am

sorry I may have been sleeping yesterday!!!
now its running... (replace node 163 by 2)

and I have the following error
in foreach Failed to crawl node: 163

well every time I run it I have more and more failed nodes.... don't understand anything...

not easy to know where is the problem....

Please help me!!! thanks a lot

Lydie Soler

Friday 27 January 2006 9:17:25 am

Hi,

Can someone help me???? It is very important for me as my site becomes so slow (5-10s) after the cache is automatically cleared (2h...) As it is not possible to change this 2h parameters (as far as I have found ...) I thought of using the crawler in order to generate cache again.

but I have the following errors after running the script:

Content-type: text/html
X-Powered-By: PHP/4.3.9
Set-Cookie: eZSESSIDsite_unite_metarisk=340a7d44bf5f30173f4b174ef60556e8; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache

Crawling 140 nodes...
Failed to crawl node: 163
Failed to crawl node: 88
Failed to crawl node: 100
Failed to crawl node: 172

Each time I run the script I have more node in error.... I would really really appreciate help

Thanks!

Lydie Soler

Monday 30 January 2006 8:23:12 am

well after some search... and with the help of thing I have found on the forum here is the code that works for me. (Though it might help someone...)

#!/usr/bin/env php
<?php
//
// Created on: <12-Jul-2005 14:08:11>
// Created by: Sebastiaan van der Vliet, sebastiaan@contactivity.com
// Contactivity bv, Leiden the Netherlands
// info@contactivity.com, http://www.contactivity.com
//
//
// This file may be distributed and/or modified under the terms of the
// "GNU General Public License" version 2 as published by the Free
// Software Foundation and appearing in the file LICENSE.GPL included in
// the packaging of this file.
//
// The "GNU General Public License" (GPL) is available at
// http://www.gnu.org/copyleft/gpl.html.
//
//

include_once( 'lib/ezutils/classes/ezcli.php' );
include_once( 'kernel/classes/ezscript.php' );
include_once( 'kernel/content/ezcontentfunctioncollection.php');
include_once( 'kernel/content/ezcontentoperationcollection.php');
include_once( "kernel/classes/datatypes/ezuser/ezuser.php" );
include_once( 'kernel/classes/eznodeviewfunctions.php' );
include_once( 'kernel/classes/ezcontentobjecttreenode.php' );

$cli =& eZCLI::instance();
$script =& eZScript::instance( array( 'description' => ( "eZ publish PreViewCacher\n" .
                                                         "\n" .
                                                         "./bin/php/ezcc.php -s news" ),
                                      'use-session' => false,
                                      'use-modules' => true,
                                      'use-extensions' => true ) );

$options = $script->getOptions( "[node:][depth:][limit:][class:][delay:]",
                                "",
                                array( 'node'  => "Node (default 2)",
                                	   'depth' => "Depth (e.g. --depth=3)",
                                       'limit' => "Limit (e.g. --limit=100)",
                                       'class' => "class filter (e.g --class=folder or --class=2)",
                                       'delay' => "delay in seconds between each page request" ) );
$sys =& eZSys::instance();
$delay = 0;
$node = 2;

if ( $options['siteaccess'] )
{
    $sys->clearAccessPath();
    $sys->addAccessPath( $options['siteaccess'] );
}

if ( $options['node'] )
{
    $node = $options['node'] ;
}

if ( $options['depth'] )
{
    $depth = $options['depth'] ;
}

if ( $options['limit'] )
{
    $limit = $options['limit'] ;
}

if ( $options['class'] )
{
   $classFilterType = 'include';
   $classFilterArray = $options['class'] ;
}

if ( $options['delay'] )
{
   $delay = $options['delay'] ;
}

$script->startup();
$script->initialize();

if ( $options['siteaccess'] )
{
	include_once( 'kernel/common/template.php' );
    $tpl =& templateInit();

	$nodelist = eZContentObjectTreeNode::subTree( array( 'Depth' => $depth, 'Limit' => $limit, 'ClassFilterType' => $classFilterType, ClassFilterArray => array( $classFilterArray )  ), $node );
    $script->setShowVerboseOutput( true );
	$cli->output(  "Crawling " . count($nodelist). " nodes..." );
	foreach ($nodelist as $node)
	{

			$time_start = microtime(true);
			$sys->setServerVariable('REQUEST_URI','/content/view/full/'.$node->attribute( 'node_id' ));
			
			$ini =& eZINI::instance("site.ini");
			$siteURL = $ini->variable("SiteSettings","SiteURL");
			
			
			//$status=shell_exec("index.php"); doesn"t work at all
			//$status=shell_exec("php index.php"); seems to work but doesn't seems to...
			$status = fclose(fopen('http://'.$siteURL.'/content/view/full/'.$node->attribute( 'node_id' ),"r"));
			$time_end = microtime(true);

			$text = false;
			if ( $status )
	              $text = "Crawled node: " . $cli->stylize( 'node', $node->attribute( 'url_alias' )) . "\t" . $cli->stylize($node->attribute( 'node_id' ) . "\t" . preg_replace('/^0?(\S+) (\S+)$/X', '$2$1', $time_end - $time_start."s" ) );
				  
	        else
	              $text = "Failed to crawl node: " . $cli->stylize( 'node', $node->attribute( 'node_id' ) );
            $script->iterate( $cli, $status, $text );

            if ($delay)
            {
            	sleep($delay);
            }
	}

}
else
{
	 $cli->output(  "Please specify a siteaccess name, e.g.: -s plain" );
}

$script->shutdown();

?>

Sebastiaan van der Vliet

Thursday 25 May 2006 12:06:15 pm

Hi Lydie,

By crawling the website over http your website statistics will be incorrect - that is why the script calls index.php instead of http://.... To avoid having to use http://, make sure that you use the PHP-CLI version of php, for example:

$status=shell_exec("c:\ezpublish\php\cli\php.exe index.php"); 

Certified eZ publish developer with over 9 years of eZ publish experience. Available for challenging eZ publish projects as a technical consultant, project manager, trouble shooter or strategic advisor.

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.