Forums / Developer / Memory problem will running cronjob

Memory problem will running cronjob

Author Message

Damien MARTIN

Monday 16 August 2010 6:50:00 am

Hi there,

I have a problem with a simple script :

I have to fetch all the nodes in a folder and there is a lot of nodes (18000+).
For each node I have to add it in an indexed array for comparison with a CSV file.

So this is the code :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
        
    if($node->ClassIdentifier == "myclass"){

        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
    
    }
    
    unset($dm);
    unset($site);
    
}

As you can see I try to unset variables (because they are not used anymore) but each time I can't finish this part of code without an error : not enough memory.

I'm under PHP 5.2.6 under GNU/Linux Lenny4 and I can't use PHP 5.3 and it's garbage collector.

Can someone help me ?

Yannick Komotir

Monday 16 August 2010 7:45:58 am

Hi,

You can use offset/limit to achieve this.

<|- Software Engineer @ eZ Publish developpers -|>
@ http://twitter.com/yannixk

Damien MARTIN

Monday 16 August 2010 8:04:12 am

Thanks Yannick,

But I really need to fetch all the node in one time.

This is how the script works in it's complete form :

  1. Load elements in the given directory (the part where the problem is)
  2. Load a CSV file with the meaningly same datas as in the stored elements
  3. Add new elements to eZ
  4. Modify existing elements in eZ
  5. Delete non existing elements (an element is removed it exists in eZ but not in the CSV file)

So, the first part has to load all the node to make the comparison with the CSV file.
If all the elements are not loaded, I will not be able to see if an element from the CSV has to be added or to be modified.

The element used for comparison is composed of two attributes (not the name, because it would be to easy...), it is why I have to load the DataMap of each node...

I hope it is more understandable like this.

Yannick Komotir

Monday 16 August 2010 9:28:58 am

Loading all nodes at same time it's not the best way. You can do it sequentially.

Just an example :

$limit = 50;
$offset = 0;
while( $continueRun )
{
$continueRun = dothis($offset, $limit);
$offset += 50;
 }

function dothis( $offset, $limit )
{
 $nodes = eZContentObjectTreeNode::subTreeByNodeID(array( 'Limit' => $limit, 'Offset' => offset),2283);
 if( $nodes == false )
  return false; 
 foreach( $nodes as $node )
 {
  //your task here
 }
  
 unset($nodes); 
}

<|- Software Engineer @ eZ Publish developpers -|>
@ http://twitter.com/yannixk

Jérôme Vieilledent

Monday 16 August 2010 2:50:30 pm

Hello

eZ Publish uses an in-memory cache for optimizations. If you want to iterate a long list of nodes/objects, you need to clear this cache :

$nodes = eZContentObjectTreeNode::subTreeByNodeID(null, 2283);
$existing = array();
foreach($nodes as $node){
 
    if($node->ClassIdentifier == "myclass"){
 
        $dm = $node->DataMap();      
        $existing[$dm["n_siren"]->DataText.$dm["siret"]->DataText] = $node->NodeID;
 
    }
 
    $node->object()->resetDataMap();
    eZContentObject::clearCache( array( $node->attribute( 'contentobject_id' ) ) );
 
}

Damien MARTIN

Tuesday 17 August 2010 12:20:43 am

Hi Jerome,

Your solution is very interesting because I can use it in my other importation scripts. And it's wonderfull not to have to edit CSV file to re-run the cronjob from where it crashed !

I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

Thanks you Yannick and Jerome.

André R.

Tuesday 17 August 2010 1:08:55 am

> I doesn't thought eZ was storing so much datas in its caches when you are using PHP directly.

It is, we have wanted to add a cache handler that manages in memory cache and provides an general cache api with handler support for years, so we can move parts of cache to for instance memecached and so, fix these memory issues, simplify cache code and possibly optimize stuff while at it.

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

Jérôme Vieilledent

Tuesday 17 August 2010 1:10:21 am

My pleasure ;-).

About import, I plan to release a new import extension very soon, SQLIImport. You'll be able to handle any data source (XML, CSV...) with only one PHP class to create and with a really simplified API to create and retrieve content objects.

Stay tuned ! :)