Forums / General / Importing and Exporting Content Objects in eZ

Importing and Exporting Content Objects in eZ

Author Message

Russell Michell

Sunday 18 April 2010 3:12:43 pm

No worries Nicolas - the 4.0x days for me were over long ago, but the DB schema/data issue hailed from that time and I have been upgrading and patching ever since 4.0.1-rc2.

FYI - I have created an XSL stylesheet which seems to work OK transforming ezxmlexport's own XML format into data_import's own XML format. I have posted it over on the ezxmlexport forum: http://projects.ez.no/ezxmlexport/forum/general/suggestions.

Hope it may help someone else :-)

Cheers
Russ

PS - anyone know how to get ezxmlexport to include the parent folder in an export? All I get right now is the contents of a folder, but have to manually re-create its parent and copy its children. I will keep playing and try some stuff out though.

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Nicolas Pastorino

Monday 19 April 2010 12:58:08 am

Thanks Russell, this XSL stylesheet may help a few others! Thanks for sharing.

About including the top node in the export, i'll try to investigate the question with the author of the extension :)
If anyone figures out before please raise a hand here !

Cheers,

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Jérôme Renard

Monday 19 April 2010 1:13:43 am

Hello,

eZXMLExport never export the parent node, it has a top -> bottom approach and not a bottom -> up one.

For example with the following directory structure :

 eZPublish
└── folder1
    └── folder2
        └── folder3

If you want to export folder2 and folder3, you have to export from folder1, and if you want to export folder1, 2 and 3 you have to choose to export from "eZPublish".

Cheers :)

Russell Michell

Monday 19 April 2010 1:18:43 pm

Hi Jérôme and thanks for clearing that up :-)

I have to say though that doing it this way doesn't seem to be intuitive to me. In the ezxmlexport Admin GUI, when you select the "Choose Contents" button, I would expect the contents I selected to be exported, not the children of that contents.

Perhaps it might be an improvement if the docs were more specific about this (possibly they already are, although I have read them several times, I don't recall - sorry!) and also an additional option in the export interface labeled: 'Export Behaviour' 1). default 2). complete - where 'default' is the behavior as it is now, and 'complete' includes the parent directory the user selects when choosing his content. Even an .ini setting?

I will post this as another suggestion over on the ezxmlexport forum though.

Thanks again though Jérôme for an extremely useful and versatile extension - I do not wish to take this fact away from you :-)

Regards
Russell

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Monday 19 April 2010 4:36:43 pm

"

Well there is an even simpler solution.

eZXMLExport allows you to use an XSLT file which is used during the export process, so with each XML file created you can apply a smal and specific XSLT that will be used to generate the final result, if you already have a ready to go XML handler in data_import that would be an interesting solution :)

"

I forgot to mention, I created a generic XMLHandler for data_import. It works at the moment with folders and articles. Hopefully you can see how to hack/modify it to do what you want it to do. The only problem is implementing a write-logger->write() method (To Do!)

<?php
/*
*       @decription:    Generic exported eZ content-class import class
*       @author:                R.Michell 2010 r DOT michell AT gns DOT cri DOT nz
*       @package:               data_import
*       @To Do:                 Implement logger-write() method using ezcomponents
*/

class XMLHandlerGeneric extends XmlHandlerPHP
{
        var $handlerTitle = 'Generic Handler';                                                                          // Default handler name.
        var $current_loc_info = array();                                                                                        // Not sure if used.
        var $logfile = 'data_import.log';                                                                                       // Log file-name.
        var $remoteID = '';                                                                                                                     // Not sure if used.
        const REMOTE_IDENTIFIER = 'xmlimport_';                                                                         // Default. Is appended-to later..

        var $root_node = 'all';                                                                                                 // Source XML root node element.
        var $xml_source_path = 'extension/data_import/dataSource/exports';              // Path to parent dir od source XML file(s) for import.
        var $xml_source_file;                                                                                                   // ezxmlexport uses an export name for the export's parent dir and XML filename.
        var $parent_id_fallback = 2;                                                                                    // Fallback to root node ('Main') of a parent_id cannot be found for an imported object

        /*
        * Constructor
        */
        public function XMLHandlerGeneric()
        {
        }

        function logger($message,$logfile)
        {
                if(is_writable())
                {
                }
        }

        function writeLog( $message, $newlogfile = '')
        {
                if($newlogfile)
                {
                        $logfile = $newlogfile;
                }
                else
                {
                        $logfile = $this->logfile;
                }
                $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
        }

// Mapping for source XML field name to an eZ attribute name:
        function geteZAttributeIdentifierFromField()
        {
                $field_name = $this->current_field->getAttribute('name');
                if($this->getTargetContentClass() == 'article')
                {
                        switch ($field_name)
                        {
                                case 'name':
                                        return 'title';
                                break;
                                case 'shortname':
                                        return 'short_title';
                                break;
                                case 'description':
                                        return 'body';
                                break;
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
                if($this->getTargetContentClass() == 'folder')
                {
                        switch ($field_name)
                        {
                                case 'shortname':
                                        return 'short_name';
                                case 'showsubitems':
                                        return 'show_children';
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
                else
                {
                        switch ($field_name)
                        {
                                case 'shortname':
                                        return 'short_name';
                                case 'showsubitems':
                                        return 'show_children';
                                case 'publishdate':
                                        return 'publish_date';
                                default:
                                        return $field_name;
                        }
                }
        }


        // Handles xml fields before storing them in ez publish
        function getValueFromField()
        {
                switch( $this->current_field->getAttribute('name') )
                {
                        case 'publishdate':
                        {
                                $return_unix_ts = time();
                                $us_formated_date = $this->current_field->nodeValue;
                                $parts = explode('/', $us_formated_date );
                                if( count( $parts ) == 3)
                                {
                                        $return_unix_ts = mktime( 0,0,0, $parts[0], $parts[1] , $parts[2] );
                                }
                                return $return_unix_ts;
                        }
                        break;
                        case 'short_description':
                        case 'description':
                        case 'intro':
                        case 'body_right':
                        {
                                $xml_text_parser = new XmlTextParser();
                                $xmltext = $xml_text_parser->Html2XmlText( $this->current_field->nodeValue );
                                if($xmltext !== false)
                                {
                                        return $xmltext;
                                }
                                else
                                {
                                        $message = 'Failed to parse XML for attribute: '.$this->current_field->getAttribute('name');
                                        //$this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                                        return false;
                                }
                        }
                        break;
                        default:
                        {
                                return $this->current_field->nodeValue;
                        }
                }
        }

// Logic where to place the current content node into the content tree
        function getParentNodeId()
        {
                $parent_id = $this->parent_id_fallback;
                $parent_remote_id = $this->current_row->getAttribute('parent_id');
                if( $parent_remote_id )
                {
                        $eZ_object = eZContentObject::fetchByRemoteID(self::REMOTE_IDENTIFIER.$parent_remote_id );
                        if( $eZ_object )
                        {
                                $parent_id = $eZ_object->attribute('main_node_id');
                        }
                }
                return $parent_id;
        }

        function getDataRowId()
        {
                return self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id');
        }

        /*
        * - Allow the flexibility to extract data from multiple content-classes in one source XML file.
        * - See comments by Joachim Karl at: http://ez.no/developer/contribs/import_export/data_import
        */
        function getTargetContentClass()
        {
                if($this->current_row->getAttribute('type'))
                {
                        return $this->current_row->getAttribute('type');
                }
                else
                {
                        $message = 'eZ content-class not found. Given class name was: '.$this->current_row->getAttribute('type');
                        $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                        return false;
                }
        }

function readData()
        {
                $filename = $this->xml_source_path.'/'.$this->xml_source_file.'/'.$this->xml_source_file.'.transformed.xml';
                //return $this->parse_xml_document($filename,$this->root_node);
                if(isset($this->xml_source_path) && isset($this->xml_source_file))
                {
                        $filename = $this->xml_source_path.'/'.$this->xml_source_file.'/'.$this->xml_source_file.'.transformed.xml';
                        if(!is_file($filename))
                        {
                                $message = 'Cannot open '.$filename.' for reading. Please check files/dirs exist and permissions are set correctly'."\n";
                                $this->logger->write(self::REMOTE_IDENTIFIER.$this->current_row->getAttribute('id').': '.$message,$logfile);
                        }
                        else
                        {
                                return $this->parse_xml_document($filename,$this->root_node);
                        }
                }
                else
                {
                                $message = 'Source export file cannot be found or is not set. Please check files/dirs exist and permissions are set correctly'."\n";
                                $this->logger->write(self::REMOTE_IDENTIFIER.$class_identifier.'_'.$this->current_row->getAttribute('id').': '.$message,$logfile);
                }
        }

        function post_publish_handling( $eZ_object, $force_exit )
        {
                $force_exit = false;
                return true;
        }
}
?>

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Tuesday 20 April 2010 3:00:57 pm

"

Hello,

eZXMLExport never export the parent node, it has a top -> bottom approach and not a bottom -> up one.

For example with the following directory structure :

 eZPublish
└── folder1
    └── folder2
        └── folder3

If you want to export folder2 and folder3, you have to export from folder1, and if you want to export folder1, 2 and 3 you have to choose to export from "eZPublish".

Cheers :)

"

What if my tree looks like this and I only want to export folder 3, folder 4 and all their contents?

 

 eZPublish
   └── folder1
       └── folder2
            └── folder3
                 └── article 1
                 └── article 2
            └── folder4
            └── folder5
    └── folder6

I'm also having trouble with my exported items maintaining their relations. If I re-import the items, the default parent node (2) is used and if there are items within items (articles in folders) the articles aren't kept within their parent folders and are all placed at the same level in the folder hierarchy.

Granted, this is probably to do with the data_import extension and not yours Jérôme but any tips anyone might have are gratefully received to fix this.

I have tried including the parent_node_id in the exported content by hacking ezxmlexportexporter.php, adding the parent_node_id into the XSL stylesheet and pulling it out in data_import's getParentNodeId() function, but the same thing still happens.

Help! - I'm fully stuck :-(
Thanks everyone

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.

Russell Michell

Sunday 25 April 2010 4:33:12 pm

Hi all, this will pretty much be the final post in this lengthy topic I reckon.

For those encountering this thread in the future (*wave* - what does the post carbon world look like?) I have solved my problems.

It appears data_import's own getParentNodeID() is lacking so I've had to further hack ezxmlexport's eZXMLExportExporter XML creation class in ezxmlexportexporter.php to ensure a parent_id is included in the source (exported) XML. From there, getParentNodeID() in the import will work.

I've also modified the XSL stylesheet to ensure container classes are created first by using <xsl:sort>. See the projects.ez.no URL to it I posted above, where I have posted the updated stylesheet.

Cheers all,
Russ

Russell Michell, Wellington, New Zealand.
We're building! http://www.theruss.com/blog/
I'm on Twitter: http://twitter.com/therussdotcom

Believe nothing, consider everything.