Import an eZXMLText with custom tags

Author Message

David Ogilo

Monday 03 August 2009 3:20:50 am

Hi,

I have been trying to understand how to import a html data in ezpublish. Here is a copy of the code

case 'ezxmltext':
{
     $parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0, false );
     $attributeValue = $dataString;
     $attributeValue = str_replace( "\r", '', $attributeValue );
     $attributeValue = str_replace( "\n", '', $attributeValue );
     $attributeValue = str_replace( "\t", ' ', $attributeValue );
     $document = $parser->process( $attributeValue );
     
     if ( !is_object( $document ) )
     {
         $cli->output( 'Error in xml parsing' );
	 return; 
     }

     $dataString = eZXMLTextType::domString( $document );
}

Some how it strips out all image and object tags in the html data and just import links, text, and other html tags.

I have tried changing all image tags in the html data to <custom name="img"> and still doesn't work.

Does anyone know of a way to resolve this issue?

Thanks,

David

Heath

Monday 03 August 2009 3:37:51 am

Hello David,

In BC ImportCSV we did something similar. Perhaps this example will help.
<i>http://svn.projects.ez.no/bcimportcsv/trunk/extension/bcimportcsv/bin/bccsvjoomlacontenttablehtmlimport.php</i>

            case 'ezxmltext':
            {
		if( $attribute->ContentClassAttributeIdentifier == 'caption' ) {
		  $dataString = null;
                  break;
		}
                // Filter for images, process, store and link
                if ( is_numeric( $imageContainerID ) )
                {
                    $matches = array();
		    $pattern = '/<img\b[^>]*\bsrc=(\\\\["\'])?((?(1)(?:(?!\1).)*|[^\s>]*))(?(1)\1)[^>]*>/si';
                    preg_match_all( $pattern, $dataString, $matches );
		    $matches_count = count( $matches[2] );
  		    // $cli->output( print_r( $matches ) );

                    if ( $matches_count > 0 )
                    {
                        $toReplace = array();
                        $replacements = array();
			$objectComplex = true;

			$cli->output( "Matches Count: " . $matches_count );

                        // $imagenr = 0;
                        foreach ( $matches[2] as $key => $match )
                        {
				$toReplace[] = $matches[0][$key];
				$imageURL = trim( str_replace(chr(32), '%20', str_replace(' ', '%20', str_replace('\"', '', $matches[2][$key] ) ) ) );
                                if ( substr($imageURL, 0, 1) == '/') {
					$imageURL = 'http://www.diariodelhuila.com' . $imageURL;
        	                }
                                $cli->output("Image link: " . $imageURL);

                            	$imageTempURL = 'http://optics.kulgun.net/Blue-Sky/red-sunset-casey1.jpg';
				$imageTempFileName = 'def.jpg';

				$imageFileName = basename( $imageURL );
				$cli->output( 'Image File Name: '.$imageFileName );

				$imagePath = "/tmp/imgtmp/" . $imageTempFileName;
                                // $imagePath = "/tmp/imgtmp" . $imagenr++ . ".jpg";

                                if ( !copy($imageTempURL, $imagePath ) )
                                {
                                    $cli->output("Error copying image from remote server");
                                    $replacements[] = '';
                                }else {
                                    $imageClass = eZContentClass::fetchByIdentifier( 'image' );
                                    $imageObject = $imageClass->instantiate( $creator );
                                    $imageObject->store();
                                
                                    $imageObjectID = $imageObject->attribute( 'id' );
                            
                                    $imageNodeAssignment = eZNodeAssignment::create( array(
                                                                                 'contentobject_id' => $imageObject->attribute( 'id' ),
                                                                                 'contentobject_version' => $imageObject->attribute( 'current_version' ),
                                                                                 'parent_node' => $imageContainerID,
                                                                                 'is_main' => 1
                                                                                 )
                                                                             );
                                    $imageNodeAssignment->store();

                                    $imageVersion = $imageObject->version( 1 );
                                    $imageVersion->setAttribute( 'modified', $createDate );
                                    $imageVersion->setAttribute( 'status', eZContentObjectVersion::STATUS_DRAFT );
                                    $imageVersion->store();

                                    $imageAttributes = $imageObject->attribute( 'contentobject_attributes' );
                                    $cli->output("Image attributes:" . $cli->output( $imageAttributes, true ) );
                                
				    $imageAttributes[0]->fromString( $imageTempFileName );
                                    $imageAttributes[0]->store();

                                    $imageAttributes[2]->fromString( $imagePath );
                                    $imageAttributes[2]->store();

                                    $operationResult = eZOperationHandler::execute( 'content', 'publish',
                                                       array( 'object_id' => $imageObjectID, 'version' => 1 ) );

                                    $replacements[] = '<embed href="ezobject://' . $imageObject->attribute( 'id' ) . '" size="original" />';
                                }
                            }
                        
                            $dataString = str_replace( $toReplace, $replacements, $dataString );
	     		    $cli->output( print_r( $toReplace ) );
			    // $cli->output( print_r( $replacements ) );
			    // $cli->output( $dataString ); echo "\n";
			    unset( $toReplace ); unset( $replacements ); unset( $imageAttributes ); unset( $imageNodeAssignment );
                        }
                    }

                    $parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0 );
                    $document = $parser->process( $dataString );
                    // $dataString = eZXMLTextType::domString( $document );
		    // $cli->output( print_r( $dataString ) );

		    // get links
		    $links = $document->getElementsByTagName( 'link' );
		    if( is_numeric( $links->length ) && $links->length > 0 && is_object( $links ) ) {
  		        // $cli->output( print_r( $links ) );
			$li = 0;
                        // for each link
	                for( $li = 0; $li < $links->length; $li++ )
                        { 
			   $linkNode = $links->item( $li );
			   $url_id = $linkNode->getAttribute( 'url_id' );

			   $cli->output( 'Link Item Count: '. $li );
                           $cli->output( 'Link Item ID: '. $url_id );

			   if( is_numeric( $url_id ) ) {
                               // create link between url (link) and object
                               $eZURLObjectLink = eZURLObjectLink::create( $url_id,
                                                  $contentObject->attribute('id'),
                                                  $contentObject->attribute('current_version') );
		               $cli->output( print_r( $eZURLObjectLink ) );
		               // $cli->output( print_r( $url_id ) );

        	               $eZURLObjectLink->store();
                           }
	                }
                    }
                }break;
                default:
            }

Cheers,
Heath

Brookins Consulting | http://brookinsconsulting.com/
Certified | http://auth.ez.no/certification/verify/380350
Solutions | http://projects.ez.no/users/community/brookins_consulting
eZpedia community documentation project | http://ezpedia.org

David Ogilo

Monday 03 August 2009 6:12:02 am

Thanks Heath.

Hmm, do I have to save a copy of the image on the ezPublish backend? How about object tags?

Is there a way to include a custom tag which won't be stripped out from the HTML?

Rainer Krauss

Monday 03 August 2009 6:27:06 am

Hi david,

depends on what you actually want to achieve.

Would you like to get something into eZ Publish that contains
- an image or
- a file?

Would you like to have the image / file be shown or should it be available for download?

If you want to include files / images as links, you can add them to the media library and reference them via links to eznode:// or ezobject:// and the node or object id.

The tags eZXMLObject accepts are documented here:
http://ez.no/doc/ez_publish/technical_manual/4_0/reference/xml_tags
..and you'd have to look at replacing your tags with valid ones.

Best wishes,
Rainer

David Ogilo

Friday 07 August 2009 2:34:48 am

Thanks Rainer!

That was exactly what I wanted, worked like a charm!! :)

David

Sébastien Antoniotti

Friday 28 August 2009 9:35:39 am

Hi,

I wake up this topic because I'm making a similar import, but I would like to create objects and nodes like this :

$billet_attributes = array(
			'titre' 								=> "article 1",
			'contenu' 							=> "<p>some xhtml content</p>"
		);
		$params = array();
		$params['parent_node_id'] = '2';
		$params['class_identifier'] = 'wx_billet';
		$params['attributes'] = $billet_attributes;

		$object = eZContentFunctions::createAndPublishObject( $params );     

The problem is that 'contenu' attribute is not correctly set. I think this is because I need to do parse the xhtml content, but I don't know how to because I'm not in the case were I have the $contentObjectID needed in

$parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0, false );

Thanks in advance !

eZ Publish Freelance
web : http://www.webaxis.fr

David Ogilo

Thursday 01 October 2009 7:19:37 am

Hi Sébastien,

You could try this:

'contenu'  => htmlentities("<p>some xhtml content</p>")

Sébastien Antoniotti

Thursday 01 October 2009 10:55:05 am

Thanks ! I'll try it !

eZ Publish Freelance
web : http://www.webaxis.fr

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.