Cut'n'paste in windows creates links in editor

Author Message

Atle Pedersen

Friday 23 January 2009 3:52:39 am

Hello,

lately a problem has started to surface, and I don't think it is a problem only with eZ publish.

Some people using windows, and using cut'n'paste from for example Word, enda up with the whole content becoming a link to a local file called clip_filelist.xml.

Using google, I found this::
http://objectmix.com/net-objects-fusion/613884-html-view-shows-blank-page-nof.html

The text inserted when pasting seems to contain this:

<META CONTENT="Word.Document" NAME="ProgId">
<META CONTENT="Microsoft Word 9" NAME="Generator">
<META CONTENT="Microsoft Word 9" NAME="Originator">
<LINK REL="File-List"
HREF="file:///C:/WINDOWS/TEMP/msoclip1/01/clip_filelist.xml">
<!-- [if gte mso 9] -->
<!-- [endif] -->

The editor then identifies the link tag, and puts the whole content in a link.

This only seems to happen to a very few people, and I've never seen it before last week, but twice since then.

Have anyone else seen this problem, know when it strikes and what the best way to deal with it is?

Ronny Vedå

Friday 23 January 2009 4:54:39 am

We also have a customer with the same problem. I suspect Microsoft Office has installed some components that messes up something in the editor.

André R.

Friday 23 January 2009 4:57:44 am

Office 2007 issue, it's also a issue if you copy text from word 2007 to word 2003 and then to the editor.
You can enable pastword button in oe5 (see ezoe.ini), and use the button to past content from word 2007 into the popup you get when you click the button.

This might only affect Firefox + word 2007 users, but not sure as I don't have Office 2007 myself.

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

Ronny Vedå

Friday 23 January 2009 5:03:29 am

Thanks André. We'll try that.

André R.

Sunday 25 January 2009 6:21:32 am

I know there are efforts in the TinyMCE community to fix the past from word situation.
In the mean time I can add code in the input parser to remove the link that is causing problems after you save.

So the question is, does the href always look like this?:

<LINK REL="File-List" HREF="file:///C:/WINDOWS/TEMP/msoclip1/01/clip_filelist.xml">

I can simply add a block list of urls that will be threated like invalid urls so the <link> tag is not stored (but the content of the tag is).
From some Googling it seams that I should look for the REL value and the end of the HREF (msoclip1/01/clip_filelist.xml).

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

André R.

Monday 26 January 2009 2:38:55 am

Update, found some other webpages that mention this issue, and here is part of the stuff that could come from word:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><meta name="ProgId" content="Word.Document" /><meta name="Generator" content="Microsoft Word 12" /><meta name="Originator" content="Microsoft Word 12" /><link rel="File-List" href="file:///C:\Users\User\AppData\Local\Temp\msohtmlclip1\01\clip_filelist.xml" [^] /><link rel="themeData" href="file:///C:\Users\User\AppData\Local\Temp\msohtmlclip1\01\clip_themedata.thmx" [^] /><link rel="colorSchemeMapping" href="file:///C:\Users\User\AppData\Local\Temp\msohtmlclip1\01\clip_colorschememapping.xml" [^] />

Meta tag is no problem, as the parser ignores it, links on the other hand are not ignored.
So here is a patch for ezoe to ignore the word 2007 links, and if you have word 2007 please test it if you can:

Index: trunk/ezoe/ezxmltext/handlers/input/ezoeinputparser.php
===================================================================
--- trunk/ezoe/ezxmltext/handlers/input/ezoeinputparser.php	(revision 3495)
+++ trunk/ezoe/ezxmltext/handlers/input/ezoeinputparser.php	(working copy)
@@ -76,7 +76,7 @@
         'ul'      => array( 'name' => 'ul' ),
         'li'      => array( 'name' => 'li' ),
         'a'       => array( 'nameHandler' => 'tagNameLink' ),
-        'link'    => array( 'name' => 'link' ),
+        'link'    => array( 'nameHandler' => 'tagNameLink' ),
        // Stubs for not supported tags.
         'tbody'   => array( 'name' => '' ),
         'thead'   => array( 'name' => '' ),
@@ -254,8 +254,17 @@
     function tagNameLink( $tagName, &$attributes )
     {
         $name = '';
-        if ( isset( $attributes['href'] ) )
+        if ( isset( $attributes['href'] )
+          && isset( $attributes['rel'] )
+          && ( $attributes['rel'] === 'File-List' || $attributes['rel'] === 'themeData' || $attributes['rel'] === 'colorSchemeMapping' )
+          && ( strpos( $attributes['href'], '.xml' ) !== false || strpos( $attributes['href'], '.thmx' ) !== false) )
         {
+            // empty check to not store buggy links created
+            // by pasting content from ms word 2007
+        }        
+        else if ( isset( $attributes['href'] ) )
+        {
+            // normal link tag
             $name = 'link';
             if ( isset( $attributes['name'] ) && !isset( $attributes['anchor_name'] ) ) $attributes['anchor_name'] = $attributes['name'];
         }

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

André R.

Monday 26 January 2009 2:15:36 pm

The patch is commited in rev. 3528 for ezoe, in other words it will be part of 5.0rc12.
Tested with word viewer 2007, internal word links are no longer stored.

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.