Forums / Suggestions / Add XHTML datatype (replace XML datatype?)

Add XHTML datatype (replace XML datatype?)

Author Message

Ɓukasz Serwatka

Friday 11 November 2005 2:01:10 am

I agree with Bruce, good valid XHTML on output is important.

Storing content in XML give you more freedom with exporting and presenting content in other technologies like WML, XHTML, PDF, XML or what you need, everything depends on output handlers, you made your own handler as well.

  XHTML  WML  XML or other format.
     \    |   /
      +------+
      |output|
      |handl.|
      +------+
      | XML  |
      +------+
because it is easiest to deal with XHTML in GUI tools.

CMS systems are mostly for Editors, which don't know xhtml syntax. They focus on content mostly. I can agree that GUI tools for Editors are important (makes they life easy), but from Editor point of view it doesn't matter what will be input (technically). As long as they work with e.g Online Editor and content they don't care about xhtml syntax, simplified xml syntax, or whatever.

Personal website -> http://serwatka.net
Blog (about eZ Publish) -> http://serwatka.net/blog

Bruce Morrison

Friday 11 November 2005 2:22:26 am

Hi Gabriel

Because the data of an old site likely is in XHTML already

There was a discussion at a CMS meeting and/or mailing list about automated conversion of existing sites to a CMS site and most people stated that the content almost always needed to be cleaned up and/or rewritten when doing this.

Automated imports often proved expensive and required further editing.

I'm not sure that this is a compelling reason for the internal storage to be XHTML. It <b>does</b> point to the need for a XHTML import parser.

</code>and because it is easiest to deal with XHTML in GUI tools.</code>

OE works doesn't it ;)

Cheers
Bruce

My Blog: http://www.stuffandcontent.com/
Follow me on twitter: http://twitter.com/brucemorrison
Consolidated eZ Publish Feed : http://friendfeed.com/rooms/ez-publish

Gabriel Ambuehl

Friday 11 November 2005 2:24:37 am

Obviously the content producers should not have to input XHTML.

But the system better be able to import it for a lot of reasons pointed out above. If the import filters supported XHTML in a sane way (currently that's far from being the case) I could care less what it stores internally...

Visit http://triligon.org

James Robertson

Wednesday 05 April 2006 2:30:54 pm

OK, obviously there is some unsaid business reason [sales of eZ publish Online Editor for ezxml?] why eZ systems are resisting this idea. Fair enough. It's their product and yet at the same time it's Open Source, so we all have the opportunity to customize it. Thanks guys :-)

I can't resist making one [last?] point however ... again:

XHTML *is* an XML schema!

Therefore any arguments made about abstracting the data so that it can be re-presented in WML [yeah right ;-], etc., etc., apply equally well to XHTML as they do to ezxml. Except that; there are many, many, many more tools available to help manipulate XHTML and many, many, many more people who are familiar with XHTML; yes ... *even* editors of websites.

liu spider

Tuesday 11 April 2006 9:42:17 pm

there are indeed some open source ezxmltext editors out there:

for example this one:
http://sf.net/projects/sjsd

and of course, you can use xhtml if you want

on the other hand, as already suggested in this thread (IIRC), ezxmltext is only a very mini-subset of xhtml (with some minor incompatabilities). As it is minimal, it's behavior can be well defined and it prevents you from doing something like "inline style" in your tags.

http://liucougar.scim-im.org
SCIM Input Method Platform
http://scim.sf.net
SJSD Online Editor
http://sf.net/projects/sjsd

Xavier Dutoit

Wednesday 12 April 2006 12:18:21 am

Hi,

I haven't tried yet the 3.8 input handler that's supposed to be more "human friendly", but until 3.7, the approach has been to enforce strictly the validity of the content. This is very fine from an IT point of view, but I've never been able to explain to a user why he/she can't input content because "tag ul can't be a child of strong" (or another as explicit error message).

Be strict about what you produce and liberal about what you accept. Can you imagine a web with browser reject any non valid (x)html page ? The web would be rather empty...

Anyway, the issue is that the simplified xml format ez use isn't the one used to store the content, and it imposes arbitrary limitations (IMO, you should allow inline style if you really want it). If needed, it can be filtered out, but shouldn't be enforced by the format.

X+

http://www.sydesy.com

Kirill Subbotin

Tuesday 25 April 2006 7:49:14 am

Hi, All, again!

Good news is that version 3.8 really contain different input parser, the same that used in OE. The "simplified xml" and "dhtml" input handlers are derived from the same base class that performs conversion from one markup language to another, and only some tag-specific functions are different. The parser performs validation basing on the ezxml schema (see ezxmlschema.php class). This is not a standard-based schema, just an array with some properties and related functions, but the schema can be built basing on it. Currently there are some problems to create standart schema (say dtd), mainly because of dual nature of custom tags. But this schema is still usefill, and as I know there were no any schema before. In eZp 4.0 we are going to create clean standard-based xml schema that doens't limit users in any way.

Also the procedure that generates "simplified XML" for the edit field has been rewritten and it is a separate class now.

2Xavier:
The new parser really can be very soft and it converts any input to valid ezxml. But... after some testing I decided to make it more strict and display more error messages, becuse in reality this IS more "user-friendly". Computer doesn't know what user means by the incorrect input. Imagine: you enter invalid xml and computer "eats" it and stores valid XML, but the part of the input is lost or it is not even close to what you expect, but you are not even warned about it - this is confusing.

But if you want you can easily turn error messages off, just set $validate = false in the parser's constructor. Thats all, it will eat any input. But even in "$validate = true" mode some errors of the first pass are ignored and corrected by default (this depends on the "error level"). For example by default it is not necessary to close all tags. Sometimes it works like you expect but sometimes you can have unexpected tag nesting or another error msgs raised because of incorrect nesting.

Kirill Subbotin

Tuesday 25 April 2006 8:07:03 am

2 James Robertson:
eZ systems is not resisting any suggestions. Moreover we have agreed that current format could be much better and could be even XHTML subset. But this is a subject to change in eZ publish 4.0 and there is no sense to do it now in 3.*. Again, of course all suggestions are welcome.

ps. btw, as I know, current 3.* format was developed with XHTML 2.0 in mind (although I think it is not compatible).

James Robertson

Thursday 27 April 2006 3:06:01 pm

Hi Kirill

Thanks for your responses. You guys (eZ systems) are doing a great job. eZ publish is already a much better product than it was 1.5 year ago, when I first started using it.

I hope I didn't offend with any of my comments. I think we are coming at the problem from two different philosophical perspectives. I am encouraged to hear that things will be different in eZ publish 4.

Hi Xavier

Thank you to you for your participation in this discussion. Your contributions have been most enlightened.

liu spider

Monday 22 May 2006 11:15:45 am

the ez 3.8 parser is much nicer, now SJSD svn based on it as well. The porting is very easy and straightforward, and results in much shorter and cleaner code, thanks for the great improvement

I have a suggestion about the ezxml format, could you consider adding the support for tbody and caption tag under table in ez 3 life frame?

almost everything else can be transformed into ezxml format without losing anything

http://liucougar.scim-im.org
SCIM Input Method Platform
http://scim.sf.net
SJSD Online Editor
http://sf.net/projects/sjsd