Learn / eZ Publish / Creating Datatypes in eZ Publish 4

Creating Datatypes in eZ Publish 4

eZ Publish 4, which was released in November 2007, is the first eZ Publish version that runs on PHP 5. In addition, it uses the eZ Components framework. This gives extension developers a broad range of new features and possibilities, including exceptions, visibility modifiers, class constants, static class variables, magic getter and setter methods, the DateTime extension, class autoloading and de-referencing object return values.

In this article, we will create an extension for a new datatype to save PHP DateTime objects. We will outline the general requirements for creating datatypes and also give special attention to exceptions and the UserInput component.

Download the extension

The example extension from this article can be checked out via SVN at http://source.ymc.ch/svn/ymc_datatypes/ymcdatatype/.

One of the most appealing features of eZ Publish is its implementation of content classes and objects. The concept of content classes and objects in eZ Publish is borrowed from object-oriented programming: content classes are blueprints for content objects as are PHP classes for PHP objects. While PHP classes have attributes and methods, there are only attributes for content classes. Content classes can be created and modified at any time – even on production systems – without getting your hands dirty with PHP. This flexibility is possible because content classes are not hard-coded in PHP, but built from smaller pieces: datatypes.

Attributes of PHP classes can be of different types -- either the built-in primitive types like strings, integers, and boolean values or more complex types like classes and arrays.

Attributes of content classes are of datatypes. Consider a product object for a small online store. An attribute of the “Text line” datatype stores the name for your product, an attribute of the “Image” datatype provides a visual representation of the product, and an attribute of the “Price” datatype stores the price. The product ID is stored in an attribute of the “Integer” datatype.

The above datatypes are among the many datatypes included in standard eZ Publish distributions, making it possible to customize many types of content. However, sometimes your site has special needs that cannot be satisfied by one of the existing datatypes.

At Young Media Concepts we have a library of custom datatypes that we re-use for different projects. Some examples from this library are:

  • “ymcdomain” stores a domain name either with or without a subdomain. Domain names can be stored using the “Text line” datatype, but our datatype includes the functionality to validate domain strings.
  • “ymcinstantmessenger” stores users' nicknames for pre-defined IM services.
  • “ymcuniquestring” saves a plain text string, but only accepts the string if it is unique across all objects of the same content class.
  • “ymcenhancedselection” enables a content manager to select pre-defined content nodes under which a secondary location of the object will be automatically stored. As an example, we use this datatype to assign videos to categories, where categories are represented by nodes in the content node tree.

The datatype that we will create in this article will store dates together with their associated timezones. The date is represented in PHP by the DateTime extension, which comes bundled by default with PHP 5. Because PHP 4 uses 32-bit integers to store dates, it was previously not possible to enter dates before 1902 or after 2038. This limitation spurred the creation of the “ezbirthday” datatype which we actually used as a model for the new "ymcdatetime" datatype. The PHP 5 DateTime class used by our datatype not only extends the date range to many billion years by using 64-bit integers, but also stores timezone information.

Let's take a look at the Object Edit Interface of an object containing two attributes in the screenshot below, both of our “ymcdatetime” datatype. Suppose this object was of a class for car rental bookings. The two “ymcdatetime” attributes would represent the rental period, while additional attributes represent pickup and return stations, client name and car type.

Object Edit Interface using attributes of our datatype

Both of the attributes in the screenshot have seven input fields, representing the different elements of a point in time. These input fields are not saved separately inside our content class, but represented in attributes as single DateTime objects.

Our custom datatype extends the functionality of eZ Publish and therefore needs to be included in an extension. Refer to Felix Woldt's article An Introduction to Developing eZ Publish Extensions to learn more about different types of extensions. Here we will outline the necessary files and configuration for the extension.

At Young Media Concepts we have one extension called “ymcdatatype” that contains all of our general-purpose datatypes. Each datatype can also be created in a separate extension or combined with a related module in an extension.

In any case, the extension containing your datatype needs to be activated in a site.ini.append file, either in settings/override, settings/siteaccess/your_siteaccess or in the settings directory of an already activated extension, like this:

[ExtensionSettings<span>]
ActiveExtensions<span>[]=ymcdatatype

(Note that the setting name is “ActiveAccessExtensions[]” if you are activating the datatype for a specific siteaccess.)

Based on the settings above, eZ Publish knows that there is an extension called “ymcdatatype”, but it does not know about the new datatype yet. To point eZ Publish to the datatype contained in the extension, we place the settings below in a content.ini.append file, which is best placed in the settings directory of the datatype extension:

[DataTypeSettings<span>]
ExtensionDirectories[]</span>=ymcdatatype
AvailableDataTypes[]=ymcdatetime

The datatype also requires some templates (to be described next), so the extension also needs to be specified as a design extension in design.ini.append:

[ExtensionSettings<span>]
DesignExtensions<span>[]=ymcdatatype

Template files

Templates are required for class and object attributes of our datatype. In both cases, we need both view and edit templates, making a total of four required templates:

  • design/standard/templates/class/datatype/edit/ymcdatetime.tpl
  • design/standard/templates/class/datatype/view/ymcdatetime.tpl
  • design/standard/templates/content/datatype/edit/ymcdatetime.tpl
  • design/standard/templates/content/datatype/view/ymcdatetime.tpl

We will explore the code in these templates a bit more later.

PHP files

Required files

The main code of the datatype consists of two PHP classes in the following files:

  • datatypes/ymcdatetime/ymcdatetime.php: ymcDatatypeDatetime
  • datatypes/ymcdatetime/ymcdatetimetype.php: ymcDatatypeDatetimeType

The first file contains the actual content holder. Some simple datatypes like string or number types do not have an extra class to represent their content, but simply return an array or a primitive PHP type as representation of their content. If the content of your datatype has a complex structure or methods associated to the content, you should represent the content by a PHP class of its own.

In our case, the ymcDatatypeDatetime class contains shortcut methods to retrieve the elements of a DateTime object, such as the day, month, second or timezone, as well as methods for serializing and unserializing the information. The constructor of the class also checks the given input for validity. This way we could shorten the ymcDatatypeDatetimeType class and let the content container decide whether or not to handle the user input.

The name and location of the file ymcdatetime.php is not strictly forced, but it is general convention in eZ Publish and many extensions to name it this way. You thus have the main code blocks of your datatype in one place and everybody knows where to search for it.

The second file handles the connection between eZ Publish and our new datatype. The name and path of the second file is important because eZ Publish tries to load a file with this exact name when you register a datatype with the identifier “ymcdatetime”.

We also need to handle the HTTP input forms. This is done with two classes, defined in datatypes/ymcdatetime/classform.php and datatypes/ymcdatetime/objectform.php, which both extend ymcDatatypeForm from interfaces/form.php. Since this datatype has been developed for eZ Publish 4, we can make use of the UserInput component from eZ Components to handle form input. This is explained in detail later.

Useful additions

PHP 5 and therefore eZ Publish 4 have two notable features to write more stable and robust code. The first one is the autoloading functionality, which frees you from all those “requires” and “includes”.

The second feature is the introduction of exceptions. In PHP 4, the only way to report errors from a function is for it to return FALSE or some other value, by convention indicating an error. However, this is a sub-optimal solution – how do we know which problem actually occurred and what if FALSE should also be a valid return value?

In every extension, we have an exceptions directory containing an exception class for, hopefully, each type of error that can occur. One such error in our datatype is invalid user input, which is indicated by a ymcDatatypeInvalidParamsException exception, defined in invalid_params.php.

Input validation in PHP 5 should be done using the filter extension. This extension comes with many configurable filters to either block invalid data or sanitize it with certain rules. If you need to add extra filters, you can provide a callback function to the filter extension.

In our case, we need special filters for integer numbers with leading zeros. If somebody enters the month as “09” instead of “9”, it should still be valid for September, even if a leading zero in PHP usually indicates an octal number and “09” is not a valid octal number.

We put our filter callbacks in classes inside the filter_callbacks directory. For this datatype, we only need the ymcDatatypeFilterIntLeadingZero class, which is defined in filter_callbacks/int_leading_zero.php.

We start our journey through the PHP code with the ymcDatatypeDatetime class, which holds a DateTime object.

The first method is the constructor. It needs a string representing the date as input. The format of the string is the one accepted by the DateTime class respective to the GNU Date command. The second parameter represents the timezone and can be given either as a string or as a DateTimeZone object; it defaults to the PHP data.timezone INI setting.

public function __construct( $dtString = NULL, $tz = NULL )
    {
        if( is_string( $dtString ) )
        {
            if( is_string( $tz ) )
            {
                try{
                    $tz = new DateTimeZone( $tz );
                } catch( Exception $e ) {
                    throw new ymcDatatypeInvalidParams(
                            $tz.' is no valid timezone identifier.'
                    );
                }
            }
            elseif( NULL === $tz )
            {
                $tz = new DateTimeZone( date_default_timezone_get() );
            }
 
            if( !$tz instanceOf DateTimeZone )
            {
                throw new ymcDatatypeInvalidParams(
                        'The second parameter of the constructor needs to be either a '
                        .'valid Timezone string or an instance of DateTimeZone.' );
            }
 
            try
            {
                $this->dateTime = new DateTime( $dtString, $tz );
            }
            catch( Exception $e )
            {
                throw new ymcDatatypeInvalidParams( $dtString.' is no valid DateTime string.');
            }
        }
    }

What I want to point out here are the exceptions used in the constructor, as shown below:

try
            {
                $this->dateTime = new DateTime( $dtString, $tz );
            }
            catch( Exception $e )
            {
                throw new ymcDatatypeInvalidParams( $dtString.' is no valid DateTime string.');
            }

If you try to instantiate a ymcDatatypeDateTime object with invalid data, an exception is thrown. This is no problem, since you can catch the exception and produce the appropriate feedback to the user. The advantage in the code is that the input validation can be reduced to just a few lines in the ymcDatatypeDateTimeType class, similar to the following simplified example:

try{
            $this->content = new ymcDatatypeDateTime( $datestr, $tz );
        }
        catch( ymcDatatypeInvalidParams $e )
        {
            $this->content = NULL;
            return eZInputValidator::STATE_INVALID;
        }
        return eZInputValidator::STATE_ACCEPTED;

Inside the templates, we work with an instance of the ymcDatatypeDateTime class to show the content of the datatype. To get the data out of the object, eZ Publish needs two methods to access the data: attribute and hasAttribute. These two methods do essentially the same thing as the new PHP 5 magic methods __get and __isset. Therefore our methods simply forward the request to those magic methods.

We will discuss the templates and how to retrieve the content later.

The remaining __toString and createFromString methods are needed to serialize and unserialize the object to and from a string that can be saved in the database:

public function __toString()
    {
        if( NULL === $this->dateTime )
        {
            return '';
        }
        return $this->dateTime->format( self::FORMAT_MYSQL_FULL )
              .$this->dateTime->format( self::FORMAT_TIMEZONE_IDENTIFIER );
    }
 
    public static function createFromString( $string )
    {
        if( '' === $string )
        {
            return new self;
        }
        return new self(
            substr( $string, 0, 19 ),
            substr( $string, 19 )
        );
    }

As previously mentioned, the connection between eZ Publish and our new datatype is made with the ymcDatatypeDateTimeType class, which extends eZDataType. Configuration of the datatype is then made by overriding certain methods of the parent class.

While eZDataType contains dozens of methods, the methods can be grouped into three main categories for the purpose of our custom datatype: methods dealing with user input for the content class attribute; those dealing with user input for the content object attribute; and other methods. These are outlined below. For more detailed information, be sure to study the code and accompanying comments in our example extension.

Content class attribute methods

Recall the previous screenshot showing the Object Edit Interface for two attributes using the “ymcdatetime” datatype. The screenshot below shows the Class Edit Interface for the content class with the same two attributes.

Class Edit Interface using attributes of our datatype

Most of the form fields are common to all datatypes and are the same for every class attribute. Only the Default value field is controlled by our custom datatype, which allows content managers to select whether a new instance of the datatype should be pre-filled with the current date or left empty.

When “Current date” is selected, the input fields of this attribute will be pre-filled with the current date for new objects. When “Empty” is selected, the input fields of this attribute will be blank.

The three methods listed below from the ymcDatatypeDateTimeType class are important for the Class Edit Interface we just described. They are described in the order in which they are called from the eZ Publish core:

  • initializeClassAttribute
    This method can be used to set default values for the configuration of a new class attribute of this datatype. For our datatype we set the value of the dropdown list for the default value to “Empty”.
  • ValidateClassAttributeHTTPInput
    Here, we check whether the submitted input is valid. The return value should be either eZInputValidator::STATE_INVALID or eZInputValidator::STATE_ACCEPTED.
  • FetchClassAttributeHTTPInput
    Here, we take the submitted input and save it in the provided eZContentClassAttribute parameter so that eZ Publish can store it in the database. Note that this method is also called when we return STATE_INVALID. This might appear unnecessary at first, but makes sense when we have multiple input fields to take care of. If only one input field has invalid data, we still want to keep the input of the valid fields so that content managers do not have to re-enter it.

Content object attribute methods

There are three methods for object editing that mirror the functionality described above for class editing. These are initializeObjectAttribute, validateObjectAttributeHTTPInput and fetchObjectAttributeHTTPInput.

The only difference lies in the initializeObjectAttribute method. It is called not only when a content object is created for the first time, but also whenever a new version is created. Since every edit operation on a content object creates a new draft version, this method is called on every edit.

If we create a new version of an object and with it also a new version of our attribute, we need to copy the content of the old version to the new one. This may seem bothersome, but is a necessary step to allow datatypes the option to do more than simply copy their content for new versions. Some content might need to be changed in some way between versions, for example to increase a counter or set a date.

The hasObjectAttributeContent method checks whether data exists. If the attribute has the “required” flag set, then eZ Publish will not save the content if there is no content to store in the datatype.

The objectAttributeContent method returns the content of the object. This can be a primitive PHP type like an integer or string, or an object for more complex content structures. In our case, we return an instance of ymcDatatypeDatetime.

Other methods

Of the remaining methods, the most important one is the constructor. It announces an identification string and a human-readable string to the parent class eZDataType. The identification string is used by eZ Publish to identify datatypes throughout the system and should therefore be chosen with care. It is a good idea to have a unique prefix for all your datatypes. (Please do not use “ymc” as this is used by us!) The second string is used in the interfaces and it is common to use the internationalization features of eZ Publish to present the user with a localized name of the datatype:

const DATATYPE_STRING = 'ymcdatetime';
    public function __construct()
    {
        parent::__construct<span>( self::DATATYPE_STRING, 'ymcDateTime');
    }

The last five methods return information necessary to sort and search the datatype and a string representation of the datatype that can be used as part of the content object's title. Please refer to the source code of our example datatype for more information.

If you have looked at the source code, you may have seen a rather uncommon code line at the end of the file

eZDataType::register( ymcDatatypeDateTimeType::DATATYPE_STRING, "ymcDatatypeDateTimeType" );

This coding style is from the pre-PHP 5 times, where classes could not be loaded automatically. Therefore, the line must be included at the end of every datatype.

The first argument is the same datatype identifier as in the constructor and the second argument is the name of the PHP class implementing the datatype. Remember that class names in PHP 5 are case sensitive!

The filter extension gives you a powerful API to block or sanitize submitted input. The UserInput component builds an object-oriented API on top of the extension. This helps us to put the form parsing code in a separate class and thereby makes the rest of the code much more readable.

Forms are objects too!

Do you also dislike PHP code where form parsing and business logic are mixed together? Why not give forms the appreciation they deserve and put them in their own PHP classes? All code that is necessary to answer the question “Can we process this data?” should be put in a form class. Code that answers the question “What went wrong with the input?” should be put in the form class too, resulting in elegant controllers in your MVC.

The UserInput component helps by providing a predefined structure to declare the input you are expecting. Consider the following screenshot of a more complicated datatype:

Enhanced edit form for a datatype

The indicated input fields were used as an example to write a valid form definition for the UserInput component:

$formDef = array(
  'isnodeplacement_value' => new ezcInputFormDefinitionElement(
      ezcInputFormDefinitionElement::OPTIONAL, 'boolean' ),
  'option_name_array' => new ezcInputFormDefinitionElement(
      ezcInputFormDefinitionElement::OPTIONAL, 'string', NULL, FILTER_REQUIRE_ARRAY ),
  'move_option_up' => new ezcInputFormDefinitionElement(
      ezcInputFormDefinitionElement::OPTIONAL, 'string', NULL, FILTER_REQUIRE_ARRAY ),
  'newoption_button' => new ezcInputFormDefinitionElement(
      ezcInputFormDefinitionElement::OPTIONAL, 'string' ));

Such an array can then be placed in a class representing the form object. One major benefit of this approach is that all possible input variables are defined in one place.

Another method I put in my form objects is isValid(). Since you have all the validation logic in the form class, you could even check the validity of a form in one line:

if  ( MyFormClass::getInstance()->isValid()) ....

Now let's see how this approach can be applied for our datatype.

UserInput and the ymcDateTime datatype

If you had a simple website with one form containing fields like “firstname”, “surname” and “email” then you can easily put these field names in the array keys of the form definition. The form declaration method described in the previous section is a bit more difficult to deal with when it comes to datatypes.

Remember that a content class in eZ Publish is not represented by a simple form hard-coded in your PHP script. In fact, the form of a content class (or object) is dynamically created based on the included datatypes. Recall that our example content class contains two attributes of the same datatype. If you were to hard-code the field names, there would be no way to determine whether an input field named “day” belonged to the “from” or the “to” attribute.

You need to add another piece of information to the input field names to indicate the attribute to which an input field belongs. This information is provided by a suffix consisting of the field name and the ID of the attribute to which it belongs. To be completely safe, another string is prefixed to the field name, indicating that it belongs to an attribute of a content class and further specifying the datatype. Therefore, your final field name could be “ContentClass_mydatatype_myfield_123”.

Normally, for every field name you want to access, you would need to write code similar to this:

$dataName = $http->postVariable($base . "_mydatatype_myfield_" . $contentObjectAttribute->attribute("id"));

The abstract ymcDatatypeForm class makes this a lot more readable by building the real field name in the background. Let's look at the fetchClassAttributeHTTPInput method in ymcDatatypeDateTimeType for a better understanding. Remember that this method should read the input from the Default Value dropdown list from the Class Edit Interface and store it in the attribute:

public function fetchClassAttributeHTTPInput( $http, $base, $attribute)
    {
        $form = ymcDatatypeForm::getInstance(
            'ymcDatatypeDateTimeClassForm',
            $attribute->attribute( 'id' )
        );

This gives us a singleton instance of the class ymcDatatypeDateTimeClassForm for the specified ID. Next is the input validation code, which consists of one simple statement:

if( !$form->isValid() )
        {
            return;
        }

If the input is valid, we can continue to the last line of code:

$attribute->setAttribute(
            self::CLASSATTRIBUTE_DEFAULT,
            $form->default
        );
    }

Note that I did not have to write “ContentClassAttribute_default_123” to access the field “default”. The mapping to the concrete field name has been done in the background by the ymcDatatypeForm class.

For more detailed information, please refer to the source code of the ymcDatatypeForm class, to the documentation of the UserInput component and to the documentation of the PHP filter extension.

As was previously mentioned, we have templates for the editing and viewing of both the content class and the content object. This chapter explains how you can access attribute data inside the templates and describes the interaction between the PHP code and the templates. For more information on templates in general, please refer to the eZ Publish user manual.

Viewing the class attribute

Let's start with the most basic example, for viewing the class attribute in design/standard/templates/class/datatype/view/ymcdatetime.tpl. In this template we use the variable $class_attribute, which is automatically available for every class attribute template. The $class_attribute variable is an instance of eZContentClassAttribute.

For our simple case we only need the value of the “data_int1” attribute of our eZContentClassAttribute instance. This is the attribute that we used in the fetchClassAttributeHTTPInput method to store whether or not the input fields should be pre-filled with the current date. We access it simply with $class_attribute.data_int1.

Editing the class attribute

The only additional concept in the template for editing the class attribute in design/standard/templates/class/datatype/edit/ymcdatetime.tpl is how to build the name of our input field. It is made up as previously described:

ContentClass_ymcDateTime_default_{$class_attribute</span>.id}

First, there is a namespace “ContentClass”, then the name of the datatype “ymcDateTime”, the name of the input field “default” and at the end an ID for the attribute.

Viewing and editing the object attribute

I will skip discussing the object editing template (design/standard/templates/content/datatype/edit/ymcdatetime.tpl), since there is nothing new in it compared to the two templates already mentioned.

The template for viewing object attributes of our datatype is located at design/standard/templates/content/datatype/view/ymcdatetime.tpl. One thing to note is how we access each of the values that make up the DateTime object. Inside this template you see the construct {$attribute.content.day}, and equivalent constructs for the other values. Let's look at each of the elements in the construct for the “day” value.

First, eZ Publish has predefined an instance of eZContentObjectAttribute in the variable $attribute. However, we cannot simply access the value for the day in the next level since it is stored in an object. Therefore, we must access “content”, whereby eZ Publish calls the objectAttributeContent method of our datatype. The return value of this method is directly given back as the content attribute itself. Looking in our datatype source code, you see that we return an instance of ymcDatatypeDateTime as the return value of this method. This means that the last word “day” accesses the attribute method of ymcDatatypeDateTime with the parameter “day”. The return value of this attribute method is the value that we are looking for.

The validation and storage of content in eZ Publish is based on datatypes, and this article described the necessary elements to create a custom datatype. Also, I introduced the concepts of exceptions and form handling with the UserInput component.

Another good source of datatype programming information is the source code of existing datatypes.

If you create a datatype that can also be of use for others, consider publishing it as Open Source on http://projects.ez.no/.

Finally, a big kudos to Derick Rethans! He is not only the author of the PHP DateTime class used in this tutorial, but also of the filter extension and the associated UserInput component.