Learn / eZ Publish / Clustering in eZ publish 3.8

Clustering in eZ publish 3.8

The most CPU-intensive aspect of eZ publish is the PHP processing. This processing transforms the content stored in the eZ publish object database to XHTML, which is then displayed by a web browser. Adding servers to a site improves its performance. A site can handle more concurrent users and provide faster response times. Additional servers also limit the impact on site visitors when people are using the eZ publish Administration Interface. In fact, eZ publish administration can be configured to run on an entirely separate server from the ones that handle normal site traffic.

The following diagram shows the number of requests per second that server(s) can handle when using a range of one to four web servers (and one database server). Our tests suggest that, assuming identical servers, each web server doubles the performance of the original server. Therefore, an eZ publish installation spread across four servers provides performance throughput four times greater than a single server.

Comparative performance of one to four servers.

eZ publish versions prior to 3.8 could also be run in a clustered environment. However, these configurations were subject to occasional race conditions when files were updated or removed. Since all cache files and images were stored locally on the hard disk of each web server, files needed to be synchronized between each server. This was the source of the problems in prior eZ publish versions.

Cache file coherence

The most common issue was that eZ publish cache files were not coherent. One scenario that illustrates this is shown in the diagram below. In this two-server configuration, each server has stored its own eZ publish cache locally on the hard drive. The cache is stored "on-demand", with cache files created as pages are requested, so in this scenario Server 1 has three cache files and Server 2 has two cache files. The different files can contain different view modes or permissions for the same piece of content.

When a user updates some content, the cache for the content is cleared. However, the cache needs to be cleared on both servers. So, in this scenario, we clear the same files (File A and File B) on both servers. However, Server 1 has cache File C which has not been cleared. This means that the cache is wrong and served pages may contain the wrong content.

Failed synchronization of cache File C.

Cache hit ratio

The "hit ratio" is the percentage of all access requests that are satisfied by the data in the cache. Since cache files were generated on-demand on each server, the hit ratio was lower than it could be, meaning that the cache was not fully effective. Each server in the cluster must generate a cache for any given page. Therefore, the more servers that were added (thus distributing page requests), the less frequently they were serving cached content (because the cache files were distributed across multiple servers).

Image synchronization lag

Another problem occurred when images (or any kind of binary file) were uploaded to eZ publish. Images were stored locally on the server to which they were uploaded. However, other page content was stored in the database. Since multiple servers use the same database, the database content was instantly available to each server but the image file was not.

The time to transfer the image from one server to another varied according to the configuration, but, with rsync and cron jobs, the synchronization could only be triggered every second. This caused a potential lag. During this lag, if a request was served that included the updated content, eZ publish looked for the image on the local hard drive. Because the image couldn't be found, eZ publish was unable to create a variation or generate the image path. (This could also result in a cache file with a broken image reference.)

As a result, sometimes images were not displayed. The diagram below illustrates this scenario.

Delayed synchronization of images.

In eZ Publish 3.8 we have solved these caching and synchronization issues by simply storing all cache files, images and binary files in the database. Database transactions are used to ensure that all the servers in a clustered environment use the same cache files and have access to the same image and binary files. Database transactions are supported by the InnoDB storage engine under MySQL.

Database transactions provide some additional benefits:

  • Canonical storage: everything is stored in the database
  • Simple backups: complete data backups require simply dumping the database
  • Cross-platform: simpler to migrate to other platforms

The clustering functionality is described in the eZ Publish 3.8 documentation (which will soon be available online). You can also find it in the documentation directory in the eZ Publish installation.

The database solution has one drawback. While storing cache files in the database does not seriously affect your site's performance, storing image and binary files does. It is significantly slower to serve these files from a database than from a filesystem.

The solution is to use a Squid reverse proxy. This ensures that images and files are cached in Squid the first time they are served. This also makes serving images, files, CSS and JavaScript files much faster than with Apache. It also reduces the load on the Apache servers, allowing them to focus on serving dynamic pages.

This article has shown how a robust clustered environment can be configured with eZ Publish 3.8. The scalability gained by adding multiple web servers is significant. More information will be available soon on www.ez.no describing installing and tuning various components in a clustered eZ Publish setup. We would appreciate hearing your comments and experiences regarding clustering and eZ Publish.

Article Discussion

Clustering in eZ publish 3.8