The main goal here is to replace standard content list/tree template fetch functions (which use SQL queries) with eZ Find search fetch function (which uses the Solr indexing engine ). Reason for this is to gain more speed as Solr is way faster in case of large site with 10 thousands of objects or more. Additional benefit is the text searching capability of Solr, which can be used to enrich functionalities available on the website. There are some drawbacks and some situations where Solr search fetch function cannot be used for replacing standard eZ Publish fetch functions, and these will be covered.
This tutorial could be even more usable in the future because of the direction where eZ Find development is headed, as it will be possible to store entire objects in the Solr index. In this case it will not be needed to use database at all, in order to fetch content nodes.
Today, although we cannot avoid using SQL queries entirely (we can get list of nodes from Solr but the node content is still fetched from the database), performance gains in using Solr instead of standard eZ Publish fetch functions can be huge.
This tutorial is written for experienced eZ developers who already use eZ Find, as well as for intermediate eZ developers who did not yet use eZ Find but are planning to do so. As we will not cover installation of eZ Find, the main requirement is to have eZ Find installed and working. More information on eZ Find can be found here: http://ez.no/ezfind.
Before digging in it is important to know few things.
The first thing you need to be aware of is that eZ Find is using database of its own, based on the well known Solr/Lucene search engine. More info on Solr can be found here: http://lucene.apache.org/solr/. A citation from that web page:
“blazing fast open source enterprise search platform”
So the content needs to be indexed (transferred to Solr database) every time it is created or changed. If for some reason the indexation process fails you will not have up-to-date data in the index.
The indexation task is carried out by eZ Find search plugin. There are 3 ways how it can be configured:
There is no silver bullet solution in choosing from these 3 options, as it depends on the way you are using search functions. If results need to be up-to-date then DelayedIndexing option in ezfind.ini should be disabled. In that case for faster publish DisableDirectCommits and CommitWithin could be used.
Good practice would be to launch complete reindexing every week just to be sure, because various things can go wrong here: external tools for binary files can break, etc.
Second important thing is the ‘optimize’ function. Optimize does exactly what the name suggest, basically it merges more Solr segments (created by update, delete, etc.) into one and by doing so makes searching faster. Recommendation would be to schedule this with cronjob as it’s not important to be executed right after content is changed. Therefore OptimizeOnCommit option in ezfind.ini should be disabled.
Third thing is to enable AllowEmptySearch in site.ini. It will enable use of eZ Find search function for listing nodes without having any search text query. This switch is generally useful to remain disabled if standard eZSearch engine is used (to prevent exhaustive SQL queries) but with eZ Find it is not an issue, as Solr handles this much better.
Fourth: indexed fields are only subset of all fields from the database. What fields you can use depends on what meta data eZ Find maps, what class attributes are searchable and what object attributes are marked searchable.
Fifth, and last, with version 2.2 eZ Find you have the option to segment the index into more chunks which can be configured independently – called shards. This can be useful for many things, among which are :
More on shards here: http://ez.no/doc/extensions/ez_find/2_2/advanced_configuration/using_multi_core_features
Sixth: Using eZ Find will give you performance boost but this should not prevent you to use all other caching possibilities of eZ Publish. To have a page with minimum request to SQL database or Solr engine is still a must.
To conclude this basic options here are recommended settings in settings/override/site.ini.append.php :
[SearchSettings] DelayedIndexing=disabled AllowEmptySearch=enabled
In extension/ezfind/settings/ezfind.ini :
[IndexOptions] OptimizeOnCommit=disabled DisableDirectCommits=true CommitWithin=2
In fact, the eZ Find search function is still using database but only for fetching node data after search result list is returned by Solr. SQL queries for getting node data are rather fast and do not present a real problem. Results are dependent on user rights so we don’t need to worry about access privileges also.
Main characteristics of the search template function :
There is no parameter for fetching list (instead of tree) but it can be easily achieved with filter. It is possible to search within more than one parent node. Results can be sorted by relevance, by meta-data or by attributes. Filter can be built with mix of nested “AND” and “OR” conditions.
The most important gain, if the eZ Find search function is used, is the possibility to combine filtering with powerful text search. And there are lot of bonus features that can be used also: highlighting , spellchecking, etc
content list/tree function parameters | eZ Find search function equivalent
|
||
---|---|---|---|
Tree or list |
|
||
parent_node_id | subtree_array | ||
sort_by | sort_by More or less the same except :
|
||
offset & limit | offset & limit | ||
attribute_filter | In standard content list/tree function only 1 type of condition can be put (AND or OR). eZ Find supports more conditions and these conditions can be nested. Keep in mind that matching is different. E.g. ‘in’ can be replaced with (term1 OR term 2). Possibilities to use :
|
||
extended_attribute_filter | filter or eZ Find special rawSolrRequest function Solr query possibilities are different from SQL query possibilities so direct comparison does not make much sense. eZ Find rawSolrRequest function can use only data stored in the index so it will be less capable then extended_attribute_filter which is written in PHP and can use all data from the database. |
||
class_filter_type & class_filter_array | For including classes. Excluding can be done through ezfind.ini ( [IndexExclude] section ) |
||
only_translated & language | To narrow down to specific language results can be filtered with e.g. language_code:ger-DE There is also a SearchMainLanguageOnly switch in [LanguageSearch] section in ezfind.ini for using only prime language. Otherwise SiteLanguageList[] setting in site.ini is used. For leveraging even more from Solr shards can be used as a specific index for every language. In that case specific Solr configuration can be applied per language e.g. collation, stemming, etc. |
||
main_node_only | No data in the index. Generally always returns the main node, but finds the object in other locations also. |
||
as_object | Not implemented yet. | ||
depth | No data in index. | ||
limitation | limitation | ||
ignore_visibility | ignore_visibility |
Few simple examples on how to replace fetch content calls with eZ Find calls.
{fetch( 'content', 'list', hash( 'parent_node_id', 100, 'class_filter_type', 'include', 'class_filter_array, array('article'), 'sort_by', array( array('modified',false()), array( 'attribute', true(),'article/title') ) ))}
{fetch( 'ezfind', 'search', hash( 'filter', 'main_parent_node_id:100', 'class_id', array('article'), 'sort_by', hash( 'modified', 'desc', 'article/title', 'asc' ) ) )}
Listing only child nodes is solved with special filter. Including classes is simpler. Sorting is a bit different and it needs only one hash, with no nested arrays.
{fetch( 'content', 'tree', hash( 'parent_node_id', 100, 'ignore_visibility', true(), 'limit', 20, 'offset', 0, 'attribute_filter', array( array( 'review/rating', 'between', array( 0, 2 ) ) ) ) )}
{fetch( 'ezfind', 'search', hash( 'subtree_array', 100, 'ignore_visibility', true(), 'limit', 20, 'offset', 0, 'filter', array('review/rating:[0 TO 2]') ) )}
This example shows even more similarity between standard fetch function and eZ Find search fetch function. Only difference is the filter with the way how condition is constructed.
A few features that standard content fetch functions do not have and eZ Find does (and that are rather usable) :
Text search is the most powerful feature you can use. Searching for more words can be configured with “AND” or “OR” logic. Special signs “+” and “-” are used for defining obligatory and negative search terms if there are more of them. Quoting more words searches for exact phrases.
Facets are tools for drilling down within fetch results. There are extremely useful for giving user more information about the data presented by showing number of nodes per facet and by giving possibility to refine results with clicks. All information about facets is returned within the same result set so only one fetch is needed.
More info :
http://ez.no/doc/extensions/ez_find/2_2/customization/customizing_facets_and_drill_down_navigation
Boosting is usable only when sorting by relevance. Can be :
More on boosting :
Highlighting is usable only when searching for text. Can emphasize search terms in context where they appear.
Spell check is also usable only when searching for text. Can show suggestions for corrected terms based on indexed values.
More info :
http://ez.no/doc/extensions/ez_find/2_2/use/advanced_search/spellchecking
If you have large site with lot of object and lot of page views performance is often a very important issue. View caching, cache blocks, static cache and reverse proxies are, of course, important ways to gain performance. But there are situations where the raw fetch speed is also very important. If you can manage to implement those fetches with eZ Find you will have instant positive effects :
Additional functionalities that eZ Find provides are like a cream on top.
Happy coding!
This tutorial is available in PDF format for offline reading :
Need for speed - How to use eZ Find search fetch instead of standard content list/tree fetch - PDF Version
Working at Netgen, Zagreb, Croatia
Tech: CMS, eZ Publish, eZ Find, PHP, Linux, MySQL, Apache, Varnish
Services: system architecture, consulting, development, support, maintenance, upgrades
This work is licensed under the GNU Free Documentation License (GFDL) :
http://www.gnu.org/copyleft/fdl.html