Search: Entries could not be found - EZ Version 3.5.1

Author Message

Stefanie Salbaum

Thursday 23 August 2007 11:56:22 pm

Hi.

At first I was trying to find a similar problem via ez-search - but I think, it's too special. :o)

Here is the problem:
An actual content-management-system of one customer of us is realised with EZ Version 3.5.1.
Since half a year the search shows problems to find different articles.

E.g.:

1. Entries like "acs" could not be found. (There are more then 50 entries with this phrase in the cms.)
2. Entries like "spannung" or "spannung*" (this is one of the most used words in the cms) could not be found, too.

3. However phrases like "-XHV", "SRM-3000" or "high-" with special characters get a lot of hits.

The searchindex does not show any errors. It seems ok...
I don't know where I have to search?! To me, this makes no sense.

Any help greatly appreciated!

Thanks!

Steffi

Bruce Morrison

Friday 24 August 2007 1:00:45 am

Hi Stefanie

There is actually some code in the search that will exclude frequent words.

From 3.5.2

// Loop every word and insert result in temporary table

            include_once( 'kernel/classes/ezcontentobjecttreenode.php' );
            $showInvisibleNodesCond =& eZContentObjectTreeNode::createShowInvisibleSQLString( true );

            foreach ( $searchPartsArray as $searchPart )
            {
                $stopWordThresholdValue = 100;
                if ( $ini->hasVariable( 'SearchSettings', 'StopWordThresholdValue' ) )
                    $stopWordThresholdValue = $ini->variable( 'SearchSettings', 'StopWordThresholdValue' );

                $stopWordThresholdPercent = 60;
                if ( $ini->hasVariable( 'SearchSettings', 'StopWordThresholdPercent' ) )
                    $stopWordThresholdPercent = $ini->variable( 'SearchSettings', 'StopWordThresholdPercent' );

                $searchThresholdValue = $totalObjectCount;
                if ( $totalObjectCount > $stopWordThresholdValue )
                {
                    $searchThresholdValue = (int)( $totalObjectCount * ( $stopWordThresholdPercent / 100 ) );
                }

                // do not search words that are too frequent
                if ( $searchPart['object_count'] < $searchThresholdValue )

You need to tweak the StopWord settings in site.ini
See: http://ez.no/doc/ez_publish/technical_manual/3_9/reference/configuration_files/site_ini/searchsettings
Cheers
Bruce

My Blog: http://www.stuffandcontent.com/
Follow me on twitter: http://twitter.com/brucemorrison
Consolidated eZ Publish Feed : http://friendfeed.com/rooms/ez-publish

Stefanie Salbaum

Friday 24 August 2007 2:54:55 am

Hi Bruce!

I modified it (site.ini), but the problem still exists. :o(

Bruce Morrison

Friday 24 August 2007 4:38:12 am

Hi Stefanie

Did you clear the cache?

What values have you got 'StopWordThresholdValue' & 'StopWordThresholdPercent' set to?

I believe that if you set 'StopWordThresholdValue' to a large value (say 1000000) or 'StopWordThresholdPercent' = 100 that the exclusion will not happen.

Cheers
Bruce

My Blog: http://www.stuffandcontent.com/
Follow me on twitter: http://twitter.com/brucemorrison
Consolidated eZ Publish Feed : http://friendfeed.com/rooms/ez-publish

Stefanie Salbaum

Monday 27 August 2007 6:07:57 am

Hello Bruce,

thanks for your reply.

Yes, I cleared the cache. And here are the SearchSettings:

StopWordThresholdPercent = 80
StopWordThresholdValue = 100

Cheers,
Steffi

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.