Forums / Setup & design / Site not being indexed by google?: Solution

Site not being indexed by google?: Solution

Author Message

Bruce Morrison

Thursday 27 November 2003 1:59:43 pm

Hi all

I have worked on a number of sites over the last 12 months and was becoming increasinging frustrated because theyy were not being spidered beyond the home page by google. I found the reason this week!

Have you noticed that on some ezPublsih sites, the first page visited will have links will have appended something like "?PHPSESSID=b0da36931dc38bd1f04e9a7af8c5b165" ?

Well this is the issue!

From another CMS mailing list I'm on:

"We were having a problem getting our action app content indexed (by google search, not news), so i asked my brother who had just started working at Google. He said:

1. yes, they do index the query string (stuff after the ?).
2. in order to do so, they pay attention to the problem of session variables in the query string by assuming that anything that looks like a session variable is one.
3. the long item ids are thus assumed to be session variables, and aren't getting spidered (i don't know the exact rule, but probably any string longer than 16 chars is going to be assumed to be a session variable).
4. they were trying to improve their algorithm for figuring out what's a session variable and what isn't."

This issue is not a specific ezPublish one but relates to the fact that it uses sessions and a PHP default configuration.

The php configuration item is "session.use_trans_sid"

This needs to be turned off and the session information will dissappear from the link, the site will work fine and google will get beyond your home page.

See http://martin.f2o.org/php/session for details.

Cheers
Bruce
http://www.designit.com.au/

My Blog: http://www.stuffandcontent.com/
Follow me on twitter: http://twitter.com/brucemorrison
Consolidated eZ Publish Feed : http://friendfeed.com/rooms/ez-publish

Tristan Koen

Friday 28 November 2003 12:30:55 am

Brilliant Bruce!

We used to have exactly that problem too.... Google only indexed the landing page.
Our host recently upgraded to PHP4.2.2 and suddenly Google indexed over 150 pages.

Never managed to figure out why until now.

bisk

Friday 28 November 2003 2:38:16 am

I'm having the same problems with sessid's on the first page and google as well.

I guess not anymore, thanks Bruce.

The .htaccess fix works nicely.

-------------------------------
http://www.kookfijn.nl & http://www.magento.be

Simion Ward

Wednesday 17 December 2003 3:43:12 pm

Add the following meta tags to your site.ini.append file:

[SiteSettings]
MetaDataArray[robots]=all
MetaDataArray[robots]=index,follow
MetaDataArray[revisit after]=5 days

Should help with indexing.

Simon
http://www.webrak.co.uk

Simion Ward

Thursday 18 December 2003 2:11:49 am

just a quick note: google indexed 25 megs of my site last night after I made this change.