Tuesday 13 June 2006 8:14:53 am
OK, so after a lot of struggle I managed to mix host and uri based matching. The result is not perfect, it has very annoying shortcomings, but it works, it's not a myth. First, I'd like to explain what I want from ez Publish: I want it to allow me to accomplish the following 3 goals <b>at the same time</b>:
1. Use the same install (same DirectoryRoot, same database, etc.) for 2 or more completely different websites. <i>This means 2 or more hosts.</i>
2. Allow me to make any of those websites multilingual. <i>This means that some form of language detection and propagation is necessary.</i> 3. Allow me to use visitor authentication (login) on any of the hosted sites, and keep the visitor session working even when they're changing language. Because ez Publish was designed insisting on detecting language based on every possible method <b>except cookies</b>, we have a very nasty conflict of interests between multilanguage sites and different sites, or sites using visitor login. That's because visitor login is accomplished using cookies, which won't propagate over different domains. What this means is that it effectively blocks me from using host matching for accomplishing multilanguage and login at the same time. If a visitor logs in on a language, this means the cookie gets set for a certain domain. When he changes language he changes domain and poof, he's not logged in anymore. I assume most of us regard the need to login once for every language somewhat annoying.
So, you can't use host matching for doing multilanguage. For multilanguage you effectively need to keep the same domain. This means using one of the alternative methods:
<b>a) Port matching.</b> It's somewhat peculiar, it may be blocked by some visitor's firewalls or routers, and it exhibits the siteaccess traversal problem (I will detail on this below).
<b>b) Server variables.</b> It is OK in concept, because you can use the Apache virtual hosts to avoid the siteaccess traversal problem. But in order to set variables you need to check some kind of condition. It can't be domains, because, you know. So it has to be some form of URL matching. But if you have to use URL matching you're much better of with ez Publish's native one. Which brings us to... <b>c) URL matching.</b> This is nice and it works, except it also exhibits siteaccess traversal, it makes developers, especially newbies, go gray figuring it out, it requires mod_rewrite magic to make nice URL's. You also need to figure out how to mix it with host matching, because I don't think site owners would appreciate serving two different sites from the same domain.com/site1 and domain.com/site2. <b>What's the siteaccess traversal problem:</b> once you accomplish host AND uri matching, you discover that it works, except one minor detail: you can access any siteaccess from any host, if you play with the URL a little. Say I have www.site1.com and www.site2.com, and each has English and French languages, and there's a common admin siteaccess. People use www.site1.com/en1/ and www.site1.com/fr1/ and www.site2.com/en2/ and www.site2.com/fr2/ and some other url for the admin site. <b>But there's nothing stopping someone from combining site1 with en2 or fr2.</b> And you get a potentially embarassing or even harmful situation. Same goes for any host+siteaccess combination you make available in such a setup, which is the only working setup that can accomplish the 3 goals I layed out above. Solution: you can block "bad" combinations using mod_rewrite, which means very ugly and complex rules. <b>How to mix host and uri matching:</b> The trick is simple: specify several matching rules in MatchOrder, not just one. When one type of matching isn't matched, ez goes on to try the next. When none are matched, it uses DefaultAccess.
[SiteAccessSettings]
AvailableSiteAccessList[]=admin
AvailableSiteAccessList[]=site1
AvailableSiteAccessList[]=site2_en
AvailableSiteAccessList[]=site2_fr
MatchOrder=host;uri
HostMatchType=map
HostMatchMapItems[]=admin.site1.com;admin
HostMatchMapItems[]=www.site1.com;site1
URIMatchType=element
URIMatchElement=1
[SiteSettings]
DefaultAccess=site2_en
Yes, that's right: we didn't define a host match for any of the site2 siteaccess's. The above setup works like this:
* We first insert all the siteaccesses we have: admin, site1, site2_en, site2_fr.
* We use MatchOrder to say we'd like to try host matching first, then fallback to uri matching.
* The Host* directives will try to match sites by domain name. With properly configured Apache virtual hosts, the admin site and site1 will match here, and we get the admin and site1 siteaccess, respectively.
* When a user goes to www.site2.com/index.php/site2_fr/, the matching doesn't find a host match, so it goes on to uri matching. It extracts "site2_fr" from the URI, as instructed by the URI* directives, and uses that as a siteaccess name. Since we really have one, you get the French version of site2. If you use /index.php/site2_en/ you get the English version. * If nothing matches, and this includes using bogus siteaccess, it falls back to the DefaultAccess, which is site2_en. What's the problem? * One problem is that I can use the "site1" siteaccess with www.site2.com, because it's a valid siteaccess and there's nothing telling ez Publish that it's not a proper combination. Same goes for every other site you're hosting on that ez install and which doesn't use host matching; as you give up precise hostname matching, you go into nobody's land, where every site can use any siteaccess, because URI's can be freely modified by visitors. * Another problem is getting rid of /index.php/. ezroot strips BOTH index.php and the siteaccess name. ezurl doesn't strip anything. Some kind of middle url operator is needed, that will only strip index.php. There you have it. If I may ask, why didn't the ez developers implement language detection by cookie, and leave all the fancy URL matchings exclusively for siteaccess separation? Why are siteaccesses used as a hack for multilanguage, instead of being actually different sites?
|