Forums / Developer / Large customer base - New project

Large customer base - New project

Author Message

Lars Eirik R

Tuesday 15 February 2011 3:20:13 am

Hi.

We are currently looking at undertaking a project where a large number of customers need to be imported in to ez from a crm. The number today is approx 500 000 customers with a limit to approximately 10 potential different roles in the system.

Does anyone have any advice on how to scale such a large client base in terms of db? We are looking at using the clustered solution, but we are not really sure if this is required.

The sites (multi language) will be hosted in the cloud and may off course be scaled to our requirements.

Any thoughs on this matter?

Thiago Campos Viana

Tuesday 15 February 2011 4:03:00 am

Don't know if it helps, but there's a clustering tutorial.

There's also a server archicture tutorial

Anyway, here's a list that could help:

http://share.ez.no/learn/ez-publish/using-the-squid-reverse-proxy-to-improve-ez-publish-performance

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-1-of-3-introduction-and-benchmarking

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-2-of-3-identifying-trouble-spots-by-debugging

http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-3-of-3-practical-cache-and-template-solutions

eZ Publish Certified Developer: http://auth.ez.no/certification/verify/376924

Twitter: http://twitter.com/tcv_br

Marko Žmak

Tuesday 15 February 2011 4:24:47 am

Lars, there are not general recipes it all depends on many factors... The complexity of the site, it's refresh rate and the expected site traffic are one of the most important.

For a good site optimization It's very important that you understand your site, and know how eZP works internally.

One of the key issues are:

  • doing eZ fetches smartly
  • good cache configuration

For example, I have a news site with more than 500.000 objects (which produces eZ tables with >500.000 and >2.000.000 rows) running on two servers (one for web and one for DB) without DB clustering and without a web accelerator. The site has a very high refresh rate and sill runs perfectly.

Here are some resources I have dealt with lately that might help:

  • http://projects.ez.no/ezsi
  • http://share.ez.no/blogs/marko-zmak/when-ezsi-doesn-t-do-it
  • http://projects.ez.no/saarchive (my archiving extension)

and some good tutorials for optimizing performance:

  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-1-of-3-introduction-and-benchmarking
  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-2-of-3-identifying-trouble-spots-by-debugging
  • http://share.ez.no/learn/ez-publish/ez-publish-performance-optimization-part-3-of-3-practical-cache-and-template-solutions/(language)/eng-GB

P.S. I could also give you a few good pointers for importing all this data.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Lars Eirik R

Wednesday 16 February 2011 2:15:20 am

Thanks for all the pointers:

We have been looking at the infrastructure options and we consider running without a clustered solution, that is using the standard handler for file and database access in ez.

We consider having different instances with ezpublish installations access the same db. Our hope is that we can use a dedicated gpfs file system to "fool" ezpulish to use a var folder which to ez looks like it is running locally, whereas our gpfs configuration actually maps this to a dedicated server which can be mounted to each of the ezpublish instances.

Is this feasible or do we have to use the clustered solution to have this succesffully working. (using the mount point in the config option)

Marko Žmak

Wednesday 16 February 2011 3:15:40 am

Lars, this sounds like a good idea, I have been thinking about it too, but never had the time nor need to do it.

You should bear in mind that the main thing to be concerned about is the bottleneck. This is the part where most optimization is needed.

For example, if your bottleneck is database queries execution, then having different servers for serving data from the same DB server won't help much. In such a case, you should consider database cluster.

But if your bottleneck is PHP execution, cache files reading and serving the pages to users, then your idea should be a good solution.

As a general architecture guideline, I suggest having a separate DB server for the database, and a separate WEB server for storing files and serving requests (this includes storing DB data and file data on separate disks). The DB server should generally have more RAM and fast disks, and WEB server should have more CPU and storage space.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth