Forums / Developer / How to handle 400 000 users ?

How to handle 400 000 users ?

Author Message

Matthieu Sévère

Tuesday 28 June 2011 12:35:30 am

Hello,

I want to handle authentication of 400 000 users in eZ Publish. Do you think that so many users can be a problem for eZ Publish content engine ? 

I have a webservice to handle authentication. So no problem to use it with a login handler but can I tweak the process so that I don't need to actually store users in eZ Publish ?

I was thinking of creating one user for everyone and tell eZ that this user is connected but I'm not sure it will do it : what happen if one of them disconnect ? can eZ handle multiple connection of the same user ? 

 

Thanks for your help !

--
eZ certified developer: http://ez.no/certification/verify/346216

Jean-Luc Nguyen

Tuesday 28 June 2011 1:19:09 am

Hi Matthieu,

We have more than 500 000 lines in ezuser table for a project, where each user can login to its account.
Best regards,

Jean-Luc.

http://www.acidre.com

Matthieu Sévère

Tuesday 28 June 2011 1:22:55 am

Great, no performance problems ?

--
eZ certified developer: http://ez.no/certification/verify/346216

Jean-Luc Nguyen

Tuesday 28 June 2011 1:28:56 am

Well, no cache-block used, as all account data can be updated by the user. No performance issue, but we never try to connect all 500 000 users at the same time. We only have an average of 500 to 1000 users connected to their account at the same time.

http://www.acidre.com

Ivo Lukac

Tuesday 28 June 2011 1:31:20 am

There is an anonymous user for all the anonymous sessions. Maybe you can do something similar with one special user if you don't want to have 400k users in the database. If you would add 400k users to the db try not to add them all under the same group...

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Matthieu Sévère

Tuesday 28 June 2011 1:42:26 am

"

There is an anonymous user for all the anonymous sessions. Maybe you can do something similar with one special user if you don't want to have 400k users in the database. If you would add 400k users to the db try not to add them all under the same group...

"

Yes I was thinking of that but I'm a bit afraid of possible side effects, any feedback on this ?

How many users in one group do you think is reasonable not to feel perfomance issue ?

 

Thanks both of you for your feedback !

--
eZ certified developer: http://ez.no/certification/verify/346216

Gaetano Giunta

Tuesday 28 June 2011 1:47:00 am

500k users is not really a pain point per se. But it can be if

  • you have very big user profile class
  • you have very complex roles and permissions assignments

As Ivo says, you could "trick" eZ by storing in the eZ content db one user per group, and storing in the user session data at runtime the "actual user id from external system", then use that info for applying your customizations in template / php code.

I would not go down that path with a light heart though, as I've seen a few implementations where this had been done and

  • a lot of time was spent on it before getting it to work
  • lot of security problems were in the custom code (but hey, that's what audits are for ;-) )
  • performances were not good anyway in the end (eg. custom session code doing ws calls to backoffice authentication server one or more times per page, authentication server slowing down whole site under load...)

Principal Consultant International Business
Member of the Community Project Board

Ivo Lukac

Tuesday 28 June 2011 1:50:37 am

"
"

There is an anonymous user for all the anonymous sessions. Maybe you can do something similar with one special user if you don't want to have 400k users in the database. If you would add 400k users to the db try not to add them all under the same group...

"

Yes I was thinking of that but I'm a bit afraid of possible side effects, any feedback on this ?

How many users in one group do you think is reasonable not to feel perfomance issue ?

"

No experience about the anonymous :( it was just an idea.

Regarding the number of users in the group I would say about few thousands. A very rough estimate, depends on lot of things. The main problem with large number of children under the same node is when a new object is created under that node (e.g. user registers), the publishing process tends to be longer and longer.... and you don't want for a new user to wait for 30 seconds to register ;)

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Matthieu Sévère

Tuesday 28 June 2011 2:03:04 am

"

I would not go down that path with a light heart though, as I've seen a few implementations where this had been done and

  • a lot of time was spent on it before getting it to work
  • lot of security problems were in the custom code (but hey, that's what audits are for ;-) )
  • performances were not good anyway in the end (eg. custom session code doing ws calls to backoffice authentication server one or more times per page, authentication server slowing down whole site under load...)
"

Ok that's a VERY interesting feedback !

--
eZ certified developer: http://ez.no/certification/verify/346216

Matthieu Sévère

Tuesday 28 June 2011 2:08:47 am

If I sum up all of your interesting answers : 

500k users are ok in eZ but :

  • Max a few thousands nodes in a group so that publication process is not too long
  • Not too complex user class
  • Simple roles and permissions assignements 

 

Thanks all !

--
eZ certified developer: http://ez.no/certification/verify/346216

Marko Žmak

Tuesday 28 June 2011 5:09:11 am

Another tip...

Don't use per user view cache. This is actually already disabled by default but just to let you know.

For the parts where you need to display some contect specific for a user, use ajax calls like for example here on share.ez.no.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Matthieu Sévère

Tuesday 28 June 2011 5:19:26 am

"

Another tip...

Don't use per user view cache. This is actually already disabled by default but just to let you know.

For the parts where you need to display some contect specific for a user, use ajax calls like for example here on share.ez.no.

"

This is not planned to be used in this case but why a "bolded don't" :) ? 

Of course it generates a lot of cache files but is there other problems with this that I don't know ?

--
eZ certified developer: http://ez.no/certification/verify/346216

Ivo Lukac

Tuesday 28 June 2011 5:31:02 am

"
"

Another tip...

Don't use per user view cache. This is actually already disabled by default but just to let you know.

For the parts where you need to display some contect specific for a user, use ajax calls like for example here on share.ez.no.

"

This is not planned to be used in this case but why a "bolded don't" :) ? 

Of course it generates a lot of cache files but is there other problems with this that I don't know ?

"

Lets say you have 500 pages * 400000 user = 200 000 000 files on disk, that is why :) in that case it would be even better to disable view cache and use some cache-blocks where possible

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Marko Žmak

Tuesday 28 June 2011 5:36:45 am

"
"

Another tip...

Don't use per user view cache. This is actually already disabled by default but just to let you know.

For the parts where you need to display some contect specific for a user, use ajax calls like for example here on share.ez.no.

"

This is not planned to be used in this case but why a "bolded don't" :) ? 

Of course it generates a lot of cache files but is there other problems with this that I don't know ?

"

A lot of cache files means a lot of work for eZ when it has to regenerate the cache. For example if you have per user cache enabled, and an article gets reedited then the cache for all the users should be regenerated.

So if you have a lot of users currently online, you could get a lot of smiultaneous cache clearings when they are browsing the published or reedited articles.

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Gaetano Giunta

Tuesday 28 June 2011 6:42:02 am

To clarify a bit about cache:

1. view cache generates one file on disk per object per user-group per-view-parameters. Not one per-user

so if you have 500k users in 50 groups, it's not so bad

2. cache blocks: here you decide the keys (one file stored per combination of keys used). Putting in the keys the node_id + curent user user_id is a horrible idea even with a few hundred users...

Principal Consultant International Business
Member of the Community Project Board

Ivo Lukac

Tuesday 28 June 2011 6:46:59 am

"

To clarify a bit about cache:

1. view cache generates one file on disk per object per user-group per-view-parameters. Not one per-user

so if you have 500k users in 50 groups, it's not so bad

2. cache blocks: here you decide the keys (one file stored per combination of keys used). Putting in the keys the node_id + curent user user_id is a horrible idea even with a few hundred users...

"

Just to make things clear: I would suggest to use cache-blocks for non-user-specific parts, of course :)

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Carlos Revillo

Tuesday 28 June 2011 7:10:36 am

"

Another tip...

Don't use per user view cache. This is actually already disabled by default but just to let you know.

For the parts where you need to display some contect specific for a user, use ajax calls like for example here on share.ez.no.

"

This is an option, but i don't feel comfortable depending on AJAX for showing this or that content to a logged user. better than use cache per user, sure, but still i don't like it. 

sometimes i said to some ez crew members how good it could be a template tag like

{do-not-cache-this}
{* code *}
{/do-not-cache-this}

specially thinking about the inclusion of voting actions or per-user generated content without the need of javascript.

So, even having view caches enabled, that cache file could still have some php code to do some specific stuff. but unfortunately it's not possible (afaik) thinking in how ez publish works with this. 

Fortunately this seems to be possible with zeta components though :)

Cheers. 

Marko Žmak

Tuesday 28 June 2011 7:25:58 am

"

To clarify a bit about cache:

1. view cache generates one file on disk per object per user-group per-view-parameters. Not one per-user

so if you have 500k users in 50 groups, it's not so bad

"

Isn't this actually "per set of users with the same roles and policies" instead of "per group"? So if you have several groups that have the same set of policies than it will be even less cache files.

But if you enable per user view cache (pr_user view cache tweak), then it will be 500K cache files for each object, right?

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Marko Žmak

Tuesday 28 June 2011 7:40:17 am

"

This is an option, but i don't feel comfortable depending on AJAX for showing this or that content to a logged user.

"

What's the problem with using ajax calls for this parts? It's a very common technique on all big sites.

"

better than use cache per user, sure, but still i don't like it. 

sometimes i said to some ez crew members how good it could be a template tag like

{do-not-cache-this}
{* code *}
{/do-not-cache-this}

specially thinking about the inclusion of voting actions or per-user generated content without the need of javascript.

So, even having view caches enabled, that cache file could still have some php code to do some specific stuff. but unfortunately it's not possible (afaik) thinking in how ez publish works with this. 

Fortunately this seems to be possible with zeta components though :)

"

First I would suggest you to take a look at varnish, ezsi and ESI or SSI.

Second, on one site I have played with something that could be useful to you... a way to include, execute and cache just small pieces of your code, one at the time and include them via ESI or SSI. This approach would suit more your "per user" needs since it allows you to pass custom parameter to the code, which can also be used to generate different output for different users. I haven't yet used it in the way you would need it, but I supose it could be used for yur purpose.

If you're intersted, the approach is described here:

--
Nothing is impossible. Not if you can imagine it!

Hubert Farnsworth

Gaetano Giunta

Tuesday 28 June 2011 8:39:24 am

"
"

To clarify a bit about cache:

1. view cache generates one file on disk per object per user-group per-view-parameters. Not one per-user

so if you have 500k users in 50 groups, it's not so bad

"

Isn't this actually "per set of users with the same roles and policies" instead of "per group"? So if you have several groups that have the same set of policies than it will be even less cache files.

"

Correct. I was oversimplifying.

Principal Consultant International Business
Member of the Community Project Board