Forums / Developer / Massive users import

Massive users import

Author Message

laurent le cadet

Tuesday 31 July 2007 6:33:03 am

Hi,

I need to import 13000 users in a eZ site.

The users already have a 6 figures number which coulb used for the password.

I know there is some extensions which could deal more or less with that but due to the very large number of users I would like to have a community feedback.

Someone already faced this problem?
Any hint?

Regards.

Laurent

Felipe Jaramillo

Wednesday 01 August 2007 6:27:18 am

Hi Laurent,

We have previously handled several data import jobs successfully. The last job involved importing about 150.000 content elements into eZ. It took a few days but worked without much problems.

The process is quite straightforward using the API classes.

Some of my notes on this:

- The import must be done through the CLI, as any web-based extensions will timeout.
- The existing user import classes will come in handy to analyze how attributes are mapped.
- Take a look at bin/php/ezcsvimport.php for CSV import
- We have done most of our imports using XML as a source

Let me know if you need any more help. I could provide you with the basic script we use for your reference if you provide an email.

Regards,

Felipe

Felipe Jaramillo
eZ Certified Extension Developer
http://www.aplyca.com | Bogotá, Colombia

laurent le cadet

Wednesday 01 August 2007 6:34:35 am

Hi Felipe,

Thanks for your precious hints.
I'll try to start with this point and maybe will ask you more in a near futur.

Regards.

Laurent

Lazaro Ferreira

Wednesday 01 August 2007 7:44:33 am

Hi Laurent,

We have imported recently a similar number of users and related objects, before importing we take a look at the import extension availables and ezpublish import/export scripts, none of them suited our import jobs because we need to import the users, then the related objects and link it during the import ( using ezrelationlist attributes ), tailoring the data source wasn't an option

We have developed some scripts to do the job, but we think this is something that ezpublish should have, some kind of import framework to do more advanced taks during the import as linking related objects , creating objects conditionally, re-using external password hashes , etc

On the other hand if you can tailor your data source for ezpublish import script (ezcsvimport.php), it shouldn't be a problem to import user objects alone from command line

Lazaro
http://www.mzbusiness.com

laurent le cadet

Wednesday 01 August 2007 7:51:14 am

Thanks Lazaro.

Maybe I'll ask also for a little more help from you after digging.

Regards.

Laurent

Felipe Jaramillo

Wednesday 01 August 2007 10:21:03 am

I agree with Lazaro about the need for more robust, fast, foolproof way of importing large amounts of data.

We talked about this with other members of the community in the eZ Conference 2007, and some ideas came up. Also, there is an OpenFunding suggestion for Varmosa integration, which is a company that handles content migration for large CMS solutions. See http://ez.no/community/open_funding/suggestions_for_new_functionality/vamosa_integration

Regards,

Felipe

Felipe Jaramillo
eZ Certified Extension Developer
http://www.aplyca.com | Bogotá, Colombia

Lazaro Ferreira

Wednesday 01 August 2007 11:32:19 am

Hi Felipe,

Vamosa support importing content from non CMS ?, like legacy database applications ?

Regards

Lazaro
http://www.mzbusiness.com

Betsy Gamrat

Thursday 02 August 2007 8:11:52 pm

Assuming the user input is just user data, without additional information, it is really similar to a flat file and the import is fairly straightforward.

I have run many imports on eZ, with many types of content and some of my ideas are:

Backup the database

Break the import into chunks, so if something goes wrong, recovery is easier

Remember that an import tends to be a one-shot event, usually the code only has to run once, so it can be simple, brute force code.

Often converting it to a convenient format is as much work as the import itself, thus custom code may actually be more cost-effective than pursuing a more flexible tool.

I do agree a standardized import would be nice, but in this case, I think a simple utility is fine.