Forums / Developer / To do a minimal backup of a website , var directory and database is enough?

To do a minimal backup of a website , var directory and database is enough?

Author Message

Romeo Antony

Monday 29 November 2010 11:37:45 pm

Hi,

Instead of backing up the entire ez root directory , is it the best way to backup only var directory and databse , so that size of the backup data will be minimal for daily backup .

Is it the right way. Or should I backup entire ez installation directory.

I would like to keep the data as minimum as possible for a huge website

Damien Pobel

Tuesday 30 November 2010 12:16:41 am

Hi,

in the var directory you can also exclude log and cache directories. If you're using eZ Publish <= 4.3 you can also exclude the ezsession table content when dumping you're database.

Cheers

Damien
Planet eZ Publish.fr : http://www.planet-ezpublish.fr
Certification : http://auth.ez.no/certification/verify/372448
Publications about eZ Publish : http://pwet.fr/tags/keywords/weblog/ez_publish

Romeo Antony

Tuesday 30 November 2010 12:31:23 am

Thanks Damien. Thankyou for the important tip.

Steven E. Bailey

Tuesday 30 November 2010 12:48:09 am

You better back up your design folders too... and while you're at it your settings folders and any custom extensions that you have. At least if you actually want a backup that can restore a site.

But, then, the question becomes - your ezpublish root directory will be 200-300MB uncompressed while your var/storage and db files can be in the multi-GBs depending on your site, so, really why not back them up. Compressing a mysqldump really makes a big difference BTW.

What we do is we have an almost exact copy of all sites on our dev machines, then we create a nightly tar file of all files (excluding cache), dbdumps and system files, and store them on the machine so there is always last nights backup on the machine (useful to restore a database or a file that accidentally gets deleted)... Those files are also (nightly) downloaded to a backup machine that stores about two weeks worth of those backup tar files - these are written to tape which are rotated and once a month one is held back... so potentially there backups on tape going back six months.

The only thing I've mostly ever had to use is the nightly backup on the machine to restore a file that gets deleted.

But there have been a couple of times that disks have gone up in flames and the only thing that was left was the remote copy.

So, if you are bothering to backup, make sure you actually are backing up enough to restore the machine to the state that it is in.

Certified eZPublish developer
http://ez.no/certification/verify/396111

Available for ezpublish troubleshooting, hosting and custom extension development: http://www.leidentech.com

Romeo Antony

Tuesday 30 November 2010 1:47:39 am

Hi Steven , thanks . really interesting discussion.

"your ezpublish root directory will be 200-300MB uncompressed while your var/storage and db files can be in the multi-GBs depending on your site,"

So it is always better to backup the complete ez directory.

"Ceate a nightly tar file of all files (excluding cache), dbdumps and system files, and store them on the machine so there is always last nights backup".(to use it if file get damaged)

"Compressing a mysqldump really makes a big difference BTW. "

Can it lower the memmory usage if the dbdump file is quite huge.?

Make sure you actually are backing up enough to restore the machine to the state that it is in.

Good suggestion. Thankyou very much.

Gaetano Giunta

Tuesday 30 November 2010 2:06:35 am

If the site is huge in contents, as Steven said adding to the backup set the complete eZP installation will not increase the size of the backup file by a significant percentage, so it might be easier to backup everything (except the logs and caches) in a single pass.

Otoh if you really are keen on fast, small backups, you could set up

- a daily backup of contents (db + var/storage + var/<vardir>/storage)

- a backup of ezp+extensions+settings that is done maybe weekly or only on demand before site upgrades*

*= on demand is only good if you have your ezp ini settings set as read-only on the filesystem, so that admins cannot change them via the admin interface. Otherwise a daily/weekly backup of the settings becomes important.

Last note: for backup purposes the var/autoload dir should be considered to be part of settings, not of storage

Principal Consultant International Business
Member of the Community Project Board

Gaetano Giunta

Tuesday 30 November 2010 2:08:15 am

Most important advice I forgot: to be able to rebuild the machine you will need the full set of confs for

1. apache

2. php

3. mysql

Principal Consultant International Business
Member of the Community Project Board

Gaetano Giunta

Tuesday 30 November 2010 2:13:01 am

One more tip: if you have plenty of space on the server's hdd, having the last backup available as a complete copy of the eZP install instead of a zipped file can be a great way to speed up "restore" operations to previous versions (this mostly happens when deploying new versions, not because of data corruption):

- use as root dir for eZ Publish a symlink that points to the current installed version (eg. /var/www/ezp => /var/ezpublish/20101129)

- when deploying a new version, create the new static files in var/ezpublish/20101130, the copy over the storage (or just use a 2nd symlink to the storage dir) and change the symlink (and then run some scripts to clean then warm up the caches)

- rollback is as simple as a symlink change

Principal Consultant International Business
Member of the Community Project Board

Romeo Antony

Tuesday 30 November 2010 3:09:26 am

Thankyou for the tips Gaetano.

I would like to summarize some important tips the replies.

* if eZ Publish <= 4.3 , can also exclude the ezsession table from dbdump.

* If the site is huge, it is good to backup ezroot directory(include all designs)(exclude cache,error logs from var)

* if you really are keen on fast, small backups, could set up

- a daily backup of contents (db + var/storage + var/<vardir>/storage)

 

- a backup of ezp+extensions+settings that is done maybe weekly or only on demand before site upgrades*

* to rebuild the machine need to backup full config files of

1. apache

2. php

 

3. mysql(only once needed)

* Make last backup as exact copy instead of zipped version will be great way to speed up.

Thanks a lot for these important ideas.

Doug Brethower

Tuesday 30 November 2010 6:15:10 am

MySQL replication

http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html

And rsync

http://samba.anu.edu.au/rsync/

Combination is useful in remote mirrored development applications when bandwidth is at a premium. (small backup sizes)

The concept is to start with snapshot mirror images. Then transfer only changed data asynchronously on a time schedule defined by the developer.

Thanks Romeo for starting an important discussion!

Doug Brethower
Apple Certified Technical Consultant, Southwest, MO USA
http://share.ez.no/directory/companies/lakedata.net

Romeo Antony

Wednesday 01 December 2010 2:36:16 am

Hi Doug,

I have read through the pages you given . rsync is really impressive. I haven't tried yet. But ofcourse I will.

Thanks for your suggestions.

Andrew Wigglesworth

Wednesday 26 January 2011 2:51:22 pm

Yes, rsync is an impressive bit of software, but you need to take care when using it for backups.

rsync synchronises the contents of two directories ... this is mirroring, not in itself a backup. If you have problems in the first directory then they will simply be mirrored in the second obliterating any previously synchronised files.

You therefore need a strategy for creating and archiving snapshots of your synchronised files if you intend to use straight rsync.

Personally I use rdiff-backup. rdiff-backup incorporates rsync for file transfers/synchronisation but does a reverse differential backup of your files. This means that you will always have a mirror of your last backup (like rsync will do), but also a set of differentials which will take you back to earlier backup versions. You can decide how many reverse differentials to keep.

I find this a very efficient system as it keeps the bandwidth economy of rsync and the reverse differentials take up far less space than would multiple archives of the website.

Like rsync it can easily be incorporated into scripts and cronjobs.

Doug Brethower

Wednesday 26 January 2011 8:36:19 pm

Yes Andrew, thank you for the save and clarification. Rsync is mirroring. It is incumbent on user to handle it from there.

More discussion, rsnapshot, duplicity ..

http://www.saltycrane.com/blog/2008/02/backup-on-linux-rsnapshot-vs-rdiff/

Apple Time Machine on the target side has turned me plumb lazy as scheduled differential backups are then fully automated

Doug Brethower
Apple Certified Technical Consultant, Southwest, MO USA
http://share.ez.no/directory/companies/lakedata.net