Discussion:
rsync or git backups?
(too old to reply)
Luca Ferrari
2016-06-01 08:35:06 UTC
Permalink
Hi all,
so far I'm using rsync to keep in sync a couple of removable media
(well, up to four) where one is the "master" and the others are a
cascade backups (meaning they are set at different time).
So far so good.
One problem is that I tend to change things in the master, e.g., bulk
file renaming or moving, so when I replicate it on the backups I have
to force the deletion of no more existing content.
This approach, however, relies on the fact that the master is good. My
fear is that if the master corrupts some file, I could possibly loss
them if they have also been moved since I will no more be able to
recognize them on the slaves.

So I would like to have some feature like git (or fossil) for hash
handling, but since I'm talking about 290+ GB of binaries I'm not sure
this approach could work.

Any suggestion?
Steve O'Hara-Smith
2016-06-01 10:33:32 UTC
Permalink
On Wed, 1 Jun 2016 10:35:06 +0200
Post by Luca Ferrari
Hi all,
so far I'm using rsync to keep in sync a couple of removable media
(well, up to four) where one is the "master" and the others are a
cascade backups (meaning they are set at different time).
So far so good.
One problem is that I tend to change things in the master, e.g., bulk
file renaming or moving, so when I replicate it on the backups I have
to force the deletion of no more existing content.
This approach, however, relies on the fact that the master is good. My
fear is that if the master corrupts some file, I could possibly loss
them if they have also been moved since I will no more be able to
recognize them on the slaves.
So I would like to have some feature like git (or fossil) for hash
handling, but since I'm talking about 290+ GB of binaries I'm not sure
this approach could work.
Any suggestion?
Use ZFS with snapshots (the zfs-periodic package is good for this)
and replace the rsync with send/receive, ZFS will protect you from hardware
silent corruption (provided you allow some redundancy - use copies on pools
with no redundancy) while the snapshots will protect you from mistakes.
--
Steve O'Hara-Smith <***@sohara.org>
N.J. Thomas
2016-06-01 17:39:09 UTC
Permalink
My fear is that if the master corrupts some file, I could possibly
loss them if they have also been moved since I will no more be able to
recognize them on the slaves.
As the other poster mentioned, ZFS snapshots is the way to go.

If you don't have these stored on ZFS filesystems, then check out
rsnapshot as an alternative:

http://rsnapshot.org/

It's available in ports. You can do ZFS-style
hourly/daily/weekly/monthly snapshots. It uses hard links to save space,
so the overhead is not too high.

Thomas
Brandon J. Wandersee
2016-06-01 18:18:43 UTC
Permalink
Post by Steve O'Hara-Smith
On Wed, 1 Jun 2016 10:35:06 +0200
Post by Luca Ferrari
Hi all,
so far I'm using rsync to keep in sync a couple of removable media
(well, up to four) where one is the "master" and the others are a
cascade backups (meaning they are set at different time).
So far so good.
One problem is that I tend to change things in the master, e.g., bulk
file renaming or moving, so when I replicate it on the backups I have
to force the deletion of no more existing content.
This approach, however, relies on the fact that the master is good. My
fear is that if the master corrupts some file, I could possibly loss
them if they have also been moved since I will no more be able to
recognize them on the slaves.
So I would like to have some feature like git (or fossil) for hash
handling, but since I'm talking about 290+ GB of binaries I'm not sure
this approach could work.
Any suggestion?
Use ZFS with snapshots (the zfs-periodic package is good for this)
and replace the rsync with send/receive, ZFS will protect you from hardware
silent corruption (provided you allow some redundancy - use copies on pools
with no redundancy) while the snapshots will protect you from mistakes.
If ZFS seems like overkill or too much hassle at the moment, you could
instead use sysutils/rsnapshot. It uses rsync to create snapshot-style,
rotating, de-duplicating, incremental backups. Verbose logging will
show you what files have changed since the last backup, so if you see a
file in the logs that you know you haven't changed in some time, it's
probably corrupt or has otherwise been compromised. Meanwhile, the
previous (good) versions will remain intact.
--
:: Brandon J. Wandersee
:: ***@gmail.com
:: --------------------------------------------------
:: 'The best design is as little design as possible.'
:: --- Dieter Rams ----------------------------------
Phil Eaton
2016-06-01 20:04:43 UTC
Permalink
Is there a good article comparing rsnapshot and zfs-snapshots?

On Wed, Jun 1, 2016 at 2:18 PM, Brandon J. Wandersee <
Post by Steve O'Hara-Smith
Post by Steve O'Hara-Smith
On Wed, 1 Jun 2016 10:35:06 +0200
Post by Luca Ferrari
Hi all,
so far I'm using rsync to keep in sync a couple of removable media
(well, up to four) where one is the "master" and the others are a
cascade backups (meaning they are set at different time).
So far so good.
One problem is that I tend to change things in the master, e.g., bulk
file renaming or moving, so when I replicate it on the backups I have
to force the deletion of no more existing content.
This approach, however, relies on the fact that the master is good. My
fear is that if the master corrupts some file, I could possibly loss
them if they have also been moved since I will no more be able to
recognize them on the slaves.
So I would like to have some feature like git (or fossil) for hash
handling, but since I'm talking about 290+ GB of binaries I'm not sure
this approach could work.
Any suggestion?
Use ZFS with snapshots (the zfs-periodic package is good for this)
and replace the rsync with send/receive, ZFS will protect you from
hardware
Post by Steve O'Hara-Smith
silent corruption (provided you allow some redundancy - use copies on
pools
Post by Steve O'Hara-Smith
with no redundancy) while the snapshots will protect you from mistakes.
If ZFS seems like overkill or too much hassle at the moment, you could
instead use sysutils/rsnapshot. It uses rsync to create snapshot-style,
rotating, de-duplicating, incremental backups. Verbose logging will
show you what files have changed since the last backup, so if you see a
file in the logs that you know you haven't changed in some time, it's
probably corrupt or has otherwise been compromised. Meanwhile, the
previous (good) versions will remain intact.
--
:: Brandon J. Wandersee
:: --------------------------------------------------
:: 'The best design is as little design as possible.'
:: --- Dieter Rams ----------------------------------
_______________________________________________
https://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "
--
Phil Eaton
Brandon J. Wandersee
2016-06-01 20:37:57 UTC
Permalink
Post by Phil Eaton
Is there a good article comparing rsnapshot and zfs-snapshots?
I don't know of any such comparison, but they are not comparable in any
technical sense. Here is the gist of how rsnapshot works, supposing you
configure it to make daily backups:

1) On the first run, a complete backup is created in a folder named
"daily.0" at the destination.
2) On the second run, the original backup is rotated to "daily.1". Copies
of all changed files are then copied to a new "daily.0" folder, and
hard links are created to all other files in the initial backup, so
only the new/modified files take up additional space and each backup
folder appears to contain a complete backup that can be restored.
3) This repeats for every backup, to a maximum number set in the
configuration file. You can configure multiple backup intervals and
run cron/anacron/periodic jobs for each one.

So when using an external or remote destination the general effect is
the same. Storing the backups locally will double the space consumed by
the backed-up data, however.
--
:: Brandon J. Wandersee
:: ***@gmail.com
:: --------------------------------------------------
:: 'The best design is as little design as possible.'
:: --- Dieter Rams ----------------------------------
jungle Boogie
2016-06-02 14:49:26 UTC
Permalink
Post by Luca Ferrari
Any suggestion?
Obviously the first choice is zfs snapshots but if you're like me,
that's not a choice.

I'd recommend something like borg backup:
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/

It's a python written app that will dedup your data so there's a
chance it can compress and dedup your large datasets very nicely.

The maintainer is very active with keeping it current and 1.0.3 is
available in ports/pkg:
http://www.freshports.org/archivers/py-borgbackup/

As you will see in the documentation, you can make a script and call
it with cron however often/irregular you would like and have it also
prune.

I think it's worth trying out for your use case.
--
-------
inum: 883510009027723
sip: ***@sip2sip.info
xmpp: jungle-***@jit.si
Luca Ferrari
2016-06-03 07:19:09 UTC
Permalink
Post by jungle Boogie
Post by Luca Ferrari
Any suggestion?
Obviously the first choice is zfs snapshots but if you're like me,
that's not a choice.
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/
Thanks to all.
Effectively ZFS is a kind of too much for my poor USB disks, also
because I would like to have them a little more portable than ZFS will
give me (I'm not aware of any commercial media center able to read
from a ZFS storage).
I will give a try to rsnapshot and borg.

Luca
Niklaas Baudet von Gersdorff
2016-06-03 08:09:37 UTC
Permalink
Post by jungle Boogie
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/
It's a python written app that will dedup your data so there's a
chance it can compress and dedup your large datasets very nicely.
This look impressive. Have you ever compared it to duplicity (which is
my current choice for backups)?

Niklaas
Steve O'Hara-Smith
2016-06-03 08:51:14 UTC
Permalink
On Fri, 3 Jun 2016 09:19:09 +0200
Post by Luca Ferrari
Post by jungle Boogie
Post by Luca Ferrari
Any suggestion?
Obviously the first choice is zfs snapshots but if you're like me,
that's not a choice.
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/
Thanks to all.
Effectively ZFS is a kind of too much for my poor USB disks, also
because I would like to have them a little more portable than ZFS will
give me (I'm not aware of any commercial media center able to read
from a ZFS storage).
I will give a try to rsnapshot and borg.
One thought for you - perhaps keep the master copy under ZFS for
snapshots and corruption protection and use rsnapshot or borg to copy to
more portable copies for use.
--
Steve O'Hara-Smith <***@sohara.org>
jungle Boogie
2016-06-03 15:50:23 UTC
Permalink
Hi Niklaas,
Post by Niklaas Baudet von Gersdorff
Post by jungle Boogie
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/
It's a python written app that will dedup your data so there's a
chance it can compress and dedup your large datasets very nicely.
This look impressive. Have you ever compared it to duplicity (which is
my current choice for backups)?
I haven't but I recommend trying borg on your data to see how it works for you.
Here's a very simple backup script that gives you an idea of what it does:
http://borgbackup.readthedocs.io/en/stable/quickstart.html#automating-backups
Post by Niklaas Baudet von Gersdorff
Niklaas
--
-------
inum: 883510009027723
sip: ***@sip2sip.info
xmpp: jungle-***@jit.si
N.J. Thomas
2016-06-03 19:06:17 UTC
Permalink
Post by N.J. Thomas
If you don't have these stored on ZFS filesystems, then check out
http://rsnapshot.org/
There was a story on HackerNews about rsnapshot yesterday, the
discussion was informative:

https://news.ycombinator.com/item?id=11817701

Several rsnapshot alternatives came up during that discussion

Attic - https://attic-backup.org/
Borg - https://borgbackup.readthedocs.io/en/stable/
Obnam - http://obnam.org/

Just fyi.

Thomas
Oliver Briscbois
2016-06-04 05:53:30 UTC
Permalink
Post by N.J. Thomas
There was a story on HackerNews about rsnapshot yesterday, the
https://news.ycombinator.com/item?id=11817701
Several rsnapshot alternatives came up during that discussion
Attic - https://attic-backup.org/
Borg - https://borgbackup.readthedocs.io/en/stable/
Obnam - http://obnam.org/
Thanks for taking the time to post about that discussion on HackerNews.
I found it very informative. I'm reading the documentation for
borgbackup now.

I've been using flexbackup with afio for several years, although I'm
not recommending this to anyone. I chose afio because it deals somewhat
gracefully with input data corruption, supports multi-volume archives,
and its compressed archives are much safer than compressed tar or cpio
archives. I've previously had to discard entire compressed backup
archives where one single bit went bad. With afio, and possibly other
compressed backup archives, only the affected file within the archive
is lost. Some loss is better than all lost with storage media fails.
I'll need to finish reading the borg docs to see how it handles errors
on backup media before making a switch.

http://www.edwinh.org/flexbackup/
http://members.chello.nl/~k.holtman/afio.html

At first glance it seems that Attic/Borg/Obnam offer dedupliction which
is far superior to that of my present flexbackup/afio solution.

Due to my location all backups need to be made to local storage media
usually USB first, then later to a physical HD which is removed and
kept in a safe place. When time ($) permits I'll set up a local NAS
server for backup storage and it'll definitely use ZFS at that time.

Thanks again,

Oliver
Niklaas Baudet von Gersdorff
2016-06-04 11:09:01 UTC
Permalink
Post by N.J. Thomas
There was a story on HackerNews about rsnapshot yesterday, the
https://news.ycombinator.com/item?id=11817701
Several rsnapshot alternatives came up during that discussion
Attic - https://attic-backup.org/
Borg - https://borgbackup.readthedocs.io/en/stable/
Obnam - http://obnam.org/
Plus "bup" -- for the sake of completeness.

I quickly read through the story and as I got it borg and attic are
related while borg solved some security issue that attic has.

Obnam is a different approach that has a "well documented filesystem"
but can't do rolling checksums (whatever that is, there's a reference
about that though).

And bup is based on git.

P.S.: Thanks for sharing the link!

Niklaas
Brandon J. Wandersee
2016-06-16 22:44:16 UTC
Permalink
Post by jungle Boogie
Post by Luca Ferrari
Any suggestion?
Obviously the first choice is zfs snapshots but if you're like me,
that's not a choice.
https://github.com/borgbackup/borg/
http://borgbackup.readthedocs.io/en/stable/
It's a python written app that will dedup your data so there's a
chance it can compress and dedup your large datasets very nicely.
The maintainer is very active with keeping it current and 1.0.3 is
http://www.freshports.org/archivers/py-borgbackup/
As you will see in the documentation, you can make a script and call
it with cron however often/irregular you would like and have it also
prune.
I think it's worth trying out for your use case.
I'd like to add my thanks to this thread. I didn't know about Borg, but
after a week of using it I have to say I'm very impressed.

It should be noted that Borg currently only builds from ports on systems
that have Python 3 set as the default (which is not the default on a
vanilla FreeBSD install), so there is no package avaiable for it (the
official Poudriere build server just ignores it). If you're running a
64-bit system and don't want to mess around with potentially rebuilding
a bunch of ports with non-default settings, you can download a pre-built
binary for it from the project's GitHub page:

https://github.com/borgbackup/borg/releases
--
:: Brandon J. Wandersee
:: ***@gmail.com
:: --------------------------------------------------
:: 'The best design is as little design as possible.'
:: --- Dieter Rams ----------------------------------
Loading...