?

Log in

Previous Entry | Next Entry

Call to the Lazyweb: Backup

I have a problem I've been beating my head against for a while now, and I've finally given up and decided to put this out there to the hive-mind of the Internet.

I have a laptop I want to keep regularly backed up. I have external hard drives that I use to do this, one that I carry with me and one that stays in my office in Portland. I use cloning software to duplicate the contents of the laptop onto them.

But I also want to do incremental backups, Dropbox-style, to a server I own.

I do have a paid Dropbox account and I do use it. (I also have a paid Microsoft OneDrive account.) But I'd really prefer to keep my files on my own server. What I want is very simple: the file and directory structure on the laptop to be mirrored automatically on my server, like such:



This should not be difficult. There is software that should be able to do this.

What I have tried:

Owncloud. They no longer support Mac OS X. Apparently they ran into problems supporting Unicode filenames and never solved it, so their solution was to drop OS X support.

BitTorrent Sync. This program is laughably bad. It works fine, if you're only syncing a handful of files. I want to protect about 216,000 files, totaling a bit over 23 GB in size. BT Sync is strictly amateur-hour; it chokes at about 100,000 files and sits there indexing forever. I've looked at the BT Sync forums; they're filled with people who have the same complaint. It's not ready for prime time.

Crashplan. Crashplan encrypts all files and stores them in a proprietary format; it does not replicate the file and folder structure of the client on the server. I'm using it now but I don't like that.

rsync. It's slow and has a lot of problems with hundreds of thousands of files. The server is also on a dynamic IP address, and rsync has no way to resolve the address of the server when it changes.

Time Machine Server. Like CrashPlan, it keeps data in a proprietary format; it doesn't simply replicate the existing file/folder structure, which is all I want. Like rsync, it has no way to cope with changes to the server's IP address.

So you tell me, O Internets. What am I missing? What exists out there that will do what I want?


Comments

( 16 comments — Leave a comment )
sweh
Feb. 18th, 2016 02:15 am (UTC)
Does MacOS have the equivalent of "dump"?

What I do, on my Linux machines, is take a regular incremental dump (Sun=full backup, Mon=level 1, Tue=level 2 etc) which is gzip'd to my NFS server. That NFS server is rsync'd offsite to my rented server in Canada over an ssh pubkey login (so don't care about source IP). It only needs to deal with a few new files for rsync so it's pretty good.

If I want to backup my laptop while travelling (I don't 'cos I use a chromebook for that) then I'd make a VPN connection back home (via a dynamic DNS entry for my home machine incase ISP changes IP address) so it can see the server.

But I've no idea if MacOS has a "dump" equivalent...
tacit
Feb. 19th, 2016 01:05 am (UTC)
dump() isn't present in OS X. It's a bit dangerous to use on a live filesystem in any event. And it still leaves the problem that a dump file is a single image of the disk's content, and using rsync to copy it to the server can't be automated if the server's IP address changes.
sweh
Feb. 19th, 2016 01:14 am (UTC)
If the server IP address changes, use a dynamic DNS system. I built my own 'cos I'm a cheap-skate. home.spuddy.org always points to my home machine. I even have a letsencrypt ssl cert (https://home.spuddy.org). I can change ISP or even move my home to another country and everything will just work.

And for when my chromebook is traveling I have it set up a VPN (openVPN) so it can see my home network. But if you don't want to do that then ssh to the dynamic DNS name. So rsync over ssh is perfectly viable and simple.

"Dynamic IP servers" are a mostly solved problem. The solution may not be "five nines" reliable, but it's sure good enough for a personal setup, IMHO.

(Not dummy friendly, of course, but you've demonstrated you're not a dummy :-))
edm
Feb. 18th, 2016 08:40 am (UTC)
git-annex
git-annex, possibly with the auto-sync assistant? It's basically intended to be an Open Source Dropbox-a-like. I use it to keep a whole bunch of things in sync between multiple machines (and backup drives), but haven't used the assistant myself (just the command line version with a bunch of wrapper scripts).

There's a whole bunch of back ends ("special remotes"), but mostly I just use ssh for transport.

Ewen
tacit
Feb. 19th, 2016 03:34 am (UTC)
Re: git-annex
This looks like it might fit the bill. I've downloaded it and I'm in the process of seeing how well it will work. Thanks!
tacit
Feb. 20th, 2016 01:00 am (UTC)
Re: git-annex
Okay, so I installed git-annex, configured it (not easy to do; it turns out most home routers filter SYN DNS requests, so I was not able to log on to Jabber until I did some faffing and set both computers to use Google's public DNS servers), created a Box account, created repositories, hit sync, and...

On the laptop, it added 52,000 files to the repository and then threw up an alert in the Web GUI to the effect of "Warning, thread crashed adding files to the repository, restart thread?" And then the UI locked up.

Every. Damn. Time.

It appears that git-annex, like BT Sync, is fine for amateur use with a small number of files but is not robust enough for my needs.
edm
Feb. 20th, 2016 01:18 am (UTC)
Re: git-annex
:-(

That's definitely an order of magnitude or two more files than the repositories I've used myself (most of my repositories have fewer, large files); and as I said, I've not used the assistant (and hence not the Jabber, etc, coordination bits).

There's a Bug Page if you want to try reporting it; Joey Hess (the author) is usually pretty responsive. One that seems plausibly related at present is one about non-ascii filenames.

Sorry you had a less than ideal experience.

Ewen
edm
Feb. 20th, 2016 01:25 am (UTC)
Re: git-annex
Also someone else had issues with files with accents in them, recently, on OS X, so that part may be less tested. (Joey develops mostly on Debian Linux, so the Linux version presumably gets more testing.)

Ewen
edm
Feb. 23rd, 2016 11:55 pm (UTC)
RE: Re: git-annex
FYI, "Also, fixed problems with the Android, Windows, and OSX builds today. Made a point release of the OSX dmg, because the last several releases of it will SIGILL on some hardware." So maybe your crash issue was one of those fixed bugs.

Just in case you'd like to try again (I'm still running an old build that I haven't upgraded because it just worked and I didn't need the newer features).

Ewen
robbat2
Feb. 20th, 2016 10:58 pm (UTC)
You did read the scalability page right?
https://git-annex.branchable.com/scalability/

Where he notes that the root problem is 'git add' being slow on hundreds of thousands of files, and gives a tuning param to work around it.
TheRevAlokSingh
Feb. 18th, 2016 09:04 am (UTC)
unison
https://www.cis.upenn.edu/~bcpierce/unison/
tacit
Feb. 19th, 2016 01:06 am (UTC)
Re: unison
I looked into Unison. It has the same difficulty as rsync--no auto-discovery of a server with a dynamic IP address.
edm
Feb. 19th, 2016 02:28 am (UTC)
Re: unison
For which the usual solution is (a) a dynamic DNS updater, or (b) a VPN, or (c) both. If you host your own DNS services, or your DNS provider allows dynamic updates directly, then you won't even need a third party service -- just a short TTL. Otherwise a CNAME from myserver.mydomain.example to the dynamic DNS ugly name is a common workaround.

If dynamic DNS/VPNs are not possible options you're probably going to want some well known stable central point. That is, after all, how all the commercial services work.

Ewen

sylphon
Feb. 19th, 2016 12:14 am (UTC)
I just paid for crashplan subscription too, but I really hate proprietary formats for backups. I'll be watching this thread with much interest :-)
robbat2
Feb. 19th, 2016 08:37 pm (UTC)
ArqBackup

The file format is fully documented, git-based, AND there is a open-source restore tool for it, officially supported by the company.
As destinations, supports many cloud targets AND SFTP to your own servers.

You'll want to use some manner of dynamic DNS to give your server a consistent hostname too.
haightch
Feb. 20th, 2016 03:54 am (UTC)

If you were using Windows, there is an amazing free tool from Microsoft (really), called SyncToy. Does exactly what you want. Super easy to use. Incremental synchronization/mirroring of folders, and works over a network shared drive, or a plugged in usb drive.


If people find this through web search, maybe it will help them.

( 16 comments — Leave a comment )