There are a lot of different methodes you could try. I think the easiest is to connect both systems with a (mesh)VPN like Wireguard, ZeroTier or Tailscale. Then you can simply copy stuff over using rsync -a (archive mode) with a cronjob or using special tools like Borg backup, kopia, etc
I second this. But keep in mind the difference between a sync tool like rsync, syncthing etc. and a dedicated backup tool like borg.
A sync tool is basically a fancy copy. It copies what is there now. It's a bit smarter than a copy in that it can avoid copying unmodified files, can optionally delete files that are no longer there, and has include/exclude patterns.
But a sync tool doesn't preserve older versions, and doesn't do deduplication, compression, encryption and so on. A backup tool does.
Both can be useful, as long as you use them properly. For example I gave my dad a Syncthing dir on his laptop that syncs whatever happens in that dir, over Tailscale, to my NAS. But the dir on the NAS gets backed up with Borg once a day.
The Syncthing protects against problems like the laptop dies, gets dropped, gets stolen etc. The Borg backup protects against deleted and modified files. Neither of them is of any use if the user didn't put something in the dir to begin with.
Borg is great.
Think ABC
- One backup on site
- One backup on site on a different medium
- One backup offsite, preferably setup in a way that doesn't allow modification by ransomware
One backup on site on a different medium
One offline backup.
Backup on a different medium is archaic advice unless you're willing to fork $$$ out for a tape drive system. DVDs don't cut it in the era of 20Tb HDDs. I'd argue that HDD is the only practical media currently for > 4Tb at less than enterprise scale. Backblaze might be considered a different medium I guess.
I have an HDD with... I think 4 TB laying around. What would be the best option? To just plug it into the server and leave it there?
Sure, but the point of an offline backup is to disconnect it when not in use, rendering it immune to ransomware, accidental deletions, lightning strikes etc. Plug in every week or whatever, do your backup, disconnect, sleep easy. I use an external usb hdd caddy (note that one needs a firmware update to work with bigger disks)
I personally do a restic backup on my server (I have a dedicated hetzner server), and keep a backup on the server itself, and do a backup to a backblaze bucket for an offsite backup.
I also have a restic check job run every week, that reads 10% of the random data making sure everything is fine and working correctly.
For my local machine I do the same but additionally backup also to an usb drive for a 3-2-1 backup solution.
As others have mentioned its important to highlight the difference between a sync (basically a replica of the source) vs a true backup which is historical data.
As far as tools goes, if the device is running OMV you might want to start by looking at the options within OMV itself to achieve this. A quick google hinted at a backup plugin that some people seem to be using.
If you're going to be replicating to a remote NAS over the Internet, try to use a site-to-site VPN for this and do not expose file sharing services to the internet (for example by port forwarding). Its not safe to do so these days.
The questions you need to ask first are:
- What exactly needs to be backed up? Some of it? All of it?
- How much space does the data I need backed up consume? Do I have enough to fit this plus some headroom for retention?
- How many backups do I want to retain? And for how long? (For example you might keep 2 weeks of daily backups, 3 months of weekly backups, 1 year of monthly backups)
- How feasible is it to run a test restore? How often am I going to do so? (I can't emphasise test restores enough - your backups are useless if they aren't restorable)
- Do you need/want to encrypt the data at rest?
- Does the internet bandwidth between the two locations allow for you to send all the data for a full backup in a reasonable amount of time or are you best to manually seed the data across somehow?
Once you know that you will be able to determine:
- What tool suits your needs
- How you will configure the tool
- How to set up the interconnects between sites
- How to set up the destination NAS
I hope I haven't overwhelmed, discouraged or confused you more and feel free to ask as many questions as you need. Protecting your data isn't fun but it is important and its a good choice you're making to look into it
Second to this - for what its worth (and I may be tarred and feathered for saying this here), I prefer commercial software for my backups.
I've used many, including:
- Acronis
- Arcserve UDP
- Datto
- Storagecraft ShadowProtect
- Unitrends Enterprise Backup (pre-Kaseya, RIP)
- Veeam B&R
- Veritas Backup Exec
What was important to me was:
- Global (not inline) deduplication to disk storage
- Agent-less backup for VMware/Hyper-V
- Tape support with direct granular restore
- Ability to have multiple destinations on a backup job (e.g. disk to disk to tape)
- Encryption
- Easy to set up
- Easy to make changes (GUI)
- Easy to diagnose
- Not having to faff about with it and have it be the one thing in my lab that just works
Believe it or not, I landed on Backup Exec. Veeam was the only other one to even get close. I've been using BE for years now and it has never skipped a beat.
This most likely isn't the solution for you, but I'm mentioning it just so you can get a feel for the sort of considerations I made when deciding how my setup would work.
After reading this thread and a few other similar ones, I tried out BorgBackup and have been massively impressed with it's efficiency.
Data that hasn't changed, is stored under a different location, or otherwise is identical to what's already stored in the backup repository (both in the backup currently being created and all historical backups) isn't replicated. Only the information required to link that existing data to its doppelgangers is stored.
The original set of data I've got being backed up is around 270gb: I currently have 13 backups of it. Raw; thats 3.78tb of data. After just compression using zlib; that's down to 1.56tb. But the incredible bit is after de-duplication (the part described in the above paragraph), the raw data stored on disk for all 13 of those backups: 67.9gb.
I can mount any one of those 13 backups to the filesystem, or just extract any of 3.78tb of files directly from that backup repository of just 67.9gb of data.
I tried to compare some backup solutions a while back: https://hedgedoc.ptman.name/kket4uo9RLiJRnOhkCzvWw#
Thank you, I've downloaded the .md for my Obsidian notes :-) Great starting point!
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:
Fewer Letters | More Letters |
---|---|
NAS | Network-Attached Storage |
UDP | User Datagram Protocol, for real-time communications |
VPN | Virtual Private Network |
3 acronyms in this thread; the most compressed thread commented on today has 7 acronyms.
[Thread #578 for this sub, first seen 6th Mar 2024, 12:55] [FAQ] [Full list] [Contact] [Source code]
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!