DIY Computer backup

From DIYWiki
Revision as of 21:58, 4 October 2023 by John Rumm (talk | contribs)
Jump to navigation Jump to search

**** CAUTION - INCOMPLETE ARTICLE - WORK IN PROGRESS ****

As a DIYer, the chances are you have accumulated a vast collection of files over the years, from plans and designs, drawings, photos and loads of other stuff along with the normal pile of documents, videos, scans and recordings. Now it is said you can basically divide computer users into two groups; those who have lost important information or documents on their computer, and those that are going to!

With that in mind, this article will cover some of the ways to make that unwanted loss easier to recover from by having a good working backup solution to safeguard your information and make sure you don't lose it.

In theory...

In theory this is not a difficult problem - just keep another copy of the data somewhere, so that if your computer dies, gets stolen, or goes up in smoke, you can get all your stuff back from your copy. However the reality is a bit more complex. What happens if you have not actually lost your data, but realise that some time ago some of it was corrupted. The file is still there, but when you try to open it, it won't; you just get an error message? Having a faithful copy of a corrupted file does not help much. How do you make sure that your backup does not get destroyed, or stolen?

Computer backup is a "deep" subject, with lots of options, and one size most definitely not fit all.

Requirements

One of the hardest bits to get right is actually working out what your requirements are. Your requirements will need to factor in how much information you need to store, what you are prepared to pay to do it, how long you are prepared to wait to store and recover it, and what kind of incidents you need to protect against.

So first some terminology

Type What is it Pros Cons
Disaster Recovery / Bare metal backup This is a backup that makes a complete copy of everything on your computer. All the data and files, all the applications, the operating system and all the configuration. The idea being that if something goes wrong, like you hard drives just fails without warning, you replace the drive, boot from your backup recovery CD/USB thumb drive, and restore the last backup. Once that is done you can carry on exactly where you left off. Typically very fast, and can fix many types of failure. Saves lots of time if you need to do a full recovery and don't want to have to reinstall your operating system and all your applications. Often it is not very granular - if you just accidentally deleted a single file, having to restore the entire computer back to where it was the last time you did a backup, might be overkill and might actually lose more data that has not been backed up recently. Also you have the problem that a bare metal backup may only be easy to restore to exactly the same or very similar hardware.


it is also unable to deal with a file that was corrupted some time ago, and has been backed up many times since then (see Generational backup below)

Full backup The process of making a full copy of all of the files that you want to backup The backup is complete and does not depend on any other backup It might take a long time, and might require lots of storage space.
Incremental backup A backup that is a follow on activity from another backup, that captures only the changes since the last backup Quick to do, and often requires little storage space. Restoration is more complex, and may require restoring a number of backups in sequence to get back to the most recent state.
Generational backup A backup that keeps not just the current version of each file, but also a number of (possibly all) the previous versions as well. This lets you not only recover from total loss of you file, but also to step back through time to find the desired version of a file - even if it is not the latest. Takes more storage space and is often more difficult to administer. May make recovery more complicated.
Online backup Not to be confused with a backup to somewhere on the internet, Simply a backup that is stored somewhere that is always accessible to the computer.


This could be the USB thumb drive, or external hard drive plugged into your computer, or perhaps saved to another computer or network attached storage device on your computer network.

Easy and rapid access, no manual intervention required The backup itself is vulnerable - it could be overwritten or destroyed by the computer it is attached to.
Offline backup One stored somewhere that is not immediately accessible. Say on an external hard drive. While the backup medium is offline, it is immutable. Becomes vulnerable when re-attached.
Offsite backup An offline backup that is stored in physically different place to the the original data. This might be cloud hosted storage, this might be a hard drive that you have stored at a friends house, at the office, or even if your bank safe deposit box! Offsite backups are essential to mitigate against some kinds of disaster. They are safe from fire or theft of the original equipment. Since they are not accessible (i.e. online) to the computer they protect, they can't be easily overwritten, corrupted, deleted even by a bit of malicious software running on your computer or the actions of a malicious individual. Take more time and effort to maintain and administer. The off site copy now needs to be secured to prevent it being a data leak problem!
Immutable backup One that can't be changed, cant be overwritten, corrupted, or deleted. Can include backups written to write once, read many media like a DVD ROM The ultimate in rock solid protection. Can be massively expensive to maintain since it can swallow an almost infinite amount of storage capacity. Can be very slow to access.
Cloud / Internet backup Storing backup data on someone else's server Offsite, and can be arranged to be immutable (at least from the end that is being backed up) Often incurs monthly costs - the amount depending on the type of storage and how "accessible" it needs to be.
Fault tolerant backup Backups need to be fault tolerant. That typically means having multiple copies of data. Accessible on hardware that is itself fault tolerant or can be easily swapped out for something compatible. Can protect against hardware failure stopping you from backing up or restoring data. More expensive, can require ongoing maintenance.

Backup law's

There are some "laws" of backup - good practice that is is vital that you take note of...

The Laws of Backup Why?
Test your backup It is sometimes very easy to think you are fully protected because you have a backup system in place. However finding out that it does not actually work like you expected at the very moment you actually need it is not a good feeling!


Have you actually backed up what you thought you had?

  • Did you include all the right folders, including those which are normally hidden by the operating system but include important application or configuration data?
  • Were you able to backup those files that were actually open at the time of your backup? You know those things like that critical database or email folder or perhaps even a virtual machine?


Have you actually backed up enough?

It might be tempting to only backup the stuff you "need", and not worry about the OS or your applications - after all the operating system and applications can be reinstalled from their original sources, in theory at least. The practicalities can be very different:

  • Do you still have those install CDs?
  • What about the activation keys required to install the software that came with them?
  • Did the application require online activation? Does that still work, and will it work on new hardware?
  • How many hours did it actually take to install all your applications? Then all the updates? Then all the extra addons and downloads that you added in the many months after?
  • Did you originally install from a download? Do you still have a backup of that? Could you find it again?
  • If you download again, can you get the same version you had installed? Do you remember which version you had? If you use the latest one, it is compatible with all your files?


Are you able to actually recover all the files?

  • Are you able to recover the information you want to a new location, and not just overwrite the working copy?
  • If you have a bare metal disaster recovery backup, can you restore data to a incompatible hardware platform?
  • Will it let you also recover individual files and not just the whole lot?


Can you do do a restore in a realistic timeframe?

  • Online backups can be nifty, but have you got the available bandwidth to download several TB of data for when you next need it?
  • Can you physically get to your off site copy when you need to?
  • Even a USB thumb drive in your pocket may have relatively low read speeds, and take many hours to copy from.


Can you recover the actual version of the files you need, and not just the latest copy?

  • The actual version of a file is often more important than just the most recent
Don't destroy your only working backup It might seem like a good idea to send your next backup to the same device as your last one. But that means you are probably destroying or at least corrupting your one and only working backup before you have completed a new one. Traditional backup solutions will use multiple sets of backup media for several reasons, and this is one.
Do it often enough Doing a backup can be a tedious process, so the temptation is only do it when "you have enough changes to make it worth while". The problem here is that you tend to forget or underestimate the number of changes or how much work it will be to redo them. It si bad enough finding out that you have lost the last few days of work, but worse when you realise it was actually several weeks, and now you can't even remember exactly what you have lost.
Automate See "Do it often enough". The way to make sure that it happens often enough is to automate the process. If you don't have to think about doing it, it won't get forgotten. However check that the automation is actually working when it should.

Backup Destinations

There are a vast number of options for protecting your data. Which will work for you will depend on your requirements. It is common to need several different options to fully achieve your requirements.

Destination Good for Limitations Cost
Another partition on the your hard drive or SSD Operating system allow storage devices to be partitioned into multiple logical volumes. That can make a lot of sense with the massive storage capacity of modern disks etc. This can allow you to keep you OS and applications separate from your data, and your data more logically organised. A backup on another partition can be a handy convenience - allowing very quick and easy restoration of files. Backups on alternate partitions are not really proper backups, they are highly vulnerable:
  • If the drive fails, you lose you main copy and backup.
  • If the backup is online all the time, it is easy to corrupt - ransomware can encrypt that at the same time as your main copy.
  • You can accidentally delete it just as easily
  • So can a careless or malicious family member, or member of staff
  • A stolen, failed, or burnt computer will lose your backup.
Low
A different hard drive or SSD in the same computer Same advantages as above Mostly the same disadvantages, with the exception that you will still be protected should only the first drive fail. You have some limited "fault tolerance" Low
An external hard drive Also very quick to access, and has the advantage that if disconnected from the computer after the backup, it is now "offline" as less easy to corrupt. Being offline, it can also be moved off site for extra security.
  • Easy to leave online when not intended reducing protection to just that of a different drive in the same computer.
  • Being off site can be good, although that might make your data less secure if others can access it.
  • HDDs might not be good long term storage options - ones unused for a long time may fail to startup reliably.
Low
A USB thumb drive Small, cheap and convienent
  • Typically slow to read and write
  • Reliability not always good
  • Can be accidentally be left "online" when not intended.
Very Low
Fault tolerant disk systems Using multiple physical drives in a fault tolerant arrangement (like RAID, JBOD, Storage Pools etc) can mitigate the risk of losing data as a result of a device failure. Access speed can be very good (and in some cases faster than "ordinary" storage. Used alone, not a complete backup option, but can often form the basis of a first line of defence. Medium
Network attached storage Smart network storage devices (i.e. anything with storage and a computer such as and off the shelf NAS or a home built one, or a "real" server) can add resilience to your backup:
  • Often have fault tolerant storage systems
  • May take care of "snapshops" , making sure you have generational backups.
  • Can be fast enough to use as primary storage - i.e. mapping your normal working folders onto the NAS/Server so that that is the primary storage location, making you data easy to access from multiple computers and devices. Play media to multiple destinations.
  • Slightly slower than internal native storage (typically limited by LAN locakl network performance).
  • If in the same building, still vulnerable to disaster damage.
  • Typically still online and so vulnerable to accidental deletion / corruption / ransomeware. However with a bit of planning they can also host storage used for backup that is not directly accessible to the systems using them. They can "pull" information from the protected system, rather than permit it to push data to them.
Medium to High
Online internet / Cloud storage Can be highly convenient allowing real time backup of your data - keeping every change as soon as it happens. Sharing and restoring to other systems is often easy.


There are a number of distinct services available in this space. See cloud backup offerings below.

Ranges from free to quite pricey - much depends on how much space you need and how long you can wait for recovery. Varies
Tape In many cases the "go to" solution for larger business users. Masses or storage at relatively low cost per TB. Relatively fast. Good for keeping offline backups and also offsite backup. Needs some manual intervention. Also some maintenance. Initial equipment costs can be very high. Tape is also a serial access medium - so if you want just a few files recovered from a backup, it might take longer to spool through the tape to the right place to start recovery. High initial cost
Optical disks (CD/DVD) Cheap, ideal for offline and offsite storage, can be immutable. Less well suited to modern quantities of data. Some question over the long term lifespan of media in storage. May require significant amounts of disk handling with large amounts of data. Relatively slow performance compared to other local options.


With time, access to suitable mechanisms may become more limited since many systems no longer include a DVD re-writer as standard.

Vey Low


Backup Strategies


Disaster recovery / bare metal software

Cloud Storage Options

Amazon AWS

Avast Cloud Backup

Backblaze

Dropbox

Google Drive

OneDrive