A Poor-Mans Cloud Storage: Bidirectional File Sync with Unison
Since the Cloud emerged many years ago there are tons of Cloud Storage Solutions like Google One, Dropbox, pCloud, Microsoft Onedrive, Apple iCloud, ownCloud, nextCloud, Seafile, ... available to us.
However, they all seem to have some caveats.
- Syncing to Linux with Google One (Drive Client) is not possible, as there just doesn't exist any official client for it (Third party only).
- Dropbox, former king of Cloud Storage, limits the amount of linked devices in the free plan. The cheapest paid plan starts at €9,99, offering 2 TB storage. Now, 2 TB is a lot, but very likely we'll never need 2 TB of cloud storage.
- nextCloud can be setup easily on an VPS. However, it's actually a full suite coming with calendar integration, to-do lists, mailing, contacts, etc. So it might also not be the ideal choice if we only want a storage option.
I think one can see a pattern here. If we are on the hunt for a simple, lean, storage sync-only solution, the solution space cleans itself up very fast.
But don't fret, there is a solution: Unison.
Unison is a cross-platform bidirectional file synchronization tool. So we can get ourselves a cheap server, with as much storage capacity as we'd like, set up Unison on both server and client and enjoy hassle-free, pure cloud storage sync on our own virtual private server.
Installing Unison
In order to get up and running, we first need to install the Unison library from our repository. On Debian based systems, this is as easy as executing the following snippet.
For other operating systems/package managers, please check out the official docs for installing unison.
sudo apt install -y unison
This has to be done on the server as well as on all the intended clients.
Setting up the SSH connection
In order to use Unison in the most unobtrusive way, it is recommended to set up a passwordless SSH connection between the server and the clients we'd like to connect.
This is actually very easy and can be done in 3 simple steps.
First, if we don't have one already, we have to generate an SSH key for (all) our client device(s).
ssh-keygen -t rsa
Executing the above is going to create an SSH key pair for us, putting it in the .ssh directory of our system.
Second, we send the public key of our pair to the server we'd like to sync to and from.
ssh-copy-id my-server-account@123.45.67.89
Most likely in order to be able to send the public key at this point we are going to be asked for our user's password.
And as a third step, we can then test if our connection can be established without the need for a password.
ssh my-server-account@123.45.67.89
Setting up the Synchronization Profile
In order to have the most frictionless experience, we are going to create a Unison Profile, this allows us to put all the configuration settings of how exactly the synchronization should be executed in one file.
nano .unison/cloud-storage.prf
Inside this file we are going to put the following.
ignore = Name {.DS_Store}
ignore = Name {.ipynb_checkpoints}
root = /home/my-client-account/cloud-storage
root = ssh://my-server-account@123.45.67.89/cloud-storage
batch = true
auto = true
The content of our profile can obviously be changed as we see fit.
We might never deal with files from Apple devices, then we can of course remove the ignore configuration in the first line for “.DS_Store” files, or we might need to ignore some more files, then we can add ignore configuration lines to it. In General Unison is able to ignore arbitrary combinations of files and paths.
We define 2 root entries for our sync, one pointing to the directory on our client system where we'd like to sync our files to. The later one defining the path of the server we'd like to sync from and to.
As the last entries, we define batch = true
and auto = true
.
auto
defines that Unison should sync new and changed files without always asking for permission.batch
defines that Unison is not going to ask back about files that are in conflict (If they have been changed on both systems without a sync). Unison will just skip those files and sync the rest. This might sound scary, but in reality it's the best option. In case we notice that files are getting out of sync, we can resolve those conflicts easily via the GUI.
In order to check if we set up everything correctly, we can execute the sync command and check the folder contents afterwards on both systems.
unison cloud-storage
In the snippet above cloud-storage
refers to the filename of the profile we created.
Automating the Synchronization
Having verified that our setup works as anticipated, it's time to automate it. To make Unison repetitiously check if the state of our client and server systems diverge, we create a plain old but simple cronjob task for it. We edit the cronjob file on our client.
crontab -e
To schedule a sync for every 10 minutes we add the following command.
*/10 * * * * test -e /var/lock/unison-sync-lock && exit 0 || (touch /var/lock/unison-sync-lock;unison cloud-storage;rm /var/lock/unison-sync-lock)
What exactly is happening here? This looks a bit more complicated than the command we executed for our test scenario.
First we are checking if there exists a file in /var/lock called “unison-sync-lock”, and if this is the case the script is directly exiting.
If this is not the case, meaning such a file doesn't exist, that exact file is being created and a sync is being started. As soon as the sync has finished, the file is being removed.
This is basically just for safety reasons, it could be that a sync of big files takes more than 10 minutes, which would create problems as every 10 minutes a new sync would be started, meaning the 2 syncs would interfere with each other.
(Fixing file conflicts)
In general, it should be stated that the higher the frequency of our sync calls is, the less likely it is that we are going to get into the situation of file conflicts, but nevertheless they can happen.
So what to do then? The easiest and most straight forward way to fix file conflicts seems to be via the GUI application unison-gtk
.
We can just install it analogously like we did for the unison
package.
sudo apt install -y unison-gtk
Having followed the previous steps of creating a profile and setting up the SSH connection, we can then just start the Unison GUI, select the profile we have created and be presented with a list of files that are in conflict.
The “Action” column in the UI will have a red question-mark symbol for all files being in a conflicting state.
All there is to do is to select each file, and in the toolbar decide whether to sync from local to remote or from remote to local (Right to Left, or Left to Right). When we are done, we click “Go” in the toolbar and the conflicts will be resolved as we chose to.
Congratulations!
You just set up your very own Unison sync client with your private server, giving you full control over how and what exactly is being synced.