Where to Save Data

Faculty of Engineering
University of Waterloo

Many people create data files in the execution of our jobs, data which is critical to our units and often to the University both short term and long. We all must be mindful that the data is

  • backed up regularly and safely to multiple locations with multiple save dates

  • available to others who need it (ie. shared)

  • available after we leave, retire or are sick, or after a compromise

Every year we see numerous groups with days or months worth of lost work because they chose the wrong storage option, and problems are usually discovered too late to help.

Summary

Worst Options Second Worst Options Best Options

hard disk 🙁

Microsoft OneDrive 😐

Shared Drives (eg. R: drive) 😃

memory sticks 🙁

Cloud Providers 😐

NextCloud 😃
(ECResearch or EngAdmin)

removable media 🙁
like CDs, DVDs, portable drives

local NAS box 😐

Teams 😃

NFS storage (Unix)

Worst Options

The worst options are:

  • local hard disk

  • removeable media (flash drives, CDs, DVDs)

These media are not backed up, and memory sticks are often lost or erased. It’s hard to find the file you want in a pile of sticks. Also, they degrade over time and lose data.

In the event of a drive crash (they happen eventually), user error, software failures, theft, flood, fire, ransomware, etc., the data is lost forever, including any backups in the same room.

Also, when you upgrade to a new computer, every few years, that data will be lost unless you remember to copy it to the new hardware.

Second Worst Options

  • Microsoft OneDrive

  • cloud providers like DropBox

  • local NAS box

OneDrive is not backed up by the institution, and also the storage is erased upon your graduation or other departure from the organization. And if you change jobs, you might decide to erase old files which on which others had grown reliant.

Departments which rely on OneDrive will find important data is gone when the user moves on for whatever reason.

OneDrive shared files are complicated to share just read access, often resulting in other people being able to change your files now or into the future.

3rd party providers are expensive, they do not offer guarantees of backups and may close your account at any time resulting in data loss over the longer term.

NAS boxes are only a single backup site, often in the same room (think fire, theft, flood), and are networked computers vulernable to network attacks and software and hardware failures.

Best Options

  • network home drive for personal files

  • shared drives (SMB file sharing - a drive letter) for sharing

  • NextCloud (ECResearch or EngAdmin) for both personal and shared files

  • Teams for shared files

  • NFS stored backups on ECResearch for Unix systems

These systems all keep data safe and are great for sharing among your peers.

They are backed up daily and we have backups in multiple locations, with backups going back for some time. We can recover files and provide continuity when others need access.

NextCloud has features that allow you to share with anyone on campus (Nexus integration), or send a link with READONLY access to anyone in the world using Email, a password and an expiry date.

NextCloud (like OneDrive) uses a subdirectory on your hard drive to temporarily work on files, so the speed is generally faster and more compatible than shared drives.

NextCloud is particularly great if you will have many files and share a lot of data frequently. It takes a few minutes to set up, and is best for in-faculty sharing, but you can work on files with any application.

Teams is often a good choice if you are sharing only Microsoft Office files (like documents and spreadsheets), there aren’t too many documents (it is more awkward to navigate), or the group is large and distributed. Its greatest strength is shared simulatenous editing of files (eg. shared Spreadsheets).

SMB/CIFS and NFS sharing are particularly good for automated backup systems.

Best Practices

Store data (or copies of it) on one of the good options above.

Store data in folders or subdirectories by function and/or by year. A folder with 1,000 files is close to useless, but when you group your data with descriptive titles and grouped in subdirectories by year, you and others will have an easier time retrieving it.