Practical tips on how to manage and care for your data

November 5, 2020, 2:57pm

This piece was written by Matthew Yang, Archive Officer at the Asian Film Archive, for World Digital Preservation Day


Managing digital data has become increasingly challenging and can be a daunting experience with the sheer volume of what can build up in a short period.  Digital data is vulnerable and can be easily lost if not properly cared for. Hard drives fail, files get plagued by bit-rot, or it can get deleted by accident. Data loss is painful, and recovery (if even possible) is an expensive and stressful process. With the advancement of born-digital filmmaking technologies, we are generating more data than ever before and this poses a huge risk to the loss of our films if we do not start taking steps to care for the physical longevity of a work as it is being created.

In the film archive, many of the titles we receive in born-digital formats require our staff hours to sieve through and sort out the clutter. This is usually because the data is not properly organised or lacks sufficient documentation (for instance, poor file-naming conventions) making it difficult to know what is being filed.

Since many people are working from home as a result of COVID-19, it is a great opportunity to take stock and review the backup of your files and check if you are still able to access your digital films and related materials. No one else knows your work better than you. So ensuring the integrity of your materials is a task best done by you.

We have compiled some practical tips and basic data management concepts that is easy for anyone who is planning to organise and backup their personal files at home. There is no right way to do this and you may find other solutions that might better suit your needs.

For the long-term preservation of your film works, consider sending it to the Asian Film Archive. We will assess if it falls within our acquisition policy. Write to us at info@asianfilmarchive.org!

Let’s start!

  •  Consolidate your files

Tracking and managing your files gets challenging when they are kept on multiple and different devices. It’s easy to lose track of its location if some are sitting on hard drives and others are on online platforms, such as Dropbox, Google Drive and OneDrive. Identifying where the files are is the first step in taking stock of what you have.

You could centralise the files on your computer or an external hard drive once you have done locating your files. Remember to select a suitably sized media to do all of this.

At this stage, everything will look like a huge mess.  Take the time to survey what you have because this will inform you on how to organise and name them later.

By the end of this process, you will  have a good sense of how much data you have and knowing this will help you allocate the necessary memory for your backup strategy.

  • Keep only what you need

Select what you need and delete whatever that is unimportant. For example, you might encounter multiple copies of a file and it could be worth considering what is sufficient to keep, for instance, keeping only the latest version.

This quality check process will free space, keeping your data volume to a minimum. Working with a smaller volume will keep costs low and allow an easier migration and backup process down the road.

  • Organise your files

Organise the files that you have selected by creating a file directory structure and a file naming convention that makes sense to you and others accessing the files. Whatever system you decide upon should be easily understood by you and others to ensure easy accessibility and quick identification.

File Directory Structure

This is an example of how files can be organised and structured:

I find it useful to first determine the top-level folder. I used [YEAR] in this example, and branch into sub-directories by [CATEGORY]. In the screenshot above, we see Year 2020 and the categories are AFA Work, External Projects and Personal Documents. The later directories might be projects/events based but try to keep them as consistent as possible.

There is no ‘perfect’ format and this example gives you an idea on how you can start. Your directory should be intuitive and logical and should be guided by how you work.

Here are some tips on designing your own directory.

  1. Draw it out!
    • Draw your directory out on a piece of paper before implementation
  2. Keep it simple
    • Avoid complex and deep-layered designs
  3. Consistency
    • Keep a consistent structure across folders
  4. Be precise
    • Use plain language and keep it short
  5. Avoid spaces, punctuations and symbols
    • Some operating systems do not recognise spaces so avoid spaces in a mixed operating system environment

File Naming Convention

Descriptive file names are imperative to quick identification and retrieval of files. Poor naming conventions are frustrating and wastes a lot of time  since they do not give useful information (“Best practices for file naming”, 2020).

A guiding principle for filenames is to include basic information such as object type, dates, and important remarks. These indicators can be crucial in distinguishing one file from another in the event that there are multiple variations of a given file. In essence, an effective file name should tell you what the file is without you having to open it (Antin, 2020).

In the screenshot above, assets of a project are organised by their folders: Logos, Mock_Up_Thumbnail and Watermarked_Stills. The name of the three files clearly indicates that there are three image stills (Still) from the film Sunshine Singapore (SS) watermarked (WM) in .png format.

Here are some tips on File Naming:

  1. Make it human-readable
    • Use concise and plain language
    • Avoid abbreviations that are not common knowledge
  2. Keep it short
    • Avoid long words and sentences
  3. Use indicative descriptions
    • Is it a report or a still image? Be clear
    • Is it the latest version? Put an indicative date

More on File Naming:

1. Best Practices for File Naming

2. Bulk Rename Utility Tool

  • Create Backups

Now that your files are all in place and organised, it is time to back them up! You can consider the classic 3-2-1 data protection strategy which is a model widely adopted by professionals in content and media production.

3-2-1 strategy

Keep 3 copies of your data

The 3-2-1 strategy encourages the back up of three copies of data because one copy is simply not enough. Having one copy is dangerous and the more copies you have, the lesser the chance of complete data loss.

Store 2 copies on 2 different storage devices

Drives will eventually fail because of mechanical failure or wear and tear. Hence the 3-2-1 strategy recommends keeping your first two backups on two separate storage devices at your primary location. Storing the two copies differently will provide an added insurance for data restoration in the event that one source fails.

There are various backup storage solutions available, but which do you choose? Here are two solutions you can consider.

A) External hard drive (SATA & SSD)

This is the most common solution  because it is affordable, easy to use and widely available. An external hard drive requires little set-up, just plug it into your computer via USB and it is ready to use. They usually come in two variants: SATA & SSD and their differences are in the links provided.  SATA drives are much cheaper but SSD’s are faster and less prone to failure since there are no moving parts in it. However it is important to note that SSDs have limited write cycles, even though it is less susceptible to physical wear (“SATA vs SSD vs NVMe: Types of Hard Drives”, 2020).

Pros: Easy to use, portable, affordable, widely available

Cons:
Can’t easily share files

You can consider deploying multiple hard drives and rely on tools/software to help duplicate data from one drive to another.

Here are some suggested tools:

1. FreeFileSync

2. GoodSync

3. Bvckup 2

B) Network Attached Storage (NAS)

A NAS storage device is connected over a computer network and acts as a central location for multiple users to write and access data (“What is NAS (Network Attached Storage) and Why is NAS Important for Small Businesses? | Seagate UK”, 2020). Depending on the model, multiple hard drives are housed within a NAS for storage and can be scaled for increased capacity depending on its number of available bays. You can think of NAS as an array of hard drives put together to form a larger storage unit.

It can be set up to use a RAID configuration to ‘create’ multiple units of storage within the NAS but still behaving as one cohesive storage. This allows you to manage storage redundancy and performance according to your needs. There are many NAS solutions available but Synology and QNAP are two popular brands.

Pros: Allows multiple users to access, scalable, allows customization, status monitoring

Cons:
High upfront cost, requires basic technical knowledge


More on NAS storage:

1. What is a NAS Device?

Keep 1 copy off-site

The last component of the 3-2-1 strategy is to have 1 copy of your data stored off-site away from your primary location. This is a key component in designing a robust back up strategy as on-site file storage can be  compromised by hardware failure, theft or a fire. You can consider cloud storage as your off-site solution where your files will be hosted on a cloud service for a cost (“Backup Strategies: Why the 3-2-1 Backup Strategy is the Best”, 2020).

There are many cloud solutions plans you can consider that can cost as low as USD6 per month for unlimited file storage. Most of these solutions keep data restoration straightforward as well. For example, they can restore your data in different ways: direct download, USB flash drive or external hard drive. Whichever method you pick will depend on the volume of data you are retrieving.

  • Conclusion

There is no one way to manage and backup your data since it will depend on your data volume and importantly, your budget. When it comes to designing your backup system, a general rule of thumb is to have multiple copies and diversify them on different storage solutions. If one source fails, you can rely on others for data restoration.

Managing your data requires continuous effort and it is easy to overlook it with competing responsibilities. Hence you can consider a backup regime or simply backup as you go. By doing so, you not only reduce the risk of data loss but also have peace of mind that your files are safe and are readily accessible when needed.

Reference List

Antin, K. (2020). File naming conventions: why you want them and how to create them. HURIDOCS. Retrieved 11 October 2020, from https://www.huridocs.org/2016/07/file-naming-conventions-why-you-want-them-and-how-to-create-them/.

Backup Strategies: Why the 3-2-1 Backup Strategy is the Best. Backblaze Blog | Cloud Storage & Cloud Backup. (2020). Retrieved 11 October 2020, from https://www.backblaze.com/blog/the-3-2-1-backup-strategy/.

Best practices for file naming. Stanford Libraries. (2020). Retrieved 10 October 2020, from https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming.

SATA vs SSD vs NVMe: Types of Hard Drives. Pluralsight.com. (2020). Retrieved 11 October 2020, from https://www.pluralsight.com/blog/it-ops/types-of-hard-drives-sata-ssd-nvme.

What is NAS (Network Attached Storage) and Why is NAS Important for Small Businesses? | Seagate UK. Seagate.com. (2020). Retrieved 10 October 2020, from https://www.seagate.com/sg/en/tech-insights/what-is-nas-master-ti/.