# Using dng for archival purposes



## Sashi (Aug 26, 2016)

Hi,

I like the idea of being able to 'validate' my photos to avoid problems like bit rot when using the dng format. I'm considering converting my library and would like to hear your opinions especially with regards to existing tiffs.

Are there any reason I may not want to do this to my whole library including tiffs?
How does the conversion handle layers? Should I flatten the tiffs that do contain them?

Thanks


----------



## AaronT (Aug 26, 2016)

Hi Sashi. There is no such thing as bit rot for photo files. What you are thinking of is file degradation or corruption. If you are a windows user you can do weekly chkdsk or scandisk checks of your file system. Corruption will affect dng files as much as any other file type. Of course, the best insurance is a good backup of all your data files, of any type.


----------



## pwp (Aug 26, 2016)

I use a 100% DNG workflow, following the recommendations of Peter Krogh:
http://www.dpbestflow.org/DNG
http://thedambook.com/about/

Here's the Lightroom verification info:
http://thedambook.com/dng-verification-in-lightroom-5/

-pw


----------



## Sashi (Aug 26, 2016)

Thanks!

pwp: the links you provided pretty much answered it all!


----------



## wockawocka (Aug 29, 2016)

Once edit everything goes to DNG. Keepers remain uncompressed, non keepers are saved in smaller dimension.

I keep everything.


----------



## LDS (Aug 29, 2016)

Sashi said:


> I like the idea of being able to 'validate' my photos to avoid problems like bit rot when using the dng format. I'm considering converting my library and would like to hear your opinions especially with regards to existing tiffs.



For long term storage, you need something better than simple a checksum. It will tell you if a file is corrupt, but won't help you to recover it but from another pristine copy - hoping that copy is not corrupted as well.

There are some storage solutions that not only compute a "checksum" but do it in (more complex) ways that allow for recomputing the corrupted data if needed (error correction codes, i.e. Reed-Solomon). They require some more space for every file, the advantage is that space is less than a full separate copy. Of course multiple copies with error correction codes increase reliability and availability.

That's why, for example, a good backup software doesn't only copy data - it does compute and store those information needed to restore corrupted files (as long as corruption is not so large it cannot be fixed). 

Just remember "bit rot" may happen at different layers - software, CPU, memory, controllers, transmission links, storage - to ensure a "reliable" workflow each and every component needs to be reliable.

This work for any file - if the file has some internal check as well it adds another layer.


----------



## Zeidora (Aug 29, 2016)

Sashi said:


> There are some storage solutions that not only compute a "checksum" but do it in (more complex) ways that allow for recomputing the corrupted data if needed (error correction codes, i.e. Reed-Solomon). They require some more space for every file, the advantage is that space is less than a full separate copy. Of course multiple copies with error correction codes increase reliability and availability.
> 
> That's why, for example, a good backup software doesn't only copy data - it does compute and store those information needed to restore corrupted files (as long as corruption is not so large it cannot be fixed).



Can you give examples of such error-correction back-up solutions? I currently do brute force RAID1 in two locations. I once tried retrospect, but it had such a clunky and incomprehensible interface that I abandoned it again.


----------



## Zeidora (Aug 29, 2016)

dilbert said:


> Zeidora said:
> 
> 
> > ...
> ...



I used to run NAS with various terra stations, but data transfer even with GB ethernet is too slow. It seems that the previously check-sum verifications are alluding to RAID 5 (or flavors thereof). I moved away from RAID5, because if one drive fails, it puts more strain on other drives (also during rebuild), which makes failure of second drive much more likely. On a 4-drive RAID5, two drives down is fatal. Given how cheap drives are these days, RAID 1 (and flavors thereof, such as RAID 10) are much safer. Hard-drives are also simpler than NAS boxes; I had a NAS board fail on me, so am cured of that mistake.


----------



## LDS (Aug 31, 2016)

Zeidora said:


> Can you give examples of such error-correction back-up solutions? I currently do brute force RAID1 in two locations. I once tried retrospect, but it had such a clunky and incomprehensible interface that I abandoned it again.



Unluckily, most actual desktop-oriented (and affordable) backup solutions now rely heavily on storage reliability and may do little more than copies, although some at least can validate backups (but usually can't do more than telling you there are errors). That's because they are designed more for relatively short-term storage of changing data, mostly for "disaster recovery" situations (failed disks, deleted files, etc.), not for long-term storage and archival. Reliability is usually increased using multiple backups in a rotation schema.

I'm actually trying to evaluate how good are actual backup software (i.e. Acronis, Veeam, Bacula, and others) for long term storage (not cloud based). Some high-end backup software can do more, but are expensive, and complex to use.

In the past, when media were far less reliable, even for short-term storage (think floppies and tapes!), the software itself needed to be more reliable. For the matter, CDs, DVDs and their formats have a lot of error correction info stored along data. But to increase reliability, correction data should be stored separately. There is a software, called "dvddisaster", which can be used for such a task. I'm also evaluating using M-Disc for long term storage (of important data), coupled with software to keep track of where files are plus ECC data.

RAID 1 is good for availability (and write speed), but not for (long-term) reliability - if two disks data differs, there's no way to know which one is good - so you need multiple copies.

RAID 5 is somewhat better, and RAID 6 a little more - because they have more redundant data - but still not perfect. There are other solutions that may offer an higher level of protection because they work at the file system level so they can know and to more about the data being written. ZFS is one of these - but it has the disadvantage of requiring RAM and CPU power to work well. ZFS can also automatically store multiple file copies.

Atop this (a layered solutions is always better), you should add a backup/archival solution able to calculate error correction data at the file level. For archival, there are some free solution, like Fixity and FileFixity, which are a still "raw" and are not exactly user-friendly. Some of those solutions are developed by companies and institutes that are worried about the long-term availability of digital archives, which is still something not really addressed properly - until not long ago "digital data" were mostly "ephemeral", but that quickly changed, and not only for images, but the archival needs have been overlooked.

There are some commercial solution for data archival, but usually those are aimed at companies, not single users, and are big and expensive.

There are also some cloud solutions, i.e. Amazon Glacier, designed for such needs, using a mix of technologies, and may be cheaper - if you use them properly - and they take care of all the storage needs and maintenance.

Just be aware people who tried to use Glacier as a disaster recovery backup solution (when you restore a lot of, or every, file), and not archival, found that used that way it could be pretty expensive (because of its pricing structure, it is designed to store data that is infrequently accessed and long lived). But if for example each year you archive the previous year images, while keeping local (or proper cloud) backups of data you need readily available, it could be a good solution.


----------



## Mt Spokane Photography (Sep 1, 2016)

DNG uses a already obsolete Tiff file format, and, as others have said, its not immune to file errors.

Longevity is one of the big concerns regarding digital images. I sent out about 30 DVD's with my family photos and those of my ancestors. Hopefully, some will survive. Its for certain that anything I have on my computers or NAS will be lost when I'm gone.


----------

