Skip to main content

Your browser may not be compatible with all the features on this site. Consider upgrading to a modern browser for an improved experience.

View Post [edit]

Poster: Coderjo Date: Oct 8, 2012 1:49pm
Forum: petabox Subject: Re: How long does the data last?

The data is stored on two separate hardware nodes as soon as it is uploaded to archive.org. As far as I know, the system does not do extra ECC (beyond what the hard drive does internally). However, in one of the item's xml files, it stores a list of files for the item along with checksums, which can be used to verify the files on each node.

Reply [edit]

Poster: Seaware Date: Oct 9, 2012 12:47am
Forum: petabox Subject: Re: How long does the data last?

Thanks. So if the half life of the data on the disk is 100 years (for example) would the drive be powered on and data be checked at least once during that period and the first failing checksum cause a replication to a fresh drive? Also, I hope you are using a CRC, not a pure checksum, which will be more likely to find multi-bit errors.

Reply [edit]

Poster: Coderjo Date: Oct 10, 2012 11:04pm
Forum: petabox Subject: Re: How long does the data last?

I don't know low-level details, so I don't know if the data is scrubbed regularly. I also don't know the procedures that occur when a drive fails and needs to be replaced.

Currently, looking at the files.xml file for a random item, the system does sha1, md5, and crc32. It also stores the file size and mtime.