I found out via the Digital-Preservation Mailing List about this article about the decay rate for online citations.

I can’t read the actual article because we don’t subscribe, but a quick scroll of reports about it states that in a study of five journals (Human Communication Research, the Journal of Broadcasting & Electronic Media, the Journal of Communication, Journalism & Mass Communication Quarterly, and New Media & Society) and 1126 articles in the years 2000 to 2003 that:

373 of the citations now did not work at all, a decay rate of 33 percent; of those thatworked, only 424 took users to information relevant to the citation. In one of the journals in the study, 167 of 265 citations did not work.

Amusingly, the article describing this problem does not appear in a journal (Scholarly Communications Report) which appears to use the DOI system. (SCR does carry an ISSN number but I have yet to find anything useful one can do with an ISSN number). I wonder whether SCR will move or reorganise their website in the future? In case you wonder whether they might, the link to that paper was:

www.extenza-eps.com/extenza/loadHTML?objectIDValue=64138&type=abstract

Even the pdf file itself had this link:

http://www.extenza-eps.com/extenza/loadPDFInit?objectIDValue=64138

which at least implies they ought to be able to without breaking the link.

On a more local scale, I wonder how well blog “permalinks” will survive. I realise that much of the blogosphere is truly ephemeral, but there is much good stuff out there too which ought to be preserved. In my case, I’m starting to treat this blog like a list of notes of “public record”, and I imagine I’ll want them for decades to come.

I’m pretty confident that many of the links will decay pretty quickly. But I feel obliged to ensure that mine wont (after all, I run a centre dedicated to preserving data).

How will we deal with permanence and migration? Firstly, for this to survive, the URL home.badc.rl.ac.uk/lawrence has to survive and point to the right stuff. Secondly, either leonardo has to survive, or the xhtml content has to survive. Obviously taking the long term, it’s the xhtml that might survive (even if it’s in a new format in the future).

So to make this happen we need to

  1. guarantee to preserve home.badc.rl.ac.uk/lawrence in perpetuity, and
  2. ensure that leonardo can export easily the entire contents as static pages.

The latter is a preservation issue by which all content management systems need to be judged. The former is ok for the small numbers of NCAS/BADC staff, but if we role leonardo out more widely for NCAS we might want to think about a more suitable URL for long term preservation.

(Of course none of this precludes the writer of the blog moving onto a new URL and new software, but the content needs to survive).

trackbacks (1)

Persistence (from “Bryan’s Blog” on (on Monday 23 October, 2006))Just after I wrote my last post on data citation, I found Joeseph Reagle’s blog entry on bibliography and citation. He’s making a number of points, one of which was about transience. In the comments to his post …