The preservation of online news corrections, updates and post-publication edits

Day 2: Errors

This is part two of a white paper, “Memory Holes and Permanent Errors,” which examines whether and how online news archives should preserve corrections, updates and other post-publication changes. Click here for part one. Parts three and four will be published May 18 and May 25, respectively, along with a PDF of the entire white paper.


The media’s relationship with their own mistakes has been flawed from the start. Even into the 1970s, there was little rhyme or reason to the placement of corrections, and little consistency. Some newspapers would constantly change the name of their corrections columns (Silverman, 2009, pp. 228-229). In fact, printing corrections was often regarded as a sign of weakness (Shepard, 1998). Eventually, it became common practice for newspapers to select a space where they’d place corrections each day, usually page A2 (Silverman, 2009, p. 229). This at least ensured that readers who wanted to read corrections would know where to look. And by the turn of the past century, newspapers came to see corrections as a way to highlight their fairness and gain reader trust (Shepard, 1998).

But this was far from an effective way to correct mistakes. Someone who read an errant story on Monday was unlikely to read the correction printed on Tuesday or Wednesday. Even if she did read both the original and the correction, it was often unclear just what information was being corrected and how, especially when the old paper had already served double-duty as kindling, fish wrapping or bird-cage liners.

The move to online news offered a chance to rectify some of these difficulties. News organizations were slow on the uptake, though they realized that online stories offered the fundamental distinction of being “continuously published,” and therefore allowing corrections to be made at any time (Ang, 1999). Before the end of the 20th century, editors also had crystallized a dilemma about whether archives should be “error-free” or the stories should be left as they were on the day of publication, with one commentator complaining, “Errors large and small can be corrected at any time, erased into the ether as though they never happened” (Joe Salkowski, cited in Ang, 1999). Some prominent news outlets clung to outmoded ways of thinking, however, maintaining that there was no need to add corrective information to errant stories. They essentially argued, “If we didn’t do it for print, why do it for digital?”

Problems with corrections today

Slowly major news outlets began to abandon this outmoded attitude, for several good reasons. First, online news has a “long tail.” Whereas once month-old or year-old stories were unlikely to reach anyone beyond researchers, now our Google searches constantly call up articles that old and older. Social media also drive readers to old stories, as do hyperlinks: from other news stories, from blogs, from Wikipedia. Every time someone finds that old, mistaken article, the misinformation has a chance to spread anew.

Second, we’ve learned much since the print days about how the human brain processes information. We’ve learned that even when people read a correction and agree that the initial information was false, the misinformation might continue to affect their attitudes (Nyhan & Reifler, 2012). This gives news outlets all the more reason to correct mistakes quickly and to get them in front of the original story’s readers. The diversity of news sources today also makes it less likely than ever that the reader of a story will just happen across a correction to that story (Cornish, 2010).

Still, archives including vendor databases often allow errant information to pickle away, unnoticed (Silverman, 2009). This problem was recognized, and solutions urged, 18 years ago: “If a correction notice is run and the error is left to stand, then there should be a means of permanently linking the correction notice to the article in question” (Ang, 1999). But the issue persists. And it’s much worse than misspelled names: even made-up quotes by the disgraced New York Times reporter linger in the ProQuest database without any corrections or editors’ notes attached (Blair, 2000; Blair, 2003; Corrections, 2003; Zwerling, 2007).

But even with these problems, the New York Times is comparatively an “A” student in the corrections world. At least its corrections are retrievable from a newspaper database. More worrying are the corrections that disappear into the morass of online content, because they aren’t preserved externally or catalogued by the news outlet in any consistent way. A study of 15 major newspapers found that eight didn’t have a corrections section on their website. Seven failed to link corrections to the original articles (Weiss, 2011). Because so few news outlets maintain centralized online corrections pages, it is difficult to assess whether they’re making the necessary fixes or addenda to their archived articles.

And the “D” students are those that rarely or never post corrections at all, a phenomenon that by its nature is difficult to study. The fluid nature of online publishing seems to have encouraged such behavior. “Unfortunately, too many papers merely ‘scrub’ the text of the article to eliminate the incorrect information, never advising the reader of the error or the correction,” Craig Silverman (2009) writes in Regret the Error, a critical appraisal of corrections policies. “Scrubbing is, in effect, a cover-up. It’s unprincipled and disingenuous” (p. 234). He also reports that many newspapers publish corrections in their print editions but not on their websites (Silverman, 2009).

These conditions suggest three key questions to consider for the preservation of online news:

  1. How can we make sure that current and future readers see all the corrections that pertain to the articles they read?
  2. How should archives display correction notices?
  3. Should the original text be changed in archives?


When we ask how corrections should be displayed to readers, current and future, it’s important to keep in mind the varying incentives at play. News outlets tend to value the archived stories on their own websites not just because there are established mechanisms for these to bring in revenue but because this — rather than an external archive — is how the outlets reach most readers. By far, the majority of people reading old New York Times stories reach this content either through exploring or, more likely, through Google, standards editor Philip B. Corbett says. “For the vast majority of people out there, that is really the archive of The New York Times these days,” he says.

Henry Fuhrmann, who until December 2015 was the Los Angeles Times’ assistant managing editor and head of the newsroom’s Standards and Practices Committee, defines the paper’s corrections priorities as three-fold: “There’s the reader, first and foremost. That’s where a forthright approach is fundamental. We’re establishing our credibility by saying, ‘Yes, we make errors, we fess up to them, and we tell you as early as we can what happened’ … There’s the source audience as well: ‘We covered you, we made an error, we owe it to you, the source, to correct the record.’ That’s pretty basic, too … The last audience to serve is the current and future staff. Every time we publish, we are adding to the collected record of the L.A. Times going back to December 4, 1881 … So when we are aware that we’ve made an error, it’s incumbent on us to correct because each article is a clipping online or in print that will be used by a future member of the staff or maybe a researcher.”

When it comes to archive accuracy, publishers’ priority is therefore to make sure that errant stories on their own websites somehow alert readers to the corrections that were made. Even in this department, standards vary widely. The New York Times and Los Angeles Times are top performers: when an error (other than a typo) is found in an online article, these outlets add a correction notice to the article and link to the article from their corrections webpages. Other publishers don’t perform so well: A Columbia Journalism Review survey of 665 magazines found that 45 percent correct online factual errors, or errors more substantive than mere typos, without alerting readers (Navasky & Lerner, 2010). As of 2010, a study of 15 major newspapers documented that seven, The Washington Post, Chicago Tribune, USA Today, San Jose Mercury News, Philadelphia Inquirer/Daily News, Denver Post and the (Minneapolis-St. Paul) Star Tribune, failed to link corrections to the original articles. Major born-digital outlets were even less likely to link corrections to articles (Weiss, 2011). Born-digital publications were also less likely to have designated corrections pages on their website and less likely to explicitly tell readers how to report an error (Weiss, 2011).

It is no surprise, then, that news outlets are generally ill-prepared to ensure the accuracy of outsourced archives like LexisNexis. The New York Times pushes a new version of each corrected online story to database vendors, replacing errant versions. The Los Angeles Times does the same though only Factiva carries its online stories. But a systematic quantitative study would be required to more precisely determine how often each outlet’s corrections tend to be reflected in website and database archives. Even this lack of information is itself instructive. It shows little change from 2010, when one researcher wrote, “It is unclear whether an online story that contains misinformation will have information corrected in all archived versions or only in later updated versions or both. This lack of a clear policy is troubling when we consider the function of archiving — to capture the present so that future generations might get a clearer picture of the past” (Cornish, 2010).

External archivists like the Internet Archive and the Library of Congress bring additional confusion to this picture. Any such efforts that capture snapshots of website front-ends, rather than receiving feeds from the publishers’ content management systems (CMSs) or back-ends, cannot make claims of completeness. For example, if the Los Angeles Times posts a correction or update on an online story, the Internet Archive will display an outdated version of the article until it happens to crawl that article site again, or until a user manually forces such a crawl.

We must ask, then, how news organizations can be encouraged or enabled to improve corrections practices, for both articles in their internal archives and those pushed to database vendors. Establishing principles and brainstorming solutions now could also help ensure that memory institutions and technologists consider the role of corrections in their future initiatives.


One of the hurdles to clear display of corrections in archives is the lack of a standardized, structured format across the industry. Los Angeles Times data editor Ben Welsh compares this to the structured data that enables the archiving of photographs. Industry standard metadata allows every digital photograph to carry information about time and date, location, the type of camera used and so on. Standardized fields make it easier for photography archives to catalogue pictures from a variety of sources and for users to specify search parameters. Of online news, Welsh says, “We just totally failed to follow that example.” The myriad CMSs in use by publishers, not to mention the various news databases, structure the same types of data in many different ways.

One way to solve this would be to leverage commercial pressures. Welsh notes that publishers do employ standardized metadata on their stories to enable Google to more easily crawl and index those articles. This system gives publishers access to millions of pageviews, via Google search and Google News results. So Google or even Facebook could similarly create a structured standard for how publishers should represent corrections in their HTML coding. This same data structure could be used in publishers’ CMSs and also extended to external databases. And it could even be used to more easily get corrections in front of the original articles’ readers, if Google or a social media company developed that functionality. For example, Facebook could track if a user clicked on an article link, and then alert that reader if the article is corrected.

Accuracy vs. purity

One more question remains for the display of corrections in archives, one on which publishers might never agree. The New York Times and Los Angeles Times illustrate this key difference. The New York Times changes the original text so it is no longer in error. But the Los Angeles Times chooses to leave the original text intact, and rely on the correction notice alone to steer readers away from misperceptions.

When this debate began, one key argument against text changes was the idea that the text as published was “sacrosanct” (Thompson, 2004). As databases became more popular, librarians began to realize the drawbacks of this approach. For example, Los Angeles Times archivists found that searches wouldn’t return results with misspelled names. But librarians feared that if they started fixing even small errors, lines would blur and staff would be faced with making ever more substantive changes (Thompson, 2004).

Proponents for changing errant text countered that a prime purpose of electronic archives is to disseminate the truth. Every time someone views a mistake in the archive, they run the risk of being misinformed. And if the correction is not sufficiently prominent, readers may miss it. Such was the case with a 2004 Washington Post article, which incorrectly reported that then-U.S. senator Mark Dayton influenced coverage at the St. Paul Pioneer Press. At that time the Post added corrections in a separate box “below the fold,” on the right side of the screen, where the notice could easily be overlooked. And indeed it was: the story was repeated without correction in publications including the Omaha World-Herald and the Drudge Report (Thompson, 2004).

This issue becomes even more complex when it comes to articles that are riddled with errors or wholly fabricated. Back in 2007 ProQuest search turned up stories by the infamous New York Times plagiarist and fantasist Jayson Blair without any of the editor’s notes that called out the falsehoods (Zwerling, 2007). I was similarly able to find un-appended Blair stories on ProQuest in December 2016 (Blair, 2000; Blair, 2003; Corrections, 2003). In one, Blair misidentified the Air Force base where a police chief served. In another, he attributed a quote to a Queens College professor, who later told the Times he did not utter those words. For such cases, simply appending the correction would be a much-needed start. Some would argue that archives should go further and fix errors in the text. But if the lesson to be drawn from the Blair episode and similar scandals is actually the mistakes themselves, does that mean that the errors should be saved for posterity?

It’s possible this debate will never be resolved. So archives must continue to respect publications’ difference of opinion while still making sure that readers see the corrections they need.


All references cited will be available in the final PDF download.

About the author

Tamar WilnerTamar Wilner is a freelance journalist, researcher and master’s student in the journalism program at the University of Missouri. She specializes in writing about the evolving news media, online misinformation and fact-checking for outlets including the Columbia Journalism Review and, and acting as an occasional consultant for media-focused organizations including the American Press Institute and the Columbia Journalism Review. Her academic research interests include science and health journalism, misinformation effects and audience notions of trust and authority. You can find her at or on twitter at @tamarwilner.

Wilner completed this white paper in fulfillment of a travel scholarship to attend the 2016 Dodging the Memory Hole Summit, a conference on the preservation of online news, held Oct. 13–14, 2016 at the University of California Los Angeles and arranged by the Donald W. Reynolds Journalism Institute at the University of Missouri. The scholarship was funded by a Laura Bush 21st Century Librarian Program grant (no. RE-33-16-0107-16) from the Institute of Museum and Library Services.

