Too much has been lost already. The glue that holds humanity’s knowledge together is coming undone.
Sixty years ago, the futurist Arthur C. Clarke observed that any sufficiently advanced technology is indistinguishable from magic. The internet—how we both communicate with one another and together preserve the intellectual products of human civilization—fits Clarke’s observation well. In Steve Jobs’s words, “it just works,” as readily as clicking, tapping, or speaking. And every bit as much aligned with the vicissitudes of magic, when the internet doesn’t work, the reasons are typically so arcane that explanations for it are about as useful as trying to pick apart a failed spell.
Underpinning our vast and simple-seeming digital networks are technologies that, if they hadn’t already been invented, probably wouldn’t unfold the same way again. They are artifacts of a very particular circumstance, and it’s unlikely that in an alternate timeline they would have been designed the same way.
The internet’s distinct architecture arose from a distinct constraint and distinct freedom: First, its academically minded designers didn’t have or expect to raise massive amounts of capital to build the network; and second, they didn’t want or expect to make money from their invention.
The internet’s framers thus had no money to simply roll out a uniform centralized network the way that, for example, FedEx metabolized a capital outlay of tens of millions of dollars to deploy liveried planes, trucks, people, and drop-off boxes, creating a single point-to-point delivery system. Instead, they settled on the equivalent of rules for how to bolt existing networks together.
Rather than a single centralized network modeled after the legacy telephone system, operated by a government or a few massive utilities, the internet was designed to allow any device anywhere to interoperate with any other device, allowing any provider able to bring whatever networking capacity it had to the growing party. And because the network’s creators did not mean to monetize, much less monopolize, any of it, the key was for desirable content to be provided naturally by the network’s users, some of whom would act as content producers or hosts, setting up watering holes for others to frequent.
Unlike the briefly ascendant proprietary networks such as CompuServe, AOL, and Prodigy, content and network would be separated. Indeed, the internet had and has no main menu, no CEO, no public stock offering, no formal organization at all. There are only engineers who meet every so often to refine its suggested communications protocols that hardware and software makers, and network builders, are then free to take up as they please.
So, the internet was a recipe for mortar, with an invitation for anyone, and everyone, to bring their own bricks. Tim Berners-Lee took up the invite and invented the protocols for the World Wide Web, an application to run on the internet. If your computer spoke “web” by running a browser, then it could speak with servers that also spoke web, naturally enough known as websites. Pages on sites could contain links to all sorts of things that would, by definition, be but a click away, and might in practice be found at servers anywhere else in the world, hosted by people or organizations not only not affiliated with the linking webpage, but entirely unaware of its existence. And webpages themselves might be assembled from multiple sources before they displayed as a single unit, facilitating the rise of ad networks that could be called on by websites to insert surveillance beacons and ads on the fly, as pages were pulled together at the moment someone sought to view them.
And like the internet’s own designers, Berners-Lee gave away his protocols to the world for free—enabling a design that omitted any form of centralized management or control, since there was no usage to track by a World Wide Web, Inc., for the purposes of billing. The web, like the internet, is a collective hallucination, a set of independent efforts united by common technological protocols to appear as a seamless, magical whole.
This absence of central control, or even easy central monitoring, has long been celebrated as an instrument of grassroots democracy and freedom. It’s not trivial to censor a network as organic and decentralized as the internet. But more recently, these features have been understood to facilitate vectors for individual harassment and societal destabilization, with no easy gating points through which to remove or label malicious work not under the umbrellas of the major social media platforms or to quickly identify their sources. While both assessments have power to them, they each gloss over a key feature of the distributed web and internet: Their designs naturally create gaps of responsibility for maintaining valuable content that others rely on. Links work seamlessly until they don’t. And as tangible counterparts to online work fade, these gaps represent actual holes in humanity’s knowledge.
Before today’s internet, the primary way to preserve something for the ages was to consign it to writing—first on the stone, then parchment, then papyrus, then 20-pound acid-free paper, then a tape drive, floppy disk, or hard-drive platter—and store the result in a temple or library: a building designed to guard it against rot, theft, war, and natural disaster. This approach has facilitated the preservation of some material for thousands of years. Ideally, there would be multiple identical copies stored in multiple libraries, so the failure of one storehouse wouldn’t extinguish the knowledge within. And in rare instances in which a document was surreptitiously altered, it could be compared against copies elsewhere to detect and correct the change.
These buildings didn’t run themselves, and they weren’t mere warehouses. They were staffed with clergy and then librarians, who fostered a culture of preservation and its many elaborate practices, so precious documents would be both safeguarded and made accessible at scale—certainly physically, and, as important, through careful indexing, so an inquiring mind could be paired with whatever a library had that might slake that thirst. (As Jorge Luis Borges pointed out, a library without an index becomes paradoxically less informative as it grows.)
At the dawn of the internet age, 25 years ago, it seemed the internet would make for immense improvements to, and perhaps some relief from, these stewards’ long work. The quirkiness of the internet and web’s design was the apotheosis of ensuring that the perfect would not be the enemy of the good. Instead of a careful system of designation of “important” knowledge distinct from day-to-day mush, and importation of that knowledge into the institutions and cultures of permanent preservation and access (libraries), there was just the infinitely variegated web, with canonical reference websites like those for academic papers and newspaper articles juxtaposed with PDFs, blogs, and social-media posts hosted here and there.
Enterprising students designed web crawlers to automatically follow and record every single link they could find, and then follow every link at the end of that link, and then build a concordance that would allow people to search across a seamless whole, creating search engines returning the top 10 hits for a word or phrase among, today, more than 100 trillion possible pages. As Google puts it, “The web is like an ever-growing library with billions of books and no central filing system.”
Now, I just quoted from Google’s corporate website, and I used a hyperlink so you can see my source. Sourcing is the glue that holds humanity’s knowledge together. It’s what allows you to learn more about what’s only briefly mentioned in an article like this one, and for others to double-check the facts as I represent them to be. The link I used points to https://www.google.com/search/howsearchworks/crawling-indexing/. Suppose Google were to change what’s on that page or reorganize its website anytime between when I’m writing this article and when you’re reading it, eliminating it entirely. Changing what’s there would be an example of content drift; eliminating it entirely is known as link rot.
It turns out that link rot and content drift are endemic to the web, which is both unsurprising and shockingly risky for a library that has “billions of books and no central filing system.” Imagine if libraries didn’t exist and there was only a “sharing economy” for physical books: People could register what books they happened to have at home, and then others who wanted them could visit and peruse them. It’s no surprise that such a system could fall out of date, with books no longer where they were advertised to be—especially if someone reported a book being in someone else’s home in 2015, and then an interested reader saw that 2015 report in 2021 and tried to visit the original home mentioned as holding it. That’s what we have right now on the web.
Whether humble home or massive government edifice, hosts of content can and do fail. For example, President Barack Obama signed the Affordable Care Act in the spring of 2010. In the fall of 2013, congressional Republicans shut down day-to-day government funding in an attempt to kill Obamacare. Federal agencies, obliged to cease all but essential activities, pulled the plug on websites across the U.S. government, including access to thousands, perhaps millions, of official government documents, both current and archived, and of course very few having anything to do with Obamacare. As night follows day, every single link pointing to the affected documents and sites no longer worked.
In 2010, Justice Samuel Alito wrote a concurring opinion in a case before the Supreme Court, and his opinion was linked to a website as part of the explanation of his reasoning. Shortly after the opinion was released, anyone following the link wouldn’t see whatever it was Alito had in mind when writing the opinion. Instead, they would find this message: “Aren’t you glad you didn’t cite to this webpage … If you had like Justice Alito did, the original content would have long since disappeared and someone else might have come along and purchased the domain in order to make a comment about the transience of linked information in the internet age.”
Inspired by cases like these, some colleagues and I joined those investigating the extent of link rot in 2014 and again this past spring.
The first study, with Kendra Albert and Larry Lessig, focused on documents meant to endure indefinitely: links within scholarly papers, as found in the Harvard Law Review, and judicial opinions of the Supreme Court. We found that 50 percent of the links embedded in Court opinions since 1996, when the first hyperlink was used, no longer worked. And 75 percent of the links in the Harvard Law Review no longer worked.
People tend to overlook the decay of the modern web, when in fact these numbers are extraordinary—they represent a comprehensive breakdown in the chain of custody for facts. Libraries exist, and they still have books in them, but they aren’t stewarding a huge percentage of the information that people are linking to, including within formal, legal documents. No one is. The flexibility of the web—the very feature that makes it work, that had it eclipse CompuServe and other centrally organized networks—diffuses responsibility for this core societal function.
Society can’t understand itself if it can’t be honest with itself, and it can’t be honest with itself if it can only live in the present moment. It’s long overdue to affirm and enact the policies and technologies that will let us see where we’ve been, including and especially where we’ve erred, so we might have a coherent sense of where we are and where we want to go.
Subscribe for newsletter
* You will receive the latest news and updates on your favorite celebrities!
9 Iconography rules to follow in UI design
Iconography is the visual images and symbols used in a work of art or the study or interpretation of these….
3 Web Design Mistakes That Hurt SEO and How to Avoid Them
Picture this: You have a stunning web design, attractive images, and the perfect theme that suits your business and consumers….
3 Psychological UX Design Principle to enhance User experience
Follow these three psychological concepts to aid you in improving your designs’ usability, occurring error, attractiveness, and effectiveness. Have you…