Internet Archive: Smart 404 Handler

internetarchiveContinuing our series on the Internet Archive, you may recall the last part on the Wayback Machine. Well, today we’re going to go over one more extension of its power that was big enough to get its own part, the smart 404 handler!

If you have your own site, you have probably deleted a page or post by now. Whether on purpose or by accident, you can control what’s on your site, but you can’t control the links to it that already exist elsewhere. When those links direct to your deleted content, the visitor sees a boring 404 Not Found error. But, what if I told you that you could use the Wayback Machine to offer those disappointed visitors a glimpse of the content they missed?

A few years back, the folks behind the Internet Archive debuted their smart 404 handler for free use, and it still works great! Simply add the code from the earlier link to your custom 404 page, and it will work like magic. If you have a WordPress site, your theme more than likely has a 404.php file. If you have not already, now is a great time to start a child theme, so your changes aren’t lost if the theme is updated. Now, simply find where the content of your 404 page ends in the 404.php file (the content you visibly see on the page, not the entirety of the code), and add the code right below it. Here’s how the relevant section looks in the Sorbet theme’s 404.php file:


You won’t see anything if the former page or post wasn’t archived by the Wayback Machine. If you did it right though, and you land on one that has been archived, you’ll find a welcoming message with a link to the most recently archived version. It will look something like this:


Thanks to the smart 404 handler and the Wayback Machine’s over 462 billion archived pages, the experience of missing out on lost content could be a thing of the past.

This bring us to the end of our series on the Internet Archive, for now. If you enjoyed your brief tour, don’t forget that they need donations to be able to provide all this for free. Until next time, enjoy everything the Internet Archive has to offer!

Internet Archive: Wayback Machine

internetarchiveContinuing in our series on the Internet Archive, we have the one thing it might be known best for, the Wayback Machine! There are over 462 billion web pages saved in the Wayback Machine, which leads to some powerful options.

The Wayback Machine is named for the WABAC time machine from the Peabody’s Improbable History segment of The Rocky and Bullwinkle Show, and like a time machine, everyone has played around with the most basic usage of the Wayback Machine. Want to know what looked like in 2003? No problem, the Wayback Machine has it. How about what looked like in 1997, or what looked like in 1998? The Wayback Machine will be hours of fun if that’s what you’re looking for, but what else does it offer?

The power of the Wayback Machine is in what it stores: everything. The entire source of the page, along with any available media, is stored. First of all, you might be thinking, “I’d better block that immediately!” Don’t. No one is going to purposefully visit your site through the Wayback Machine instead of just normally visiting your site, that’s silly. Allow your site to be archived for history, there’s no reason not to.

So, what does this “everything” get you? Quite a bit actually. Ever wonder what would happen to your site if you found out your backups were bad? The Wayback Machine is here for you to copy and paste whatever text you need to, and to re-upload any media it was able to archive. Does something seem odd in your site lately, something you can’t quite identify? Instead of fully restoring an old backup, compare your site to last month’s archive on the Wayback Machine. If you can identify what’s different, you can even view the source like you would on any normal web page to dig into the deep details.

As a true story of its power, we use the Wayback Machine almost every day in Jetpack support. When you connect Jetpack with your blog, it ties everything to your blog’s URL, and assigns that URL a unique blog ID. If you’re running the Stats module, you can find that ID in the source output towards the bottom. Just look in the source for “blog:’number'” and that number is the blog ID. Sometimes people move their blog to a new domain, and Jetpack will get confused and think it’s a new site (we’re working on ways to improve that). If we can find the old site in the Wayback Machine, we can find the old blog ID in the source, and then we can fix everything.

The Wayback Machine has a lot to offer, and you only need to start digging to get a good grasp of just how much there is. Storing so much data isn’t cheap though, and the Internet Archive needs your donations to keep it running. Dive into history with the Wayback Machine and see what you can uncover! Next time? Smart 404 handler!

Internet Archive: Software

internetarchiveContinuing in our series on the Internet Archive, we have one of their newest sections, Software. This is definitely something your local library doesn’t have … unless it has an arcade, I suppose.

The Internet Archive has over 1 hundred-thousand software titles, including older abandoned applications and educational titles, but the real stars are the Internet Arcade and Console Living Room. Between both the Internet Arcade and Console Living Room, you’ll find over 2 thousand older arcade and console titles that you can play right in your browser!

From Defender to an Asteroids clone, you’ll be able to experience your favorite arcade classics without spending a single quarter. Looking for more nostalgic hits from your DOS or console gaming days? From Atari’s Star Wars to Rebel Assault, from Oregon Trail to Dune 2, from Super Street Fighter 2 to Ultimate Mortal Kombat 3, you’ll find it on the Internet Archive, ready to relive your past in your browser for free.

Take some time to explore what’s available there. You’re sure to bring back some great memories, and don’t forget, they take donations to keep everything freely available. Next time? The Wayback Machine!

Internet Archive: Free Media

internetarchiveContinuing our series on the Internet Archive, I figured we’d start with the obvious bits. At the Internet Archive, you have access to a wide variety of public domain or owner-donated texts, audio, videos, and photos. That’s right, it’s just a like a library online, because that’s exactly what it is!

There are over 8 million texts available to browse or download in eBook formats on the Internet Archive, anything from text books for higher education to US government studies into UFO sightings. If what you’re looking for isn’t freely available for download at the Internet Archive, stop by its side-project, the Open Library for even more titles available to check out in eBook formats.

There are almost 3 million audio files available to stream or download, including voice recordings, radio shows, music, whole albums, audio books, and almost 2 hundred-thousand full live concerts. You’ll never need to buy an album or pay for a streaming music service again, unless you wanted to hear recently released music of course.

There are just over 2 million videos available to stream or download, including movies and television. If you’re feeling nostalgic, stop by the Perlinger Archives for over 6 thousand public service announcements and educational films, or perhaps almost 1 million TV news clips.

There are over 1 million images available to browse and download. From NASA Images to 16th Century artwork, it’s your history stored digitally in so many ways.

Take some time to tour the Internet Archive and see what you can find, and don’t forget, they take donations to keep everything freely available. Next time? Software!

The Internet Archive

internetarchiveThe Internet Archive has been one of my favorite sites for quite a few years, and many of its hidden powers are not that obvious at first glance, so I figured I might as well write up a few posts. With that said, this is part 1 of a 5 part series.

What is the Internet Archive? According to their about page:

The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library. Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format. … Now the Internet Archive includes texts, audio, moving images, and software, as well as archived web pages in our collections.

That sure is a lot to take in, and from a non-profit organization too! In short, the Internet Archive is dedicating to digitally preserving both our physical and digital history, and making it all freely available to the world, in many ways more than any other library or museum out there. Besides its core offerings, the Internet Archive has a large number of fascinating connected projects, including Open Library and most recently Political TV Ad Archive which collects this year’s political ads and reports some fascinating data.

There sure is a lot going on at the Internet Archive, and that’s why they’re always open to donations. There are quite a few things I couldn’t do without the Internet Archive, and over the next few days (weeks?) I hope to share some of the far less obvious ones with you. For now, browse around it and see what you can discover!

The Revisionaries of Wikipedia

wikiquestionWikipedia has been with us for fourteen years, and I’m willing to bet that everyone has made use of it at least once. Perhaps some of you have even contributed content or editorial help to Wikipedia. It is, after all, the encyclopedia editable by everyone, right?

Over the past eleven years, a group of core editors has been working behind the scenes, choosing which edits live or die while handing out lifetime bans for edits they consider to be not factual (regardless of evidence). Recently, they choose to ban editors defending articles from vandalism, rather than ban the vandals themselves. Is this a reaction to the impossibility of policing an encyclopedia which is editable by the entire world? Of course, but then why advertise it as such?

The whole situation reminds me of The Revisionaries, a documentary detailing how a small group with clear biases has commanding control over exactly what and how history is portrayed in our textbooks. If you have not seen it yet, I recommend it, as it may also speak towards a grim future for Wikipedia, given recent events.

An encyclopedia editable by the entire world needs to be policed somehow, but when deciding the knowledge which is passed down, how do we trust the right choices are being made? After all, they even discard corrections from scholars accompanied by published evidence. Does this mean that established yet incorrect “facts” may never be corrected in the eyes of Wikipedia? Perhaps the problem is that too many people think of facts as a matter of opinion, not as a result of evidence.

Where do we go from here? I believe The Internet Archive is the answer. The Internet Archive does not strive to make history editable by the masses, nor does it make rulings on fact vs. fiction. The Internet Archive simply exists to preserve as much history as it can for as long as it can. If you see a book written by one man preserved in The Internet Archive, you can trust that it is the opinion of that one man. If the evidence in that book holds up, you known that opinion is indeed a fact. How much of a Wikipedia article is fact? We may never know. Everyone has their hands in Wikipedia, with one possibly biased group holding the power of final judgement.

I guess I see it as a choice between what’s more important for the future of history itself. Should it be the ability to preserve history forever, or the ability to edit history whenever we want to?