Recovering Broken Web Links and Resources with Archive.org

Recovering Broken Web Links and Resources with Archive.org
Zachary Fruhling May 11, 2018

Article continues here

Using external web resources, such as online articles, files, and websites in your online courses is a powerful way to include provide a world of information and resources to your students at minimal cost. External web resources, however, come with certain user-experience risks, namely that sometimes web resources are removed or relocated, resulting in broken web links and a lack of required materials for students.

Sometimes it is easy to find the location and URL (web address) of a replacement resource if the resource has merely been relocated instead of removed from the internet entirely. Other times, however, externally hosted articles or files may be removed from the internet entirely with no obvious replacement resource or web location. For instructional designers who are responsible for the maintenance of online courses, this presents a practical problem keeping online courses up to date and functional as external web resources are removed or relocated over time.

Thankfully, there is a way to recover many seemingly lost external web resources with the help of the Internet Archive (archive.org). The Internet Archive has been actively archiving websites and externally hosted file-based resources (along with its many other archive projects) since 1996. The Internet Archive uses a web crawler bot that scans and archives websites and web resources over time, often resulting in many different archive versions of the same website over time.

If you have the URL for a broken web link, article, or file (such as an externally hosted PDF file in an online course), you can often recover the resource using the Internet Archive’s Wayback Machine (the title of which is a clever nod to the WABAC machine time-travelling device from The Rocky and Bullwinkle Show). Simply navigate to the Wayback Machine under the “Web” section of the Internet Archive’s website here: https://archive.org/web/. Once there, enter the URL for the resource you are trying to recover into the search field. If the Internet Archive has one or more archived scans of that URL available for you, you will be presented with a calendar showing the dates of available archived scans of that URL that you can browse and view as you see fit.

(For example, see an archive of the White House website from March 1, 2000 here: The White House Website, March 1, 2000.)

Once you have located the archived version of the resource you are trying to recover, it is easy to add the resource into your online course to replace broken web links. One option is to replace a broken web link with a link to the URL for the archived version of the resource hosted on the Internet Archive website. To do this, simply view one of the archived versions of the resource, copy the URL for the archived version from your internet browser’s address bar, and paste the URL for the archived version into your online course as a web link.

Another option for recovering file-based resources (such as an externally hosted PDF, DOC, or PPT file) is to follow the steps above to locate and call up an archived version of the file, but instead of linking to the URL for the archived version of the file, download the file, then upload it directly to the Learning Management System (LMS) hosting your course. This has the advantage of making your online course even less dependent on external web resources, although it should be noted that the URLs for archived versions of web resources at the Internet Archive are generally more reliable, and less likely to change or to be taken down, than the URLs for the non-archived versions of many web resources and files.

It is worth noting that you can also submit a URL to the Internet Archive for archiving, or request a new scan of a web resource that has been previously archived. For any externally hosted resources that you are worried about losing public access to, it is wise to submit the URL for the resource to the Wayback Machine for safekeeping. This can likewise be done under the “Web” section of the Internet Archive here: https://archive.org/web/.

While the Internet Archive’s Wayback Machine serves a useful historical and sociological function, it also serves a practical role in maintaining and repairing online courses that depend on external web resources, thus making the Internet Archive’s Wayback Machine an essential tool in my day-to-day instructional design toolbox.


Zachary Fruhling is an instructional designer, online educational content author and developer, educational technologist, philosophy instructor, poet, and podcaster with nearly 20 years of experience in higher education and educational content development. See Zachary's website at www.zacharyfruhling.com.

You may also like to read

Share