Investigators have a common question: what if I could go back in time to see what a website looked like a year ago? Or even 3 years ago? What about a week ago? The Internet Archive provides just such a capability. Inspired by the fictional WABAC machine from the Rocky and Bullwinkle Show, which Professor Peabody used to go back to past eras in time, the “Wayback Machine” by the Internet Archive also goes back in time on the Internet. Although it can be used in that capacity, it is critical to be aware of its limitations as well. In fact there are active legal discussion on its admissibility as evidence in court. See Onward Multi-Corp. Inc. v. Empire Comfort Systems, Inc., 2010 TMOB 29 (CanLII) http://canlii.ca/t/291b4 or this discussion for US-based case law or this paper.
Over 240 billion webpages are archived from 1996 up to today. Popular websites are archived more frequently than others, and the website needs to be at least 6 months old for the Wayback Machine to pick it up. It updates the version on record a minimum of once every 6 months. Below is a survey of its use in popular websites / social media / e-commerce for investigations:
The most popular social networking site out there shuts the door on Internet Archive and many other search crawlers due to its robots.txt file. Google provides a limited view of the users in Facebook, but can’t penetrate into the network due to similar restrictions Facebook imposes on its Web Crawler.
No further details available than the screenshot below: Twitter is not supported on Internet Archive.
Internet Archive does a little better with YouTube. You can type in a specific video and track if the video has changed, or if there were some “comments” made in the past that were deleted or no longer there today.
Above is the archived page for a popular YouTube video from Jan. 1, 2012. Surprisingly, the video works, and is streamed through JWPlayer, some sort of simulation of the original YouTube video player. You can move forwards or backwards in time by selecting the controls along the top of the page. If you want to fine-tune you can use the left or right arrows to move to the next archived version. Sometimes Wayback Machine can have a few different versions of the page in the same day.
Here is the same video archived on Nov. 27, 2012, you can see that the number of hits has gone up and more comments are there to browse through.
Below is another limitation of the Internet Archive: search capability is restricted. If you try searching on the archived website, you’ll be directed to the results on the live website instead. That is because Internet Archive cannot possibly archive all possible search results you can enter.
The Internet Archive has some limited visibility into old Craigslist ads as shown in the screenshot below. It only shows a listing of ads rather than the ads themselves
As you can see below, when you click on an individual ad then the error message is displayed.
So if you were looking for archived Craigslist ads, and you are working for law enforcement, SIU, fraud detection, or private investigators, you can use Craigslist Power Search to accessed archived ads.