We’re Archiving Yahoo Answers So You’ll Always Know How Babby Is Formed

We’re Archiving Yahoo Answers So You’ll Always Know How Babby Is Formed
Photo: Postmodern Studio, Shutterstock

When the internet was mourning the impending shutdown of Yahoo Answers, announced earlier this week, there was a sentiment that you saw over and over again: that shutting the site down was akin to the library of Alexandria burning down.

Honestly, that’s not too much of an exaggeration. When the service rolled out in the halcyon days of 2005, it became pretty pivotal in establishing some of the earliest memes to ever grace The Online. It taught us how Babby was formed. It taught us how to use Weggy Boards. It dared to ask whether spider has pusspuss (the answer to that is “NO!!!!!!!!!!!!!!!!!!,” by the way). The weird, surreal humour beloved by folks that were raised by the internet wouldn’t be the same without this dumb, dumb site.

What’s adding insult to injury on Yahoo’s part is the fact that they’re giving us a mere month’s notice to say goodbye. Once that month’s over, this pivotal piece of internet memorabilia would be gone forever — and because Yahoo isn’t offering any sort of data dump for folks looking to archive their favourite Yahoo’s, ultimately that responsibility falls on the shoulders of the people that loved it. That includes us.

With the help of the Internet Archive — and a little bit of code — we set up a script to auto-archive as many of the roughly 84 million submitted questions that we were able to find using the “sitemap” file for the Yahoo Answers site. These sorts of files are typically included as a way to help search engines index different pages so that people looking for answers will have a particular Yahoo Answers page crop up.

That said, we have no way of knowing whether posts in the sitemap represent the complete universe of questions asked of Yahoo. But still: 84 million isn’t anything to sneeze at!

What our script looks like on the backend.  (Screenshot: Gizmodo) What our script looks like on the backend. (Screenshot: Gizmodo)

At the same time, these scripts take time to work. Each of these questions can take over a second to log, which means at the current rate, it would take us at least two and a half years to fully archive every question in our set. For that reason we submitting posts to the Internet Archive at random, to at least try and preserve some sort of representative sample of — and we say this lovingly — perhaps the stupidest place on the internet.

Thankfully there appears to be an effort already underway to save the entirety of Answer by a volunteer collective of archivists known as Archive Team. In fact, this is the second archive of Yahoo Answers they’ve kicked off, after first attempting to save these pages back in 2017.

At the time of publication, we have about 5,500 questions that we saved, which you’re free to browse through here if you’re curious about:

If you’re feeling generous, you can also donate to the Internet Archive for making this silly (yet deeply significant) project possible.