When Megaupload got taken down two years ago, it took a whole hell of a lot of data with it. And eventually it got obliterated. Some of it was pirate data, sure, but some was legit too. And new research shows that, at the very least, 10 million innocent files got the axe.
Researchers at Boston's Northeastern University, together with colleagues from France and Australia, ran a study to try to check that copyright-infringement status of a ton of files that had been Megauploaded shortly before the takedown. Examining metadata from links to content that had been hosted on Megaupload, the researchers took representative samples of 1000 files at a time, and manually decided if they were infringing, non-infringing, or undecided.
In the end, the researchers found that a whopping 31 per cent of Megaupload's content was clearly infringing, but at least four per cent of the 250 million uploads — which translates to roughly 10 million files — was clearly not. On top of that, there was a majority of 65 per cent where the researchers couldn't tell one way or the other.
Four per cent doesn't sound like that much — and it isn't! — but there are a couple important details to consider. Copyright-infringing files are duplicates by their very nature, but non-infringing files are far more like to have been unique, meaning their deletion was a real, actual loss. And any percentage of that 65 per cent could have been legit too, although it probably wasn't that big a slice.
Of course, if you were relying on MegaUpload and Megaupload alone to store your library of personal family photos, you were kind of asking for trouble. But man, 10 million files is a lot of collateral damage. So remember folks: multiple backups. [Torrentfreak]