
Anna’s Archive, an open source search engine and shadow library, announced over the weekend that it has scraped metadata from over 250 million Spotify tracks, plus audio files for 86 million tracks. The group claims that it’s been scraping this data in an effort to build a “preservation archive” for music, and it’s planning to release the data, including music files, in different stages.
“A while ago, we discovered a way to scrape Spotify at scale,” Anna’s Archive wrote. “This Spotify scrape is our humble attempt to start such a “preservation archive” for music. Of course Spotify doesn’t have all the music in the world, but it’s a great start.”
Anna’s Archive claimed that it scraped 99.9% of the metadata for Spotify tracks, as well as 99.6% of the platform’s music files, which represents close to 300TB of data. The activist group said that this is “by far the largest music metadata database that is publicly available,” and the metadata is already available to download from its Torrents page. Music files will be coming next in order of popularity.
In a statement shared with Billboard, a Spotify representative said that the company is “actively investigating the incident,” which could have unprecedented consequences. “An investigation into unauthorized access identified that a third party scraped public metadata and used illicit tactics to circumvent DRM to access some of the platform’s audio files.”
Obviously, anyone getting their hands on the data could create their own music library with close to 90 million tracks. Anna’s Archive said that the quality of its scraped audio files is the original OGG Vorbis at 160kbit/s. While distribution of these files will be done via torrents, the activist group said that “if there is enough interest, we could add downloading of individual files to Anna’s Archive.”