SpotiFile/README.md

91 wiersze
6.3 KiB
Markdown
Czysty Zwykły widok Historia

2023-01-05 13:15:59 +00:00
# SpotiFile
## A simple and open source spotify scraper.
2023-01-20 14:04:23 +00:00
*Python 3.8+*
2023-01-05 13:15:59 +00:00
2023-01-05 13:17:43 +00:00
---
2023-01-21 12:39:43 +00:00
## Quick Start
2023-01-21 12:41:17 +00:00
Make sure you have python 3.8 or above.
$ git clone https://github.com/Michael-K-Stein/SpotiFile.git
$ cd SpotiFile
2023-01-21 13:23:44 +00:00
Now open config.py and setup your SP_KEY (Spotify has renamed this to sp_adid) and SP_DC tokens ([see below](https://github.com/Michael-K-Stein/SpotiFile#sp_key--sp_dc-tokens))
2023-01-21 12:41:17 +00:00
$ python main.py
2023-01-21 12:39:43 +00:00
---
2023-01-05 13:15:59 +00:00
## What?
SpotiFile is a script which allows users to simply and easily, using a web-gui, scrape on Spotify playlists, albums, artists, etc.
2023-04-06 14:54:16 +00:00
More advanced usages can be done by importing the relevant classes (e.g. ```python
from spotify_scraper import SpotifyScraper```) and then using IPython to access specific Spotify API features.
2023-01-05 13:15:59 +00:00
### Advantages
The main advantage of using SpotiFile is that it completely circumvents all of Spotify's api call limmits and restrictions. Spotifile offers an API to communicate with Spotify's API as if it were a real user.
This allows SpotiFile to download information en-masse quickly.
2023-01-05 13:17:43 +00:00
---
2023-01-05 13:15:59 +00:00
## Why?
Downloading massive amounts of songs and meta data can help if you prefer listening to music offline, or if you are desgining a music server which runs on an airgapped network.
2023-04-06 09:08:48 +00:00
*We do not encourage music piracy nor condone any illegal activity. SpotiFile is a usefull research tool. Usage of SpotiFile for other purposes is at the user's own risk. Be warned, we will not bear any responsibility for improper use of this educational software!*
2023-01-20 13:36:05 +00:00
### Proper and legitimate uses of SpotiFile:
+ Scraping tracks to create datasets for machine learning models.
+ Creating remixes (for personal use only!)
+ Downloading music which no longer falls under copyright law ([Generally, content who's original artist passed away over 70 years ago](https://www.copyright.gov/help/faq/faq-duration.html)).
2023-04-06 10:32:31 +00:00
### Please notice Spotify's User Guidelines, and make sure you understand them. See section 5;
*The following is not permitted for any reason whatsoever in relation to the Services and the material or content made available through the Services, or any part thereof:
5. "crawling" or "scraping", whether manually or by automated means, or otherwise using any automated means (including bots, scrapers, and spiders), to view, access or collect information;*
Usage of this "scraper" is in violation of Spotify's User Guidelines. By using this code, you assume responsibility - as *you* are the one "scraping" Spotify using automated means.
2023-04-06 10:36:40 +00:00
### Please notice Deezer's Terms of Use, and make sure you understand them. See article 8 - Intellectual property;
*The Recordings on the Deezer Free Service are protected digital files by national and international copyright and neighboring rights. They may only therefore be listened to within a private or family setting. Any use for a non-private purpose will expose the Deezer Free User to civil and/or criminal proceedings. Any other use of the Recordings is strictly forbidden and more particularly any download or attempt to download, any transfer or attempt to transfer permanently or temporarily on the hard drive of a computer or any other device (notably music players), any burn or attempt to burn a CD or any other support are expressly forbidden. Any resale, exchange or renting of these files is strictly prohibited.*
2023-04-06 10:37:54 +00:00
Storing, or attempting to store files from Deezer is strictly prohibited. Use this software only to create, for personal use, a custom streaming app. Notice that you can only use this streaming app in a private or family setting. By using this code, you assume responsibility to perform only legal actions - such as *streaming* music from Deezer for personal use.
2023-04-06 10:43:09 +00:00
### Do adhere to your local laws regarding intellectual property!
#### Notice: Local law (where this was written), explicitly permits reverse engeneering for non-commercial purposes.
2023-01-05 13:15:59 +00:00
2023-01-05 13:17:43 +00:00
---
2023-01-05 13:15:59 +00:00
## How?
SpotiFile starts its life by authenticating as a normal Spotify user, and then performs a wide range of conventional and unconventional API calls to Spotify in order to retrieve relevant information.
2023-01-20 13:36:05 +00:00
SpotiFile does not actually download audio from Spotify, since they use proper DRM encryption to protect against piracy. Rather, SpotiFile finds the relevant audio file on Deezer, using the copyright id (ironically). Then SpotiFile downloads the "encrypted" audio file from Deezer, which failed to implement DRM properly. Credit for reversing Deezer's encryption goes to https://git.fuwafuwa.moe/toad/ayeBot/src/branch/master/bot.py & https://notabug.org/deezpy-dev/Deezpy/src/master/deezpy.py & https://www.reddit.com/r/deemix/ (Original reversing algorithm has been taken down).
2023-01-05 13:15:59 +00:00
2023-01-05 13:17:43 +00:00
---
2023-01-05 13:15:59 +00:00
## Features
+ Authenticating as a legitimate Spotify user.
+ Scraping tracks from a playlist.
+ Scraping tracks from an album.
+ Scraping tracks from an artist.
2023-01-20 13:20:58 +00:00
+ Scraping playlists from a user.
2023-01-20 13:43:06 +00:00
+ Scraping playlists from a catergory.
2023-01-05 13:15:59 +00:00
+ Scraping a track from a track url.
+ Scraping artist images.
+ Scraping popular playlists' metadata and tracks.
+ Premium user token snatching (experimental).
+ Scraping song lyrics (time synced when possible).
2023-01-05 14:21:14 +00:00
+ Scraping track metadata.
2023-01-20 13:43:06 +00:00
+ Scraping category metadata.
2023-01-20 11:12:27 +00:00
---
## SP_KEY & SP_DC tokens
2023-01-21 13:23:44 +00:00
Obtaining sp_dc and sp_key cookies (sp_key is now called sp_adid)
2023-01-20 11:12:27 +00:00
SpotiFile uses two cookies to authenticate against Spotify in order to have access to the required services.
2023-01-20 11:14:10 +00:00
*Shoutout to @fondberg for the explanation https://github.com/fondberg/spotcast*
2023-01-20 11:12:27 +00:00
To obtain the cookies, these different methods can be used:
### Chrome based browser
Open a new Incognito window at https://open.spotify.com and login to Spotify.
Press Command+Option+I (Mac) or Control+Shift+I or F12. This should open the developer tools menu of your browser.
Go into the application section.
In the menu on the left go int Storage/Cookies/open.spotify.com.
Find the sp_dc and sp_key and copy the values.
Close the window without logging out (Otherwise the cookies are made invalid).
### Firefox based browser
Open a new Incognito window at https://open.spotify.com and login to Spotify.
Press Command+Option+I (Mac) or Control+Shift+I or F12. This should open the developer tools menu of your browser.
Go into the Storage section. (You might have to click on the right arrows to reveal the section).
Select the Cookies sub-menu and then https://open.spotify.com.
Find the sp_dc and sp_key and copy the values.
Close the window without logging out (Otherwise the cookies are made invalid).