I have a couple of podcasts I host on my site and I've noticed a disturbing trend the last couple of months: my site's bandwidth usage has gone up by 10x, but it appears most of it was a series of Google App Server instances, not an incredible increase in listeners. Digging through the logs, I think most of them appear to identify themselves as "Spotify/1.0," which is hitting my server several times every minute and downloading different old podcast episodes.
I'm wondering if this has something to do with me recently implementing NGINX as a caching server in front of Apache, because the spike roughly coincided with that change. Is there some issue with how I'm answering Spotify's bot? Is this a known issue with their indexer I need to somehow deal with?
For example:
35.240.121.201 - - [16/May/2021:12:16:14 -0500] "GET /littlehillschurch/resources/podcasts/20201214.mp3 HTTP/1.1" 200 15004716 "-" "Spotify/1.0" "-"
35.195.247.67 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20200803.mp3 HTTP/1.1" 200 14141931 "-" "Spotify/1.0" "-"
146.148.19.22 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20210503.mp3 HTTP/1.1" 200 14243142 "-" "Spotify/1.0" "-"
35.240.121.201 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20210125.mp3 HTTP/1.1" 200 15050067 "-" "Spotify/1.0" "-"
35.195.91.128 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20200817.mp3 HTTP/1.1" 200 15266593 "-" "Spotify/1.0" "-"
35.187.181.74 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20200921.mp3 HTTP/1.1" 200 14607340 "-" "Spotify/1.0" "-"
35.195.247.67 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20210222.mp3 HTTP/1.1" 200 15279536 "-" "Spotify/1.0" "-"
35.195.91.128 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20210208.mp3 HTTP/1.1" 200 15480738 "-" "Spotify/1.0" "-"
35.189.225.190 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20210510.mp3 HTTP/1.1" 200 16093457 "-" "Spotify/1.0" "-"
35.205.135.106 - - [16/May/2021:12:16:15 -0500] "GET /littlehillschurch/resources/podcasts/20200907.mp3 HTTP/1.1" 200 14715203 "-" "Spotify/1.0" "-"
35.205.135.106 - - [16/May/2021:12:16:16 -0500] "GET /littlehillschurch/resources/podcasts/20200720.mp3 HTTP/1.1" 200 14100420 "-" "Spotify/1.0" "-"
The actual files being requested seem to report correct modified dates, etc. Here's the output from curl -v
that seems relevant if I request one of the files:
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
< HTTP/2 200
< server: nginx/1.20.0
< date: Sun, 16 May 2021 17:38:40 GMT
< content-type: audio/mpeg
< content-length: 14100420
< strict-transport-security: max-age=16070400; includeSubDomains
< last-modified: Tue, 21 Jul 2020 00:52:50 GMT
< accept-ranges: bytes
< access-control-allow-origin: *
0 Answers