Download file from Blob URL with Python -


i wish have python script download master data (download, xlsx) excel file frankfurt stock exchange webpage.

when retrieve urrlib , wget, turns out url leads blob , file downloaded 289 bytes , unreadable.

http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/all-tradable-etfs-etcs-and-etns.xlsx

i'm entirely unfamiliar blobs , have these questions:

  • can file "behind blob" retrieved using python?

  • if so, necessary uncover "true" url behind blob – if there such thing – , how? concern here link above won't static change often.

that 289 byte long thing might html code 403 forbidden page. happen because server smart , rejects if code not specify user agent.

python 3

# python3 import urllib.request request  url = 'http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/all-tradable-etfs-etcs-and-etns.xlsx' # fake user agent of safari fake_useragent = 'mozilla/5.0 (ipad; cpu os 6_0 mac os x) applewebkit/536.26 (khtml, gecko) version/6.0 mobile/10a5355d safari/8536.25' r = request.request(url, headers={'user-agent': fake_useragent}) f = request.urlopen(r)  # print or write print(f.read()) 

python 2

# python2 import urllib2  url = 'http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/all-tradable-etfs-etcs-and-etns.xlsx' # fake user agent of safari fake_useragent = 'mozilla/5.0 (ipad; cpu os 6_0 mac os x) applewebkit/536.26 (khtml, gecko) version/6.0 mobile/10a5355d safari/8536.25'  r = urllib2.request(url, headers={'user-agent': fake_useragent}) f = urllib2.urlopen(r)  print(f.read()) 

Comments

Popular posts from this blog

php - Auto increment employee ID -

php - isset function not working properly -

python - Evaluating the next line in a For Loop while in the current iteration -