Download file from Blob URL with Python -
i wish have python script download master data (download, xlsx) excel file frankfurt stock exchange webpage.
when retrieve urrlib , wget, turns out url leads blob , file downloaded 289 bytes , unreadable.
i'm entirely unfamiliar blobs , have these questions:
can file "behind blob" retrieved using python?
if so, necessary uncover "true" url behind blob – if there such thing – , how? concern here link above won't static change often.
that 289 byte long thing might html code 403 forbidden page. happen because server smart , rejects if code not specify user agent.
python 3
# python3 import urllib.request request url = 'http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/all-tradable-etfs-etcs-and-etns.xlsx' # fake user agent of safari fake_useragent = 'mozilla/5.0 (ipad; cpu os 6_0 mac os x) applewebkit/536.26 (khtml, gecko) version/6.0 mobile/10a5355d safari/8536.25' r = request.request(url, headers={'user-agent': fake_useragent}) f = request.urlopen(r) # print or write print(f.read()) python 2
# python2 import urllib2 url = 'http://www.xetra.com/blob/1193366/b2f210876702b8e08e40b8ecb769a02e/data/all-tradable-etfs-etcs-and-etns.xlsx' # fake user agent of safari fake_useragent = 'mozilla/5.0 (ipad; cpu os 6_0 mac os x) applewebkit/536.26 (khtml, gecko) version/6.0 mobile/10a5355d safari/8536.25' r = urllib2.request(url, headers={'user-agent': fake_useragent}) f = urllib2.urlopen(r) print(f.read())
Comments
Post a Comment