When using wget
in a script to download some files from Google Docs, the name of the file is not preserved. For example:
wget 'http://spreadsheets.google.com/pub?key=pyj6tScZqmEfbZyl0qjbiRQ&output=xls'
saves the file as pub?key=pyj6tScZqmEfbZyl0qjbiRQ
instead of indicatorhivestimatedprevalence15-49.xls
, which is what I get if I click on the link in a browser. Is there any way to enforce this "browser-like" behaviour in wget
?
will do the trick for you.
Its still not fully implemented and seems to bug out a bit sometimes so its not the default option in
wget
, use it at your own risk.You can try to use curl to download and keep original filename:
see curl command line options.
The Google Docs link is really telling a script on the server to run, parsing that into the file you want. The file, to the best of my knowledge, does not exist ever on the server in the els form, but is generated at runtime when you ask for it. Thus, there isn't anything for wget to get.
In order to download the file, you would need to use the google API http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#DownloadingDocs/.