I have a link to an http page that has a structure like this:
Parent Directory -
[DIR] _OLD/ 01-Feb-2012 06:05 -
[DIR] _Jan/ 01-Feb-2012 06:05 -
[DIR] _Dec/ 01-Jan-2012 06:05 -
......
[DIR] _Apr/ 01-May-2011 06:05 -
[DIR] _Mar/ 01-Apr-2011 06:05 -
[DIR] _Feb/ 01-Mar-2011 06:05 -
[DIR] WEB-INF/ 21-Aug-2009 13:44 -
[ ] nohup_XXX_XXX21.out 14-Feb-2012 09:05 1.6M
[ ] XXX_XXX21.log 14-Feb-2012 09:04 64K
[ ] XXX_XXX21_access.log 14-Feb-2012 08:31 8.0K
[ ] XXX_XXX21_access.log00013 14-Feb-2012 00:01 585K
I would like to downlload only the files present in the root directory...the xxxx files.
I have a solution using
curl -U Mozilla http://yourpage.com/bla.html > page
grep -o http://[^[:space:]]*.*log* page > links
wget -i link
but i wonder is not possible to do that only using wget ?
All files from root directory matching pattern *.log*:
You avoid grepping out html links (could be error prone) at a cost of few more requests to server.