How can I use docker without sudo?

Question

yukashima huksay

Asked: 2018-02-09 01:18:05 +0800 CST2018-02-09 01:18:05 +0800 CST 2018-02-09 01:18:05 +0800 CST

wget stuck in the middle of mirroring a webpage

772

I was mirroring a website with the following command:

wget -m -nc -p -E -k -np -e robots=off https://www.somesite.com/ & disown

And everything was going on alright until I saw that it was stuck in

Reusing existing connection to www.somesite.com:443.

and I closed that tty.

What should I do to make it continue?

Here is a part of wget output:

www.somesite.com/.../sport.html       [   <=>                                           ] 833.32K  1.53MB/s    in 0.5s    
Last-modified header missing -- time-stamps turned off.
2018-02-10 16:34:23 (1.53 MB/s) - ‘www.somesite.com/.../sport.html’ saved [853319]

--2018-02-10 16:34:23--  http://www.somesite.com/.../social
Reusing existing connection to www.somesite.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘www.somesite.com/.../social.html’

www.somesite.com/.../social.html      [ <=>                                             ] 141.35K   816KB/s    in 0.2s    

Last-modified header missing -- time-stamps turned off.
2018-02-10 16:34:24 (816 KB/s) - ‘www.somesite.com/.../social.html’ saved [144747]

--2018-02-10 16:34:24--  http://www.somesite.com/.../parliament
Reusing existing connection to www.somesite.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘www.somesite.com/.../parliament.html’

The command I used is:

wget -m -c -p -E -k -np -e robots=off https://www.somesite.com

Is there no way to instruct wget to not download the same url that it had already downloaded before?

1 Answers

Voted

Melebius · Answer 1 · 2018-02-09T02:51:38+08:00

Just run the command again. wget is clever enough to continue the download. However, you must specify correct options.

For example, remove the -nc option if you want to re-download changed files (see also Skip download if files exist in wget?):

-nc
--no-clobber
(…) When -nc is specified, (…) Wget will refuse to download newer copies of file. Therefore, ""no-clobber"" is actually a misnomer in this mode---it's not clobbering that's prevented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that's prevented.

When running Wget with -r or -p, but without -N, -nd, or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored.

If the download was interrupted during downloading a large file, you might want to add the -c option:

-c
--continue
Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program.

Source of quotes: man wget

You should also consider using screen or tmux instead of disown to be able to check the status and output of your background processes.

wget stuck in the middle of mirroring a webpage

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?