for example:
wget -m https://www.kali.org
No warnings, no errors; What can be wrong?
just to get more complicated I used the recommended command (see below) and the output is not satisfactory (yet):
wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains=kali.org www.kali.org
Both --no-clobber and --convert-links were specified, only --convert-links will be used.
URL transformed to HTTPS due to an HSTS policy
--2019-07-04 14:13:38-- https://www.kali.org/
Resolving www.kali.org (www.kali.org)... 192.124.249.10
Connecting to www.kali.org (www.kali.org)|192.124.249.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18714 (18K) [text/html]
Saving to: ‘www.kali.org/index.html.gz’
www.kali.org/index.html.gz 100%[=======================================================>] 18.28K --.-KB/s in 0.01s
2019-07-04 14:13:38 (1.84 MB/s) - ‘www.kali.org/index.html.gz’ saved [18714/18714]
FINISHED --2019-07-04 14:13:38--
Total wall clock time: 0.3s
Downloaded: 1 files, 18K in 0.01s (1.84 MB/s)
Converting links in www.kali.org/index.html.gz... nothing to do.
Converted links in 1 files in 0 seconds.
But ... mirrored https://www.cnn.com - for instance
Ubuntu 19.04
Codename: disco
some pages are loaded as "view page source":
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="viewport" content="initial-scale=1.0, maximum-scale=1.0" />
<link href='./index.css' rel='stylesheet' type='text/css'>
<title>crontab.guru - the cron schedule expression editor</title>
<meta name="description" content="An easy to use editor for crontab schedules.">
<meta name="google-site-verification" content="QPa8OWuMuIsXgvuvPdfSCxA4ewd2Gs5tTUh0k2crBPE" />
</head>
<body>
<a href="/"><h1>crontab guru</h1></a>
<div class="blurb">
<div>The quick and simple editor for cron schedule expressions by <a href="https://cronitor.io?utm_source=crontabguru&utm_campaign=cronitor_top" title="Cron job monitoring and observability" rel="nofollow">Cronitor</a></div>
</div>
<div id="content">loading...</div>
and again the tree directory was not downloaded.
This will work, it will copy the website locally.
If that is what you want, please use the command as follows ( change
domain.com
to your desired domain ):--recursive
means: download the whole site.--no-clobber
means: do not overwrite existing files.--page-requisites
means: download all the components of the page including images.--html-extension
means: save the pages as .html files.--convert-links
means: convert all the links to run locally ie. offline.--domains=domain.com
means: do not follow links outside this domain.Notice:
Some web-servers use compression with served pages and
wget
will download a compressed fileindex.html.gz
like so:In this case
wget
needs an extra option--compression=auto
or--compression=gzip
to correctly handle and decompress pages locally. You can use the command with this option like so ( changedomain.com
to your desired domain ):For further reading, please refer to Wget - The non-interactive network downloader
I have the same issue.
Try this command: