Many server administrators want their server to be used only by humans and not by retrieval programs like wget
. One way to block such programs is to use log analysis. Log analysis identifies retrieval programs by looking for statistically significant similarities among the requests, often through timing.
Whenever I try to use wget to download packages through a shell script (one similar to those created by synaptic
, mostly they are actually created by synaptic
), only a few packages are downloaded and most of the packages fail to download due to connection refusal.
So I strongly think that the most probable reason why the connection is refused is that Ubuntu servers use log analysis to block programs.
Do Ubuntu servers use log analysis to block (package retrieval) programs?
EDIT:
I executed some scripts which contained packages of small size (i.e., they would get downloaded in less time). Such scripts work properly as expected. The error comes up with packages that are large in size (consequently they take more time).
wget
has an option,--random-wait
, that is designed to avert log analysis blocking. From the docs:So chances are, if the server accepts you with the
--random-wait
option turned on but not without it, it is using log analysis.Most of the mirrors aren't controlled by Ubuntu and their configuration is completely up to the sysadmins. By extension there may be some blocking on some mirrors. I personally don't see why they would but given the defaults,
wget
is pretty simple to fingerprint through its user-agent string even before you start considering behavioural tracking.You can make
wget
look like the currentapt
quite simply:And as another user pointed out, if your current mirror is controlled by somebody who doesn't want you using
wget
, you could just use another mirror. There are loads of them.