Running grep by opening Cygwin terminal via Microsoft Remote Desktop to Windows Server 1012 R2(same as natively?):
Administrator@MYSERV /cygdrive/d/bin/beta
$ time grep -inowf matchfile_431184247462809.temp infile_431184247462809.temp > delme
real 1m40.568s
user 1m40.405s
sys 0m0.140s
Exact same command, on same files, executed when connected via Cygwin SSH:
Administrator@MYSERV /cygdrive/d/bin/beta
$ time grep -inowf matchfile_431184247462809.temp infile_431184247462809.temp > delmessh
real 0m0.148s
user 0m0.140s
sys 0m0.000s
grep.exe executable is the same, the output file is the same, but the run time is split second vs. almost 2 minutes.
Given that cygwin SSH runs under special user setup, i tried to ssh localhost
on remote desktop; the runtime: 1 minute and 40 seconds.
Is there some logical or illogical explanation for this? Any settings that I can check on Windows Server 2012 that artificially suppress remote desktop processes?
Update:
running C:\cygwin\bin\grep.exe from Windows command line cmd
also is instant. So there is an issue with Cygwin terminal.
Update 2: I googled that having dead file shares in PATH can slow Bash terminal down. Contrary to my initial hope erasing $PATH variable did not do anything. I also do not have any dead links in PATH.
Solution, kudos to @Paul Haldane:
Grep seems to be thrown off by $LANG
value of en_US.UTF-8
, which is default in Cygwin. This hits regex performance especially hard. Running grep -F
was also slower but only by a factor of 4.
Here is a verification on a separate server:
$ echo $LANG
en_US.UTF-8
$ time grep -inowf matchfile_431184247462809.temp infile_431184247462809.temp > delme
real 1m56.425s
user 1m56.218s
sys 0m0.171s
$ LANG=''
$ time grep -inowf matchfile_431184247462809.temp infile_431184247462809.temp > delme2
real 0m0.286s
user 0m0.265s
sys 0m0.015s
$ diff delme delme2
** no difference **
Solution, kudos to @Paul Haldane:
Grep seems to be thrown off by $LANG value of en_US.UTF-8, which is default in Cygwin. This hits regex performance especially hard. Running grep -F was also slower but only by a factor of 4.
Here is a verification on a separate server:
For one ssh adds overhead because of the encryption but that doesn't explain the jump from seconds to minutes, what does explain is the fact that Cygwin emulates a Unix terminal and emulation is slow. You can find more details regarding this on Wikipedia https://en.wikipedia.org/wiki/Cygwin
That part explain it pretty well
The fork system call for duplicating a process is fully implemented, but it does not map well to the Windows API. For example, the copy-on-write optimization strategy could not be used.[5][6][7] As a result, Cygwin's fork is rather slow compared with Linux and others. (That overhead can often be avoided by replacing uses of the fork/exec technique with calls to the spawn functions declared in the Windows-specific process.h header).