Why does $ ls > ls.out
cause 'ls.out' to be included in list of names of files in current directory? Why was this chosen to be? Why not otherwise?
Why does $ ls > ls.out
cause 'ls.out' to be included in list of names of files in current directory? Why was this chosen to be? Why not otherwise?
When evaluating the command the
>
redirection is resolved first: so by the timels
runs the output file has been created already.This is also the reason why reading and writing to the same file using a
>
redirection within the same command truncates the file; by the time the command runs the file has been truncated already:Tricks to avoid this:
<<<"$(ls)" > ls.out
(works for any command that needs to run before the redirection is resolved)The command substitution is run before the outer command is evaluated, so
ls
is run beforels.out
is created:ls | sponge ls.out
(works for any command that needs to run before the redirection is resolved)sponge
writes to the file only when the rest of the pipe has finished executing, sols
is run beforels.out
is created (sponge
is provided with themoreutils
package):ls * > ls.out
(works forls > ls.out
's specific case)The filename expansion is performed before the redirection is resolved, so
ls
will run on its arguments, which won't containls.out
:On why redirections are resolved before the program / script / whatever is run, I don't see a specific reason why it's mandatory to do so, but I see two reasons why it's better to do so:
not redirecting STDIN beforehand would make the program / script / whatever hold until STDIN is redirected;
not redirecting STDOUT beforehand should necessarily make the shell buffer the program's / script's / whatever's output until STDOUT is redirected;
So a waste of time in the first case and a waste of time and memory in the second case.
This is just what occurs to me, I'm not claiming these are the actual reasons; but I guess that all in all, if one had a choice, they would go with redirecting before anyway for the abovementioned reasons.
From
man bash
:First sentence, suggests that output is made to go somewhere other than
stdin
with redirection right before the command is executed. Thus, in order to be redirected to file, file must first be created by the shell itself.To avoid having a file, I suggest you redirect output to named pipe first, and then to file. Note the use of
&
to return control over terminal to the userBut why?
Think about this - where will be the output ? A program has functions like
printf
,sprintf
,puts
, which all by default go tostdout
, but can their output be gone to file if file doesn't exist in the first place ? It's like water. Can you get a glass of water without putting glass underneath the faucet first ?I don't disagree with the current answers. The output file has to be opened before the command runs or the command won't have anywhere to write its output.
This is because "everything is a file" in our world. Output to screen is SDOUT (aka file descriptor 1). For an application to write to the terminal, it opens fd1 and writes to it like a file.
When you redirect an application's output in a shell, you're altering fd1 so it's actually pointing at the file. When you pipe you alter one application's STDOUT to become another's STDIN (fd0).
But it's all nice saying that, but you can quite easily look at how this works with
strace
. It's pretty heavy stuff but this example is quite short.Within
strace.out
we can see the following highlights:This opens
ls.out
asfd3
. Write only. Truncates (overwrites) if exists, otherwise creates.This is a bit of juggling. We shunt STDOUT (fd1) off to fd10 and close it off. This is because we're not outputting anything to the real STDOUT with this command. It finishes by duplicating the write handle to
ls.out
and closing the original one.This is it searching for the executable. A lesson perhaps to not have a long path ;)
Then the command runs and the parent waits. During this operation any STDOUT will have actually mapped to the open file handle on
ls.out
. When the child issuesSIGCHLD
, this tells the parent process its finished and that it can resume. It finishes off with a little more juggling and a close ofls.out
.Why is there so much juggling? No I'm not entirely sure either.
Of course you can change this behaviour. You could buffer to memory wth something like
sponge
and that'll be invisible from the proceeding command. We're still affecting the file descriptors, but not in a file-system-visible way.There is also a nice article about Implementation of redirection and pipe operators in shell. Which shows how redirection could be implemented so
$ ls > ls.out
could look like: