The simplest way to display file contents is using the cat
command:
cat file.txt
I can get the same result using input redirection:
cat < file.txt
Then, what is the difference between them?
The simplest way to display file contents is using the cat
command:
cat file.txt
I can get the same result using input redirection:
cat < file.txt
Then, what is the difference between them?
The
cat
program will open, read and close the file.Your shell will open the file and connect the contents to
cat
's stdin.cat
recognizes it has no file arguments, and will read from stdin.There is no difference from a user point of view. These commands do the same thing.
Technically the difference is in what program opens the file: the
cat
program or the shell that runs it. Redirections are set up by the shell, before it runs a command.(So in some other commands--that is, not the command shown in the question--there may be a difference. In particular, if you can't access
file.txt
but the root user can, thensudo cat file.txt
works butsudo cat < file.txt
does not.)You can use either one that is convenient in your case.
There are almost always many ways to get the same result.
cat
accepts a file from arguments orstdin
if there are no arguments.See
man cat
:One Big Difference
One big difference is with the
*
,?
, or[
globbing characters (wildcards) or anything else the shell may expand into multiple filenames. Anything the shell expands into two or more items, rather than treating as a single filename, cannot be opened for redirection.Without redirection (ie no
<
), the shell passes multiple filenames tocat
, which outputs the files' contents one after another. For example this works:But with redirection (
<
) an error message occurs:One Tiny Difference
I thought with redirection it would be slower but there is no perceivable time difference:
Notes:
The major difference is who opens the file, shell or cat. They may be operating with different permission regimes, so
may work while
will fail. This kind of permission regime can be a bit tricky to work around when just wanting to use
echo
for easy scripting, so there is the expedience of misusingtee
like inwhich doesn't really work using redirection instead because of the permission problem.
TL;DR Version of the answer:
With
cat file.txt
the application ( in this casecat
) received one positional parameter, executes open(2) syscall on it, and permission checks happen within the applications.With
cat < file.txt
the shell will performdup2()
syscall to make stdin into a copy of file descriptor (typically next available one, e.g. 3) corresponding tofile.txt
and close that file descriptor ( e.g. 3). The application does not perform open(2) on the file and is unaware of file's existence; it operates strictly on its stdin file descriptor. Permission check rests with the shell. Open file description will remain the same as when the shell opened the file.Introduction
On the surface
cat file.txt
andcat < file.txt
behave the same, but there's a lot more going on behind the scenes with that single character difference. That one<
character changes how shell understandsfile.txt
, who opens the file, and how the file is passed between shell and the command. Of course, in order to explain all these details we also need to understand how opening files and running commands works in shell, and this is what my answer aims to achieve - educate the reader, in simplest possible terms, on what really goes on in these seemingly simple commands. In this answer you'll find multiple examples, including those that use strace command to back up the explanations of what actually happens behind the scenes.Because inner workings how shells and commands are based on standard syscalls, viewing
cat
as just one command among many others is important. If you are a beginner reading this answer, please set yourself with an open mind and be aware thatprog file.txt
will not always be the same asprog < file.txt
. A different command may behave entirely differently when the two forms are applied to it, and that depends on permissions or how the program is written. I ask you also to suspend judgement, and look at this from the perspective of different users - for a casual shell user the needs may be entirely different than for sysadmin and developer.execve() Syscall and Positional Parameters the Executable Sees
Shells run commands by creating a child process with fork(2) syscall and calling execve(2) syscall, which executes command with specified arguments and environment variables. The command called inside
execve()
will take over and replace the process; for instance, when shell callscat
it will first create a child process with PID 12345 and afterexecve()
happens the PID 12345 becomescat
.This brings us to the difference between
cat file.txt
andcat < file.txt
. In the first case,cat file.txt
is a command called with one positional parameter, and the shell will put togetherexecve()
appropriately:In the second case, the
<
part is shell operator and< testfile.txt
tells the shell to opentestfile.txt
and make stdin file descriptor 0 into a copy of file descriptor which corresponds totestfile.txt
. This means< testfile.txt
is not going to be passed to the command itself as positional argument:This can be significant if the program requires a positional parameter to function properly. In this case,
cat
defaults to accepting input from stdin if no positional parameters corresponding to files were supplied. Which also brings us to the next topic: stdin and file descriptors.STDIN and file descriptors
Who opens the file -
cat
or shell ? How do they open it ? Do they even have permission to open it ? These are the questions that can be asked, but first we need to understand how opening a file works.When a process performs
open()
oropenat()
on a file, those functions provide the process with an integer corresponding to the open file, and the programs then can callread()
,seek()
, andwrite()
calls and myriad of other syscalls by referring that integer number. Of course the system ( aka kernel ) will keep in memory how a particular file was open, with what sort of permissions, with what sort of mode - read only,write only, read/write - and where in the file we're currently - at the byte 0 or byte 1024 - which is called an offset. This is called open file description.On the very basic level,
cat testfile.txt
is wherecat
opens the file and it will be referenced by next available file descriptor which is 3 (notice the 3 in read(2)).By contrast,
cat < testfile.txt
will use file descriptor 0 ( aka stdin ):Remember when earlier we learned that shells run commands via
fork()
first thenexec()
type of process ? Well, turns out how file is open caries over to the child processes created withfork()/exec()
pattern. To quote open(2) manual:What does this mean for
cat file.txt
vscat < file.txt
? A lot actually. Incat file.txt
thecat
opens the file, which means it's it is in control of how file is opened. In the second case, shell will open thefile.txt
and how it was opened will remain unchanged for child processes, compound commands, and pipelines. Where we're currently at in the file will also remain the same.Let's use this file as an example:
Look at example below. Why didn't the word
line
change in the first line ?The answer lies in the quote from open(2) manual above: the file opened by the shell is duplicated onto stdin of the compound command and each command/process that runs shares the offset of the open file description.
head
simply rewinded the file ahead by one line, andsed
dealt with the rest. More specifically, we'd see 2 sequences ofdup2()
/fork()
/execve()
syscalls, and in each case we get the copy of file descriptor which references the same file description on the opentestfile.txt
. Confused ? Let's take a bit crazier example:Here we printed first line, then rewinded open file description 5 bytes ahead ( which eliminated the word
line
) and then just printed the rest. And how did we manage to do it ? The open file description ontestfile.txt
remains the same, with shared offset on the file.Now, why this is useful to understand, aside from writing crazy compound commands like above ? As a developer you might want to take advantage or beware of such behavior. Let's say instead of
cat
you wrote a C program that needs a configuration either passed as file or passed from stdin, and you run it likemyprog myconfig.json
. What will happen if instead you ran{ head -n1; myprog;} < myconfig.json
? At best your program will get incomplete config data, and at worst - break the program. We can also use that as an advantage to spawn child process and let parent rewind to data which child process should take care of.Permissions and Privileges
Let's start with an example this time on a file with no read or write permissions to other users:
What happened here ? Why can we read the file in first example as
potato
user but not in second ? This goes back to the same quote from open(2) man page mentioned earlier. With< file.txt
shell opens the file, hence permission checks happen at the time ofopen
/openat()
performed by shell. The shell at that time runs with privileges of the file owner who does have read permissions on the file. By virtue of open file description being inherited acrossdup2
calls, the shell passes copy of open file descriptor tosudo
, which passed copy of file descriptor tocat
, andcat
being unaware of anything else happily reads the contents of the file. In the last command, thecat
under potato user performsopen()
on the file, and of course that user has no permission to read the file.More practically and more commonly, this is why users are baffled as to why something like this doesn't work (running privileged command to write to file which they cannot open):
But something like this works (using a privileged command to write to file that dos require privileges):
A theoretical example of the opposite situation from the one I showed earlier ( where
privileged_prog < file.txt
fails butprivileged_prog file.txt
does work ) would be with SUID programs. The SUID programs , such aspasswd
, allow performing actions with permissions of the executable owner. This is whypasswd
command allows you to change your password and then write that change to /etc/shadow even though the file is owned by root user.And for the sake of example and fun, I actually write quick demo
cat
-like application in C (source code here) with SUID bit set, but if you get the point - feel free to skip to next section of this answer and ignore this part. Side note: the OS ignores SUID bit on interpreted executables with#!
, so a Python version of this same thing would fail.Let's check the permissions on the program and the
testfile.txt
:Looks good, only the file owner and those who belong to
administrator
group can read this file. Now let's login as potato user and try to read the file:Looks OK, neither shell nor
cat
that have potato user permissions can read the file they're not allowed to read. Notice also who reports the error -cat
vsbash
. Let's test our SUID program:Works as intended ! Again, the point made by this little demo is that
prog file.txt
andprog < file.txt
differ in who open the file and differ in open file permissions.How Programs React to STDIN
We already know that
< testfile.txt
re-writes stdin in such way that data will come from the specified file instead of keyboard. In theory, and based on Unix philosophy of "doing one thing and doing it well", programs reading from stdin ( aka file descriptor 0 ) should behave consistently, and as suchprog1 | prog2
should be similar toprog2 file.txt
. But what ifprog2
wants to rewind with lseek syscall, for example in order to skip to certain byte or rewind to the end in order to find how much data we have ?Certain programs disallow reading data from pipe, since pipelines cannot be rewinded with lseek(2) syscall or the data cannot be loaded into memory with mmap(2) for faster processing. This has been covered by an excellent answer from Stephane Chazelas in this question: What is the difference between “cat file | ./binary” and “./binary < file”? I highly recommend reading that.
Luckily,
cat < file.txt
andcat file.txt
behaves consistently andcat
is not against pipes in any way, although we know it reads entirely different file descriptors. How does this apply inprog file.txt
vsprog < file.txt
in general ? If a program really doesn't want to do anything with pipes, lacking positional parameterfile.txt
will be enough to exit with error, but the application can still uselseek()
on stdin to check it it is a pipe or not (although isatty(3) or detecting S_ISFIFO mode in fstat(2) are more likely to be used for detecting pipe input ), in which case doing something like./binary <(grep pattern file.txt)
or./binary < <(grep pattern file.txt)
may not work.Filetype influence
A file type may influence
prog file
vsprog < file
behavior. Which to some extent implies that as a user of a program you are choosing the syscalls even if you are not aware of doing so. For instance, suppose we have a Unix domain socket and we runnc
server to listen on it, maybe we even prepared some data to be servedIn this case,
/tmp/mysocket.sock
will be opened via different syscalls:Now, let's try to read data from that socket in different terminal:
Both the shell and cat are performing
open(2)
syscall on what requires entirely different syscall - the socket(2) and connect(2) pair. Even this doesn't work:But if we are conscious of the file type and how we can invoke the proper syscall, we can get the desired behavior:
Notes and other suggested readings:
The quote from open(2) manual states that permissions on file descriptor get inherited. In theory, there is a way to change read/write permissions on a file descriptor but that has to be done on the level of source code.
What is an open file description?. See also POSIX definition
How does Linux check permission for file descriptor?
Why is the behavior of
command 1>file.txt 2>file.txt
different fromcommand 1>file.txt 2>&1
?