From a given a file, I have a requirement to create a copy that is padded with zeros to a specific size.
If you create a file with the following.
echo test >testfile
The output of the following command is inconsistent.
cat testfile /dev/zero | dd bs=256k count=1 status=none | od -c
This is the output that I would expect.
0000000 t e s t \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
1000000
But you also randomly get either of the following.
0000000 t e s t \n
0000005
0000000 t e s t \n \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
0400000 \0 \0 \0 \0 \0
0400005
Why does this command have inconsistent behavior?
Even if dd is cutting the pipe off at the end of the first file, The 128k result is strange. I get the same inconsistent results under 16.04, 18.04 and 19.04 systems.
You need to specify full blocks. Try:
Documentation
From
man dd
:Example
Observe that, without
fullblock
, the byte counts are inconsistent:With
iflag=fullbock
, I see consistent full byte counts:The core of the issue is two-fold. One part of the problem is short or partial
read()
. Per POSIX specifications:This is typical with pipes and that's exactly what's happening in the question. One solution is to use GNU extension
iflag=fullblock
, and this is the version Ubuntu uses. From GNU dd manual:POSIX
dd
, MirOSdd
, FreeBSDdd
- these do not have such option (although there were requests to add that to POSIX spec). So how do we write portable scripts withdd
that you may want to port from Ubuntu to say FreeBSD ? Well, part of the issue is thecount=1
flag. It tellsdd
how manyread()
calls to perform. Try to perform multiple traces ondd if=/dev/urandom | strace -e read dd of=/dev/null bs=256k count=1
and you will see there's always only oneread()
, which is often partial. (Note also, don't be surprised if you see 262144 bytes read instead of 256,000, because 256k is 256*1024=262144)The solution is to flip the parameters , that is make the block size
bs=1
andcount=256k
. That way we ensure there's no partial reads and we always read 1 byte, but we will do that 256k times. And yes, this is a lot slower and will take a lot longer with data in range of Gigabytes/Terabytes. In my tests,iflag=fullblock
was about 100 times faster (difference between 5 milliseconds and 700 milliseconds on the 256k bytes). However, the advantage is that this is portable and doesn't have to rely on GNUdd
extension, especially you cannot always install GNUdd