How can I use docker without sudo?

Question

Arronical

Asked: 2016-02-05 06:40:30 +0800 CST2016-02-05 06:40:30 +0800 CST 2016-02-05 06:40:30 +0800 CST

Bash - Check directory for files against list of partial file names

772

I have a server which receives a file per client each day into a directory. The filenames are constructed as follows:

uuid_datestring_other-data

For example:

d6f60016-0011-49c4-8fca-e2b3496ad5a7_20160204_023-ERROR

uuid is a standard format uuid.
datestring is the output from date +%Y%m%d.
other-data is variable in length but will never contain an underscore.

I have a file of the format:

#
d6f60016-0011-49c4-8fca-e2b3496ad5a7    client1
d5873483-5b98-4895-ab09-9891d80a13da    client2
be0ed6a6-e73a-4f33-b755-47226ff22401    another_client
...

I need to check that every uuid listed in the file has a corresponding file in the directory, using bash.

I've got this far, but feel like I'm coming from the wrong direction by using an if statement, and that I need to loop through the files in the source directory.

The source_directory and uuid_list variables have been assigned earlier in the script:

# Check the entries in the file list

while read -r uuid name; do
# Ignore comment lines
   [[ $uuid = \#* ]] && continue
   if [[ -f "${source_directory}/${uuid}*" ]]
   then
      echo "File for ${name} has arrived"
   else
      echo "PANIC! - No File for ${name}"
   fi
done < "${uuid_list}"

How should I check that the files in my list exist in the directory? I'd like to use bash functionality as far as possible, but am not against using commands if need be.

5 Answers

Voted

choroba · Answer 1 · 2016-02-05T06:56:10+08:00

Best Answer

choroba

2016-02-05T06:56:10+08:002016-02-05T06:56:10+08:00

Walk over the files, create an associative array over the uuids contained in their names (I used parameter expansion to extract the uuid). The, read the list, check the associative array for each uuid and report whether the file was recorded or not.

#!/bin/bash
uuid_list=...

declare -A file_for
for file in *_*_* ; do
    uuid=${file%%_*}
    file_for[$uuid]=1
done

while read -r uuid name ; do
    [[ $uuid = \#* ]] && continue
    if [[ ${file_for[$uuid]} ]] ; then
        echo "File for $name has arrived."
    else
        echo "File for $name missing!"
    fi
done < "$uuid_list"

5

terdon · Answer 2 · 2016-02-05T08:39:25+08:00

Here's a more "bashy" and concise approach:

#!/bin/bash

## Read the UUIDs into the array 'uuids'. Using awk
## lets us both skip comments and only keep the UUID
mapfile -t uuids < <(awk '!/^\s*#/{print $1}' uuids.txt)

## Iterate over each UUID
for uuid in ${uuids[@]}; do
        ## Set the special array $_ (the positional parameters: $1, $2 etc)
        ## to the glob matching the UUID. This will be all file/directory
        ## names that start with this UUID.
        set -- "${source_directory}"/"${uuid}"*
        ## If no files matched the glob, no file named $1 will exist
        [[ -e "$1" ]] && echo "YES : $1" || echo  "PANIC $uuid" 
done

Note that while the above is pretty and will work fine for a few files, its speed depends on the number of UUIDs and will be very slow if you need to process many. If that is the case, either use @choroba's solution or, for something truly fast, avoid the shell and call perl:

#!/bin/bash

source_directory="."
perl -lne 'BEGIN{
            opendir(D,"'"$source_directory"'"); 
            foreach(readdir(D)){ /((.+?)_.*)/; $f{$2}=$1; }
           } 
           s/\s.*//; $f{$_} ? print "YES: $f{$_}" : print "PANIC: $_"' uuids.txt

Just to illustrate the time differences, I tested my bash approach, choroba's and my perl on a file with 20000 UUIDs of which 18001 had a corresponding file name. Note that each test was run by redirecting the script's output to /dev/null.

My bash (~3.5 min)

real   3m39.775s
user   1m26.083s
sys    2m13.400s

Choroba's (bash, ~0.7 sec)

real   0m0.732s
user   0m0.697s
sys    0m0.037s

My perl (~0.1 sec):

real   0m0.100s
user   0m0.093s
sys    0m0.013s

kos · Answer 3 · 2016-02-05T07:13:58+08:00

This is pure Bash (i.e. no external commands), and it's the most coincise approach that I can think of.

But performance-wise is really not much better than what you currently have.

It will read each line from path/to/file; for each line, it will store the first field in $uuid and prints a message if a file matching the pattern path/to/directory/$uuid* is not found:

#! /bin/bash
[ -z "$2" ] && printf 'Not enough arguments.\n' && exit

while read uuid; do
    [ ! -f "$2/$uuid"* ] && printf '%s missing in %s\n' "$uuid" "$2"
done <"$1"

Call it with path/to/script path/to/file path/to/directory.

Sample output using the sample input file in the question on a test directory hierarchy containing the sample file in the question:

% tree
.
├── path
│   └── to
│       ├── directory
│       │   └── d6f60016-0011-49c4-8fca-e2b3496ad5a7_20160204_023-ERROR
│       └── file
└── script.sh

3 directories, 3 files
% ./script.sh path/to/file path/to/directory
d5873483-5b98-4895-ab09-9891d80a13da* missing in path/to/directory
be0ed6a6-e73a-4f33-b755-47226ff22401* missing in path/to/directory

mikeserv · Answer 4 · 2016-02-06T05:47:43+08:00

unset IFS
set -f
set +f -- $(<uuid_file)
while  [ "${1+:}" ]
do     : < "$source_directory/$1"*  &&
       printf 'File for %s has arrived.\n' "$2"
       shift 2
done

The idea here is not to worry about reporting errors the shell will report for you. If you try to < open a file which doesn't exist your shell will complain. In fact, it will prepend your script's $0 and the line number on which the error occurred to the error output when it does... This is good information that is provided by default already - so don't bother.

You also don't need to take the file in line-by-line like that - it can be awfully slow. This expands the whole thing in a single shot out to a white-space delimited array of arguments and it handles two at a time. If your data is consistent with your example, then $1 will always be your uuid and $2 will be your $name. If bash can open a match to your uuid - and only one such match exists - then printf happens. Otherwise it doesn't and the shell writes diagnostics to stderr about why.

Sergiy Kolodyazhnyy · Answer 5 · 2016-02-05T08:04:23+08:00

The way I'd approach it is to get uuids from file first, then use find

awk '{print $1}' listfile.txt  | while read fileName;do find /etc -name "$fileName*" -printf "%p FOUND\n" 2> /dev/null;done

For readabilty,

awk '{print $1}' listfile.txt  | \
    while read fileName;do \
    find /etc -name "$fileName*" -printf "%p FOUND\n" 2> /dev/null;
    done

Example with a list of files in /etc/, looking for passwd, group,fstab, and THISDOESNTEXIST filenames.

$ awk '{print $1}' listfile.txt  | while read fileName;do find /etc -name "$fileName*" -printf "%p FOUND\n" 2> /dev/null; done
/etc/pam.d/passwd FOUND
/etc/cron.daily/passwd FOUND
/etc/passwd FOUND
/etc/group FOUND
/etc/iproute2/group FOUND
/etc/fstab FOUND

Since you've mentioned the directory is flat,you could use the -printf "%f\n" option to just print filename itself

What this doesn't do is to list missing files. find's small disadvantage is that it doesn't tell you if it doesn't find a file, only when it matches something. What one could do , however , is to check the output - if the output is empty , then we have a file missing

awk '{print $1}' listfile.txt  | while read fileName;do RESULT="$(find /etc -name "$fileName*" -printf "%p\n" 2> /dev/null )"; [ -z "$RESULT"  ] && echo "$fileName not found" || echo "$fileName found"  ;done

More readable:

awk '{print $1}' listfile.txt  | \
   while read fileName;do \
   RESULT="$(find /etc -name "$fileName*" -printf "%p\n" 2> /dev/null )"; \
   [ -z "$RESULT"  ] && echo "$fileName not found" || \
   echo "$fileName found"  
   done

And here's how it performs as a small script:

skolodya@ubuntu:$ ./listfiles.sh                                               
passwd found
group found
fstab found
THISDONTEXIST not found

skolodya@ubuntu:$ cat listfiles.sh                                             
#!/bin/bash
awk '{print $1}' listfile.txt  | \
   while read fileName;do \
   RESULT="$(find /etc -name "$fileName*" -printf "%p\n" 2> /dev/null )"; \
   [ -z "$RESULT"  ] && echo "$fileName not found" || \
   echo "$fileName found"  
   done

One could use stat as alternative, since it's a flat directory, but the code bellow won't work recursively for subdirectories if you ever decide to add those:

$ awk '{print $1}' listfile.txt  | while read fileName;do  stat /etc/"$fileName"* 1> /dev/null ;done        
stat: cannot stat ‘/etc/THISDONTEXIST*’: No such file or directory

If we take the stat idea and run with it, we could use the exit code of stat as indication for whether a file exists or not. Effectivelly, we want to do this:

$ awk '{print $1}' listfile.txt  | while read fileName;do  if stat /etc/"$fileName"* &> /dev/null;then echo "$fileName found"; else echo "$fileName NOT found"; fi ;done

Sample run:

skolodya@ubuntu:$ awk '{print $1}' listfile.txt  | \                                                         
> while read FILE; do                                                                                        
> if stat /etc/"$FILE" &> /dev/null  ;then                                                                   
> echo "$FILE found"                                                                                         
> else echo "$FILE NOT found"                                                                                
> fi                                                                                                         
> done
passwd found
group found
fstab found
THISDONTEXIST NOT found

Bash - Check directory for files against list of partial file names

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?