I have a directory full of files.
I want to initially pass each of those files through a command, and send the output to another file in a different directory as follows:
cat dir1/sourcefile | process.py > dir2/destfile
the name of "destfile" is unimportant, it can be any filename.
Easy enough, however - new files are being added to dir1 all the time, and existing files are occasionally being modified.
How can I write a bash script (or another type of script) that will keep an eye on dir1, and whenever a new file is added or modified, process it or re-process it into dir2?
with a little Google-magic, found this
you don't specify which OS or distro you're using, but under Ubuntu, the inotify-tools package contains inotifywait and inotifywatch:
so, for your use, you'd want something more like:
(sorry, my bash fu is weak tonight)
if you're not creating files rapidly, you could probably trim out the inner loop...
In linux you can use inotify to get events of a directory changing or a file changing. Unfortunately, there is no command line utility that can support this for bash scripts...at least none that I'm aware of.
However, there is a Python binding for the inotify API, PyInotify. Since you're already using python, for your processing utility, perhaps this is suitable for you.
As KFro suggested, the most elegant way would be with PyInotify.
But a brute-force way to do it would be to write a Python script that uses os.walk to visit all the files, and keep track of which files have already been seen in a dictionary; then sleep for a while using time.sleep(), and run the os.walk again, seeing which files were not already in the dictionary, and updating appropriately.
To keep track of files that have been updated, you can use os.stat() to get a time stamp of when the file was last updated, and store that in the dictionary. In fact that's all you really need in the dictionary: the full filename (including path) as the key, and the timestamp as the value.
Not nearly as elegant as PyInotify, but it should work anywhere Python works.