I'm looking to make a "Drop Folder" in a windows shared drive that is accessible to everyone. I'd like files to be deleted automagically if they sit in the folder for more than X days.
However, it seems like all methods I've found to do this, use the last modified date, last access time, or creation date of a file.
I'm trying to make this a folder that a user can drop files in to share with somebody. If someone copies or moves files into here, I'd like the clock to start ticking at this point. However, the last modified date and creation date of a file will not be updated unless someone actually modifies the file. The last access time is updated too frequently... it seems that just opening a directory in windows explorer will update the last access time.
Anyone know of a solution to this? I'm thinking that cataloging the hash of files on a daily basis and then expiring files based on hashes older than a certain date might be a solution.... but taking hashes of files can be time consuming.
Any ideas would be greatly appreciated!
Note:
I've already looked at quite a lot of answers on here... looked into File Server Resource Monitor, powershell scripts, batch scripts, etc. They still use the last access time, last modified time or creation time... which, as described, do not fit the above needs.
We used a combination of a powershell script and a policy. The policy specifies that the user must create a folder inside the Drop_Zone share and then copy whatever files they want into that folder. When the folder gets to be 7 days old (using CreationTime) the powershell script will delete it.
I also added some logging to the powershell script so we could verify it's operation and turned on shadow copies just to save the completely inept from themselves.
Here is the script without all the logging stuff.
If you can assume NTFS you could write a key (Guid) into an alternate stream of the file. Plus the date, so you could basically store the database in the files.
More information can be found at
http://blogs.technet.com/b/askcore/archive/2013/03/24/alternate-data-streams-in-ntfs.aspx
Basically you can store additional content in a separate stream that is coded by a special name.
You could use IO.FileSystemWatcher, which allows you to "watch" a folder for new files created. Here are the pieces you'd need to make this work.
These variables configure the path to watch and a filter to fine-tune which files to track:
This sets up the parameters for the folder to watch and the actions to perform when the event occurs. Basically this resets the LastWriteTime on each file as it's written:
The event can be unregistered if needed using this:
Finally, you can run this once a day in order to clean up the old files:
That should be everything you need...
It's been a while but I set up a relatively straight forward method for addressing this.
I would touch any files added to the drop directory (monitored via a resource monitoring utility) and set the last modified date to the date added to the folder.
I could then use the last modified date to purge any files that need to be aged off. This also has the advantage that if someone really does update the file it'll reset the countdown.
There is no way to rely on the dates for when a file was copied or moved into a folder. Windows manages to preserve it across filesystems, drives, network shares, etc. You might be able to work something out with a linux file server, or prevent people from directly copying files by using FTP or a web based upload system.
If you are ok with people not being able to modify the files after they upload, you could have separate upload and access folders, and a script that moves files between them and re-dates them. But it sounds like you want people to be able to modify the files directly.
So a simple, if somewhat hacky, solution would be to mess with the dates. I would write two scripts:
Hourly Date Changer script
Have a script run once an hour or so, in your preferred language, that:
In powershell, it would look something like this:
Running this script today (May 27), sets the modified date of all files to June 1st, 1994 - exactly 356*20 days ago. Because it is changing only files newer than the $before value, it won't touch files it has already set to the past.
Cleanup Script
The cleanup script would run every night, and:
I won't write the script for this part - there are plenty of utilities that can handle deleting files that are older than a specified date, choose whichever you like. The important part is to look for files that are 7300+X days old, where X is the number of days you want to keep them since they were last modified.
Advantages
This has a few advantages over the other answers here:
The only problem I can see is if people copy a file that was last modified 20 years ago to the drop folder. I think in most scenarios, that's unlikely to be much of an issue, but it could come up.
You could formalise the adding of files to the drop box through a web page, which has an "upload" IFRAME. The user could then "post" the file, which invokes a PHP/ASP job on the server, which takes the file and places it in the pucker location. The PHP/ASP could do any number of index/analysis operations.
I would create a script that runs as scheduled tasks every five minutes and does two things.
There's an existing mechanism to mark files, the Archive bit. It's been there since the early days of DOS, and is present on both FAT and NTFS.
Basically, every file will have it's archive bit set by default. If you see a file with the archive bit in your drop folder, (1) clear that bit and (2) set its date to today. If you see a file without that bit and with a date <= 7 days in the past, delete it.
If a user writes to the file while it's in the drop folder, its archive bit is set again so its lifetime is also reset to 7 days. It's in effect a new file, after all.
You can now safely use FileSystemWatcher. Any issues it has (such as duplicate events, buffer overflow losing detailed information) no longer matter as the relevant information is all in the file metadata.