Ping a Specific Port

Question

Axel

Asked: 2020-02-26 11:00:13 +0800 CST2020-02-26 11:00:13 +0800 CST 2020-02-26 11:00:13 +0800 CST

Deleting S3 files with a given prefix only

772

We have a bucket with more than 500,000 objects in it.

I'm assigned a job where I've to delete files which have a specific prefix. There are around 300,000 files with the given prefix in the bucket.

For eg If there are 3 files

abc_1file.txt
abc_2file.txt
abc_1newfile.txt

I've to delete the files with abc_1 prefix only. I didn't find much in AWS documentation related to this.

Any suggestions on how can I automate this?

2 Answers

Voted

sippybear · Answer 1 · 2020-02-26T11:40:32+08:00

Best Answer

sippybear

2020-02-26T11:40:32+08:002020-02-26T11:40:32+08:00

You can use aws s3 rm command using the --include and --exclude parameters to specify a pattern for the files you'd like to delete.

So in your case, the command would be:

aws s3 rm s3://bucket/ --recursive --exclude "*" --include "abc_1*"

which will delete all files that match the "abc_1*" pattern in the bucket.

The behavior of these parameters is documented here

These instructions assume you have downloaded, installed and configured the AWS CLI tools

26

Pierre D · Answer 2 · 2021-02-12T09:41:28+08:00

Pierre D

2021-02-12T09:41:28+08:002021-02-12T09:41:28+08:00

As a complement to @sippybear's excellent answer, I would recommend the following, if somebody has a bucket with a trillion objects and the pattern of the files one wants to delete includes "parent directories", e.g. 'my/path/to/topdir/abc_1*':

aws s3 rm --dryrun --recursive --exclude '*' --include 'abc_1*' s3://mybucket/my/path/to/topdir/

Why?

this will restrict the search of objects to delete to the parent directory, thus considerably speeding up the operation;
really, do yourself a favor and start with --dryrun, even if you promptly interrupt it (ctrl-C); typos and other accidents happen and errors when deleting large number of files in a bucket can be very regrettable (even if you have proper backups)...

Once you're happy with what you see is about to be deleted, then remove the --dryrun.

10

Deleting S3 files with a given prefix only

Why?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?