I want to use the AWS S3 cli to copy a full directory structure to an S3 bucket.
So far, everything I've tried copies the files to the bucket, but the directory structure is collapsed. (to say it another way, each file is copied into the root directory of the bucket)
The command I use is:
aws s3 cp --recursive ./logdata/ s3://bucketname/
I've also tried leaving off the trailing slash on my source designation (ie, the copy from argument). I've also used a wildcard to designate all files ... each thing I try simply copies the log files into the root directory of the bucket.
I believe sync is the method you want. Try this instead:
The following worked for me:
aws s3 cp ~/this_directory s3://bucketname/this_directory --recursive
AWS will then "make"
this_directory
and copy all of the local contents into it.I had faced this error while using either of these commands.
I even thought of mounting the S3 bucket locally and then run rsync, even that failed (or got hung for few hours) as I have thousands of file.
Finally, s3cmd worked like a charm.
This not only does the job well and shows quite a verbose output on the console, but also uploads big files in parts.
(Improving the solution of Shishir)
s3Copy.sh
)/PATH/TO/s3Copy.sh /PATH/TO/ROOT/DIR/OF/SOURCE/FILESandDIRS PATH/OF/S3/BUCKET
For example if
s3Copy.sh
is stored in the home directory and I want to copy all the files and directories located in the current directory, then I run this:~/s3Copy.sh . s3://XXX/myBucket
You can easily modify the script to allow for other arguments of
s3 cp
such as--include
,--exclude
, ...Use the following script for copying folder structure:
I couldn't get
s3 sync
ors3 cp
to work on a 55 GB folder with thousands of files and over 2 dozen subdirectories inside. Trying to sync the whole folder would just cause awscli to fail silently without uploading anything to the bucket.Ended up doing this to first sync all subdirectories and their contents (folder structure is preserved):
Then I did this to get the 30,000 files in the top level:
Make sure to watch the load on the server (protip you can use
w
to just show the load) andctrl-z
to suspend the command if load gets too high. (fg
to continue it again).Putting this here in case it helps anyone in a similar situation.
Notes:
-mindepth 1
excludes.
-maxdepth 1
prevents find from listing contents of sub-directories, sinces3 sync
handles those successfully.cut -c 3-
removes the "./" from the beginning of each result from find.Alternatively you could also try minio client aka mc
Hope it help.
PS: I am one of the contributor to the project.
This works for me.. aws s3 sync mydir s3://rahuls-bucket/mydir