I'm using amazon s3 as storage for users profile pic. I see that many websites generates large random filenames and put them into the same root directory like:
http://xxx.us-east-1.amazonaws.com/aHR0cHM6Ly9mYmNkbi1wcm9maWxlLWEuYWthbWFpaGQubmV0L2hwcm9maWxlLWFrLWFzaDIvMjczMzkxXzEwMDAwMDMxMjAxMzg5OV81NTk3MjM4Mzdfbi5qcGc.jpg
And my question is: What are the pros and cons of that approach?
If I palce them into different directories, what problems I will have in future?
http://xxx.us-east-1.amazonaws.com/users/id/username.jpg
or
http://xxx.us-east-1.amazonaws.com/users/id/random_number.jpg
Thanks!
As you are using S3, the amount of files should not be an issue. However, consider what happens when you need to lookup a single file manually.. Listing some gazillion files in your browser won't be fun.
So for this case, you should have some kind of "human browseable" tree structure, which final sub directories contain a reasonable amount of files.
I'd recommend either to expand and split the id (assuming it is numeric) or prefix-split the username.
ID example:
Username example:
In any case, most of the strategies invented for storage structure design try to tackle issues which you simply don't have in S3: amount of files per directory, sharding across storage servers .. stuff like that.
Edit: The long file names you described are often chosen for "security" reasons -> as long as you don't use an algorithm to derive it from username + id or so, any relation between the file and a specific user is concealed (given only the file name). Again: use some kind of sub-directory strategy (for the reason argued above).
It depends upon how many images you are going to use. If your application uses millions of images you better cluster them on to another server just to load balance it. You can also divide the images based upon the type of user profile. Place all the user profile based on categories. At the end of the day all you need to know is how well your server is going to load balance the requests. This is just theoretical assumption. having the specification of hardware and amount of pictures would make sense.