The scenario:
- I have mounted a gcsfuse drive to my docker container so I can store and read data from Google Cloud Storage
- The docker container only has a few gigs of storage, but I might have terrabytes of data in Google Cloud Sorage
The question:
Does gcsfuse download all data from Google Cloud Storage, or does it only retrieve files when I try to read them? Conversely, once I write data to the mounted drive, does that data stay stored locally, or does it get send to GCS and removed from the local storage?
The overall concern, in case I'm asking the wrong question
I'm worried that the gcsfuse mounted drive might cause all of the containers storage to be used up, even though the data is actually stored on Google Cloud. I'm trying to evaluate if this a legitimate concern, or is gcsfuse built to handle situations like this.
For those of you that are going to tell me to "just read to the docs"
Yes, I have tried. If this information is in the documents, then it is buried deep enough or ambiguous enough that this is likely worth raising as a question here.
gcsfuse doesn't download all data in the bucket. It stores the entirety of all files that have been written to but not closed locally. For non-dirtied files, reads are served directly from GCS and don't incur local storage.