I read the following in the docs for Java appengine:
standard: Most cost-effective for applications that have significant periods where they are not serving traffic.
flexible: No free tier. Application always has a minimum number of running instances. Most cost-effective for applications that serve traffic continuously.
What does "serve traffic continuously" means? I'm asking because my WAR will not serve traffic continuously but it will need to "stay up" all the time because of caching.
Also, for flexible, I saw the following pricing:
vCPU per core hour $0.0526
Memory per GB hour $0.0071
Persistent disk per GB per month $0.0400
Does the above means I should multiply each with *24*31 (month of uptime) in order to understand how much I'm going to pay?
For example:
2CPU = 0.0526 * 2 (cpus) * 24 (hours) * 31 (days in a month) = $78
2GB = 0.0071 * 2 (gb) * 24 (hours) * 31 (days in a month) = $10
1GB(disk) = 0.04 * 1 (month) = $0.04
A general question: Does Google cloud also serves a VPS-like computer where the user (me) can install what he wants? If so, where can I read about it?
Only you can do your capacity planning. Make a rough, order of magnitude guess for the resource consumption expected. Estimate based on observations of this app in use. Set a spending limit on the project to be sure going over budget is not a surprisingly large bill.
For standard environment the scaling unit is a container, for flexible environment the scaling unit is a VM instance. Just as it says, flexible has at least one (billable) instance running all the time, because it takes too long to auto scale from zero. Your capacity planning process helps determine which of these is appropriate and how many you need.
GCP App Engine is a managed service that assumes you do not want a shell. While you can get a shell on a flexible environment instance, you also can abandon App Engine and run your own environment on unmanaged Google Compute Engine instances.