I have a stream of events happening in the new CW Event Bridge that I need to get some statistics on from a REST API endpoint, a lambda function (python + pandas).
What is the best AWS service to store this data so that it's quick to load in a lambda function?
Here are the biggest cons of the things I've considered:
- CloudWatch logs is too slow
- CloudWatch insights only gives you last 1000 without pagination (and also is too slow)
- extended retention only covers last 7 days
- S3 does not have "append" operation
- DynamoDB will (probably) consume too much write capacity
- ElasticSearch seems a bit overkill (and requires an running cluster != serverless)
Kinesis Data Streams looks like a good match for this use-case. If you only want to extract statistics out of your event streams, and you can use either SQL or Java, then you can also try out Kinesis Data Analytics.