So, I have data in Amazon S3 as parquet files and I have it available in the Glue data catalog too. I want to build an AppSync API on top of this data. Now the two options that I am considering are:
Bring the data to Amazon DynamoDB and then build my API on top of this Database.
Add a Lambda function that resolves Amazon Athena queries made by AppSync.
Which of the two approaches will be cost effective?
I would really appreciate some back of the envelope estimates too.
Note: I only expect to make read queries. Thanks.
Overall, I would think, if the data fits in AWS DynamoDB with being able to Query (not scan) that would be a bit more cost effective. But it all depends on the size and changes in the data.
On relatively stale data Athena could be cheaper on big loads when the data is processed via Glue, Lambda costs are quite small. DynamoDB could become expensive under big loads of (reads and/or writes)
It really depends on the data size and number of requests. If latency isnt an issue for this use-case so Amazon Athena might be a good solution for that if you partition the data correctly to be effective enough. DynamoDB is a key-value db, it really depends on the use-case - you might not be able to retrieve the relevant info..