Amazon Athena vs Amazon DynamoDB

Overview

Amazon DynamoDB

Stacks4.0K

Followers3.2K

Votes195

Amazon Athena

Stacks521

Followers840

Votes49

Amazon Athena vs Amazon DynamoDB: What are the differences?

Athena is an interactive query service that allows you to analyze data stored in Amazon S3 using standard SQL, while DynamoDB is a fully managed NoSQL database service designed for high-performance, scalable, and low-latency applications with flexible data models. Let's explore the key differences between them:

Fully Managed vs. Self-Managed: Amazon Athena is a fully managed service, which means that Amazon takes care of the underlying infrastructure for you. It scales automatically, and you only pay for the queries you run. On the other hand, Amazon DynamoDB is a self-managed NoSQL database service, where you have to provision and manage the required infrastructure yourself.
Query vs. Key-Value Store: Amazon Athena is designed for ad-hoc querying of data stored in Amazon S3 using standard SQL queries. It provides the ability to analyze large datasets without the need to set up and manage a database. In contrast, Amazon DynamoDB is a key-value store that is optimized for fast and predictable performance with low-latency access to small, frequently accessed data items.
Schema-on-Read vs. Schema-on-Write: With Amazon Athena, data is stored in Amazon S3 in any format (e.g., JSON, CSV), and you can define the schema on read. This means that you can run queries on the data without explicitly defining the schema beforehand. In contrast, Amazon DynamoDB requires you to define the schema upfront, as it follows a schema-on-write approach. Each item in DynamoDB needs to have the same set of attributes, although the values can vary.
SQL Support vs. NoSQL API: Amazon Athena provides support for standard SQL queries, making it easy for users who are familiar with SQL to interact with their data. On the other hand, Amazon DynamoDB offers a NoSQL API, which allows users to perform CRUD operations (create, read, update, and delete) using the API methods provided by DynamoDB.
Storage Cost vs. Provisioned Capacity: For Amazon Athena, you only pay for the amount of data scanned by your queries, which makes it a cost-effective option for analyzing large datasets. In contrast, Amazon DynamoDB requires you to provision read and write capacity units, even if your workload is unpredictable or sporadic. This means that you have to pay for the provisioned capacity, regardless of how much you actually use.
Workload Types vs. Data Access Patterns: Amazon Athena is well suited for ad-hoc and interactive querying of data, making it ideal for exploratory analysis and data discovery. On the other hand, Amazon DynamoDB is designed for applications that require low-latency access to small data items with simple data access patterns, such as key-value lookups or item scans.

In summary, Amazon Athena is a fully managed service for ad-hoc querying of data stored in Amazon S3 using standard SQL, while Amazon DynamoDB is a self-managed NoSQL database optimized for fast and predictable performance with low-latency access to small data items using key-value lookups or item scans.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on Amazon DynamoDB, Amazon Athena

Doru

Solution Architect

Jun 9, 2019

Reviewon

Amazon DynamoDB

I use Amazon DynamoDB because it integrates seamlessly with other AWS SaaS solutions and if cost is the primary concern early on, then this will be a better choice when compared to AWS RDS or any other solution that requires the creation of a HA cluster of IaaS components that will cost money just for being there, the costs not being influenced primarily by usage.

1.38k views1.38k

Comments

akash

Aug 27, 2020

Needs adviceon

Cloud Firestore

Firebase Realtime Database

Amazon DynamoDB

We are building a social media app, where users will post images, like their post, and make friends based on their interest. We are currently using Cloud Firestore and Firebase Realtime Database. We are looking for another database like Amazon DynamoDB; how much this decision can be efficient in terms of pricing and overhead?

199k views199k

Comments

Pavithra

Mar 12, 2020

Needs adviceon

Amazon S3

Amazon Athena

Amazon Redshift

Hi all,

Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?

522k views522k

Comments

Detailed Comparison

Amazon DynamoDB	Amazon Athena
With it , you can offload the administrative burden of operating and scaling a highly available distributed database cluster, while paying a low price for only what you use.	Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
Automated Storage Scaling – There is no limit to the amount of data you can store in a DynamoDB table, and the service automatically allocates more storage, as you store more data using the DynamoDB write APIs;Provisioned Throughput – When creating a table, simply specify how much request capacity you require. DynamoDB allocates dedicated resources to your table to meet your performance requirements, and automatically partitions data over a sufficient number of servers to meet your request capacity;Fully Distributed, Shared Nothing Architecture	-
Statistics
Stacks 4.0K	Stacks 521
Followers 3.2K	Followers 840
Votes 195	Votes 49
Pros & Cons
Pros 62 Predictable performance and cost 56 Scalable 35 Native JSON Support 21 AWS Free Tier 7 Fast Cons 4 Only sequential access for paginate data 1 Scaling 1 Document Limit Size	Pros 16 Use SQL to analyze CSV files 8 Glue crawlers gives easy Data catalogue 7 Cheap 6 Query all my data without running servers 24x7 4 No data base servers yay
Integrations
Amazon RDS for PostgreSQL PostgreSQL MySQL SQLite Azure Database for MySQL	Amazon S3 Presto

What are some alternatives to Amazon DynamoDB, Amazon Athena?

Apache Spark

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Azure Cosmos DB

Azure DocumentDB is a fully managed NoSQL database service built for fast and predictable performance, high availability, elastic scaling, global distribution, and ease of development.

Cloud Firestore

Cloud Firestore is a NoSQL document database that lets you easily store, sync, and query data for your mobile and web apps - at global scale.

Presto

Distributed SQL Query Engine for Big Data

Apache Flink

Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

lakeFS

It is an open-source data version control system for data lakes. It provides a “Git for data” platform enabling you to implement best practices from software engineering on your data lake, including branching and merging, CI/CD, and production-like dev/test environments.

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Cloudant

Cloudant’s distributed database as a service (DBaaS) allows developers of fast-growing web and mobile apps to focus on building and improving their products, instead of worrying about scaling and managing databases on their own.

Google Cloud Bigtable

Google Cloud Bigtable offers you a fast, fully managed, massively scalable NoSQL database service that's ideal for web, mobile, and Internet of Things applications requiring terabytes to petabytes of data. Unlike comparable market offerings, Cloud Bigtable doesn't require you to sacrifice speed, scale, or cost efficiency when your applications grow. Cloud Bigtable has been battle-tested at Google for more than 10 years—it's the database driving major applications such as Google Analytics and Gmail.

Apache Kylin

Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop/Spark supporting extremely large datasets, originally contributed from eBay Inc.

Related Comparisons

Amazon Athena vs Amazon DynamoDB: What are the differences?

Fully Managed vs. Self-Managed: Amazon Athena is a fully managed service, which means that Amazon takes care of the underlying infrastructure for you. It scales automatically, and you only pay for the queries you run. On the other hand, Amazon DynamoDB is a self-managed NoSQL database service, where you have to provision and manage the required infrastructure yourself.
Query vs. Key-Value Store: Amazon Athena is designed for ad-hoc querying of data stored in Amazon S3 using standard SQL queries. It provides the ability to analyze large datasets without the need to set up and manage a database. In contrast, Amazon DynamoDB is a key-value store that is optimized for fast and predictable performance with low-latency access to small, frequently accessed data items.
Schema-on-Read vs. Schema-on-Write: With Amazon Athena, data is stored in Amazon S3 in any format (e.g., JSON, CSV), and you can define the schema on read. This means that you can run queries on the data without explicitly defining the schema beforehand. In contrast, Amazon DynamoDB requires you to define the schema upfront, as it follows a schema-on-write approach. Each item in DynamoDB needs to have the same set of attributes, although the values can vary.
SQL Support vs. NoSQL API: Amazon Athena provides support for standard SQL queries, making it easy for users who are familiar with SQL to interact with their data. On the other hand, Amazon DynamoDB offers a NoSQL API, which allows users to perform CRUD operations (create, read, update, and delete) using the API methods provided by DynamoDB.
Storage Cost vs. Provisioned Capacity: For Amazon Athena, you only pay for the amount of data scanned by your queries, which makes it a cost-effective option for analyzing large datasets. In contrast, Amazon DynamoDB requires you to provision read and write capacity units, even if your workload is unpredictable or sporadic. This means that you have to pay for the provisioned capacity, regardless of how much you actually use.
Workload Types vs. Data Access Patterns: Amazon Athena is well suited for ad-hoc and interactive querying of data, making it ideal for exploratory analysis and data discovery. On the other hand, Amazon DynamoDB is designed for applications that require low-latency access to small data items with simple data access patterns, such as key-value lookups or item scans.

Amazon Athena vs Amazon DynamoDB

Overview