484
829
+ 1
49

What is Amazon Athena?

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
Amazon Athena is a tool in the Big Data Tools category of a tech stack.

Who uses Amazon Athena?

Companies
168 companies reportedly use Amazon Athena in their tech stacks, including CRED, Tech Stack, and Payhere.

Developers
300 developers on StackShare have stated that they use Amazon Athena.

Amazon Athena Integrations

Amazon S3, Presto, AWS Glue, Redash, and Cube are some of the popular tools that integrate with Amazon Athena. Here's a list of all 24 tools that integrate with Amazon Athena.
Pros of Amazon Athena
16
Use SQL to analyze CSV files
8
Glue crawlers gives easy Data catalogue
7
Cheap
6
Query all my data without running servers 24x7
4
No data base servers yay
3
Easy integration with QuickSight
2
Query and analyse CSV,parquet,json files in sql
2
Also glue and athena use same data catalog
1
No configuration required
0
Ad hoc checks on data made easy
Decisions about Amazon Athena

Here are some stack decisions, common use cases and reviews by companies and developers who chose Amazon Athena in their tech stack.

Needs advice
on
Ali9t apache awsAli9t apache aws
and
TrinoTrino

Could you please suggest the best database engine for on-premise? We have used Amazon Athena for the cloud. We are looking similar product for on-premise. This should support Node.js programming language.

See more
Needs advice
on
Amazon AthenaAmazon Athena
and
Amazon DynamoDBAmazon DynamoDB

So, I have data in Amazon S3 as parquet files and I have it available in the Glue data catalog too. I want to build an AppSync API on top of this data. Now the two options that I am considering are:

  1. Bring the data to Amazon DynamoDB and then build my API on top of this Database.

  2. Add a Lambda function that resolves Amazon Athena queries made by AppSync.

Which of the two approaches will be cost effective?

I would really appreciate some back of the envelope estimates too.

Note: I only expect to make read queries. Thanks.

See more
Punith Ganadinni
Senior Product Engineer · | 2 upvotes · 60.2K views
Needs advice
on
AWS Data PipelineAWS Data Pipeline
and
AWS GlueAWS Glue

Hey all, I need some suggestions in creating a replica of our RDS DB for reporting and analytical purposes. Cost is a major factor. I was thinking of using AWS Glue to move data from Amazon RDS to Amazon S3 and use Amazon Athena to run queries on it. Any other suggestions would be appreciable.

See more

Hi all,

Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?

See more

Blog Posts

Aug 28 2019 at 3:10AM

Segment

PythonJavaAmazon S3+16
7
2550
Jul 2 2019 at 9:34PM

Segment

Google AnalyticsAmazon S3New Relic+25
10
6724

Amazon Athena Alternatives & Comparisons

What are some alternatives to Amazon Athena?
Presto
Distributed SQL Query Engine for Big Data
Amazon Redshift Spectrum
With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond data stored on local disks in your data warehouse to query vast amounts of unstructured data in your Amazon S3 “data lake” -- without having to load or transform any data.
Amazon Redshift
It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
Cassandra
Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
Spectrum
The community platform for the future.
See all alternatives

Amazon Athena's Followers
829 developers follow Amazon Athena to keep up with related blogs and decisions.