Amazon Athena vs Microsoft SQL Server

Overview

Microsoft SQL Server

Stacks21.3K

Followers15.5K

Votes540

Amazon Athena

Stacks521

Followers840

Votes49

Amazon Athena vs Microsoft SQL Server: What are the differences?

Introduction

In this article, we will compare and highlight the key differences between Amazon Athena and Microsoft SQL Server. Both Amazon Athena and Microsoft SQL Server are popular platforms used in data analytics and database management. While they serve similar purposes, they have distinct features that set them apart from each other.

Scalability and Infrastructure Management:
- Amazon Athena is a serverless data querying service provided by Amazon Web Services (AWS). It allows users to run queries on data stored in Amazon S3 without the need for provisioning and managing infrastructure. In contrast, Microsoft SQL Server requires the setup and management of servers, databases, and network infrastructure. Users need to allocate resources to handle their workloads and ensure scalability.
Data Source and Integration:
- Amazon Athena specializes in querying data stored in Amazon S3, making it ideal for analyzing large datasets from various sources like logs, CSV files, JSON objects, etc. It supports popular file formats like Parquet and ORC. On the other hand, Microsoft SQL Server can integrate with various databases and file systems, making it suitable for analyzing structured data from diverse sources.
Cost Model:
- Amazon Athena follows a pay-per-query pricing model. Users only pay for the amount of data scanned in each query, making it cost-effective for ad-hoc analysis. Microsoft SQL Server often involves licensing costs and requires more upfront investments, making it suitable for long-term or enterprise-level projects.
Performance and Query Optimization:
- Amazon Athena is optimized for running distributed queries on large datasets, leveraging the underlying power of AWS infrastructure. It automatically parallelizes and scales queries, offering faster results for ad-hoc analysis. Microsoft SQL Server, on the other hand, requires proper query optimization and indexing to achieve optimal performance. It provides tools and techniques to tune and optimize queries for specific workloads.
SQL Dialect and Compatibility:
- Amazon Athena uses a modified version of Presto SQL, which is ANSI SQL compliant but may have some syntax differences compared to traditional SQL dialects like T-SQL used in Microsoft SQL Server. This may require some adjustments in queries and can impact the portability of existing code or the ability to leverage specific SQL language features.
Availability and Reliability:
- Amazon Athena benefits from the robust infrastructure provided by AWS, ensuring high availability and fault tolerance. It is built to handle data failures and automatically recover from errors. Microsoft SQL Server's availability depends on the infrastructure setup and configuration. Users need to implement redundancy, clustering, and backups to ensure availability and reliability.

In Summary, Amazon Athena is a serverless query service optimized for ad-hoc analysis of large datasets stored in Amazon S3, providing scalable infrastructure and cost-effective pricing. Microsoft SQL Server, on the other hand, requires dedicated infrastructure management, supports various data sources, and offers more control over performance optimization and query tuning.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on Microsoft SQL Server, Amazon Athena

Erin

IT Specialist

Mar 10, 2020

Needs adviceon

Microsoft SQL Server

MySQL

PostgreSQL

I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:

I need to use either @{MySQL}|tool:1025| or @{PostgreSQL}|tool:1028| on a @{Linux}|tool:10483| based OS. Which would be better for this application?
I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.

668k views668k

Comments

Pavithra

Mar 12, 2020

Needs adviceon

Amazon S3

Amazon Athena

Amazon Redshift

Hi all,

Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?

522k views522k

Comments

Detailed Comparison

Microsoft SQL Server	Amazon Athena
Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.	Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
Statistics
Stacks 21.3K	Stacks 521
Followers 15.5K	Followers 840
Votes 540	Votes 49
Pros & Cons
Pros 139 Reliable and easy to use 101 High performance 95 Great with .net 65 Works well with .net 56 Easy to maintain Cons 4 Expensive Licensing 2 Microsoft 1 Replication can loose the data 1 Data pages is only 8k 1 The maximum number of connections is only 14000 connect	Pros 16 Use SQL to analyze CSV files 8 Glue crawlers gives easy Data catalogue 7 Cheap 6 Query all my data without running servers 24x7 4 No data base servers yay
Integrations
No integrations available	Amazon S3 Presto

What are some alternatives to Microsoft SQL Server, Amazon Athena?

MongoDB

MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

MySQL

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

PostgreSQL

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

SQLite

SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

Cassandra

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

MariaDB

Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry. MariaDB is designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.

RethinkDB

RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.

ArangoDB

A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

InfluxDB

InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.

Related Comparisons

Amazon Athena vs Microsoft SQL Server: What are the differences?

Introduction

Scalability and Infrastructure Management:
- Amazon Athena is a serverless data querying service provided by Amazon Web Services (AWS). It allows users to run queries on data stored in Amazon S3 without the need for provisioning and managing infrastructure. In contrast, Microsoft SQL Server requires the setup and management of servers, databases, and network infrastructure. Users need to allocate resources to handle their workloads and ensure scalability.
Data Source and Integration:
- Amazon Athena specializes in querying data stored in Amazon S3, making it ideal for analyzing large datasets from various sources like logs, CSV files, JSON objects, etc. It supports popular file formats like Parquet and ORC. On the other hand, Microsoft SQL Server can integrate with various databases and file systems, making it suitable for analyzing structured data from diverse sources.
Cost Model:
- Amazon Athena follows a pay-per-query pricing model. Users only pay for the amount of data scanned in each query, making it cost-effective for ad-hoc analysis. Microsoft SQL Server often involves licensing costs and requires more upfront investments, making it suitable for long-term or enterprise-level projects.
Performance and Query Optimization:
- Amazon Athena is optimized for running distributed queries on large datasets, leveraging the underlying power of AWS infrastructure. It automatically parallelizes and scales queries, offering faster results for ad-hoc analysis. Microsoft SQL Server, on the other hand, requires proper query optimization and indexing to achieve optimal performance. It provides tools and techniques to tune and optimize queries for specific workloads.
SQL Dialect and Compatibility:
- Amazon Athena uses a modified version of Presto SQL, which is ANSI SQL compliant but may have some syntax differences compared to traditional SQL dialects like T-SQL used in Microsoft SQL Server. This may require some adjustments in queries and can impact the portability of existing code or the ability to leverage specific SQL language features.
Availability and Reliability:
- Amazon Athena benefits from the robust infrastructure provided by AWS, ensuring high availability and fault tolerance. It is built to handle data failures and automatically recover from errors. Microsoft SQL Server's availability depends on the infrastructure setup and configuration. Users need to implement redundancy, clustering, and backups to ensure availability and reliability.

Amazon Athena vs Microsoft SQL Server

Overview