Amazon Redshift vs Microsoft SQL Server

Overview

Amazon Redshift

Stacks1.5K

Followers1.4K

Votes108

Microsoft SQL Server

Stacks21.3K

Followers15.5K

Votes540

Amazon Redshift vs Microsoft SQL Server: What are the differences?

Introduction

In this article, we will explore the key differences between Amazon Redshift and Microsoft SQL Server. Both Amazon Redshift and Microsoft SQL Server are popular data warehousing solutions used by many organizations. However, there are some significant differences between the two.

Architecture and Scalability: Amazon Redshift is built on a massively parallel processing architecture, which allows it to process large amounts of data quickly. It scales horizontally by adding more nodes to the cluster, providing high performance for data warehousing workloads. In contrast, Microsoft SQL Server uses a traditional client-server architecture, and while it can scale vertically by adding more powerful hardware, it does not offer the same level of parallel processing as Redshift.
Data Storage: Redshift stores data in a columnar fashion, which makes it highly optimized for analytical queries. This columnar storage allows for efficient compression, reducing the amount of storage required. SQL Server, on the other hand, uses a row-based storage model by default, which is better suited for transactional workloads. While SQL Server does offer columnstore indexes for columnar storage, it may not perform as well as Redshift for analytical workloads.
Data Loading: Redshift supports bulk data loading using its COPY command, which can efficiently load large amounts of data from various sources such as Amazon S3, Amazon DynamoDB, or other relational databases. SQL Server also provides options for bulk data loading, but the process may be more complex and require additional configurations.
Query Optimization: Redshift's query optimizer is designed specifically for analytical workloads and can efficiently handle complex queries involving large datasets. It incorporates various optimization techniques such as columnar storage, query compilation, and parallel execution. SQL Server, on the other hand, has a query optimizer optimized for transactional workloads and may not perform as well as Redshift for complex analytical queries.
Pricing Model: Redshift offers a pay-as-you-go pricing model, allowing users to scale their resources up or down based on their needs. It also provides options for reserved instances to reduce costs for long-term usage. SQL Server, on the other hand, follows a traditional licensing model, which may require upfront investments for hardware and software licenses.
Integration with Ecosystem: Amazon Redshift integrates well with other AWS services, such as Amazon S3 for data storage, Amazon EMR for big data processing, and Amazon QuickSight for data visualization. It also supports various third-party tools and technologies. SQL Server, on the other hand, is tightly integrated with the Microsoft ecosystem, providing seamless integration with other Microsoft products such as Azure, Power BI, and Visual Studio.

In summary, Amazon Redshift and Microsoft SQL Server differ in their architecture, scalability, data storage model, data loading mechanisms, query optimization capabilities, pricing models, and ecosystem integration. The choice between the two depends on the specific requirements and use cases of the organization.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on Amazon Redshift, Microsoft SQL Server

Erin

IT Specialist

Mar 10, 2020

Needs adviceon

Microsoft SQL Server

MySQL

PostgreSQL

I am a Microsoft SQL Server programmer who is a bit out of practice. I have been asked to assist on a new project. The overall purpose is to organize a large number of recordings so that they can be searched. I have an enormous music library but my songs are several hours long. I need to include things like time, date and location of the recording. I don't have a problem with the general database design. I have two primary questions:

I need to use either @{MySQL}|tool:1025| or @{PostgreSQL}|tool:1028| on a @{Linux}|tool:10483| based OS. Which would be better for this application?
I have not dealt with a sound based data type before. How do I store that and put it in a table? Thank you.

668k views668k

Comments

datocrats-org

Jul 29, 2020

Needs adviceon

Amazon EC2

Tableau

PowerBI

We need to perform ETL from several databases into a data warehouse or data lake. We want to

keep raw and transformed data available to users to draft their own queries efficiently
give users the ability to give custom permissions and SSO
move between open-source on-premises development and cloud-based production environments

We want to use inexpensive Amazon EC2 instances only on medium-sized data set 16GB to 32GB feeding into Tableau Server or PowerBI for reporting and data analysis purposes.

319k views319k

Comments

Julien

CTO at Hawk

Sep 19, 2020

Decided

Cloud Data-warehouse is the centerpiece of modern Data platform. The choice of the most suitable solution is therefore fundamental.

Our benchmark was conducted over BigQuery and Snowflake. These solutions seem to match our goals but they have very different approaches.

BigQuery is notably the only 100% serverless cloud data-warehouse, which requires absolutely NO maintenance: no re-clustering, no compression, no index optimization, no storage management, no performance management. Snowflake requires to set up (paid) reclustering processes, to manage the performance allocated to each profile, etc. We can also mention Redshift, which we have eliminated because this technology requires even more ops operation.

BigQuery can therefore be set up with almost zero cost of human resources. Its on-demand pricing is particularly adapted to small workloads. 0 cost when the solution is not used, only pay for the query you're running. But quickly the use of slots (with monthly or per-minute commitment) will drastically reduce the cost of use. We've reduced by 10 the cost of our nightly batches by using flex slots.

Finally, a major advantage of BigQuery is its almost perfect integration with Google Cloud Platform services: Cloud functions, Dataflow, Data Studio, etc.

BigQuery is still evolving very quickly. The next milestone, BigQuery Omni, will allow to run queries over data stored in an external Cloud platform (Amazon S3 for example). It will be a major breakthrough in the history of cloud data-warehouses. Omni will compensate a weakness of BigQuery: transferring data in near real time from S3 to BQ is not easy today. It was even simpler to implement via Snowflake's Snowpipe solution.

We also plan to use the Machine Learning features built into BigQuery to accelerate our deployment of Data-Science-based projects. An opportunity only offered by the BigQuery solution

193k views193k

Comments

Detailed Comparison

Amazon Redshift	Microsoft SQL Server
It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.	Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.
Optimized for Data Warehousing- It uses columnar storage, data compression, and zone maps to reduce the amount of IO needed to perform queries. Redshift has a massively parallel processing (MPP) architecture, parallelizing and distributing SQL operations to take advantage of all available resources.;Scalable- With a few clicks of the AWS Management Console or a simple API call, you can easily scale the number of nodes in your data warehouse up or down as your performance or capacity needs change.;No Up-Front Costs- You pay only for the resources you provision. You can choose On-Demand pricing with no up-front costs or long-term commitments, or obtain significantly discounted rates with Reserved Instance pricing.;Fault Tolerant- Amazon Redshift has multiple features that enhance the reliability of your data warehouse cluster. All data written to a node in your cluster is automatically replicated to other nodes within the cluster and all data is continuously backed up to Amazon S3.;SQL - Amazon Redshift is a SQL data warehouse and uses industry standard ODBC and JDBC connections and Postgres drivers.;Isolation - Amazon Redshift enables you to configure firewall rules to control network access to your data warehouse cluster.;Encryption – With just a couple of parameter settings, you can set up Amazon Redshift to use SSL to secure data in transit and hardware-acccelerated AES-256 encryption for data at rest.<br>	-
Statistics
Stacks 1.5K	Stacks 21.3K
Followers 1.4K	Followers 15.5K
Votes 108	Votes 540
Pros & Cons
Pros 41 Data Warehousing 27 Scalable 17 SQL 14 Backed by Amazon 5 Encryption	Pros 139 Reliable and easy to use 101 High performance 95 Great with .net 65 Works well with .net 56 Easy to maintain Cons 4 Expensive Licensing 2 Microsoft 1 The maximum number of connections is only 14000 connect 1 Replication can loose the data 1 Allwayon can loose data in asycronious mode
Integrations
SQLite MySQL Oracle PL/SQL	No integrations available

What are some alternatives to Amazon Redshift, Microsoft SQL Server?

MongoDB

MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

MySQL

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

PostgreSQL

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

SQLite

SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

Cassandra

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

MariaDB

Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry. MariaDB is designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.

RethinkDB

RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.

ArangoDB

A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

InfluxDB

InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.

Related Comparisons

Amazon Redshift vs Microsoft SQL Server: What are the differences?

Introduction

Architecture and Scalability: Amazon Redshift is built on a massively parallel processing architecture, which allows it to process large amounts of data quickly. It scales horizontally by adding more nodes to the cluster, providing high performance for data warehousing workloads. In contrast, Microsoft SQL Server uses a traditional client-server architecture, and while it can scale vertically by adding more powerful hardware, it does not offer the same level of parallel processing as Redshift.
Data Storage: Redshift stores data in a columnar fashion, which makes it highly optimized for analytical queries. This columnar storage allows for efficient compression, reducing the amount of storage required. SQL Server, on the other hand, uses a row-based storage model by default, which is better suited for transactional workloads. While SQL Server does offer columnstore indexes for columnar storage, it may not perform as well as Redshift for analytical workloads.
Data Loading: Redshift supports bulk data loading using its COPY command, which can efficiently load large amounts of data from various sources such as Amazon S3, Amazon DynamoDB, or other relational databases. SQL Server also provides options for bulk data loading, but the process may be more complex and require additional configurations.
Query Optimization: Redshift's query optimizer is designed specifically for analytical workloads and can efficiently handle complex queries involving large datasets. It incorporates various optimization techniques such as columnar storage, query compilation, and parallel execution. SQL Server, on the other hand, has a query optimizer optimized for transactional workloads and may not perform as well as Redshift for complex analytical queries.
Pricing Model: Redshift offers a pay-as-you-go pricing model, allowing users to scale their resources up or down based on their needs. It also provides options for reserved instances to reduce costs for long-term usage. SQL Server, on the other hand, follows a traditional licensing model, which may require upfront investments for hardware and software licenses.
Integration with Ecosystem: Amazon Redshift integrates well with other AWS services, such as Amazon S3 for data storage, Amazon EMR for big data processing, and Amazon QuickSight for data visualization. It also supports various third-party tools and technologies. SQL Server, on the other hand, is tightly integrated with the Microsoft ecosystem, providing seamless integration with other Microsoft products such as Azure, Power BI, and Visual Studio.

Amazon Redshift vs Microsoft SQL Server

Overview

Amazon Redshift vs Microsoft SQL Server: What are the differences?