Need advice about which tool to choose?Ask the StackShare community!

Amazon S3

53K
39.7K
+ 1
2K
Cassandra

3.6K
3.5K
+ 1
507
Add tool

Amazon S3 vs Cassandra: What are the differences?

Introduction

Amazon S3 and Cassandra are both popular data storage solutions, but they have significant differences in their architecture and use cases. This document aims to provide a concise overview of the key differences between Amazon S3 and Cassandra.

  1. Data Structure:

    • Amazon S3 is an object storage service that stores data in a flat structure, treating each file as an object with a unique key. It is not optimized for complex queries or real-time data processing.
    • Cassandra is a distributed NoSQL database that organizes data into a structured column-family model. It allows querying and indexing data across multiple columns and offers high scalability and performance.
  2. Data Distribution and Replication:

    • In Amazon S3, data is stored in multiple data centers across different regions, providing high durability and availability.
    • Cassandra is designed for distributed environments and replicates data across multiple nodes for fault tolerance and scalability. It uses a peer-to-peer model for data distribution.
  3. Data Consistency:

    • Amazon S3 provides eventual consistency, where changes made to objects are propagated across the system over time. It may take a few minutes for changes to become consistent.
    • Cassandra offers tunable consistency, allowing developers to choose the level of consistency required for each read or write operation. It supports strong consistency for immediate data availability.
  4. Querying and Indexing:

    • Amazon S3 does not provide built-in query capabilities. To retrieve data, you need to know the exact key or use tools like S3 Select or Athena for limited querying.
    • Cassandra supports rich querying with its query language (CQL), and you can create secondary indexes on specific columns for efficient searching. It offers flexibility in querying individual records or ranges of records.
  5. Scalability:

    • Amazon S3 automatically scales to accommodate large amounts of data and high request rates. It can store an unlimited number of objects, and the performance remains consistent as you add more data.
    • Cassandra's distributed architecture allows it to scale horizontally by adding more nodes to the cluster. It can handle massive amounts of data and high workloads while maintaining low latency.
  6. Use Cases:

    • Amazon S3 is commonly used for backup and archiving, content distribution, and static website hosting. It is well-suited for storing and retrieving large amounts of unstructured data.
    • Cassandra is often used for real-time applications, such as messaging platforms, sensor data management, and recommendation systems. It excels in handling write-heavy workloads and provides low-latency access to data.

In summary, Amazon S3 is a highly durable and scalable object storage service optimized for storing large amounts of unstructured data, while Cassandra is a distributed NoSQL database designed for real-time applications with rich querying capabilities, tunable consistency, and high scalability.

Advice on Amazon S3 and Cassandra

Hello! I have a mobile app with nearly 100k MAU, and I want to add a cloud file storage service to my app.

My app will allow users to store their image, video, and audio files and retrieve them to their device when necessary.

I have already decided to use PHP & Laravel as my backend, and I use Contabo VPS. Now, I need an object storage service for my app, and my options are:

  • Amazon S3 : It sounds to me like the best option but the most expensive. Closest to my users (MENA Region) for other services, I will have to go to Europe. Not sure how important this is?

  • DigitalOcean Spaces : Seems like my best option for price/service, but I am still not sure

  • Wasabi: the best price (6 USD/MONTH/TB) and free bandwidth, but I am not sure if it fits my needs as I want to allow my users to preview audio and video files. They don't recommend their service for streaming videos.

  • Backblaze B2 Cloud Storage: Good price but not sure about them.

  • There is also the self-hosted s3 compatible option, but I am not sure about that.

Any thoughts will be helpful. Also, if you think I should post in a different sub, please tell me.

See more
Replies (2)
Michira Griffins
Software Developer at Codeshares Ltd · | 1 upvotes · 119.8K views

If pricing is the issue i'd suggest you use digital ocean, but if its not use amazon was digital oceans API is s3 compatible

See more
Recommends
on
Cloudways Cloudways

Hello Mohammad, I am using : Cloudways >> AWS >> Bahrain for last 2 years. This is best I consider out of my 10 year research on Laravel hosting.

See more
Vinay Mehta
Needs advice
on
CassandraCassandra
and
ScyllaDBScyllaDB

The problem I have is - we need to process & change(update/insert) 55M Data every 2 min and this updated data to be available for Rest API for Filtering / Selection. Response time for Rest API should be less than 1 sec.

The most important factors for me are processing and storing time of 2 min. There need to be 2 views of Data One is for Selection & 2. Changed data.

See more
Replies (4)
Recommends
on
ScyllaDBScyllaDB

Scylla can handle 1M/s events with a simple data model quite easily. The api to query is CQL, we have REST api but that's for control/monitoring

See more
Alex Peake
Recommends
on
CassandraCassandra

Cassandra is quite capable of the task, in a highly available way, given appropriate scaling of the system. Remember that updates are only inserts, and that efficient retrieval is only by key (which can be a complex key). Talking of keys, make sure that the keys are well distributed.

See more
Pankaj Soni
Chief Technical Officer at Software Joint · | 2 upvotes · 158.1K views
Recommends
on
CassandraCassandra

i love syclla for pet projects however it's license which is based on server model is an issue. thus i recommend cassandra

See more
Recommends
on
ScyllaDBScyllaDB

By 55M do you mean 55 million entity changes per 2 minutes? It is relatively high, means almost 460k per second. If I had to choose between Scylla or Cassandra, I would opt for Scylla as it is promising better performance for simple operations. However, maybe it would be worth to consider yet another alternative technology. Take into consideration required consistency, reliability and high availability and you may realize that there are more suitable once. Rest API should not be the main driver, because you can always develop the API yourself, if not supported by given technology.

See more
Decisions about Amazon S3 and Cassandra

Minio is a free and open source object storage system. It can be self-hosted and is S3 compatible. During the early stage it would save cost and allow us to move to a different object storage when we scale up. It is also fast and easy to set up. This is very useful during development since it can be run on localhost.

See more
Gabriel Pa

We offer our customer HIPAA compliant storage. After analyzing the market, we decided to go with Google Storage. The Nodejs API is ok, still not ES6 and can be very confusing to use. For each new customer, we created a different bucket so they can have individual data and not have to worry about data loss. After 1000+ customers we started seeing many problems with the creation of new buckets, with saving or retrieving a new file. Many false positive: the Promise returned ok, but in reality, it failed.

That's why we switched to S3 that just works.

See more
Manage your open source components, licenses, and vulnerabilities
Learn More
Pros of Amazon S3
Pros of Cassandra
  • 590
    Reliable
  • 492
    Scalable
  • 456
    Cheap
  • 329
    Simple & easy
  • 83
    Many sdks
  • 30
    Logical
  • 13
    Easy Setup
  • 11
    REST API
  • 11
    1000+ POPs
  • 6
    Secure
  • 4
    Plug and play
  • 4
    Easy
  • 3
    Web UI for uploading files
  • 2
    Faster on response
  • 2
    Flexible
  • 2
    GDPR ready
  • 1
    Easy to use
  • 1
    Plug-gable
  • 1
    Easy integration with CloudFront
  • 119
    Distributed
  • 98
    High performance
  • 81
    High availability
  • 74
    Easy scalability
  • 53
    Replication
  • 26
    Reliable
  • 26
    Multi datacenter deployments
  • 10
    Schema optional
  • 9
    OLTP
  • 8
    Open source
  • 2
    Workload separation (via MDC)
  • 1
    Fast

Sign up to add or upvote prosMake informed product decisions

Cons of Amazon S3
Cons of Cassandra
  • 7
    Permissions take some time to get right
  • 6
    Requires a credit card
  • 6
    Takes time/work to organize buckets & folders properly
  • 3
    Complex to set up
  • 3
    Reliability of replication
  • 1
    Size
  • 1
    Updates

Sign up to add or upvote consMake informed product decisions

- No public GitHub repository available -

What is Amazon S3?

Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web

What is Cassandra?

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Amazon S3?
What companies use Cassandra?
Manage your open source components, licenses, and vulnerabilities
Learn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Amazon S3?
What tools integrate with Cassandra?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Amazon S3 and Cassandra?
Amazon Glacier
In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.
Amazon EBS
Amazon EBS volumes are network-attached, and persist independently from the life of an instance. Amazon EBS provides highly available, highly reliable, predictable storage volumes that can be attached to a running Amazon EC2 instance and exposed as a device within the instance. Amazon EBS is particularly suited for applications that require a database, file system, or access to raw block level storage.
Amazon EC2
It is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.
Google Drive
Keep photos, stories, designs, drawings, recordings, videos, and more. Your first 15 GB of storage are free with a Google Account. Your files in Drive can be reached from any smartphone, tablet, or computer.
Microsoft Azure
Azure is an open and flexible cloud platform that enables you to quickly build, deploy and manage applications across a global network of Microsoft-managed datacenters. You can build applications using any language, tool or framework. And you can integrate your public cloud applications with your existing IT environment.
See all alternatives