Why developers like Amazon S3

What is Amazon S3?

Amazon Simple Storage Service provides a fully redundant data storage infrastructure for storing and retrieving any amount of data, at any time, from anywhere on the web

Amazon S3 is a tool in the Cloud Storage category of a tech stack.

Who uses Amazon S3?

Companies

7320 companies reportedly use Amazon S3 in their tech stacks, including Airbnb, Pinterest, and Netflix.

Airbnb

Netflix

Spotify

Amazon

Udemy

Instacart

Developers

44265 developers on StackShare have stated that they use Amazon S3.

Atlassian

SantéVet

Evandro Magalhães

Clan Of The Cloud

SaaS

Lubert

Marin Software

Magazine du Webdesign

Personal

Amazon S3 Integrations

Travis CI, Gatsby, Auth0, Fastly, and Drone.io are some of the popular tools that integrate with Amazon S3. Here's a list of all 239 tools that integrate with Amazon S3.

Travis CI

Gatsby

Auth0

Fastly

Drone.io

AWS CodePipeline

Amazon Athena

Liquibase

Minio

Pros of Amazon S3

590

Reliable

492

Scalable

456

Cheap

329

Simple & easy

Many sdks

Logical

Easy Setup

REST API

1000+ POPs

Secure

Plug and play

Easy

Web UI for uploading files

Faster on response

Flexible

GDPR ready

Easy to use

Plug-gable

Easy integration with CloudFront

Decisions about Amazon S3

Here are some stack decisions, common use cases and reviews by companies and developers who chose Amazon S3 in their tech stack.

kew44

Nov 10, 2022 | 6 upvotes · 84.1K views

Needs advice

Amazon S3

Dremio

and

Snowflake

Trying to establish a data lake(or maybe puddle) for my org's Data Sharing project. The idea is that outside partners would send cuts of their PHI data, regardless of format/variables/systems, to our Data Team who would then harmonize the data, create data marts, and eventually use it for something. End-to-end, I'm envisioning:

Ingestion->Secure, role-based, self service portal for users to upload data (1a. bonus points if it can preform basic validations/masking)
Storage->Amazon S3 seems like the cheapest. We probably won't need very big, even at full capacity. Our current storage is a secure Box folder that has ~4GB with several batches of test data, code, presentations, and planning docs.
Data Catalog-> AWS Glue? Azure Data Factory? Snowplow? is the main difference basically based on the vendor? We also will have Data Dictionaries/Codebooks from submitters. Where would they fit in?
Partitions-> I've seen Cassandra and YARN mentioned, but have no experience with either
Processing-> We want to use SAS if at all possible. What will work with SAS code?
Pipeline/Automation->The check-in and verification processes that have been outlined are rather involved. Some sort of automated messaging or approval workflow would be nice
I have very little guidance on what a "Data Mart" should look like, so I'm going with the idea that it would be another "experimental" partition. Unless there's an actual mart-building paradigm I've missed?
An end user might use the catalog to pull certain de-identified data sets from the marts. Again, role-based access and self-service gui would be preferable. I'm the only full-time tech person on this project, but I'm mostly an OOP, HTML, JavaScript, and some SQL programmer. Most of this is out of my repertoire. I've done a lot of research, but I can't be an effective evangelist without hands-on experience. Since we're starting a new year of our grant, they've finally decided to let me try some stuff out. Any pointers would be appreciated!

Sindhumathi Parameswaran

Aug 4, 2022 | 3 upvotes · 28K views

Needs advice

Amazon S3

and

MySQL

Hi, I'm working on a project to integrate dat from Shopify (e-commerce platform) to Amazon Quicksight. I'm thinking about which database to use, either Amazon S3 or MySQL.

Urban Leung

Apr 30, 2022 | 9 upvotes · 24.4K views

Needs advice

Heroku Postgres

and

MongoDB

Hi team, we are building up a ecommerce marketplace with both sides (sellers and buyers), now, we have made the forntend dev, but we are facing the choice of database selection. we would use Amazon S3 as cloud storage, but hard to choose the database between No-SQL (Heroku Postgres and MongoDB) For the tech stack; we are using Next.js and Node.js for the front and backend. so please share your professional input. thanks in advance

Raunak Dave

Apr 15, 2022 | 5 upvotes · 29.3K views

Needs advice

Amazon Athena

and

Amazon DynamoDB

So, I have data in Amazon S3 as parquet files and I have it available in the Glue data catalog too. I want to build an AppSync API on top of this data. Now the two options that I am considering are:

Bring the data to Amazon DynamoDB and then build my API on top of this Database.
Add a Lambda function that resolves Amazon Athena queries made by AppSync.

Which of the two approaches will be cost effective?

I would really appreciate some back of the envelope estimates too.

Note: I only expect to make read queries. Thanks.

Arnaud Amzallag

Mar 25, 2022 | 6 upvotes · 42.7K views

Needs advice

Gatsby

Hexo

and

WordPress

I have been building a website with Gatsby (for a small group of volunteers). I track it in GitHub and push it to Amazon S3.

I am satisfied with it as a single user; however, I would like to get non-technical teammates to be able to post Markdown blog posts. I tried to teach them to add mdx files, git push, gastby build, and publish with gatsby-plugin-s3, but I am getting a fair amount of resistance :).

So I wonder if there are tools, preferably using Node.js, that allow multi-user blog authors a la wordpress, i.e. with an interface for non technical bloggers, but producing static/pre-rendered web pages.

(PS: I am considering having a node/express.js server where they could upload their mdx file and the server would re-build push and publish for them, without having them install anything, but I'd like to know if something already exists before jumping into this endeavor)

Miroslav Petrovic

Senior Software Engineer at Incode technologies · Feb 7, 2022 | 3 upvotes · 14.8K views

Needs advice

ceph

and

Minio

I need a replacement for Amazon S3 storage, private storage replacement for s3, which one would you choose?

See all decisions

Blog Posts

Optimizing Pinterest’s Data Ingestion Stack: Findings and Lear...

Jun 29 2022 at 4:48AM

1346

3 Innovations While Unifying Pinterest’s Key-Value Storage

Mar 9 2022 at 6:41AM

981

MemQ: An Efficient, Scalable Cloud Native PubSub System

Nov 24 2021 at 8:14AM

1590

Efficient Resource Management at Pinterest’s Batch Processing ...

Oct 27 2021 at 4:26PM

1459

Faster Flink Adoption with Self-Service Diagnosis Tool at Pint...

Oct 6 2021 at 8:21AM

620

Improving Efficiency and Reducing Runtime Using S3 Read Optimi...

Sep 1 2021 at 5:34PM

1214

Unified Flink Source at Pinterest: Streaming Data Processing

Jul 29 2021 at 7:12PM

1237

Open Sourcing Querybook, Pinterest’s Collaborative Big Data Hu...

Jun 23 2021 at 5:13PM

+17

10054

Amazon S3's Features

Write, read, and delete objects containing from 1 byte to 5 terabytes of data each. The number of objects you can store is unlimited.
Each object is stored in a bucket and retrieved via a unique, developer-assigned key.
A bucket can be stored in one of several Regions. You can choose a Region to optimize for latency, minimize costs, or address regulatory requirements. Amazon S3 is currently available in the US Standard, US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), South America (Sao Paulo), and GovCloud (US) Regions. The US Standard Region automatically routes requests to facilities in Northern Virginia or the Pacific Northwest using network maps.
Objects stored in a Region never leave the Region unless you transfer them out. For example, objects stored in the EU (Ireland) Region never leave the EU.
Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users.
Options for secure data upload/download and encryption of data at rest are provided for additional data protection.
Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.
Built to be flexible so that protocol or functional layers can easily be added. The default download protocol is HTTP. A BitTorrent protocol interface is provided to lower costs for high-scale distribution.
Provides functionality to simplify manageability of data through its lifetime. Includes options for segregating data by buckets, monitoring and controlling spend, and automatically archiving data to even lower cost storage options. These options can be easily administered from the Amazon S3 Management Console.
Reliability backed with the Amazon S3 Service Level Agreement.

Amazon S3 Alternatives & Comparisons

What are some alternatives to Amazon S3?

Amazon Glacier

In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.

Amazon EBS

Amazon EBS volumes are network-attached, and persist independently from the life of an instance. Amazon EBS provides highly available, highly reliable, predictable storage volumes that can be attached to a running Amazon EC2 instance and exposed as a device within the instance. Amazon EBS is particularly suited for applications that require a database, file system, or access to raw block level storage.

Amazon EC2

It is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.

Google Drive

Keep photos, stories, designs, drawings, recordings, videos, and more. Your first 15 GB of storage are free with a Google Account. Your files in Drive can be reached from any smartphone, tablet, or computer.

Microsoft Azure

Azure is an open and flexible cloud platform that enables you to quickly build, deploy and manage applications across a global network of Microsoft-managed datacenters. You can build applications using any language, tool or framework. And you can integrate your public cloud applications with your existing IT environment.

See all alternatives

Related Comparisons