Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
AWS Data Pipeline is a web service that provides a simple management system for data-driven workflows. Using AWS Data Pipeline, you define a pipeline composed of the “data sources” that contain your data, the “activities” or business logic such as EMR jobs or SQL queries, and the “schedule” on which your business logic executes. For example, you could define a job that, every hour, runs an Amazon Elastic MapReduce (Amazon EMR)–based analysis on that hour’s Amazon Simple Storage Service (Amazon S3) log data, loads the results into a relational database for future lookup, and then automatically sends you a daily summary email. | Google Cloud Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Cloud Dataflow frees you from operational tasks like resource management and performance optimization. |
You can find (and use) a variety of popular AWS Data Pipeline tasks in the AWS Management Console’s template section.;Hourly analysis of Amazon S3‐based log data;Daily replication of AmazonDynamoDB data to Amazon S3;Periodic replication of on-premise JDBC database tables into RDS | Fully managed;
Combines batch and streaming with a single API;
High performance with automatic workload rebalancing
Open source SDK; |
Statistics | |
Stacks 94 | Stacks 219 |
Followers 398 | Followers 497 |
Votes 1 | Votes 19 |
Pros & Cons | |
Pros
| Pros
|

Amazon Kinesis can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

AWS Snowball Edge is a 100TB data transfer device with on-board storage and compute capabilities. You can use Snowball Edge to move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.

It is an elegant and simple HTTP library for Python, built for human beings. It allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your POST data.

Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture and automatically load streaming data into Amazon S3 and Amazon Redshift, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today.

It is a .NET library that can read/write Office formats without Microsoft Office installed. No COM+, no interop.

It's focus is on performance; specifically, end-user perceived latency, network and server resource usage.

It is an open-source bulk data loader that helps data transfer between various databases, storages, file formats, and cloud services.

BigQuery Data Transfer Service lets you focus your efforts on analyzing your data. You can setup a data transfer with a few clicks. Your analytics team can lay the foundation for a data warehouse without writing a single line of code.

A cloud-based solution engineered to fill the gaps between cloud applications. The software utilizes Intelligent 2-way Contact Sync technology to sync contacts in real-time between your favorite CRM and marketing apps.

It offers the industry leading data synchronization tool. Trusted by millions of users and thousands of companies across the globe. Resilient, fast and scalable p2p file sync software for enterprises and individuals.