Azure Data Factory logo

Azure Data Factory

Hybrid data integration service that simplifies ETL at scale
+ 1

What is Azure Data Factory?

It is a service designed to allow developers to integrate disparate data sources. It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.
Azure Data Factory is a tool in the Big Data Tools category of a tech stack.
Azure Data Factory is an open source tool with 349 GitHub stars and 430 GitHub forks. Here’s a link to Azure Data Factory's open source repository on GitHub

Who uses Azure Data Factory?

27 companies reportedly use Azure Data Factory in their tech stacks, including ViaVarejo, Runtastic, and Mews.

160 developers on StackShare have stated that they use Azure Data Factory.

Azure Data Factory Integrations

Java, .NET, Azure HDInsight, Octotree, and Octopai are some of the popular tools that integrate with Azure Data Factory. Here's a list of all 5 tools that integrate with Azure Data Factory.
Decisions about Azure Data Factory

Here are some stack decisions, common use cases and reviews by companies and developers who chose Azure Data Factory in their tech stack.

Vamshi Krishna
Data Engineer at Tata Consultancy Services · | 4 upvotes · 146.9K views

I have to collect different data from multiple sources and store them in a single cloud location. Then perform cleaning and transforming using PySpark, and push the end results to other applications like reporting tools, etc. What would be the best solution? I can only think of Azure Data Factory + Databricks. Are there any alternatives to #AWS services + Databricks?

See more

Jobs that mention Azure Data Factory as a desired skillset

India Telangana Hyderabad
See all jobs

Azure Data Factory's Features

  • Real-Time Integration
  • Parallel Processing
  • Data Chunker
  • Data Masking
  • Proactive Monitoring
  • Big Data Processing

Azure Data Factory Alternatives & Comparisons

What are some alternatives to Azure Data Factory?
Azure Databricks
Accelerate big data analytics and artificial intelligence (AI) solutions with Azure Databricks, a fast, easy and collaborative Apache Spark–based analytics service.
It is an open source software integration platform helps you in effortlessly turning data into business insights. It uses native code generation that lets you run your data pipelines seamlessly across all cloud providers and get optimized performance on all platforms.
AWS Data Pipeline
AWS Data Pipeline is a web service that provides a simple management system for data-driven workflows. Using AWS Data Pipeline, you define a pipeline composed of the “data sources” that contain your data, the “activities” or business logic such as EMR jobs or SQL queries, and the “schedule” on which your business logic executes. For example, you could define a job that, every hour, runs an Amazon Elastic MapReduce (Amazon EMR)–based analysis on that hour’s Amazon Simple Storage Service (Amazon S3) log data, loads the results into a relational database for future lookup, and then automatically sends you a daily summary email.
AWS Glue
A fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.
Apache NiFi
An easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
See all alternatives

Azure Data Factory's Followers
389 developers follow Azure Data Factory to keep up with related blogs and decisions.