Alternatives to AWS Step Functions logo

Alternatives to AWS Step Functions

AWS Lambda, Airflow, AWS Batch, AWS Data Pipeline, and Batch are the most popular alternatives and competitors to AWS Step Functions.
200
333
+ 1
23

What is AWS Step Functions and what are its top alternatives?

AWS Step Functions makes it easy to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function lets you scale and change applications quickly.
AWS Step Functions is a tool in the Cloud Task Management category of a tech stack.

Top Alternatives to AWS Step Functions

  • AWS Lambda
    AWS Lambda

    AWS Lambda is a compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security. ...

  • Airflow
    Airflow

    Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command lines utilities makes performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress and troubleshoot issues when needed. ...

  • AWS Batch
    AWS Batch

    It enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. ...

  • AWS Data Pipeline
    AWS Data Pipeline

    AWS Data Pipeline is a web service that provides a simple management system for data-driven workflows. Using AWS Data Pipeline, you define a pipeline composed of the “data sources” that contain your data, the “activities” or business logic such as EMR jobs or SQL queries, and the “schedule” on which your business logic executes. For example, you could define a job that, every hour, runs an Amazon Elastic MapReduce (Amazon EMR)–based analysis on that hour’s Amazon Simple Storage Service (Amazon S3) log data, loads the results into a relational database for future lookup, and then automatically sends you a daily summary email. ...

  • Batch
    Batch

    Yes, we’re really free. So, how do we keep the lights on? Instead of charging you a monthly fee, we sell ads on your behalf to the top 500 mobile advertisers in the world. With Batch, you earn money each month while accessing great engagement tools for free. ...

  • Camunda
    Camunda

    It is an open source platform for workflow and decision automation that brings business users and software developers together. ...

  • Google Keep
    Google Keep

    It is a note-taking service developed by Google. It is available on the web, and has mobile apps for the Android and iOS mobile operating systems. Keep offers a variety of tools for taking notes, including text, lists, images, and audio. ...

  • Amazon SWF
    Amazon SWF

    Amazon Simple Workflow allows you to structure the various processing steps in an application that runs across one or more machines as a set of “tasks.” Amazon SWF manages dependencies between the tasks, schedules the tasks for execution, and runs any logic that needs to be executed in parallel. The service also stores the tasks, reliably dispatches them to application components, tracks their progress, and keeps their latest state. ...

AWS Step Functions alternatives & related posts

AWS Lambda logo

AWS Lambda

19.3K
14.4K
425
Automatically run code in response to modifications to objects in Amazon S3 buckets, messages in Kinesis streams, or...
19.3K
14.4K
+ 1
425
PROS OF AWS LAMBDA
  • 128
    No infrastructure
  • 82
    Cheap
  • 69
    Quick
  • 58
    Stateless
  • 47
    No deploy, no server, great sleep
  • 9
    AWS Lambda went down taking many sites with it
  • 6
    Easy to deploy
  • 6
    Extensive API
  • 6
    Auto scale and cost effective
  • 6
    Event Driven Governance
  • 5
    VPC Support
  • 3
    Integrated with various AWS services
CONS OF AWS LAMBDA
  • 5
    Cant execute ruby or go
  • 1
    Compute time limited
  • 0
    Can't execute PHP w/o significant effort

related AWS Lambda posts

Jeyabalaji Subramanian

Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

Based on the above criteria, we selected the following tools to perform the end to end data replication:

We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

See more
Tim Nolet

Heroku Docker GitHub Node.js hapi Vue.js AWS Lambda Amazon S3 PostgreSQL Knex.js Checkly is a fairly young company and we're still working hard to find the correct mix of product features, price and audience.

We are focussed on tech B2B, but I always wanted to serve solo developers too. So I decided to make a $7 plan.

Why $7? Simply put, it seems to be a sweet spot for tech companies: Heroku, Docker, Github, Appoptics (Librato) all offer $7 plans. They must have done a ton of research into this, so why not piggy back that and try it out.

Enough biz talk, onto tech. The challenges were:

  • Slice of a portion of the functionality so a $7 plan is still profitable. We call this the "plan limits"
  • Update API and back end services to handle and enforce plan limits.
  • Update the UI to kindly state plan limits are in effect on some part of the UI.
  • Update the pricing page to reflect all changes.
  • Keep the actual processing backend, storage and API's as untouched as possible.

In essence, we went from strictly volume based pricing to value based pricing. Here come the technical steps & decisions we made to get there.

  1. We updated our PostgreSQL schema so plans now have an array of "features". These are string constants that represent feature toggles.
  2. The Vue.js frontend reads these from the vuex store on login.
  3. Based on these values, the UI has simple v-if statements to either just show the feature or show a friendly "please upgrade" button.
  4. The hapi API has a hook on each relevant API endpoint that checks whether a user's plan has the feature enabled, or not.

Side note: We offer 10 SMS messages per month on the developer plan. However, we were not actually counting how many people were sending. We had to update our alerting daemon (that runs on Heroku and triggers SMS messages via AWS SNS) to actually bump a counter.

What we build is basically feature-toggling based on plan features. It is very extensible for future additions. Our scheduling and storage backend that actually runs users' monitoring requests (AWS Lambda) and stores the results (S3 and Postgres) has no knowledge of all of this and remained unchanged.

Hope this helps anyone building out their SaaS and is in a similar situation.

See more
Airflow logo

Airflow

1.4K
2.3K
123
A platform to programmaticaly author, schedule and monitor data pipelines, by Airbnb
1.4K
2.3K
+ 1
123
PROS OF AIRFLOW
  • 49
    Features
  • 14
    Task Dependency Management
  • 12
    Beautiful UI
  • 12
    Cluster of workers
  • 10
    Extensibility
  • 5
    Open source
  • 5
    Python
  • 4
    Complex workflows
  • 3
    K
  • 3
    Good api
  • 2
    Custom operators
  • 2
    Apache project
  • 2
    Dashboard
CONS OF AIRFLOW
  • 2
    Running it on kubernetes cluster relatively complex
  • 2
    Open source - provides minimum or no support
  • 1
    Logical separation of DAGs is not straight forward
  • 1
    Observability is not great when the DAGs exceed 250

related Airflow posts

Shared insights
on
JenkinsJenkinsAirflowAirflow

I am looking for an open-source scheduler tool with cross-functional application dependencies. Some of the tasks I am looking to schedule are as follows:

  1. Trigger Matillion ETL loads
  2. Trigger Attunity Replication tasks that have downstream ETL loads
  3. Trigger Golden gate Replication Tasks
  4. Shell scripts, wrappers, file watchers
  5. Event-driven schedules

I have used Airflow in the past, and I know we need to create DAGs for each pipeline. I am not familiar with Jenkins, but I know it works with configuration without much underlying code. I want to evaluate both and appreciate any advise

See more
Shared insights
on
AWS Step FunctionsAWS Step FunctionsAirflowAirflow

I am working on a project that grabs a set of input data from AWS S3, pre-processes and divvies it up, spins up 10K batch containers to process the divvied data in parallel on AWS Batch, post-aggregates the data, and pushes it to S3.

I already have software patterns from other projects for Airflow + Batch but have not dealt with the scaling factors of 10k parallel tasks. Airflow is nice since I can look at which tasks failed and retry a task after debugging. But dealing with that many tasks on one Airflow EC2 instance seems like a barrier. Another option would be to have one task that kicks off the 10k containers and monitors it from there.

I have no experience with AWS Step Functions but have heard it's AWS's Airflow. There looks to be plenty of patterns online for Step Functions + Batch. Do Step Functions seem like a good path to check out for my use case? Do you get the same insights on failing jobs / ability to retry tasks as you do with Airflow?

See more
AWS Batch logo

AWS Batch

79
213
6
Fully Managed Batch Processing at Any Scale
79
213
+ 1
6
PROS OF AWS BATCH
  • 3
    Containerized
  • 3
    Scalable
CONS OF AWS BATCH
  • 2
    More overhead than lambda
  • 1
    Image management

related AWS Batch posts

Sumit Singh Chauhan
Data Scientist at Entropik · | 6 upvotes · 22.6K views

I have started using AWS Batch for some long ML inference jobs. So far it's working well and giving a decent performance. Since it is fully managed, it saves a lot of extra work as well. But Batch takes a good amount of time to create a new cluster and then load the job based on the priority of the queue. Going forward would love to put effort into something which is fast to start and give more flexibility as well. What other tools you would suggest for long-running backend jobs which can scale well. I am not looking for something fully managed so ignore the options similar to batch in Google Cloud Platform or Microsoft Azure, Looking for open-source alternatives here. Do you think Kubernetes, RabbitMQ/Kafka will be a good fit or just overkill for my problem. Usually w we get 1000s of requests in parallel and each job might take 20-30 mins in a 2 vCPU system.

See more
AWS Data Pipeline logo

AWS Data Pipeline

91
362
1
Process and move data between different AWS compute and storage services
91
362
+ 1
1
PROS OF AWS DATA PIPELINE
  • 1
    Easy to create DAG and execute it
CONS OF AWS DATA PIPELINE
    Be the first to leave a con

    related AWS Data Pipeline posts

    Batch logo

    Batch

    37
    34
    1
    Free retention toolkit for indie developers & startups - push notifications, user analytics, reward engine, and native ads
    37
    34
    + 1
    1
    PROS OF BATCH
    • 1
      Revenuecat
    CONS OF BATCH
      Be the first to leave a con

      related Batch posts

      Camunda logo

      Camunda

      140
      157
      0
      A Workflow and Decision Automation Platform
      140
      157
      + 1
      0
      PROS OF CAMUNDA
        Be the first to leave a pro
        CONS OF CAMUNDA
          Be the first to leave a con

          related Camunda posts

          Google Keep logo

          Google Keep

          56
          50
          0
          Capture what’s important and get more done
          56
          50
          + 1
          0
          PROS OF GOOGLE KEEP
            Be the first to leave a pro
            CONS OF GOOGLE KEEP
              Be the first to leave a con

              related Google Keep posts

              Amazon SWF logo

              Amazon SWF

              35
              72
              0
              Automate the coordination, auditing, and scaling of applications across multiple machines
              35
              72
              + 1
              0
              PROS OF AMAZON SWF
                Be the first to leave a pro
                CONS OF AMAZON SWF
                  Be the first to leave a con

                  related Amazon SWF posts