Alternatives to Confluent logo

Alternatives to Confluent

Databricks, Kafka, RabbitMQ, Amazon SQS, and Celery are the most popular alternatives and competitors to Confluent.
141
174
+ 1
13

What is Confluent and what are its top alternatives?

It is a data streaming platform based on Apache Kafka: a full-scale streaming platform, capable of not only publish-and-subscribe, but also the storage and processing of data within the stream
Confluent is a tool in the Message Queue category of a tech stack.

Top Alternatives to Confluent

  • Databricks
    Databricks

    Databricks Unified Analytics Platform, from the original creators of Apache Spark™, unifies data science and engineering across the Machine Learning lifecycle from data preparation to experimentation and deployment of ML applications. ...

  • Kafka
    Kafka

    Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. ...

  • RabbitMQ
    RabbitMQ

    RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received. ...

  • Amazon SQS
    Amazon SQS

    Transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available. With SQS, you can offload the administrative burden of operating and scaling a highly available messaging cluster, while paying a low price for only what you use. ...

  • Celery
    Celery

    Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. ...

  • ActiveMQ
    ActiveMQ

    Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License. ...

  • MQTT
    MQTT

    It was designed as an extremely lightweight publish/subscribe messaging transport. It is useful for connections with remote locations where a small code footprint is required and/or network bandwidth is at a premium. ...

  • Kafka Streams
    Kafka Streams

    It is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters. It combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology. ...

Confluent alternatives & related posts

Databricks logo

Databricks

334
538
8
A unified analytics platform, powered by Apache Spark
334
538
+ 1
8
PROS OF DATABRICKS
  • 1
    Best Performances on large datasets
  • 1
    True lakehouse architecture
  • 1
    Scalability
  • 1
    Databricks doesn't get access to your data
  • 1
    Usage Based Billing
  • 1
    Security
  • 1
    Data stays in your cloud account
  • 1
    Multicloud
CONS OF DATABRICKS
    Be the first to leave a con

    related Databricks posts

    Kafka logo

    Kafka

    17.7K
    16.8K
    587
    Distributed, fault tolerant, high throughput pub-sub messaging system
    17.7K
    16.8K
    + 1
    587
    PROS OF KAFKA
    • 125
      High-throughput
    • 118
      Distributed
    • 88
      Scalable
    • 82
      High-Performance
    • 65
      Durable
    • 37
      Publish-Subscribe
    • 19
      Simple-to-use
    • 16
      Open source
    • 11
      Written in Scala and java. Runs on JVM
    • 7
      Message broker + Streaming system
    • 4
      Avro schema integration
    • 4
      KSQL
    • 3
      Robust
    • 2
      Suport Multiple clients
    • 2
      Partioned, replayable log
    • 1
      Flexible
    • 1
      Extremely good parallelism constructs
    • 1
      Simple publisher / multi-subscriber model
    • 1
      Fun
    CONS OF KAFKA
    • 29
      Non-Java clients are second-class citizens
    • 27
      Needs Zookeeper
    • 7
      Operational difficulties
    • 2
      Terrible Packaging

    related Kafka posts

    Eric Colson
    Chief Algorithms Officer at Stitch Fix · | 21 upvotes · 2.3M views

    The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.

    Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).

    At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.

    For more info:

    #DataScience #DataStack #Data

    See more
    John Kodumal

    As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.

    We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.

    See more
    RabbitMQ logo

    RabbitMQ

    16.4K
    14.4K
    520
    Open source multiprotocol messaging broker
    16.4K
    14.4K
    + 1
    520
    PROS OF RABBITMQ
    • 231
      It's fast and it works with good metrics/monitoring
    • 79
      Ease of configuration
    • 58
      I like the admin interface
    • 50
      Easy to set-up and start with
    • 20
      Durable
    • 18
      Intuitive work through python
    • 18
      Standard protocols
    • 10
      Written primarily in Erlang
    • 8
      Simply superb
    • 6
      Completeness of messaging patterns
    • 3
      Scales to 1 million messages per second
    • 3
      Reliable
    • 2
      Better than most traditional queue based message broker
    • 2
      Distributed
    • 2
      Supports AMQP
    • 1
      Inubit Integration
    • 1
      Delayed messages
    • 1
      Supports MQTT
    • 1
      Runs on Open Telecom Platform
    • 1
      High performance
    • 1
      Reliability
    • 1
      Clusterable
    • 1
      Clear documentation with different scripting language
    • 1
      Great ui
    • 1
      Better routing system
    CONS OF RABBITMQ
    • 9
      Too complicated cluster/HA config and management
    • 6
      Needs Erlang runtime. Need ops good with Erlang runtime
    • 5
      Configuration must be done first, not by your code
    • 4
      Slow

    related RabbitMQ posts

    James Cunningham
    Operations Engineer at Sentry · | 18 upvotes · 1.4M views
    Shared insights
    on
    CeleryCeleryRabbitMQRabbitMQ
    at

    As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.

    Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.

    #MessageQueue

    See more
    Yogesh Bhondekar
    Co-Founder at weconnect.chat · | 15 upvotes · 169.9K views

    Hi, I am building an enhanced web-conferencing app that will have a voice/video call, live chats, live notifications, live discussions, screen sharing, etc features. Ref: Zoom.

    I need advise finalizing the tech stack for this app. I am considering below tech stack:

    • Frontend: React
    • Backend: Node.js
    • Database: MongoDB
    • IAAS: #AWS
    • Containers & Orchestration: Docker / Kubernetes
    • DevOps: GitLab, Terraform
    • Brokers: Redis / RabbitMQ

    I need advice at the platform level as to what could be considered to support concurrent video streaming seamlessly.

    Also, please suggest what could be a better tech stack for my app?

    #SAAS #VideoConferencing #WebAndVideoConferencing #zoom #stack

    See more
    Amazon SQS logo

    Amazon SQS

    2K
    1.7K
    166
    Fully managed message queuing service
    2K
    1.7K
    + 1
    166
    PROS OF AMAZON SQS
    • 60
      Easy to use, reliable
    • 39
      Low cost
    • 27
      Simple
    • 13
      Doesn't need to maintain it
    • 8
      It is Serverless
    • 4
      Has a max message size (currently 256K)
    • 3
      Easy to configure with Terraform
    • 3
      Triggers Lambda
    • 3
      Delayed delivery upto 15 mins only
    • 3
      Delayed delivery upto 12 hours
    • 1
      JMS compliant
    • 1
      Support for retry and dead letter queue
    • 1
      D
    CONS OF AMAZON SQS
    • 2
      Has a max message size (currently 256K)
    • 2
      Proprietary
    • 2
      Difficult to configure
    • 1
      Has a maximum 15 minutes of delayed messages only

    related Amazon SQS posts

    Praveen Mooli
    Engineering Manager at Taylor and Francis · | 17 upvotes · 2.3M views

    We are in the process of building a modern content platform to deliver our content through various channels. We decided to go with Microservices architecture as we wanted scale. Microservice architecture style is an approach to developing an application as a suite of small independently deployable services built around specific business capabilities. You can gain modularity, extensive parallelism and cost-effective scaling by deploying services across many distributed servers. Microservices modularity facilitates independent updates/deployments, and helps to avoid single point of failure, which can help prevent large-scale outages. We also decided to use Event Driven Architecture pattern which is a popular distributed asynchronous architecture pattern used to produce highly scalable applications. The event-driven architecture is made up of highly decoupled, single-purpose event processing components that asynchronously receive and process events.

    To build our #Backend capabilities we decided to use the following: 1. #Microservices - Java with Spring Boot , Node.js with ExpressJS and Python with Flask 2. #Eventsourcingframework - Amazon Kinesis , Amazon Kinesis Firehose , Amazon SNS , Amazon SQS, AWS Lambda 3. #Data - Amazon RDS , Amazon DynamoDB , Amazon S3 , MongoDB Atlas

    To build #Webapps we decided to use Angular 2 with RxJS

    #Devops - GitHub , Travis CI , Terraform , Docker , Serverless

    See more
    Tim Specht
    ‎Co-Founder and CTO at Dubsmash · | 14 upvotes · 651K views

    In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.

    While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.

    In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.

    #ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService

    See more
    Celery logo

    Celery

    1.4K
    1.4K
    268
    Distributed task queue
    1.4K
    1.4K
    + 1
    268
    PROS OF CELERY
    • 96
      Task queue
    • 62
      Python integration
    • 37
      Django integration
    • 29
      Scheduled Task
    • 18
      Publish/subsribe
    • 6
      Various backend broker
    • 6
      Easy to use
    • 5
      Great community
    • 4
      Free
    • 4
      Workflow
    • 1
      Dynamic
    CONS OF CELERY
    • 4
      Sometimes loses tasks
    • 1
      Depends on broker

    related Celery posts

    James Cunningham
    Operations Engineer at Sentry · | 18 upvotes · 1.4M views
    Shared insights
    on
    CeleryCeleryRabbitMQRabbitMQ
    at

    As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.

    Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.

    #MessageQueue

    See more
    Pulkit Sapra

    Hi! I am creating a scraping system in Django, which involves long running tasks between 1 minute & 1 Day. As I am new to Message Brokers and Task Queues, I need advice on which architecture to use for my system. ( Amazon SQS, RabbitMQ, or Celery). The system should be autoscalable using Kubernetes(K8) based on the number of pending tasks in the queue.

    See more
    ActiveMQ logo

    ActiveMQ

    475
    1.1K
    76
    A message broker written in Java together with a full JMS client
    475
    1.1K
    + 1
    76
    PROS OF ACTIVEMQ
    • 18
      Easy to use
    • 14
      Open source
    • 13
      Efficient
    • 10
      JMS compliant
    • 6
      High Availability
    • 5
      Scalable
    • 3
      Support XA (distributed transactions)
    • 3
      Persistence
    • 2
      Distributed Network of brokers
    • 1
      Highly configurable
    • 1
      Docker delievery
    • 0
      RabbitMQ
    CONS OF ACTIVEMQ
    • 1
      Support
    • 1
      Low resilience to exceptions and interruptions
    • 1
      Difficult to scale

    related ActiveMQ posts

    I want to choose Message Queue with the following features - Highly Available, Distributed, Scalable, Monitoring. I have RabbitMQ, ActiveMQ, Kafka and Apache RocketMQ in mind. But I am confused which one to choose.

    See more
    Naushad Warsi
    software developer at klingelnberg · | 1 upvote · 648.3K views
    Shared insights
    on
    ActiveMQActiveMQRabbitMQRabbitMQ

    I use ActiveMQ because RabbitMQ have stopped giving the support for AMQP 1.0 or above version and the earlier version of AMQP doesn't give the functionality to support OAuth.

    If OAuth is not required and we can go with AMQP 0.9 then i still recommend rabbitMq.

    See more
    MQTT logo

    MQTT

    386
    447
    5
    A machine-to-machine Internet of Things connectivity protocol
    386
    447
    + 1
    5
    PROS OF MQTT
    • 3
      Varying levels of Quality of Service to fit a range of
    • 1
      Very easy to configure and use with open source tools
    • 1
      Lightweight with a relatively small data footprint
    CONS OF MQTT
    • 1
      Easy to configure in an unsecure manner

    related MQTT posts

    Kafka Streams logo

    Kafka Streams

    316
    387
    0
    A client library for building applications and microservices
    316
    387
    + 1
    0
    PROS OF KAFKA STREAMS
      Be the first to leave a pro
      CONS OF KAFKA STREAMS
        Be the first to leave a con

        related Kafka Streams posts

        I have recently started using Confluent/Kafka cloud. We want to do some stream processing. As I was going through Kafka I came across Kafka Streams and KSQL. Both seem to be A good fit for stream processing. But I could not understand which one should be used and one has any advantage over another. We will be using Confluent/Kafka Managed Cloud Instance. In near future, our Producers and Consumers are running on premise and we will be interacting with Confluent Cloud.

        Also, Confluent Cloud Kafka has a primitive interface; is there any better UI interface to manage Kafka Cloud Cluster?

        See more
        Shared insights
        on
        Apache FlinkApache FlinkKafka StreamsKafka Streams

        We currently have 2 Kafka Streams topics that have records coming in continuously. We're looking into joining the 2 streams based on a key with a window of 5 minutes based on their timestamp.

        Should I consider kStream - kStream join or Apache Flink window joins? Or is there any other better way to achieve this?

        See more