Alternatives to Zookeeper logo

Alternatives to Zookeeper

Consul, etcd, Yarn, Eureka, and Ambari are the most popular alternatives and competitors to Zookeeper.
808
1K
+ 1
43

What is Zookeeper and what are its top alternatives?

Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and group services. It is designed to be highly available, fault-tolerant, and scalable for distributed systems. Zookeeper allows developers to coordinate and manage distributed applications efficiently. However, Zookeeper can be complex to set up and manage, and may not be suitable for all use cases due to its steep learning curve and potential performance limitations.

  1. etcd: etcd is a distributed reliable key-value store for the most critical data of a distributed system. Key features include simple HTTP API, watch feature for monitoring, and strong consistency guarantees. Pros include simplicity in usage, strong consistency, and support for Kubernetes. Cons may include limited scalability compared to Zookeeper.
  2. Consul: Consul is a service networking solution to connect and secure services across any runtime platform and public or private cloud. Key features include service discovery, health checking, and centralized key-value store. Pros include multi-datacenter support, service mesh capabilities, and robust security features. Cons may include complexity in setup and configuration.
  3. Eureka: Eureka is a REST-based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing. Key features include service registration, health checking, and load balancing. Pros include easy setup, integration with Netflix OSS, and lightweight footprint. Cons may include lack of advanced features compared to Zookeeper.
  4. Doozerd: Doozerd is a highly-available, completely consistent store for small amounts of extremely important data. Key features include coordination service, consistent replication, and fault tolerance. Pros include strong consistency guarantees, ease of use, and lightweight design. Cons may include limited scalability for large-scale systems.
  5. ZooKeeper-watcher: ZooKeeper-watcher is a lightweight wrapper around the ZooKeeper Python client that simplifies its usage. Key features include an easy-to-use interface, automatic reconnection, and high-level abstractions. Pros include simplicity for Python developers, reduced complexity in code, and quick integration with Zookeeper. Cons may include limited functionality compared to other alternatives.
  6. ZooInspector: ZooInspector is a tool to work with ZooKeeper in real-time. Key features include an interactive UI, real-time updates, and node management. Pros include visual representation of Zookeeper data, ease of monitoring, and efficient troubleshooting. Cons may include limited functionality for advanced Zookeeper operations.
  7. Exhibitor: Exhibitor is a supervisor system for Apache ZooKeeper that simplifies the task of managing ZooKeeper. Key features include automated backups, monitoring, and control of Zookeeper. Pros include simple setup, automated maintenance tasks, and integration with AWS. Cons may include limited flexibility compared to directly working with Zookeeper.
  8. Curator: Curator is a set of Java libraries that make using Apache ZooKeeper easier. Key features include recipes for common use cases, utilities, and abstractions. Pros include enhanced functionality for Zookeeper operations, clean API design, and active community support. Cons may include additional complexity in dependency management.
  9. Nacos: Nacos is a platform designed to manage, monitor, and maintain microservices. Key features include service discovery, dynamic configuration, and service governance. Pros include support for multiple runtime environments, comprehensive functionality, and easy integration with cloud services. Cons may include potential learning curve for new users.
  10. ZooF: ZooF is a ZooKeeper CLI with additional commands for convenience in managing Zookeeper. Key features include extended functionality for Zookeeper operations, interactive shell, and script support. Pros include enhanced usability for Zookeeper administrators, quick access to common tasks, and script automation capabilities. Cons may include limited adoption and community support compared to other alternatives.

Top Alternatives to Zookeeper

  • Consul
    Consul

    Consul is a tool for service discovery and configuration. Consul is distributed, highly available, and extremely scalable. ...

  • etcd
    etcd

    etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles master elections during network partitions and will tolerate machine failure, including the master. ...

  • Yarn
    Yarn

    Yarn caches every package it downloads so it never needs to again. It also parallelizes operations to maximize resource utilization so install times are faster than ever. ...

  • Eureka
    Eureka

    Eureka is a REST (Representational State Transfer) based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers. ...

  • Ambari
    Ambari

    This project is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. It provides an intuitive, easy-to-use Hadoop management web UI backed by its RESTful APIs. ...

  • Kafka
    Kafka

    Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. ...

  • Redis
    Redis

    Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. ...

  • Kubernetes
    Kubernetes

    Kubernetes is an open source orchestration system for Docker containers. It handles scheduling onto nodes in a compute cluster and actively manages workloads to ensure that their state matches the users declared intentions. ...

Zookeeper alternatives & related posts

Consul logo

Consul

1.1K
1.5K
213
A tool for service discovery, monitoring and configuration
1.1K
1.5K
+ 1
213
PROS OF CONSUL
  • 61
    Great service discovery infrastructure
  • 35
    Health checking
  • 29
    Distributed key-value store
  • 26
    Monitoring
  • 23
    High-availability
  • 12
    Web-UI
  • 10
    Token-based acls
  • 6
    Gossip clustering
  • 5
    Dns server
  • 4
    Not Java
  • 1
    Docker integration
  • 1
    Javascript
CONS OF CONSUL
    Be the first to leave a con

    related Consul posts

    John Kodumal

    As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.

    We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.

    See more
    Shared insights
    on
    ConsulConsulElixirElixirErlangErlang
    at

    Postmates built a tool called Bazaar that helps onboard new partners and handles several routine tasks, like nightly emails to merchants alerting them about items that are out of stock.

    Since they ran Bazaar across multiple instances, the team needed to avoid sending multiple emails to their partners by obtaining lock across multiple hosts. To solve their challenge, they created and open sourced ConsulMutEx, and an Elixir module for acquiring and releasing locks with Consul and other backends.

    It works with Consul’s KV store, as well as other backends, including ets, Erlang’s in-memory database.

    See more
    etcd logo

    etcd

    305
    414
    24
    A distributed consistent key-value store for shared configuration and service discovery
    305
    414
    + 1
    24
    PROS OF ETCD
    • 11
      Service discovery
    • 6
      Fault tolerant key value store
    • 2
      Secure
    • 2
      Bundled with coreos
    • 1
      Consol integration
    • 1
      Privilege Access Management
    • 1
      Open Source
    CONS OF ETCD
      Be the first to leave a con

      related etcd posts

      Yarn logo

      Yarn

      24.3K
      13.2K
      151
      A new package manager for JavaScript
      24.3K
      13.2K
      + 1
      151
      PROS OF YARN
      • 85
        Incredibly fast
      • 22
        Easy to use
      • 13
        Open Source
      • 11
        Can install any npm package
      • 8
        Works where npm fails
      • 7
        Workspaces
      • 3
        Incomplete to run tasks
      • 2
        Fast
      CONS OF YARN
      • 16
        Facebook
      • 7
        Sends data to facebook
      • 4
        Should be installed separately
      • 3
        Cannot publish to registry other than npm

      related Yarn posts

      Nick Parsons
      Building cool things on the internet 🛠️ at Stream · | 35 upvotes · 4M views

      Winds 2.0 is an open source Podcast/RSS reader developed by Stream with a core goal to enable a wide range of developers to contribute.

      We chose JavaScript because nearly every developer knows or can, at the very least, read JavaScript. With ES6 and Node.js v10.x.x, it’s become a very capable language. Async/Await is powerful and easy to use (Async/Await vs Promises). Babel allows us to experiment with next-generation JavaScript (features that are not in the official JavaScript spec yet). Yarn allows us to consistently install packages quickly (and is filled with tons of new tricks)

      We’re using JavaScript for everything – both front and backend. Most of our team is experienced with Go and Python, so Node was not an obvious choice for this app.

      Sure... there will be haters who refuse to acknowledge that there is anything remotely positive about JavaScript (there are even rants on Hacker News about Node.js); however, without writing completely in JavaScript, we would not have seen the results we did.

      #FrameworksFullStack #Languages

      See more
      Simon Reymann
      Senior Fullstack Developer at QUANTUSflow Software GmbH · | 27 upvotes · 4.8M views

      Our whole Node.js backend stack consists of the following tools:

      • Lerna as a tool for multi package and multi repository management
      • npm as package manager
      • NestJS as Node.js framework
      • TypeScript as programming language
      • ExpressJS as web server
      • Swagger UI for visualizing and interacting with the API’s resources
      • Postman as a tool for API development
      • TypeORM as object relational mapping layer
      • JSON Web Token for access token management

      The main reason we have chosen Node.js over PHP is related to the following artifacts:

      • Made for the web and widely in use: Node.js is a software platform for developing server-side network services. Well-known projects that rely on Node.js include the blogging software Ghost, the project management tool Trello and the operating system WebOS. Node.js requires the JavaScript runtime environment V8, which was specially developed by Google for the popular Chrome browser. This guarantees a very resource-saving architecture, which qualifies Node.js especially for the operation of a web server. Ryan Dahl, the developer of Node.js, released the first stable version on May 27, 2009. He developed Node.js out of dissatisfaction with the possibilities that JavaScript offered at the time. The basic functionality of Node.js has been mapped with JavaScript since the first version, which can be expanded with a large number of different modules. The current package managers (npm or Yarn) for Node.js know more than 1,000,000 of these modules.
      • Fast server-side solutions: Node.js adopts the JavaScript "event-loop" to create non-blocking I/O applications that conveniently serve simultaneous events. With the standard available asynchronous processing within JavaScript/TypeScript, highly scalable, server-side solutions can be realized. The efficient use of the CPU and the RAM is maximized and more simultaneous requests can be processed than with conventional multi-thread servers.
      • A language along the entire stack: Widely used frameworks such as React or AngularJS or Vue.js, which we prefer, are written in JavaScript/TypeScript. If Node.js is now used on the server side, you can use all the advantages of a uniform script language throughout the entire application development. The same language in the back- and frontend simplifies the maintenance of the application and also the coordination within the development team.
      • Flexibility: Node.js sets very few strict dependencies, rules and guidelines and thus grants a high degree of flexibility in application development. There are no strict conventions so that the appropriate architecture, design structures, modules and features can be freely selected for the development.
      See more
      Eureka logo

      Eureka

      288
      779
      69
      AWS Service registry for resilient mid-tier load balancing and failover.
      288
      779
      + 1
      69
      PROS OF EUREKA
      • 21
        Easy setup and integration with spring-cloud
      • 9
        Web ui
      • 8
        Monitoring
      • 8
        Health checking
      • 7
        Circuit breaker
      • 6
        Netflix battle tested components
      • 6
        Service discovery
      • 4
        Open Source
      CONS OF EUREKA
        Be the first to leave a con

        related Eureka posts

        Ambari logo

        Ambari

        43
        74
        2
        A software for provisioning, managing, and monitoring Apache Hadoop clusters
        43
        74
        + 1
        2
        PROS OF AMBARI
        • 2
          Ease of use
        CONS OF AMBARI
          Be the first to leave a con

          related Ambari posts

          Kafka logo

          Kafka

          23.3K
          21.8K
          607
          Distributed, fault tolerant, high throughput pub-sub messaging system
          23.3K
          21.8K
          + 1
          607
          PROS OF KAFKA
          • 126
            High-throughput
          • 119
            Distributed
          • 92
            Scalable
          • 86
            High-Performance
          • 66
            Durable
          • 38
            Publish-Subscribe
          • 19
            Simple-to-use
          • 18
            Open source
          • 12
            Written in Scala and java. Runs on JVM
          • 9
            Message broker + Streaming system
          • 4
            KSQL
          • 4
            Avro schema integration
          • 4
            Robust
          • 3
            Suport Multiple clients
          • 2
            Extremely good parallelism constructs
          • 2
            Partioned, replayable log
          • 1
            Simple publisher / multi-subscriber model
          • 1
            Fun
          • 1
            Flexible
          CONS OF KAFKA
          • 32
            Non-Java clients are second-class citizens
          • 29
            Needs Zookeeper
          • 9
            Operational difficulties
          • 5
            Terrible Packaging

          related Kafka posts

          Nick Rockwell
          SVP, Engineering at Fastly · | 46 upvotes · 3.6M views

          When I joined NYT there was already broad dissatisfaction with the LAMP (Linux Apache HTTP Server MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?

          So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.

          React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.

          Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.

          See more
          Ashish Singh
          Tech Lead, Big Data Platform at Pinterest · | 38 upvotes · 3M views

          To provide employees with the critical need of interactive querying, we’ve worked with Presto, an open-source distributed SQL query engine, over the years. Operating Presto at Pinterest’s scale has involved resolving quite a few challenges like, supporting deeply nested and huge thrift schemas, slow/ bad worker detection and remediation, auto-scaling cluster, graceful cluster shutdown and impersonation support for ldap authenticator.

          Our infrastructure is built on top of Amazon EC2 and we leverage Amazon S3 for storing our data. This separates compute and storage layers, and allows multiple compute clusters to share the S3 data.

          We have hundreds of petabytes of data and tens of thousands of Apache Hive tables. Our Presto clusters are comprised of a fleet of 450 r4.8xl EC2 instances. Presto clusters together have over 100 TBs of memory and 14K vcpu cores. Within Pinterest, we have close to more than 1,000 monthly active users (out of total 1,600+ Pinterest employees) using Presto, who run about 400K queries on these clusters per month.

          Each query submitted to Presto cluster is logged to a Kafka topic via Singer. Singer is a logging agent built at Pinterest and we talked about it in a previous post. Each query is logged when it is submitted and when it finishes. When a Presto cluster crashes, we will have query submitted events without corresponding query finished events. These events enable us to capture the effect of cluster crashes over time.

          Each Presto cluster at Pinterest has workers on a mix of dedicated AWS EC2 instances and Kubernetes pods. Kubernetes platform provides us with the capability to add and remove workers from a Presto cluster very quickly. The best-case latency on bringing up a new worker on Kubernetes is less than a minute. However, when the Kubernetes cluster itself is out of resources and needs to scale up, it can take up to ten minutes. Some other advantages of deploying on Kubernetes platform is that our Presto deployment becomes agnostic of cloud vendor, instance types, OS, etc.

          #BigData #AWS #DataScience #DataEngineering

          See more
          Redis logo

          Redis

          58.8K
          45.2K
          3.9K
          Open source (BSD licensed), in-memory data structure store
          58.8K
          45.2K
          + 1
          3.9K
          PROS OF REDIS
          • 886
            Performance
          • 542
            Super fast
          • 513
            Ease of use
          • 444
            In-memory cache
          • 324
            Advanced key-value cache
          • 194
            Open source
          • 182
            Easy to deploy
          • 164
            Stable
          • 155
            Free
          • 121
            Fast
          • 42
            High-Performance
          • 40
            High Availability
          • 35
            Data Structures
          • 32
            Very Scalable
          • 24
            Replication
          • 22
            Great community
          • 22
            Pub/Sub
          • 19
            "NoSQL" key-value data store
          • 16
            Hashes
          • 13
            Sets
          • 11
            Sorted Sets
          • 10
            NoSQL
          • 10
            Lists
          • 9
            Async replication
          • 9
            BSD licensed
          • 8
            Bitmaps
          • 8
            Integrates super easy with Sidekiq for Rails background
          • 7
            Keys with a limited time-to-live
          • 7
            Open Source
          • 6
            Lua scripting
          • 6
            Strings
          • 5
            Awesomeness for Free
          • 5
            Hyperloglogs
          • 4
            Transactions
          • 4
            Outstanding performance
          • 4
            Runs server side LUA
          • 4
            LRU eviction of keys
          • 4
            Feature Rich
          • 4
            Written in ANSI C
          • 4
            Networked
          • 3
            Data structure server
          • 3
            Performance & ease of use
          • 2
            Dont save data if no subscribers are found
          • 2
            Automatic failover
          • 2
            Easy to use
          • 2
            Temporarily kept on disk
          • 2
            Scalable
          • 2
            Existing Laravel Integration
          • 2
            Channels concept
          • 2
            Object [key/value] size each 500 MB
          • 2
            Simple
          CONS OF REDIS
          • 15
            Cannot query objects directly
          • 3
            No secondary indexes for non-numeric data types
          • 1
            No WAL

          related Redis posts

          Russel Werner
          Lead Engineer at StackShare · | 32 upvotes · 2.6M views

          StackShare Feed is built entirely with React, Glamorous, and Apollo. One of our objectives with the public launch of the Feed was to enable a Server-side rendered (SSR) experience for our organic search traffic. When you visit the StackShare Feed, and you aren't logged in, you are delivered the Trending feed experience. We use an in-house Node.js rendering microservice to generate this HTML. This microservice needs to run and serve requests independent of our Rails web app. Up until recently, we had a mono-repo with our Rails and React code living happily together and all served from the same web process. In order to deploy our SSR app into a Heroku environment, we needed to split out our front-end application into a separate repo in GitHub. The driving factor in this decision was mostly due to limitations imposed by Heroku specifically with how processes can't communicate with each other. A new SSR app was created in Heroku and linked directly to the frontend repo so it stays in-sync with changes.

          Related to this, we need a way to "deploy" our frontend changes to various server environments without building & releasing the entire Ruby application. We built a hybrid Amazon S3 Amazon CloudFront solution to host our Webpack bundles. A new CircleCI script builds the bundles and uploads them to S3. The final step in our rollout is to update some keys in Redis so our Rails app knows which bundles to serve. The result of these efforts were significant. Our frontend team now moves independently of our backend team, our build & release process takes only a few minutes, we are now using an edge CDN to serve JS assets, and we have pre-rendered React pages!

          #StackDecisionsLaunch #SSR #Microservices #FrontEndRepoSplit

          See more
          Simon Reymann
          Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 10M views

          Our whole DevOps stack consists of the following tools:

          • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
          • Respectively Git as revision control system
          • SourceTree as Git GUI
          • Visual Studio Code as IDE
          • CircleCI for continuous integration (automatize development process)
          • Prettier / TSLint / ESLint as code linter
          • SonarQube as quality gate
          • Docker as container management (incl. Docker Compose for multi-container application management)
          • VirtualBox for operating system simulation tests
          • Kubernetes as cluster management for docker containers
          • Heroku for deploying in test environments
          • nginx as web server (preferably used as facade server in production environment)
          • SSLMate (using OpenSSL) for certificate management
          • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
          • PostgreSQL as preferred database system
          • Redis as preferred in-memory database/store (great for caching)

          The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

          • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
          • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
          • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
          • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
          • Scalability: All-in-one framework for distributed systems.
          • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
          See more
          Kubernetes logo

          Kubernetes

          59.2K
          51.2K
          678
          Manage a cluster of Linux containers as a single system to accelerate Dev and simplify Ops
          59.2K
          51.2K
          + 1
          678
          PROS OF KUBERNETES
          • 164
            Leading docker container management solution
          • 128
            Simple and powerful
          • 107
            Open source
          • 76
            Backed by google
          • 58
            The right abstractions
          • 25
            Scale services
          • 20
            Replication controller
          • 11
            Permission managment
          • 9
            Supports autoscaling
          • 8
            Cheap
          • 8
            Simple
          • 6
            Self-healing
          • 5
            Promotes modern/good infrascture practice
          • 5
            Open, powerful, stable
          • 5
            Reliable
          • 5
            No cloud platform lock-in
          • 4
            Scalable
          • 4
            Quick cloud setup
          • 3
            Custom and extensibility
          • 3
            A self healing environment with rich metadata
          • 3
            Cloud Agnostic
          • 3
            Backed by Red Hat
          • 3
            Runs on azure
          • 3
            Captain of Container Ship
          • 2
            Expandable
          • 2
            Sfg
          • 2
            Everything of CaaS
          • 2
            Golang
          • 2
            Easy setup
          • 2
            Gke
          CONS OF KUBERNETES
          • 16
            Steep learning curve
          • 15
            Poor workflow for development
          • 8
            Orchestrates only infrastructure
          • 4
            High resource requirements for on-prem clusters
          • 2
            Too heavy for simple systems
          • 1
            Additional vendor lock-in (Docker)
          • 1
            More moving parts to secure
          • 1
            Additional Technology Overhead

          related Kubernetes posts

          Conor Myhrvold
          Tech Brand Mgr, Office of CTO at Uber · | 44 upvotes · 11.3M views

          How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:

          Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.

          Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:

          https://eng.uber.com/distributed-tracing/

          (GitHub Pages : https://www.jaegertracing.io/, GitHub: https://github.com/jaegertracing/jaeger)

          Bindings/Operator: Python Java Node.js Go C++ Kubernetes JavaScript OpenShift C# Apache Spark

          See more
          Ashish Singh
          Tech Lead, Big Data Platform at Pinterest · | 38 upvotes · 3M views

          To provide employees with the critical need of interactive querying, we’ve worked with Presto, an open-source distributed SQL query engine, over the years. Operating Presto at Pinterest’s scale has involved resolving quite a few challenges like, supporting deeply nested and huge thrift schemas, slow/ bad worker detection and remediation, auto-scaling cluster, graceful cluster shutdown and impersonation support for ldap authenticator.

          Our infrastructure is built on top of Amazon EC2 and we leverage Amazon S3 for storing our data. This separates compute and storage layers, and allows multiple compute clusters to share the S3 data.

          We have hundreds of petabytes of data and tens of thousands of Apache Hive tables. Our Presto clusters are comprised of a fleet of 450 r4.8xl EC2 instances. Presto clusters together have over 100 TBs of memory and 14K vcpu cores. Within Pinterest, we have close to more than 1,000 monthly active users (out of total 1,600+ Pinterest employees) using Presto, who run about 400K queries on these clusters per month.

          Each query submitted to Presto cluster is logged to a Kafka topic via Singer. Singer is a logging agent built at Pinterest and we talked about it in a previous post. Each query is logged when it is submitted and when it finishes. When a Presto cluster crashes, we will have query submitted events without corresponding query finished events. These events enable us to capture the effect of cluster crashes over time.

          Each Presto cluster at Pinterest has workers on a mix of dedicated AWS EC2 instances and Kubernetes pods. Kubernetes platform provides us with the capability to add and remove workers from a Presto cluster very quickly. The best-case latency on bringing up a new worker on Kubernetes is less than a minute. However, when the Kubernetes cluster itself is out of resources and needs to scale up, it can take up to ten minutes. Some other advantages of deploying on Kubernetes platform is that our Presto deployment becomes agnostic of cloud vendor, instance types, OS, etc.

          #BigData #AWS #DataScience #DataEngineering

          See more