Alternatives to MemSQL logo

Alternatives to MemSQL

VoltDB, Redis, MongoDB, Cassandra, and MySQL are the most popular alternatives and competitors to MemSQL.
85
184
+ 1
44

What is MemSQL and what are its top alternatives?

MemSQL is a distributed, in-memory database that combines high performance, scalability, and simplicity. It provides real-time analytics capabilities to help organizations make faster and data-driven decisions. MemSQL's key features include SQL support, distributed architecture, high availability, and real-time analytics. However, some limitations of MemSQL include high cost of ownership, limited support for complex queries, and potential challenges with scalability.

  1. ClickHouse: ClickHouse is an open-source, column-oriented database management system that offers high performance for analytical queries. Key features include real-time data processing, efficient handling of large data sets, and support for SQL queries. Pros of ClickHouse compared to MemSQL include cost-effectiveness and strong performance, while cons include a steeper learning curve for beginners.

  2. CockroachDB: CockroachDB is a distributed SQL database that provides horizontal scalability and strong consistency. Key features include distributed transactions, fault tolerance, and ACID compliance. Pros of CockroachDB compared to MemSQL include easy scalability and deployment, while cons include slower performance for some workloads.

  3. InfluxDB: InfluxDB is a time series database designed for handling high write and query loads. Key features include schema-less design, high availability, and SQL-like querying language. Pros of InfluxDB compared to MemSQL include specialization in time-series data and scalability for IoT applications, while cons include limited support for complex data structures.

  4. TiDB: TiDB is a distributed SQL database that combines the advantages of traditional databases and distributed systems. Key features include horizontal scalability, strong consistency, and compatibility with MySQL protocol. Pros of TiDB compared to MemSQL include compatibility with existing MySQL applications and ease of use, while cons include potential performance limitations for some workloads.

  5. MongoDB: MongoDB is a popular document-oriented NoSQL database that offers flexibility and scalability. Key features include JSON-like document storage, automatic sharding, and high availability. Pros of MongoDB compared to MemSQL include flexibility in data modeling and ease of scaling, while cons include potential challenges with complex queries and analytics.

  6. Amazon Redshift: Amazon Redshift is a fully managed cloud data warehouse that offers fast query performance and scalability. Key features include columnar storage, parallel processing, and integration with other AWS services. Pros of Amazon Redshift compared to MemSQL include seamless integration with AWS ecosystem and cost-effectiveness for large-scale data warehousing, while cons include potential limitations in real-time analytics capabilities.

  7. PostgreSQL: PostgreSQL is a powerful open-source relational database management system known for its reliability and extensibility. Key features include ACID compliance, support for complex queries, and a wide range of data types. Pros of PostgreSQL compared to MemSQL include maturity and stability, while cons include potential challenges with scalability for large datasets.

  8. Vertica: Vertica is a columnar store analytical database that offers high performance for data warehousing and analytics workloads. Key features include compression, workload management, and integration with BI tools. Pros of Vertica compared to MemSQL include scalability for large datasets and optimized query performance, while cons include potentially higher costs and complexity in management.

  9. MariaDB: MariaDB is a popular open-source relational database management system that is designed for high performance and scalability. Key features include ACID compliance, data encryption, and compatibility with MySQL. Pros of MariaDB compared to MemSQL include cost-effectiveness and community support, while cons include potential challenges with enterprise-level features and scalability.

  10. SingleStore: SingleStore is a distributed SQL database that offers real-time analytics, scalability, and high performance. Key features include in-memory processing, distributed architecture, and compatibility with MySQL. Pros of SingleStore compared to MemSQL include high performance for real-time analytics and ease of scaling, while cons include pricing considerations for larger deployments.

Top Alternatives to MemSQL

  • VoltDB
    VoltDB

    VoltDB is a fundamental redesign of the RDBMS that provides unparalleled performance and scalability on bare-metal, virtualized and cloud infrastructures. VoltDB is a modern in-memory architecture that supports both SQL + Java with data durability and fault tolerance. ...

  • Redis
    Redis

    Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. ...

  • MongoDB
    MongoDB

    MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding. ...

  • Cassandra
    Cassandra

    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL. ...

  • MySQL
    MySQL

    The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software. ...

  • Apache Ignite
    Apache Ignite

    It is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads delivering in-memory speeds at petabyte scale ...

  • CockroachDB
    CockroachDB

    CockroachDB is distributed SQL database that can be deployed in serverless, dedicated, or on-prem. Elastic scale, multi-active availability for resilience, and low latency performance. ...

  • Elasticsearch
    Elasticsearch

    Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack). ...

MemSQL alternatives & related posts

VoltDB logo

VoltDB

18
18
In-memory relational DBMS capable of supporting millions of database operations per second
18
18
PROS OF VOLTDB
  • 5
    SQL + Java
  • 4
    In-memory database
  • 4
    A brainchild of Michael Stonebraker
  • 3
    Very Fast
  • 2
    NewSQL
CONS OF VOLTDB
    Be the first to leave a con

    related VoltDB posts

    Redis logo

    Redis

    59.9K
    3.9K
    Open source (BSD licensed), in-memory data structure store
    59.9K
    3.9K
    PROS OF REDIS
    • 887
      Performance
    • 542
      Super fast
    • 514
      Ease of use
    • 444
      In-memory cache
    • 324
      Advanced key-value cache
    • 194
      Open source
    • 182
      Easy to deploy
    • 165
      Stable
    • 156
      Free
    • 121
      Fast
    • 42
      High-Performance
    • 40
      High Availability
    • 35
      Data Structures
    • 32
      Very Scalable
    • 24
      Replication
    • 23
      Pub/Sub
    • 22
      Great community
    • 19
      "NoSQL" key-value data store
    • 16
      Hashes
    • 13
      Sets
    • 11
      Sorted Sets
    • 10
      Lists
    • 10
      NoSQL
    • 9
      Async replication
    • 9
      BSD licensed
    • 8
      Integrates super easy with Sidekiq for Rails background
    • 8
      Bitmaps
    • 7
      Open Source
    • 7
      Keys with a limited time-to-live
    • 6
      Lua scripting
    • 6
      Strings
    • 5
      Awesomeness for Free
    • 5
      Hyperloglogs
    • 4
      Runs server side LUA
    • 4
      Transactions
    • 4
      Networked
    • 4
      Outstanding performance
    • 4
      Feature Rich
    • 4
      Written in ANSI C
    • 4
      LRU eviction of keys
    • 3
      Data structure server
    • 3
      Performance & ease of use
    • 2
      Temporarily kept on disk
    • 2
      Dont save data if no subscribers are found
    • 2
      Automatic failover
    • 2
      Easy to use
    • 2
      Scalable
    • 2
      Channels concept
    • 2
      Object [key/value] size each 500 MB
    • 2
      Existing Laravel Integration
    • 2
      Simple
    CONS OF REDIS
    • 15
      Cannot query objects directly
    • 3
      No secondary indexes for non-numeric data types
    • 1
      No WAL

    related Redis posts

    Russel Werner
    Lead Engineer at StackShare · | 32 upvotes · 2.9M views

    StackShare Feed is built entirely with React, Glamorous, and Apollo. One of our objectives with the public launch of the Feed was to enable a Server-side rendered (SSR) experience for our organic search traffic. When you visit the StackShare Feed, and you aren't logged in, you are delivered the Trending feed experience. We use an in-house Node.js rendering microservice to generate this HTML. This microservice needs to run and serve requests independent of our Rails web app. Up until recently, we had a mono-repo with our Rails and React code living happily together and all served from the same web process. In order to deploy our SSR app into a Heroku environment, we needed to split out our front-end application into a separate repo in GitHub. The driving factor in this decision was mostly due to limitations imposed by Heroku specifically with how processes can't communicate with each other. A new SSR app was created in Heroku and linked directly to the frontend repo so it stays in-sync with changes.

    Related to this, we need a way to "deploy" our frontend changes to various server environments without building & releasing the entire Ruby application. We built a hybrid Amazon S3 Amazon CloudFront solution to host our Webpack bundles. A new CircleCI script builds the bundles and uploads them to S3. The final step in our rollout is to update some keys in Redis so our Rails app knows which bundles to serve. The result of these efforts were significant. Our frontend team now moves independently of our backend team, our build & release process takes only a few minutes, we are now using an edge CDN to serve JS assets, and we have pre-rendered React pages!

    #StackDecisionsLaunch #SSR #Microservices #FrontEndRepoSplit

    See more
    Simon Reymann
    Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 12.1M views

    Our whole DevOps stack consists of the following tools:

    • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
    • Respectively Git as revision control system
    • SourceTree as Git GUI
    • Visual Studio Code as IDE
    • CircleCI for continuous integration (automatize development process)
    • Prettier / TSLint / ESLint as code linter
    • SonarQube as quality gate
    • Docker as container management (incl. Docker Compose for multi-container application management)
    • VirtualBox for operating system simulation tests
    • Kubernetes as cluster management for docker containers
    • Heroku for deploying in test environments
    • nginx as web server (preferably used as facade server in production environment)
    • SSLMate (using OpenSSL) for certificate management
    • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
    • PostgreSQL as preferred database system
    • Redis as preferred in-memory database/store (great for caching)

    The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

    • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
    • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
    • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
    • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
    • Scalability: All-in-one framework for distributed systems.
    • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
    See more
    MongoDB logo

    MongoDB

    94.2K
    4.1K
    The database for giant ideas
    94.2K
    4.1K
    PROS OF MONGODB
    • 829
      Document-oriented storage
    • 594
      No sql
    • 554
      Ease of use
    • 465
      Fast
    • 410
      High performance
    • 255
      Free
    • 218
      Open source
    • 180
      Flexible
    • 145
      Replication & high availability
    • 112
      Easy to maintain
    • 42
      Querying
    • 39
      Easy scalability
    • 38
      Auto-sharding
    • 37
      High availability
    • 31
      Map/reduce
    • 27
      Document database
    • 25
      Easy setup
    • 25
      Full index support
    • 16
      Reliable
    • 15
      Fast in-place updates
    • 14
      Agile programming, flexible, fast
    • 12
      No database migrations
    • 8
      Easy integration with Node.Js
    • 8
      Enterprise
    • 6
      Enterprise Support
    • 5
      Great NoSQL DB
    • 4
      Support for many languages through different drivers
    • 3
      Schemaless
    • 3
      Aggregation Framework
    • 3
      Drivers support is good
    • 2
      Fast
    • 2
      Managed service
    • 2
      Easy to Scale
    • 2
      Awesome
    • 2
      Consistent
    • 1
      Good GUI
    • 1
      Acid Compliant
    CONS OF MONGODB
    • 6
      Very slowly for connected models that require joins
    • 3
      Not acid compliant
    • 2
      Proprietary query language

    related MongoDB posts

    Jeyabalaji Subramanian

    Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.

    We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient

    Based on the above criteria, we selected the following tools to perform the end to end data replication:

    We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.

    We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.

    In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.

    Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.

    In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!

    See more
    Robert Zuber

    We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.

    As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).

    When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.

    See more
    Cassandra logo

    Cassandra

    3.6K
    507
    A partitioned row store. Rows are organized into tables with a required primary key.
    3.6K
    507
    PROS OF CASSANDRA
    • 119
      Distributed
    • 98
      High performance
    • 81
      High availability
    • 74
      Easy scalability
    • 53
      Replication
    • 26
      Reliable
    • 26
      Multi datacenter deployments
    • 10
      Schema optional
    • 9
      OLTP
    • 8
      Open source
    • 2
      Workload separation (via MDC)
    • 1
      Fast
    CONS OF CASSANDRA
    • 3
      Reliability of replication
    • 1
      Size
    • 1
      Updates

    related Cassandra posts

    Thierry Schellenbach
    Shared insights
    on
    RedisRedisCassandraCassandraRocksDBRocksDB
    at

    1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

    Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

    RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra.

    This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

    #InMemoryDatabases #DataStores #Databases

    See more

    Trying to establish a data lake(or maybe puddle) for my org's Data Sharing project. The idea is that outside partners would send cuts of their PHI data, regardless of format/variables/systems, to our Data Team who would then harmonize the data, create data marts, and eventually use it for something. End-to-end, I'm envisioning:

    1. Ingestion->Secure, role-based, self service portal for users to upload data (1a. bonus points if it can preform basic validations/masking)
    2. Storage->Amazon S3 seems like the cheapest. We probably won't need very big, even at full capacity. Our current storage is a secure Box folder that has ~4GB with several batches of test data, code, presentations, and planning docs.
    3. Data Catalog-> AWS Glue? Azure Data Factory? Snowplow? is the main difference basically based on the vendor? We also will have Data Dictionaries/Codebooks from submitters. Where would they fit in?
    4. Partitions-> I've seen Cassandra and YARN mentioned, but have no experience with either
    5. Processing-> We want to use SAS if at all possible. What will work with SAS code?
    6. Pipeline/Automation->The check-in and verification processes that have been outlined are rather involved. Some sort of automated messaging or approval workflow would be nice
    7. I have very little guidance on what a "Data Mart" should look like, so I'm going with the idea that it would be another "experimental" partition. Unless there's an actual mart-building paradigm I've missed?
    8. An end user might use the catalog to pull certain de-identified data sets from the marts. Again, role-based access and self-service gui would be preferable. I'm the only full-time tech person on this project, but I'm mostly an OOP, HTML, JavaScript, and some SQL programmer. Most of this is out of my repertoire. I've done a lot of research, but I can't be an effective evangelist without hands-on experience. Since we're starting a new year of our grant, they've finally decided to let me try some stuff out. Any pointers would be appreciated!
    See more
    MySQL logo

    MySQL

    126.4K
    3.8K
    The world's most popular open source database
    126.4K
    3.8K
    PROS OF MYSQL
    • 800
      Sql
    • 679
      Free
    • 562
      Easy
    • 528
      Widely used
    • 490
      Open source
    • 180
      High availability
    • 160
      Cross-platform support
    • 104
      Great community
    • 79
      Secure
    • 75
      Full-text indexing and searching
    • 26
      Fast, open, available
    • 16
      Reliable
    • 16
      SSL support
    • 15
      Robust
    • 9
      Enterprise Version
    • 7
      Easy to set up on all platforms
    • 3
      NoSQL access to JSON data type
    • 1
      Relational database
    • 1
      Easy, light, scalable
    • 1
      Sequel Pro (best SQL GUI)
    • 1
      Replica Support
    CONS OF MYSQL
    • 16
      Owned by a company with their own agenda
    • 3
      Can't roll back schema changes

    related MySQL posts

    Nick Rockwell
    SVP, Engineering at Fastly · | 46 upvotes · 4.3M views

    When I joined NYT there was already broad dissatisfaction with the LAMP (Linux Apache HTTP Server MySQL PHP) Stack and the front end framework, in particular. So, I wasn't passing judgment on it. I mean, LAMP's fine, you can do good work in LAMP. It's a little dated at this point, but it's not ... I didn't want to rip it out for its own sake, but everyone else was like, "We don't like this, it's really inflexible." And I remember from being outside the company when that was called MIT FIVE when it had launched. And been observing it from the outside, and I was like, you guys took so long to do that and you did it so carefully, and yet you're not happy with your decisions. Why is that? That was more the impetus. If we're going to do this again, how are we going to do it in a way that we're gonna get a better result?

    So we're moving quickly away from LAMP, I would say. So, right now, the new front end is React based and using Apollo. And we've been in a long, protracted, gradual rollout of the core experiences.

    React is now talking to GraphQL as a primary API. There's a Node.js back end, to the front end, which is mainly for server-side rendering, as well.

    Behind there, the main repository for the GraphQL server is a big table repository, that we call Bodega because it's a convenience store. And that reads off of a Kafka pipeline.

    See more
    Tim Abbott

    We've been using PostgreSQL since the very early days of Zulip, but we actually didn't use it from the beginning. Zulip started out as a MySQL project back in 2012, because we'd heard it was a good choice for a startup with a wide community. However, we found that even though we were using the Django ORM for most of our database access, we spent a lot of time fighting with MySQL. Issues ranged from bad collation defaults, to bad query plans which required a lot of manual query tweaks.

    We ended up getting so frustrated that we tried out PostgresQL, and the results were fantastic. We didn't have to do any real customization (just some tuning settings for how big a server we had), and all of our most important queries were faster out of the box. As a result, we were able to delete a bunch of custom queries escaping the ORM that we'd written to make the MySQL query planner happy (because postgres just did the right thing automatically).

    And then after that, we've just gotten a ton of value out of postgres. We use its excellent built-in full-text search, which has helped us avoid needing to bring in a tool like Elasticsearch, and we've really enjoyed features like its partial indexes, which saved us a lot of work adding unnecessary extra tables to get good performance for things like our "unread messages" and "starred messages" indexes.

    I can't recommend it highly enough.

    See more
    Apache Ignite logo

    Apache Ignite

    98
    41
    An open-source distributed database, caching and processing platform
    98
    41
    PROS OF APACHE IGNITE
    • 5
      Written in java. runs on jvm
    • 5
      Multiple client language support
    • 5
      Free
    • 5
      High Avaliability
    • 4
      Rest interface
    • 4
      Sql query support in cluster wide
    • 4
      Load balancing
    • 3
      Distributed compute
    • 3
      Better Documentation
    • 2
      Easy to use
    • 1
      Distributed Locking
    CONS OF APACHE IGNITE
      Be the first to leave a con

      related Apache Ignite posts

      CockroachDB logo

      CockroachDB

      214
      0
      A distributed SQL database that scales fast, survives disaster, and thrives everywhere
      214
      0
      PROS OF COCKROACHDB
        Be the first to leave a pro
        CONS OF COCKROACHDB
          Be the first to leave a con

          related CockroachDB posts

          Elasticsearch logo

          Elasticsearch

          34.7K
          1.6K
          Open Source, Distributed, RESTful Search Engine
          34.7K
          1.6K
          PROS OF ELASTICSEARCH
          • 329
            Powerful api
          • 315
            Great search engine
          • 231
            Open source
          • 214
            Restful
          • 200
            Near real-time search
          • 98
            Free
          • 85
            Search everything
          • 54
            Easy to get started
          • 45
            Analytics
          • 26
            Distributed
          • 6
            Fast search
          • 5
            More than a search engine
          • 4
            Awesome, great tool
          • 4
            Great docs
          • 3
            Highly Available
          • 3
            Easy to scale
          • 2
            Nosql DB
          • 2
            Document Store
          • 2
            Great customer support
          • 2
            Intuitive API
          • 2
            Reliable
          • 2
            Potato
          • 2
            Fast
          • 2
            Easy setup
          • 2
            Great piece of software
          • 1
            Open
          • 1
            Scalability
          • 1
            Not stable
          • 1
            Easy to get hot data
          • 1
            Github
          • 1
            Elaticsearch
          • 1
            Actively developing
          • 1
            Responsive maintainers on GitHub
          • 1
            Ecosystem
          • 0
            Community
          CONS OF ELASTICSEARCH
          • 7
            Resource hungry
          • 6
            Diffecult to get started
          • 5
            Expensive
          • 4
            Hard to keep stable at large scale

          related Elasticsearch posts

          Tim Abbott

          We've been using PostgreSQL since the very early days of Zulip, but we actually didn't use it from the beginning. Zulip started out as a MySQL project back in 2012, because we'd heard it was a good choice for a startup with a wide community. However, we found that even though we were using the Django ORM for most of our database access, we spent a lot of time fighting with MySQL. Issues ranged from bad collation defaults, to bad query plans which required a lot of manual query tweaks.

          We ended up getting so frustrated that we tried out PostgresQL, and the results were fantastic. We didn't have to do any real customization (just some tuning settings for how big a server we had), and all of our most important queries were faster out of the box. As a result, we were able to delete a bunch of custom queries escaping the ORM that we'd written to make the MySQL query planner happy (because postgres just did the right thing automatically).

          And then after that, we've just gotten a ton of value out of postgres. We use its excellent built-in full-text search, which has helped us avoid needing to bring in a tool like Elasticsearch, and we've really enjoyed features like its partial indexes, which saved us a lot of work adding unnecessary extra tables to get good performance for things like our "unread messages" and "starred messages" indexes.

          I can't recommend it highly enough.

          See more
          Tymoteusz Paul
          Devops guy at X20X Development LTD · | 23 upvotes · 10.2M views

          Often enough I have to explain my way of going about setting up a CI/CD pipeline with multiple deployment platforms. Since I am a bit tired of yapping the same every single time, I've decided to write it up and share with the world this way, and send people to read it instead ;). I will explain it on "live-example" of how the Rome got built, basing that current methodology exists only of readme.md and wishes of good luck (as it usually is ;)).

          It always starts with an app, whatever it may be and reading the readmes available while Vagrant and VirtualBox is installing and updating. Following that is the first hurdle to go over - convert all the instruction/scripts into Ansible playbook(s), and only stopping when doing a clear vagrant up or vagrant reload we will have a fully working environment. As our Vagrant environment is now functional, it's time to break it! This is the moment to look for how things can be done better (too rigid/too lose versioning? Sloppy environment setup?) and replace them with the right way to do stuff, one that won't bite us in the backside. This is the point, and the best opportunity, to upcycle the existing way of doing dev environment to produce a proper, production-grade product.

          I should probably digress here for a moment and explain why. I firmly believe that the way you deploy production is the same way you should deploy develop, shy of few debugging-friendly setting. This way you avoid the discrepancy between how production work vs how development works, which almost always causes major pains in the back of the neck, and with use of proper tools should mean no more work for the developers. That's why we start with Vagrant as developer boxes should be as easy as vagrant up, but the meat of our product lies in Ansible which will do meat of the work and can be applied to almost anything: AWS, bare metal, docker, LXC, in open net, behind vpn - you name it.

          We must also give proper consideration to monitoring and logging hoovering at this point. My generic answer here is to grab Elasticsearch, Kibana, and Logstash. While for different use cases there may be better solutions, this one is well battle-tested, performs reasonably and is very easy to scale both vertically (within some limits) and horizontally. Logstash rules are easy to write and are well supported in maintenance through Ansible, which as I've mentioned earlier, are at the very core of things, and creating triggers/reports and alerts based on Elastic and Kibana is generally a breeze, including some quite complex aggregations.

          If we are happy with the state of the Ansible it's time to move on and put all those roles and playbooks to work. Namely, we need something to manage our CI/CD pipelines. For me, the choice is obvious: TeamCity. It's modern, robust and unlike most of the light-weight alternatives, it's transparent. What I mean by that is that it doesn't tell you how to do things, doesn't limit your ways to deploy, or test, or package for that matter. Instead, it provides a developer-friendly and rich playground for your pipelines. You can do most the same with Jenkins, but it has a quite dated look and feel to it, while also missing some key functionality that must be brought in via plugins (like quality REST API which comes built-in with TeamCity). It also comes with all the common-handy plugins like Slack or Apache Maven integration.

          The exact flow between CI and CD varies too greatly from one application to another to describe, so I will outline a few rules that guide me in it: 1. Make build steps as small as possible. This way when something breaks, we know exactly where, without needing to dig and root around. 2. All security credentials besides development environment must be sources from individual Vault instances. Keys to those containers should exist only on the CI/CD box and accessible by a few people (the less the better). This is pretty self-explanatory, as anything besides dev may contain sensitive data and, at times, be public-facing. Because of that appropriate security must be present. TeamCity shines in this department with excellent secrets-management. 3. Every part of the build chain shall consume and produce artifacts. If it creates nothing, it likely shouldn't be its own build. This way if any issue shows up with any environment or version, all developer has to do it is grab appropriate artifacts to reproduce the issue locally. 4. Deployment builds should be directly tied to specific Git branches/tags. This enables much easier tracking of what caused an issue, including automated identifying and tagging the author (nothing like automated regression testing!).

          Speaking of deployments, I generally try to keep it simple but also with a close eye on the wallet. Because of that, I am more than happy with AWS or another cloud provider, but also constantly peeking at the loads and do we get the value of what we are paying for. Often enough the pattern of use is not constantly erratic, but rather has a firm baseline which could be migrated away from the cloud and into bare metal boxes. That is another part where this approach strongly triumphs over the common Docker and CircleCI setup, where you are very much tied in to use cloud providers and getting out is expensive. Here to embrace bare-metal hosting all you need is a help of some container-based self-hosting software, my personal preference is with Proxmox and LXC. Following that all you must write are ansible scripts to manage hardware of Proxmox, similar way as you do for Amazon EC2 (ansible supports both greatly) and you are good to go. One does not exclude another, quite the opposite, as they can live in great synergy and cut your costs dramatically (the heavier your base load, the bigger the savings) while providing production-grade resiliency.

          See more