Alternatives to Apache Parquet logo

Alternatives to Apache Parquet

Avro, Apache Kudu, JSON, Cassandra, and HBase are the most popular alternatives and competitors to Apache Parquet.
92
185
+ 1
0

What is Apache Parquet and what are its top alternatives?

It is a columnar storage format available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model or programming language.
Apache Parquet is a tool in the Databases category of a tech stack.
Apache Parquet is an open source tool with 2.4K GitHub stars and 1.4K GitHub forks. Here’s a link to Apache Parquet's open source repository on GitHub

Top Alternatives to Apache Parquet

  • Avro
    Avro

    It is a row-oriented remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. ...

  • Apache Kudu
    Apache Kudu

    A new addition to the open source Apache Hadoop ecosystem, Kudu completes Hadoop's storage layer to enable fast analytics on fast data. ...

  • JSON
    JSON

    JavaScript Object Notation is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language. ...

  • Cassandra
    Cassandra

    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL. ...

  • HBase
    HBase

    Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop. ...

  • JavaScript
    JavaScript

    JavaScript is most known as the scripting language for Web pages, but used in many non-browser environments as well such as node.js or Apache CouchDB. It is a prototype-based, multi-paradigm scripting language that is dynamic,and supports object-oriented, imperative, and functional programming styles. ...

  • Git
    Git

    Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. ...

  • GitHub
    GitHub

    GitHub is the best place to share code with friends, co-workers, classmates, and complete strangers. Over three million people use GitHub to build amazing things together. ...

Apache Parquet alternatives & related posts

Avro logo

Avro

260
176
0
A data serialization framework
260
176
+ 1
0
PROS OF AVRO
    Be the first to leave a pro
    CONS OF AVRO
      Be the first to leave a con

      related Avro posts

      Apache Kudu logo

      Apache Kudu

      72
      258
      10
      Fast Analytics on Fast Data. A columnar storage manager developed for the Hadoop platform
      72
      258
      + 1
      10
      PROS OF APACHE KUDU
      • 10
        Realtime Analytics
      CONS OF APACHE KUDU
      • 1
        Restart time

      related Apache Kudu posts

      I have been working on a Java application to demonstrate the latency for the select/insert/update operations on KUDU storage using Apache Kudu API - Java based client. I have a few queries about using Apache Kudu API

      1. Do we have JDBC wrapper to use Apache Kudu API for getting connection to Kudu masters with connection pool mechanism and all DB operations?

      2. Does Apache KuduAPI supports order by, group by, and aggregate functions? if yes, how to implement these functions using Kudu APIs.

      3. How can we add kudu predicates to Kudu update operation? if yes, how?

      4. Does Apache Kudu API supports batch insertion (execute the Kudu Insert for multiple rows at one go instead of row by row)? (like Kudusession.apply(List);)

      5. Does Apache Kudu API support join on tables?

      6. which tool is preferred over others (Apache Impala /Kudu API) for read and update/insert DB operations?

      See more
      JSON logo

      JSON

      1.9K
      1.6K
      9
      A lightweight data-interchange format
      1.9K
      1.6K
      + 1
      9
      PROS OF JSON
      • 5
        Simple
      • 4
        Widely supported
      CONS OF JSON
        Be the first to leave a con

        related JSON posts

        Ali Soueidan
        Creative Web Developer at Ali Soueidan · | 18 upvotes · 1.2M views

        Application and Data: Since my personal website ( https://alisoueidan.com ) is a SPA I've chosen to use Vue.js, as a framework to create it. After a short skeptical phase I immediately felt in love with the single file component concept! I also used vuex for state management, which makes working with several components, which are communicating with each other even more fun and convenient to use. Of course, using Vue requires using JavaScript as well, since it is the basis of it.

        For markup and style, I used Pug and Sass, since they’re the perfect match to me. I love the clean and strict syntax of both of them and even more that their structure is almost similar. Also, both of them come with an expanded functionality such as mixins, loops and so on related to their “siblings” (HTML and CSS). Both of them require nesting and prevent untidy code, which can be a huge advantage when working in teams. I used JSON to store data (since the data quantity on my website is moderate) – JSON works also good in combo with Pug, using for loops, based on the JSON Objects for example.

        To send my contact form I used PHP, since sending emails using PHP is still relatively convenient, simple and easy done.

        DevOps: Of course, I used Git to do my version management (which I even do in smaller projects like my website just have an additional backup of my code). On top of that I used GitHub since it now supports private repository for free accounts (which I am using for my own). I use Babel to use ES6 functionality such as arrow functions and so on, and still don’t losing cross browser compatibility.

        Side note: I used npm for package management. 🎉

        *Business Tools: * I use Asana to organize my project. This is a big advantage to me, even if I work alone, since “private” projects can get interrupted for some time. By using Asana I still know (even after month of not touching a project) what I’ve done, on which task I was at last working on and what still is to do. Working in Teams (for enterprise I’d take on Jira instead) of course Asana is a Tool which I really love to use as well. All the graphics on my website are SVG which I have created with Adobe Illustrator and adjusted within the SVG code or by using JavaScript or CSS (SASS).

        See more

        I use Visual Studio Code because at this time is a mature software and I can do practically everything using it.

        • It's free and open source: The project is hosted on GitHub and it’s free to download, fork, modify and contribute to the project.

        • Multi-platform: You can download binaries for different platforms, included Windows (x64), MacOS and Linux (.rpm and .deb packages)

        • LightWeight: It runs smoothly in different devices. It has an average memory and CPU usage. Starts almost immediately and it’s very stable.

        • Extended language support: Supports by default the majority of the most used languages and syntax like JavaScript, HTML, C#, Swift, Java, PHP, Python and others. Also, VS Code supports different file types associated to projects like .ini, .properties, XML and JSON files.

        • Integrated tools: Includes an integrated terminal, debugger, problem list and console output inspector. The project navigator sidebar is simple and powerful: you can manage your files and folders with ease. The command palette helps you find commands by text. The search widget has a powerful auto-complete feature to search and find your files.

        • Extensible and configurable: There are many extensions available for every language supported, including syntax highlighters, IntelliSense and code completion, and debuggers. There are also extension to manage application configuration and architecture like Docker and Jenkins.

        • Integrated with Git: You can visually manage your project repositories, pull, commit and push your changes, and easy conflict resolution.( there is support for SVN (Subversion) users by plugin)

        See more
        Cassandra logo

        Cassandra

        3.5K
        3.5K
        507
        A partitioned row store. Rows are organized into tables with a required primary key.
        3.5K
        3.5K
        + 1
        507
        PROS OF CASSANDRA
        • 119
          Distributed
        • 98
          High performance
        • 81
          High availability
        • 74
          Easy scalability
        • 53
          Replication
        • 26
          Reliable
        • 26
          Multi datacenter deployments
        • 10
          Schema optional
        • 9
          OLTP
        • 8
          Open source
        • 2
          Workload separation (via MDC)
        • 1
          Fast
        CONS OF CASSANDRA
        • 3
          Reliability of replication
        • 1
          Size
        • 1
          Updates

        related Cassandra posts

        Thierry Schellenbach
        Shared insights
        on
        GolangGolangPythonPythonCassandraCassandra
        at

        After years of optimizing our existing feed technology, we decided to make a larger leap with 2.0 of Stream. While the first iteration of Stream was powered by Python and Cassandra, for Stream 2.0 of our infrastructure we switched to Go.

        The main reason why we switched from Python to Go is performance. Certain features of Stream such as aggregation, ranking and serialization were very difficult to speed up using Python.

        We’ve been using Go since March 2017 and it’s been a great experience so far. Go has greatly increased the productivity of our development team. Not only has it improved the speed at which we develop, it’s also 30x faster for many components of Stream. Initially we struggled a bit with package management for Go. However, using Dep together with the VG package contributed to creating a great workflow.

        Go as a language is heavily focused on performance. The built-in PPROF tool is amazing for finding performance issues. Uber’s Go-Torch library is great for visualizing data from PPROF and will be bundled in PPROF in Go 1.10.

        The performance of Go greatly influenced our architecture in a positive way. With Python we often found ourselves delegating logic to the database layer purely for performance reasons. The high performance of Go gave us more flexibility in terms of architecture. This led to a huge simplification of our infrastructure and a dramatic improvement of latency. For instance, we saw a 10 to 1 reduction in web-server count thanks to the lower memory and CPU usage for the same number of requests.

        #DataStores #Databases

        See more
        Thierry Schellenbach
        Shared insights
        on
        RedisRedisCassandraCassandraRocksDBRocksDB
        at

        1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

        Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

        RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra.

        This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

        #InMemoryDatabases #DataStores #Databases

        See more
        HBase logo

        HBase

        454
        492
        15
        The Hadoop database, a distributed, scalable, big data store
        454
        492
        + 1
        15
        PROS OF HBASE
        • 9
          Performance
        • 5
          OLTP
        • 1
          Fast Point Queries
        CONS OF HBASE
          Be the first to leave a con

          related HBase posts

          I am researching different querying solutions to handle ~1 trillion records of data (in the realm of a petabyte). The data is mostly textual. I have identified a few options: Milvus, HBase, RocksDB, and Elasticsearch. I was wondering if there is a good way to compare the performance of these options (or if anyone has already done something like this). I want to be able to compare the speed of ingesting and querying textual data from these tools. Does anyone have information on this or know where I can find some? Thanks in advance!

          See more

          Hi, I'm building a machine learning pipelines to store image bytes and image vectors in the backend.

          So, when users query for the random access image data (key), we return the image bytes and perform machine learning model operations on it.

          I'm currently considering going with Amazon S3 (in the future, maybe add Redis caching layer) as the backend system to store the information (s3 buckets with sharded prefixes).

          As the latency of S3 is 100-200ms (get/put) and it has a high throughput of 3500 puts/sec and 5500 gets/sec for a given bucker/prefix. In the future I need to reduce the latency, I can add Redis cache.

          Also, s3 costs are way fewer than HBase (on Amazon EC2 instances with 3x replication factor)

          I have not personally used HBase before, so can someone help me if I'm making the right choice here? I'm not aware of Hbase latencies and I have learned that the MOB feature on Hbase has to be turned on if we have store image bytes on of the column families as the avg image bytes are 240Kb.

          See more
          JavaScript logo

          JavaScript

          350.6K
          267K
          8.1K
          Lightweight, interpreted, object-oriented language with first-class functions
          350.6K
          267K
          + 1
          8.1K
          PROS OF JAVASCRIPT
          • 1.7K
            Can be used on frontend/backend
          • 1.5K
            It's everywhere
          • 1.2K
            Lots of great frameworks
          • 896
            Fast
          • 745
            Light weight
          • 425
            Flexible
          • 392
            You can't get a device today that doesn't run js
          • 286
            Non-blocking i/o
          • 236
            Ubiquitousness
          • 191
            Expressive
          • 55
            Extended functionality to web pages
          • 49
            Relatively easy language
          • 46
            Executed on the client side
          • 30
            Relatively fast to the end user
          • 25
            Pure Javascript
          • 21
            Functional programming
          • 15
            Async
          • 13
            Full-stack
          • 12
            Setup is easy
          • 12
            Its everywhere
          • 12
            Future Language of The Web
          • 11
            JavaScript is the New PHP
          • 11
            Because I love functions
          • 10
            Like it or not, JS is part of the web standard
          • 9
            Expansive community
          • 9
            Everyone use it
          • 9
            Can be used in backend, frontend and DB
          • 9
            Easy
          • 8
            Easy to hire developers
          • 8
            No need to use PHP
          • 8
            For the good parts
          • 8
            Can be used both as frontend and backend as well
          • 8
            Powerful
          • 8
            Most Popular Language in the World
          • 7
            Popularized Class-Less Architecture & Lambdas
          • 7
            It's fun
          • 7
            Nice
          • 7
            Versitile
          • 7
            Hard not to use
          • 7
            Its fun and fast
          • 7
            Agile, packages simple to use
          • 7
            Supports lambdas and closures
          • 7
            Love-hate relationship
          • 7
            Photoshop has 3 JS runtimes built in
          • 7
            Evolution of C
          • 6
            1.6K Can be used on frontend/backend
          • 6
            Client side JS uses the visitors CPU to save Server Res
          • 6
            It let's me use Babel & Typescript
          • 6
            Easy to make something
          • 6
            Can be used on frontend/backend/Mobile/create PRO Ui
          • 5
            Promise relationship
          • 5
            Stockholm Syndrome
          • 5
            Function expressions are useful for callbacks
          • 5
            Scope manipulation
          • 5
            Everywhere
          • 5
            Client processing
          • 5
            Clojurescript
          • 5
            What to add
          • 4
            Because it is so simple and lightweight
          • 4
            Only Programming language on browser
          • 1
            Test2
          • 1
            Easy to learn
          • 1
            Easy to understand
          • 1
            Not the best
          • 1
            Hard to learn
          • 1
            Subskill #4
          • 1
            Test
          • 0
            Hard 彤
          CONS OF JAVASCRIPT
          • 22
            A constant moving target, too much churn
          • 20
            Horribly inconsistent
          • 15
            Javascript is the New PHP
          • 9
            No ability to monitor memory utilitization
          • 8
            Shows Zero output in case of ANY error
          • 7
            Thinks strange results are better than errors
          • 6
            Can be ugly
          • 3
            No GitHub
          • 2
            Slow

          related JavaScript posts

          Zach Holman

          Oof. I have truly hated JavaScript for a long time. Like, for over twenty years now. Like, since the Clinton administration. It's always been a nightmare to deal with all of the aspects of that silly language.

          But wowza, things have changed. Tooling is just way, way better. I'm primarily web-oriented, and using React and Apollo together the past few years really opened my eyes to building rich apps. And I deeply apologize for using the phrase rich apps; I don't think I've ever said such Enterprisey words before.

          But yeah, things are different now. I still love Rails, and still use it for a lot of apps I build. But it's that silly rich apps phrase that's the problem. Users have way more comprehensive expectations than they did even five years ago, and the JS community does a good job at building tools and tech that tackle the problems of making heavy, complicated UI and frontend work.

          Obviously there's a lot of things happening here, so just saying "JavaScript isn't terrible" might encompass a huge amount of libraries and frameworks. But if you're like me, yeah, give things another shot- I'm somehow not hating on JavaScript anymore and... gulp... I kinda love it.

          See more
          Conor Myhrvold
          Tech Brand Mgr, Office of CTO at Uber · | 44 upvotes · 10M views

          How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:

          Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.

          Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:

          https://eng.uber.com/distributed-tracing/

          (GitHub Pages : https://www.jaegertracing.io/, GitHub: https://github.com/jaegertracing/jaeger)

          Bindings/Operator: Python Java Node.js Go C++ Kubernetes JavaScript OpenShift C# Apache Spark

          See more
          Git logo

          Git

          289.6K
          174K
          6.6K
          Fast, scalable, distributed revision control system
          289.6K
          174K
          + 1
          6.6K
          PROS OF GIT
          • 1.4K
            Distributed version control system
          • 1.1K
            Efficient branching and merging
          • 959
            Fast
          • 845
            Open source
          • 726
            Better than svn
          • 368
            Great command-line application
          • 306
            Simple
          • 291
            Free
          • 232
            Easy to use
          • 222
            Does not require server
          • 27
            Distributed
          • 22
            Small & Fast
          • 18
            Feature based workflow
          • 15
            Staging Area
          • 13
            Most wide-spread VSC
          • 11
            Role-based codelines
          • 11
            Disposable Experimentation
          • 7
            Frictionless Context Switching
          • 6
            Data Assurance
          • 5
            Efficient
          • 4
            Just awesome
          • 3
            Github integration
          • 3
            Easy branching and merging
          • 2
            Compatible
          • 2
            Flexible
          • 2
            Possible to lose history and commits
          • 1
            Rebase supported natively; reflog; access to plumbing
          • 1
            Light
          • 1
            Team Integration
          • 1
            Fast, scalable, distributed revision control system
          • 1
            Easy
          • 1
            Flexible, easy, Safe, and fast
          • 1
            CLI is great, but the GUI tools are awesome
          • 1
            It's what you do
          • 0
            Phinx
          CONS OF GIT
          • 16
            Hard to learn
          • 11
            Inconsistent command line interface
          • 9
            Easy to lose uncommitted work
          • 7
            Worst documentation ever possibly made
          • 5
            Awful merge handling
          • 3
            Unexistent preventive security flows
          • 3
            Rebase hell
          • 2
            When --force is disabled, cannot rebase
          • 2
            Ironically even die-hard supporters screw up badly
          • 1
            Doesn't scale for big data

          related Git posts

          Simon Reymann
          Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 9.2M views

          Our whole DevOps stack consists of the following tools:

          • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
          • Respectively Git as revision control system
          • SourceTree as Git GUI
          • Visual Studio Code as IDE
          • CircleCI for continuous integration (automatize development process)
          • Prettier / TSLint / ESLint as code linter
          • SonarQube as quality gate
          • Docker as container management (incl. Docker Compose for multi-container application management)
          • VirtualBox for operating system simulation tests
          • Kubernetes as cluster management for docker containers
          • Heroku for deploying in test environments
          • nginx as web server (preferably used as facade server in production environment)
          • SSLMate (using OpenSSL) for certificate management
          • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
          • PostgreSQL as preferred database system
          • Redis as preferred in-memory database/store (great for caching)

          The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

          • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
          • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
          • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
          • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
          • Scalability: All-in-one framework for distributed systems.
          • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
          See more
          Tymoteusz Paul
          Devops guy at X20X Development LTD · | 23 upvotes · 8.2M views

          Often enough I have to explain my way of going about setting up a CI/CD pipeline with multiple deployment platforms. Since I am a bit tired of yapping the same every single time, I've decided to write it up and share with the world this way, and send people to read it instead ;). I will explain it on "live-example" of how the Rome got built, basing that current methodology exists only of readme.md and wishes of good luck (as it usually is ;)).

          It always starts with an app, whatever it may be and reading the readmes available while Vagrant and VirtualBox is installing and updating. Following that is the first hurdle to go over - convert all the instruction/scripts into Ansible playbook(s), and only stopping when doing a clear vagrant up or vagrant reload we will have a fully working environment. As our Vagrant environment is now functional, it's time to break it! This is the moment to look for how things can be done better (too rigid/too lose versioning? Sloppy environment setup?) and replace them with the right way to do stuff, one that won't bite us in the backside. This is the point, and the best opportunity, to upcycle the existing way of doing dev environment to produce a proper, production-grade product.

          I should probably digress here for a moment and explain why. I firmly believe that the way you deploy production is the same way you should deploy develop, shy of few debugging-friendly setting. This way you avoid the discrepancy between how production work vs how development works, which almost always causes major pains in the back of the neck, and with use of proper tools should mean no more work for the developers. That's why we start with Vagrant as developer boxes should be as easy as vagrant up, but the meat of our product lies in Ansible which will do meat of the work and can be applied to almost anything: AWS, bare metal, docker, LXC, in open net, behind vpn - you name it.

          We must also give proper consideration to monitoring and logging hoovering at this point. My generic answer here is to grab Elasticsearch, Kibana, and Logstash. While for different use cases there may be better solutions, this one is well battle-tested, performs reasonably and is very easy to scale both vertically (within some limits) and horizontally. Logstash rules are easy to write and are well supported in maintenance through Ansible, which as I've mentioned earlier, are at the very core of things, and creating triggers/reports and alerts based on Elastic and Kibana is generally a breeze, including some quite complex aggregations.

          If we are happy with the state of the Ansible it's time to move on and put all those roles and playbooks to work. Namely, we need something to manage our CI/CD pipelines. For me, the choice is obvious: TeamCity. It's modern, robust and unlike most of the light-weight alternatives, it's transparent. What I mean by that is that it doesn't tell you how to do things, doesn't limit your ways to deploy, or test, or package for that matter. Instead, it provides a developer-friendly and rich playground for your pipelines. You can do most the same with Jenkins, but it has a quite dated look and feel to it, while also missing some key functionality that must be brought in via plugins (like quality REST API which comes built-in with TeamCity). It also comes with all the common-handy plugins like Slack or Apache Maven integration.

          The exact flow between CI and CD varies too greatly from one application to another to describe, so I will outline a few rules that guide me in it: 1. Make build steps as small as possible. This way when something breaks, we know exactly where, without needing to dig and root around. 2. All security credentials besides development environment must be sources from individual Vault instances. Keys to those containers should exist only on the CI/CD box and accessible by a few people (the less the better). This is pretty self-explanatory, as anything besides dev may contain sensitive data and, at times, be public-facing. Because of that appropriate security must be present. TeamCity shines in this department with excellent secrets-management. 3. Every part of the build chain shall consume and produce artifacts. If it creates nothing, it likely shouldn't be its own build. This way if any issue shows up with any environment or version, all developer has to do it is grab appropriate artifacts to reproduce the issue locally. 4. Deployment builds should be directly tied to specific Git branches/tags. This enables much easier tracking of what caused an issue, including automated identifying and tagging the author (nothing like automated regression testing!).

          Speaking of deployments, I generally try to keep it simple but also with a close eye on the wallet. Because of that, I am more than happy with AWS or another cloud provider, but also constantly peeking at the loads and do we get the value of what we are paying for. Often enough the pattern of use is not constantly erratic, but rather has a firm baseline which could be migrated away from the cloud and into bare metal boxes. That is another part where this approach strongly triumphs over the common Docker and CircleCI setup, where you are very much tied in to use cloud providers and getting out is expensive. Here to embrace bare-metal hosting all you need is a help of some container-based self-hosting software, my personal preference is with Proxmox and LXC. Following that all you must write are ansible scripts to manage hardware of Proxmox, similar way as you do for Amazon EC2 (ansible supports both greatly) and you are good to go. One does not exclude another, quite the opposite, as they can live in great synergy and cut your costs dramatically (the heavier your base load, the bigger the savings) while providing production-grade resiliency.

          See more
          GitHub logo

          GitHub

          279.3K
          243.6K
          10.3K
          Powerful collaboration, review, and code management for open source and private development projects
          279.3K
          243.6K
          + 1
          10.3K
          PROS OF GITHUB
          • 1.8K
            Open source friendly
          • 1.5K
            Easy source control
          • 1.3K
            Nice UI
          • 1.1K
            Great for team collaboration
          • 867
            Easy setup
          • 504
            Issue tracker
          • 486
            Great community
          • 482
            Remote team collaboration
          • 451
            Great way to share
          • 442
            Pull request and features planning
          • 147
            Just works
          • 132
            Integrated in many tools
          • 121
            Free Public Repos
          • 116
            Github Gists
          • 112
            Github pages
          • 83
            Easy to find repos
          • 62
            Open source
          • 60
            It's free
          • 60
            Easy to find projects
          • 56
            Network effect
          • 49
            Extensive API
          • 43
            Organizations
          • 42
            Branching
          • 34
            Developer Profiles
          • 32
            Git Powered Wikis
          • 30
            Great for collaboration
          • 24
            It's fun
          • 23
            Clean interface and good integrations
          • 22
            Community SDK involvement
          • 20
            Learn from others source code
          • 16
            Because: Git
          • 14
            It integrates directly with Azure
          • 10
            Newsfeed
          • 10
            Standard in Open Source collab
          • 8
            Fast
          • 8
            It integrates directly with Hipchat
          • 8
            Beautiful user experience
          • 7
            Easy to discover new code libraries
          • 6
            Smooth integration
          • 6
            Cloud SCM
          • 6
            Nice API
          • 6
            Graphs
          • 6
            Integrations
          • 6
            It's awesome
          • 5
            Quick Onboarding
          • 5
            Remarkable uptime
          • 5
            CI Integration
          • 5
            Hands down best online Git service available
          • 5
            Reliable
          • 4
            Free HTML hosting
          • 4
            Version Control
          • 4
            Simple but powerful
          • 4
            Unlimited Public Repos at no cost
          • 4
            Security options
          • 4
            Loved by developers
          • 4
            Uses GIT
          • 4
            Easy to use and collaborate with others
          • 3
            IAM
          • 3
            Nice to use
          • 3
            Ci
          • 3
            Easy deployment via SSH
          • 2
            Good tools support
          • 2
            Leads the copycats
          • 2
            Free private repos
          • 2
            Free HTML hostings
          • 2
            Easy and efficient maintainance of the projects
          • 2
            Beautiful
          • 2
            Never dethroned
          • 2
            IAM integration
          • 2
            Very Easy to Use
          • 2
            Easy to use
          • 2
            All in one development service
          • 2
            Self Hosted
          • 2
            Issues tracker
          • 2
            Easy source control and everything is backed up
          • 1
            Profound
          CONS OF GITHUB
          • 53
            Owned by micrcosoft
          • 37
            Expensive for lone developers that want private repos
          • 15
            Relatively slow product/feature release cadence
          • 10
            API scoping could be better
          • 8
            Only 3 collaborators for private repos
          • 3
            Limited featureset for issue management
          • 2
            GitHub Packages does not support SNAPSHOT versions
          • 2
            Does not have a graph for showing history like git lens
          • 1
            No multilingual interface
          • 1
            Takes a long time to commit
          • 1
            Expensive

          related GitHub posts

          Johnny Bell

          I was building a personal project that I needed to store items in a real time database. I am more comfortable with my Frontend skills than my backend so I didn't want to spend time building out anything in Ruby or Go.

          I stumbled on Firebase by #Google, and it was really all I needed. It had realtime data, an area for storing file uploads and best of all for the amount of data I needed it was free!

          I built out my application using tools I was familiar with, React for the framework, Redux.js to manage my state across components, and styled-components for the styling.

          Now as this was a project I was just working on in my free time for fun I didn't really want to pay for hosting. I did some research and I found Netlify. I had actually seen them at #ReactRally the year before and deployed a Gatsby site to Netlify already.

          Netlify was very easy to setup and link to my GitHub account you select a repo and pretty much with very little configuration you have a live site that will deploy every time you push to master.

          With the selection of these tools I was able to build out my application, connect it to a realtime database, and deploy to a live environment all with $0 spent.

          If you're looking to build out a small app I suggest giving these tools a go as you can get your idea out into the real world for absolutely no cost.

          See more
          Russel Werner
          Lead Engineer at StackShare · | 32 upvotes · 2.1M views

          StackShare Feed is built entirely with React, Glamorous, and Apollo. One of our objectives with the public launch of the Feed was to enable a Server-side rendered (SSR) experience for our organic search traffic. When you visit the StackShare Feed, and you aren't logged in, you are delivered the Trending feed experience. We use an in-house Node.js rendering microservice to generate this HTML. This microservice needs to run and serve requests independent of our Rails web app. Up until recently, we had a mono-repo with our Rails and React code living happily together and all served from the same web process. In order to deploy our SSR app into a Heroku environment, we needed to split out our front-end application into a separate repo in GitHub. The driving factor in this decision was mostly due to limitations imposed by Heroku specifically with how processes can't communicate with each other. A new SSR app was created in Heroku and linked directly to the frontend repo so it stays in-sync with changes.

          Related to this, we need a way to "deploy" our frontend changes to various server environments without building & releasing the entire Ruby application. We built a hybrid Amazon S3 Amazon CloudFront solution to host our Webpack bundles. A new CircleCI script builds the bundles and uploads them to S3. The final step in our rollout is to update some keys in Redis so our Rails app knows which bundles to serve. The result of these efforts were significant. Our frontend team now moves independently of our backend team, our build & release process takes only a few minutes, we are now using an edge CDN to serve JS assets, and we have pre-rendered React pages!

          #StackDecisionsLaunch #SSR #Microservices #FrontEndRepoSplit

          See more