Apache Parquet

#107in Databases

Discussions1

Followers190

50 Alternatives to Apache Parquet

Compare Apache Parquet to these popular alternatives based on real-world usage and developer feedback.

Vertica

It provides a best-in-class, unified analytics platform that will forever be independent from underlying infrastructure.

90 stacks16 votes120 followers

Why developers like Vertica:

✓Shared nothing or shared everything architecture(3)

Compare Apache Parquet vs Vertica →

RisingWave

It is a cloud-native streaming database that uses SQL as the interface. It is designed to reduce the complexity and cost of building real-time applications. It consumes streaming data, performs continuous queries, and updates results dynamically. As a database system, it maintains results in its own storage so that users can access data efficiently.

0 stacks0 votes2 followers

Compare Apache Parquet vs RisingWave →

MySQL

The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software.

129,581 stacks3,754 votes108,568 followers

Why developers like MySQL:

✓Sql(800)
✓Free(679)
✓Easy(562)

Compare Apache Parquet vs MySQL →

PostgreSQL

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions.

103,054 stacks3,551 votes83,925 followers

Why developers like PostgreSQL:

✓Relational database(765)
✓High availability (511)
✓Enterprise class database(439)

Compare Apache Parquet vs PostgreSQL →

MongoDB

MongoDB stores data in JSON-like documents that can vary in structure, offering a dynamic, flexible schema. MongoDB was also designed for high availability and scalability, with built-in replication and auto-sharding.

96,571 stacks4,137 votes82,007 followers

Why developers like MongoDB:

✓Document-oriented storage(829)
✓No sql(594)
✓Ease of use(554)

Compare Apache Parquet vs MongoDB →

Microsoft SQL Server

Microsoft® SQL Server is a database management and analysis system for e-commerce, line-of-business, and data warehousing solutions.

21,278 stacks540 votes15,512 followers

Why developers like Microsoft SQL Server:

✓Reliable and easy to use(139)
✓High performance(101)
✓Great with .net(95)

Compare Apache Parquet vs Microsoft SQL Server →

SQLite

SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file.

19,872 stacks535 votes15,237 followers

Why developers like SQLite:

✓Lightweight(163)
✓Portable(135)
✓Simple(122)

Compare Apache Parquet vs SQLite →

MariaDB

Started by core members of the original MySQL team, MariaDB actively works with outside developers to deliver the most featureful, stable, and sanely licensed open SQL server in the industry. MariaDB is designed as a drop-in replacement of MySQL(R) with more features, new storage engines, fewer bugs, and better performance.

16,547 stacks468 votes12,826 followers

Why developers like MariaDB:

✓Drop-in mysql replacement(149)
✓Great performance(100)
✓Open source(74)

Compare Apache Parquet vs MariaDB →

Memcached

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

7,948 stacks473 votes5,692 followers

Why developers like Memcached:

✓Fast object cache(139)
✓High-performance(129)
✓Stable(91)

Compare Apache Parquet vs Memcached →

Cassandra

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

3,590 stacks507 votes3,547 followers

Why developers like Cassandra:

✓Distributed(119)
✓High performance(98)
✓High availability(81)

Compare Apache Parquet vs Cassandra →

Apache Spark

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

3,075 stacks140 votes3,530 followers

Why developers like Apache Spark:

✓Open-source(61)
✓Fast and Flexible(48)
✓Great for distributed SQL like applications(8)

Compare Apache Parquet vs Apache Spark →

Hadoop

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

2,707 stacks56 votes2,300 followers

Why developers like Hadoop:

✓Great ecosystem(39)
✓One stack to rule them all(11)
✓Great load balancer(4)

Compare Apache Parquet vs Hadoop →

Oracle

Oracle Database is an RDBMS. An RDBMS that implements object-oriented features such as user-defined types, inheritance, and polymorphism is called an object-relational database management system (ORDBMS). Oracle Database has extended the relational model to an object-relational model, making it possible to store complex business models in a relational database.

2,606 stacks113 votes1,774 followers

Why developers like Oracle:

✓Reliable(44)
✓Enterprise(33)
✓High Availability(15)

Compare Apache Parquet vs Oracle →

H2 Database

It is a relational database management system written in Java. It can be embedded in Java applications or run in client-server mode.

1,316 stacks0 votes121 followers

Compare Apache Parquet vs H2 Database →

InfluxDB

InfluxDB is a scalable datastore for metrics, events, and real-time analytics. It has a built-in HTTP API so you don't have to write any server side code to get up and running. InfluxDB is designed to be scalable, simple to install and manage, and fast to get data in and out.

1,028 stacks175 votes1,191 followers

Why developers like InfluxDB:

✓Time-series data analysis(59)
✓Easy setup, no dependencies(30)
✓Fast, scalable & open source(24)

Compare Apache Parquet vs InfluxDB →

MSSQL

It is capable of storing any type of data that you want. It will let you quickly store and retrieve information and multiple web site visitors can use it at one time.

1,003 stacks3 votes417 followers

Why developers like MSSQL:

✓Easy of use(3)

Compare Apache Parquet vs MSSQL →

Splunk

It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.

773 stacks20 votes1,023 followers

Why developers like Splunk:

✓API for searching logs, running reports(3)
✓Alert system based on custom query results(3)

Compare Apache Parquet vs Splunk →

Oracle PL/SQL

It is a powerful, yet straightforward database programming language. It is easy to both write and read, and comes packed with lots of out-of-the-box optimizations and security features.

749 stacks8 votes598 followers

Compare Apache Parquet vs Oracle PL/SQL →

Azure SQL Database

It is the intelligent, scalable, cloud database service that provides the broadest SQL Server engine compatibility and up to a 212% return on investment. It is a database service that can quickly and efficiently scale to meet demand, is automatically highly available, and supports a variety of third party software.

585 stacks13 votes502 followers

Why developers like Azure SQL Database:

✓Managed(6)
✓Secure(4)
✓Scalable(3)

Compare Apache Parquet vs Azure SQL Database →

Apache Flink

Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

534 stacks38 votes879 followers

Why developers like Apache Flink:

✓Unified batch and stream processing(16)
✓Out-of-the box connector to kinesis,s3,hdfs(8)
✓Easy to use streaming apis(8)

Compare Apache Parquet vs Apache Flink →

CouchDB

Apache CouchDB is a database that uses JSON for documents, JavaScript for MapReduce indexes, and regular HTTP for its API. CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents and query your indexes with your web browser, via HTTP. Index, combine, and transform your documents with JavaScript.

529 stacks139 votes584 followers

Why developers like CouchDB:

✓JSON(43)
✓Open source(30)
✓Highly available(18)

Compare Apache Parquet vs CouchDB →

Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

521 stacks49 votes840 followers

Why developers like Amazon Athena:

✓Use SQL to analyze CSV files(16)
✓Glue crawlers gives easy Data catalogue(8)
✓Cheap(7)

Compare Apache Parquet vs Amazon Athena →

HBase

Apache HBase is an open-source, distributed, versioned, column-oriented store modeled after Google' Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Apache Hadoop.

511 stacks15 votes498 followers

Why developers like HBase:

✓Performance(9)
✓OLTP(5)

Compare Apache Parquet vs HBase →

Couchbase

Developed as an alternative to traditionally inflexible SQL databases, the Couchbase NoSQL database is built on an open source foundation and architected to help developers solve real-world problems and meet high scalability demands.

505 stacks110 votes606 followers

Why developers like Couchbase:

✓Flexible data model, easy scalability, extremely fast(18)
✓High performance(18)
✓Mobile app support(9)

Compare Apache Parquet vs Couchbase →

Apache Hive

Hive facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage.

488 stacks0 votes475 followers

Compare Apache Parquet vs Apache Hive →

AWS Glue

A fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

462 stacks9 votes819 followers

Why developers like AWS Glue:

✓Managed Hive Metastore(9)

Compare Apache Parquet vs AWS Glue →

HSQLDB

It offers a small, fast multi-threaded and transactional database engine with in-memory and disk-based tables and supports embedded and server modes. It includes a powerful command line SQL tool and simple GUI query tools.

449 stacks0 votes61 followers

Compare Apache Parquet vs HSQLDB →

Clickhouse

It allows analysis of data that is updated in real time. It offers instant results in most cases: the data is processed faster than it takes to create a query.

433 stacks85 votes543 followers

Why developers like Clickhouse:

✓Fast, very very fast(21)
✓Good compression ratio(11)
✓Horizontally scalable(7)

Compare Apache Parquet vs Clickhouse →

Presto

Distributed SQL Query Engine for Big Data

394 stacks66 votes1,032 followers

Why developers like Presto:

✓Works directly on files in s3 (no ETL)(18)
✓Open-source(13)
✓Join multiple databases(12)

Compare Apache Parquet vs Presto →

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

376 stacks32 votes867 followers

Why developers like Druid:

✓Real Time Aggregations(15)
✓Batch and Real-Time Ingestion(6)
✓OLAP(5)

Compare Apache Parquet vs Druid →

Talend

It is an open source software integration platform helps you in effortlessly turning data into business insights. It uses native code generation that lets you run your data pipelines seamlessly across all cloud providers and get optimized performance on all platforms.

297 stacks0 votes249 followers

Compare Apache Parquet vs Talend →

RethinkDB

RethinkDB is built to store JSON documents, and scale to multiple machines with very little effort. It has a pleasant query language that supports really useful queries like table joins and group by, and is easy to setup and learn.

292 stacks307 votes406 followers

Why developers like RethinkDB:

✓Powerful query language(48)
✓Excellent dashboard(46)
✓JSON(42)

Compare Apache Parquet vs RethinkDB →

ArangoDB

A distributed free and open-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.

273 stacks192 votes442 followers

Why developers like ArangoDB:

✓Grahps and documents in one DB(37)
✓Intuitive and rich query language(26)
✓Open source(25)

Compare Apache Parquet vs ArangoDB →

Azure Data Factory

It is a service designed to allow developers to integrate disparate data sources. It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.

254 stacks0 votes484 followers

Compare Apache Parquet vs Azure Data Factory →

IBM DB2

DB2 for Linux, UNIX, and Windows is optimized to deliver industry-leading performance across multiple workloads, while lowering administration, storage, development, and server costs.

245 stacks19 votes254 followers

Why developers like IBM DB2:

✓Rock solid and very scalable(7)
✓BLU Analytics is amazingly fast(5)

Compare Apache Parquet vs IBM DB2 →

TimescaleDB

TimescaleDB: An open-source database built for analyzing time-series data with the power and convenience of SQL — on premise, at the edge, or in the cloud.

227 stacks44 votes374 followers

Why developers like TimescaleDB:

✓Open source(9)
✓Easy Query Language(8)
✓Time-series data analysis(7)

Compare Apache Parquet vs TimescaleDB →

CockroachDB

CockroachDB is distributed SQL database that can be deployed in serverless, dedicated, or on-prem. Elastic scale, multi-active availability for resilience, and low latency performance.

216 stacks0 votes341 followers

Compare Apache Parquet vs CockroachDB →

Mentat

Project Mentat is a persistent, embedded knowledge base. It draws heavily on DataScript and Datomic. Mentat is implemented in Rust.

199 stacks0 votes12 followers

Compare Apache Parquet vs Mentat →

Pouchdb

PouchDB enables applications to store data locally while offline, then synchronize it with CouchDB and compatible servers when the application is back online, keeping the user's data in sync no matter where they next login.

148 stacks6 votes242 followers

Compare Apache Parquet vs Pouchdb →

Apache Impala

Impala is a modern, open source, MPP SQL query engine for Apache Hadoop. Impala is shipped by Cloudera, MapR, and Amazon. With Impala, you can query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time.

145 stacks18 votes301 followers

Why developers like Apache Impala:

✓Super fast(11)

Compare Apache Parquet vs Apache Impala →

ScyllaDB

ScyllaDB is the database for data-intensive apps that require high performance and low latency. It enables teams to harness the ever-increasing computing power of modern infrastructures – eliminating barriers to scale as data grows.

143 stacks8 votes197 followers

Compare Apache Parquet vs ScyllaDB →

Percona

It delivers enterprise-class software, support, consulting and managed services for both MySQL and MongoDB across traditional and cloud-based platforms.

143 stacks0 votes101 followers

Compare Apache Parquet vs Percona →

RocksDB

RocksDB is an embeddable persistent key-value store for fast storage. RocksDB can also be the foundation for a client-server database but our current focus is on embedded workloads. RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation.

141 stacks11 votes290 followers

Why developers like RocksDB:

✓Very fast(5)
✓Made by Facebook(3)

Compare Apache Parquet vs RocksDB →

Mule runtime engine

Its mission is to connect the world’s applications, data and devices. It makes connecting anything easy with Anypoint Platform™, the only complete integration platform for SaaS, SOA and APIs. Thousands of organizations in 60 countries, from emerging brands to Global 500 enterprises, use it to innovate faster and gain competitive advantage.

127 stacks8 votes129 followers

Why developers like Mule runtime engine:

✓Open Source(4)

Compare Apache Parquet vs Mule runtime engine →

JSONlite

JSONlite sandboxes the current working directory similar to SQLite. The JSONlite data directory is named jsonlite.data by default, and each json document is saved pretty printed as a uuid.

122 stacks2 votes19 followers

Compare Apache Parquet vs JSONlite →

Dremio

Dremio—the data lake engine, operationalizes your data lake storage and speeds your analytics processes with a high-performance and high-efficiency query engine while also democratizing data access for data scientists and analysts.

116 stacks8 votes348 followers

Why developers like Dremio:

✓Nice GUI to enable more people to work with Data(3)

Compare Apache Parquet vs Dremio →

Fauna

Escape the boundaries imposed by legacy databases with a data API that is simple to adopt, highly productive to use, and offers the capabilities that your business needs, without the operational pain typically associated with databases.

112 stacks27 votes153 followers

Why developers like Fauna:

✓100% ACID(5)
✓Generous free tier(4)
✓Removes server provisioning or maintenance (4)

Compare Apache Parquet vs Fauna →

LevelDB

It is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values. It has been ported to a variety of Unix-based systems, macOS, Windows, and Android.

108 stacks0 votes111 followers

Compare Apache Parquet vs LevelDB →

Delta Lake

An open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads.

105 stacks0 votes315 followers

Compare Apache Parquet vs Delta Lake →

Azure Synapse

It is an analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources—at scale. It brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.

104 stacks10 votes230 followers

Why developers like Azure Synapse:

✓ETL(4)
✓Security(3)

Compare Apache Parquet vs Azure Synapse →