Apache Storm vs riko

Overview

Apache Storm

Stacks207

Followers282

Votes25

GitHub Stars6.7K

Forks4.1K

riko

Stacks0

Followers6

Votes0

GitHub Stars1.6K

Forks75

Apache Storm vs riko: What are the differences?

Apache Storm: Distributed and fault-tolerant realtime computation. Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate; riko: A Python stream processing engine modeled after Yahoo! Pipes. riko is a pure Python library for analyzing and processing streams of structured data. riko has synchronous and asynchronous APIs, supports parallel execution, and is well suited for processing RSS feeds. riko also supplies a command-line interface for executing flows, i.e., stream processors aka workflows.

Apache Storm and riko can be primarily classified as "Stream Processing" tools.

Some of the features offered by Apache Storm are:

Storm integrates with the queueing and database technologies you already use
Simple API
Scalable

On the other hand, riko provides the following key features:

Read csv/xml/json/html files
Create text and data based flows via modular pipes
Parse, extract, and process RSS/Atom feeds

Apache Storm and riko are both open source tools. Apache Storm with 5.81K GitHub stars and 3.94K forks on GitHub appears to be more popular than riko with 1.47K GitHub stars and 67 GitHub forks.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Detailed Comparison

Apache Storm	riko
Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.	riko is a pure Python library for analyzing and processing streams of structured data. riko has synchronous and asynchronous APIs, supports parallel execution, and is well suited for processing RSS feeds. riko also supplies a command-line interface for executing flows, i.e., stream processors aka workflows.
Storm integrates with the queueing and database technologies you already use;Simple API;Scalable;Fault tolerant;Guarantees data processing;Use with any language;Easy to deploy and operate;Free and open source	Read csv/xml/json/html files;Create text and data based flows via modular pipes;Parse, extract, and process RSS/Atom feeds;Create awesome mashups, APIs, and maps;Perform parallel processing via cpus/processors or threads
Statistics
GitHub Stars 6.7K	GitHub Stars 1.6K
GitHub Forks 4.1K	GitHub Forks 75
Stacks 207	Stacks 0
Followers 282	Followers 6
Votes 25	Votes 0
Pros & Cons
Pros 10 Flexible 6 Easy setup 4 Event Processing 3 Clojure 2 Real Time	No community feedback yet
Integrations
No integrations available	Python

What are some alternatives to Apache Storm, riko?

Apache NiFi

An easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.

Confluent

It is a data streaming platform based on Apache Kafka: a full-scale streaming platform, capable of not only publish-and-subscribe, but also the storage and processing of data within the stream

KSQL

KSQL is an open source streaming SQL engine for Apache Kafka. It provides a simple and completely interactive SQL interface for stream processing on Kafka; no need to write code in a programming language such as Java or Python. KSQL is open-source (Apache 2.0 licensed), distributed, scalable, reliable, and real-time.

Heron

Heron is realtime analytics platform developed by Twitter. It is the direct successor of Apache Storm, built to be backwards compatible with Storm's topology API but with a wide array of architectural improvements.

Kafka Streams

It is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters. It combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology.

Kapacitor

It is a native data processing engine for InfluxDB 1.x and is an integrated component in the InfluxDB 2.0 platform. It can process both stream and batch data from InfluxDB, acting on this data in real-time via its programming language TICKscript.

Redpanda

It is a streaming platform for mission critical workloads. Kafka® compatible, No Zookeeper®, no JVM, and no code changes required. Use all your favorite open source tooling - 10x faster.

Faust

It is a stream processing library, porting the ideas from Kafka Streams to Python. It provides both stream processing and event processing, sharing similarity with tools such as Kafka Streams, Apache Spark/Storm/Samza/Flink.

Samza

It allows you to build stateful applications that process data in real-time from multiple sources including Apache Kafka.

Benthos

It is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads.