Priyab Dash
bobquest33
Data Engineer
|
LnPa
3 points
Tools bobquest33 is Following
GitHub
github.com
GitHub is the best place to share code with friends, co-workers, classmates, and complete strangers. Over t...
Microsoft Azure
azure.microsoft.com/en-us
Azure is an open and flexible cloud platform that enables you to quickly build, deploy and manage applicati...
Amazon Redshift
aws.amazon.com/redshift
It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less tha...
AWS Data Pipeline
aws.amazon.com/datapipeline
AWS Data Pipeline is a web service that provides a simple management system for data-driven workflows. Usin...
Python
python.org
Python is a general purpose programming language created by Guido Van Rossum. Python is most praised for it...
Kafka
kafka.apache.org
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a mess...
scikit-image
scikit-image.org
scikit-image is a collection of algorithms for image processing.
Apache Spark
spark.apache.org
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters th...
scikit-learn
scikit-learn.org/stable
scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clau...
Apache Flink
flink.apache.org
Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports bat...
Airflow
airbnb.io/projects/airflow
Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes ...
Druid
druid.io
Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power expl...
TensorFlow
tensorflow.org
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in th...
Snowflake
snowflake.net
Snowflake eliminates the administration and management demands of traditional data warehouses and big data ...
Dask
dask.pydata.org/en/latest
It is a versatile tool that supports a variety of workloads. It is composed of two parts: Dynamic task sch...
Singer
singer.io
Singer powers data extraction and consolidation for all of your organization’s tools: advertising platforms...
PyTorch
pytorch.org
PyTorch is not a Python binding into a monolothic C++ framework. It is built to be deeply integrated into P...
Apache Kylin
kylin.apache.org
Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-di...
dbt
getdbt.com
dbt is a transformation workflow that lets teams deploy analytics code following software engineering best ...
azure
github.com/azure/azure-sdk-...
Microsoft Azure Client Library for Ruby.
Hevo Data
hevodata.com
It is a no-code data pipeline as a service. Start moving data from any source to your data warehouses such ...
dataflows
github.com/datahq/dataflows
A nifty data processing framework, based on data packages.