John Miller
email2miller
3 points
Tools email2miller is Following
Google BigQuery
cloud.google.com/bigquery/w...
Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google...
PostgreSQL
postgresql.org
PostgreSQL is an advanced object-relational database management system that supports an extended subset of...
Apache Spark
spark.apache.org
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters th...
Airflow
airbnb.io/projects/airflow
Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes ...
Snowflake
snowflake.net
Snowflake eliminates the administration and management demands of traditional data warehouses and big data ...
Databricks
databricks.com
Databricks Unified Analytics Platform, from the original creators of Apache Sparkā¢, unifies data science an...
XGBoost
xgboost.ai
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scal...
Fivetran
fivetran.com
It helps you centralize data from disparate sources which you can manage directly from your browser. We ext...
redshift
github.com/aws/aws-sdk-go
AWS SDK for the Go programming language.
hdfs
github.com/colinmarc/hdfs
A native go client for HDFS.
Dagster
dagster.io
It is an orchestrator that's designed for developing and maintaining data assets, such as tables, data sets...
LLM
github.com/rustformers/llm
It is a Rust ecosystem of libraries for running inference on large language models, inspired by llama.cpp. ...