Google BigQuery

What is Google BigQuery?

Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.

Google BigQuery is a tool in the Databases category of a tech stack.

Key Features

All behind the scenes- Your queries can execute asynchronously in the background, and can be polled for status.Import data with ease- Bulk load your data using Google Cloud Storage or stream it in bursts of up to 1,000 rows per second.Affordable big data- The first Terabyte of data processed each month is free.The right interface- Separate interfaces for administration and developers will make sure that you have access to the tools you need.

Google BigQuery Discussions

Discover why developers choose Google BigQuery. Read real-world technical decisions and stack choices from the StackShare community.

Sung Won Chung

Jun 5, 2019

Needs adviceon

Google BigQuery

Snowflake

I use Google BigQuery because it makes is super easy to query and store data for analytics workloads. If you're using GCP, you're likely using BigQuery. However, running data viz tools directly connected to BigQuery will run pretty slow. They recently announced BI Engine which will hopefully compete well against big players like Snowflake when it comes to concurrency.

What's nice too is that it has SQL-based ML tools, and it has great GIS support!

0 views0

Comments

Nick Rockwell

SVP, Engineering at The New York Times

Sep 24, 2018

Needs adviceon

Google BigQuery

Google Cloud Pub/Sub

Google Cloud Dataflow

We really drank the Google Kool-Aid on analytics. So, everything's going into Google BigQuery and almost everything is going straight into Google Cloud Pub/Sub and then doing some processing in Google Cloud Dataflow before ending up in BigQuery. We still do too much processing and augmentation on the front end before it goes into Pub/Sub. And that's using some kind of stuff we pulled together using Amazon DynamoDB and so on. And it's very brittle, actually. Actually, Dynamo throttling is one of our biggest headaches. So, I want all of that to go away and do all our augmentation in BigQuery after the data's been collected. And having it just go straight into Pub/Sub. So, we're working on that. And it'll happen, some time. #Analytics #AnalyticsPipeline

0 views0

Comments

Tim Specht

‎Co-Founder and CTO at Dubsmash

Sep 13, 2018

Needs adviceon

Google Analytics

Amazon Kinesis

AWS Lambda

In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.

While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.

In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.

#ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService

0 views0

Comments

Lyndon Wong

Jan 27, 2018

Needs adviceon

Google BigQuery

Aggregation of user events and traits across a marketing website, SaaS web application, user account provisioning backend and Salesforce CRM. Enables full-funnel analysis of campaign ROI, customer acquisition, engagement and retention at both the user and target account level. Google BigQuery

0 views0

Comments

Meredith Fuhrman

Software Engineer (Support Tools) at ShareThis

Sep 14, 2015

Needs adviceon

Google BigQuery

BigQuery allows our team to pull reports quickly using a SQL-like queries against our large store of data about social sharing. We use the information throughout the company, to do everything from making internal product decisions based on usage patterns to sharing certain kinds of custom reports with our publishers. Google BigQuery

0 views0

Comments

Google BigQuery Discussions

Discover why developers choose Google BigQuery. Read real-world technical decisions and stack choices from the StackShare community.