1.1K
916
+ 1
145

What is Google BigQuery?

Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python.
Google BigQuery is a tool in the Big Data as a Service category of a tech stack.

Who uses Google BigQuery?

Companies
350 companies reportedly use Google BigQuery in their tech stacks, including Spotify, Delivery Hero, and Stack.

Developers
675 developers on StackShare have stated that they use Google BigQuery.

Google BigQuery Integrations

Fastly, Fluentd, Looker, Redash, and Data Studio are some of the popular tools that integrate with Google BigQuery. Here's a list of all 62 tools that integrate with Google BigQuery.
Pros of Google BigQuery
27
High Performance
24
Easy to use
21
Fully managed service
19
Cheap Pricing
16
Process hundreds of GB in seconds
11
Full table scans in seconds, no indexes needed
10
Big Data
8
Always on, no per-hour costs
5
Good combination with fluentd
4
Machine learning
Decisions about Google BigQuery

Here are some stack decisions, common use cases and reviews by companies and developers who chose Google BigQuery in their tech stack.

Context: I wanted to create an end to end IoT data pipeline simulation in Google Cloud IoT Core and other GCP services. I never touched Terraform meaningfully until working on this project, and it's one of the best explorations in my development career. The documentation and syntax is incredibly human-readable and friendly. I'm used to building infrastructure through the google apis via Python , but I'm so glad past Sung did not make that decision. I was tempted to use Google Cloud Deployment Manager, but the templates were a bit convoluted by first impression. I'm glad past Sung did not make this decision either.

Solution: Leveraging Google Cloud Build Google Cloud Run Google Cloud Bigtable Google BigQuery Google Cloud Storage Google Compute Engine along with some other fun tools, I can deploy over 40 GCP resources using Terraform!

Check Out My Architecture: CLICK ME

Check out the GitHub repo attached

See more
Shared insights
on
dbt
Google BigQuery

I used dbt over manually setting up python wrappers around SQL scripts because it makes managing transformations within Google BigQuery much easier. This saves future Sung dozens of hours maintaining plumbing code to run a couple SQL queries. Check out my tutorial in the link!

I haven't seen any other tool make it as easy to run dependent SQL DAGs directly in a data warehouse.

See more
Rory Gwozdz
CTO at Harvested Financial | 2 upvotes 路 12.9K views

I'm trying to build a way to read financial data really, really fast, for low cost. We are write/update-light (in this arena) and read-heavy. Google BigQuery being serverless can keep costs beyond low, but query speeds are always a few seconds because, I think, of the lack of indexing and potential to take advantage of the structure of the common queries. I have tried various partitions on BigQuery to speed things up too with some success but nothing extraordinary. I have never used Google Cloud Bigtable but get how it works conceptually. I believe it would make date-range based queries markedly faster. Question is, are there ways to take advantage of date-ranges in BigQuery, or does it makes sense to just shift to BigTable for mega-fast reads? I'd love to get sub-50ms.

See more

Hi all,

Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?

See more
Mohan Ramanujam

We are a consumer mobile app IOS/Android startup. The app is instrumented with branch and Firebase. We use Google BigQuery. We are looking at tools that can support engagement and cohort analysis at an early stage price which we can grow with. Data Studio is the default but it would seem Looker provides more power. We don't have much insight into Amplitude other than the fact it is a popular PM tool. Please provide some insight.

See more

Reading data from on prem data lake to cloud storage in order to utilize cloud computing for resource heavy operations regarding NLP and ML (<10GB Total). Trying to decide if we need to utilize Google BigQuery here or if we can work directly form Google Cloud Storage with a DataProc cluster. Any thoughts here would be appreciated in regards to which would be a better approach. Thanks!

See more

Blog Posts

Aug 28 2019 at 3:10AM

Segment

+16
5
1954
Jul 2 2019 at 9:34PM

Segment

+25
10
5437
+47
46
68298

Google BigQuery's Features

  • All behind the scenes- Your queries can execute asynchronously in the background, and can be polled for status.
  • Import data with ease- Bulk load your data using Google Cloud Storage or stream it in bursts of up to 1,000 rows per second.
  • Affordable big data- The first Terabyte of data processed each month is free.
  • The right interface- Separate interfaces for administration and developers will make sure that you have access to the tools you need.

Google BigQuery Alternatives & Comparisons

What are some alternatives to Google BigQuery?
Google Cloud Bigtable
Google Cloud Bigtable offers you a fast, fully managed, massively scalable NoSQL database service that's ideal for web, mobile, and Internet of Things applications requiring terabytes to petabytes of data. Unlike comparable market offerings, Cloud Bigtable doesn't require you to sacrifice speed, scale, or cost efficiency when your applications grow. Cloud Bigtable has been battle-tested at Google for more than 10 years鈥攊t's the database driving major applications such as Google Analytics and Gmail.
Amazon Redshift
It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.
Hadoop
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
Snowflake
Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)鈥攏o infrastructure to manage and no knobs to turn.
Google Analytics
Google Analytics lets you measure your advertising ROI as well as track your Flash, video, and social networking sites and applications.
See all alternatives

Google BigQuery's Followers
916 developers follow Google BigQuery to keep up with related blogs and decisions.