What is Google BigQuery?
Who uses Google BigQuery?
Google BigQuery Integrations
Here are some stack decisions, common use cases and reviews by companies and developers who chose Google BigQuery in their tech stack.
Context: I wanted to create an end to end IoT data pipeline simulation in Google Cloud IoT Core and other GCP services. I never touched Terraform meaningfully until working on this project, and it's one of the best explorations in my development career. The documentation and syntax is incredibly human-readable and friendly. I'm used to building infrastructure through the google apis via Python , but I'm so glad past Sung did not make that decision. I was tempted to use Google Cloud Deployment Manager, but the templates were a bit convoluted by first impression. I'm glad past Sung did not make this decision either.
Solution: Leveraging Google Cloud Build Google Cloud Run Google Cloud Bigtable Google BigQuery Google Cloud Storage Google Compute Engine along with some other fun tools, I can deploy over 40 GCP resources using Terraform!
Check Out My Architecture: CLICK ME
Check out the GitHub repo attached
I used dbt over manually setting up python wrappers around SQL scripts because it makes managing transformations within Google BigQuery much easier. This saves future Sung dozens of hours maintaining plumbing code to run a couple SQL queries. Check out my tutorial in the link!
I haven't seen any other tool make it as easy to run dependent SQL DAGs directly in a data warehouse.
I'm trying to build a way to read financial data really, really fast, for low cost. We are write/update-light (in this arena) and read-heavy. Google BigQuery being serverless can keep costs beyond low, but query speeds are always a few seconds because, I think, of the lack of indexing and potential to take advantage of the structure of the common queries. I have tried various partitions on BigQuery to speed things up too with some success but nothing extraordinary. I have never used Google Cloud Bigtable but get how it works conceptually. I believe it would make date-range based queries markedly faster. Question is, are there ways to take advantage of date-ranges in BigQuery, or does it makes sense to just shift to BigTable for mega-fast reads? I'd love to get sub-50ms.
Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?
We are a consumer mobile app IOS/Android startup. The app is instrumented with branch and Firebase. We use Google BigQuery. We are looking at tools that can support engagement and cohort analysis at an early stage price which we can grow with. Data Studio is the default but it would seem Looker provides more power. We don't have much insight into Amplitude other than the fact it is a popular PM tool. Please provide some insight.
Reading data from on prem data lake to cloud storage in order to utilize cloud computing for resource heavy operations regarding NLP and ML (<10GB Total). Trying to decide if we need to utilize Google BigQuery here or if we can work directly form Google Cloud Storage with a DataProc cluster. Any thoughts here would be appreciated in regards to which would be a better approach. Thanks!
Google BigQuery's Features
- All behind the scenes- Your queries can execute asynchronously in the background, and can be polled for status.
- Import data with ease- Bulk load your data using Google Cloud Storage or stream it in bursts of up to 1,000 rows per second.
- Affordable big data- The first Terabyte of data processed each month is free.
- The right interface- Separate interfaces for administration and developers will make sure that you have access to the tools you need.