Typically big data analytics solutions are extremely complex, with a wide range of interdependencies. You have to deal with potential latency in reading the data off of storage devices, moving information through the network and processing the data at the server/app level. When you’re dealing with huge quantities of information, the result is usually a time-consuming and expensive process.
But the big data sector has evolved and matured to the point that those cumbersome configurations are being streamlined in the cloud, and Google’s BigQuery serves as a prime example of how organizations can use a cloud-based analytics platform to accelerate and simplify analytics.
Just how fast is BigQuery?
As part of the Google Cloud Platform, BigQuery offers the performance and scalability that the ecosystem has become known for. Google’s robust data centers and long experience handling huge quantities of data make the company uniquely positioned to offer market-leading data analytics solutions.
Freed from the worry of infrastructure provisioning and maintenance, the serverless infrastructure, ANSI-compliant SQL support, and ODBC & JDBC drivers make it just as quick and easy to get your data set up as it is to analyze.
In the demonstration video below, Jeff Davis, authorized Google Cloud Platform trainer, runs viewers through a query of a data set containing more than 70 million lines of data and multiple gigabytes of information. The query, a simple SQL search to pin down three components of data falling under a central theme, is set up. The platform provides an estimate of the data involved in the transaction – allowing for cost estimation – and the query is run. A little more than 20 seconds later, a report with hundreds of thousands of lines of data has been created. Davis then narrows down the data set, then puts a more specific query into this smaller information pool. In just a few seconds, he has a highly specific, nuanced report.
The scale, simplicity and speed of this query isn’t a fluke either. Davis goes on to search through a database of all Wikipedia entries to identify which person with the name “Davis” has the most popular entry. This task meant going through more than 3.5 terabytes of information that included more than 100 billion lines of data. In about 30 seconds the query was complete, and it turns out Miles Davis is the most popular.
Speed doesn’t just stem from performance, it also comes due to ease of use. During the video, Davis also highlighted that BigQuery’s SQL foundation lets users pull queries from other environments and run them in Google’s platform, dramatically simplifying the user experience.
How Google Cloud Platform fuels BigQuery performance
Scalable data analytics configurations are difficult to come by in a traditional setup because organizations are highly limited by the network, compute and storage resources they have in their corporate data centers. By moving into the cloud, companies can work from a vast resource framework that scales based on their demands at a given time.
Google has designed BigQuery specifically with this kind of scalability in mind:
- Built on an ANSI-compliant SQL foundation with mainstream drives that create familiarity and reduce the learning curve.
- With elastic capacity and scaling, you don’t have to worry about right-sizing your data warehouse.
- Integration across Google’s various cloud services allows you to pull data from diverse sources without having to manually migrate data between systems.
- Parallel execution and managed columnar storage can drive major performance gains.
- Native integration with CloudML Engine and TensorFlow make it an ideal data warehouse for your Machine Learning and Artificial Intelligence initiatives.
So whether you are loading data from related sources like Google Cloud Storage or Google Cloud Datastore, streaming it in at thousands of rows per second to enable real time analysis, or using BigQuery Data Transfer Service to automatically transfer data from external sources, BigQuery can position organizations to accelerate and optimize their big data projects.
Dito can help your business streamline its data warehousing operations and tackle large scale data analysis with BigQuery. Schedule a data analytics workshop to discuss next steps for optimizing how your organization manages and works with big data.