Google Cloud makes significant BigQuery upgrade in pursuit of the ‘ultimate data cloud’
The tech giant is also adding support for Apache Iceberg, Delta Lake, and Apache Hudi - the leading open source table formats commonly used in data lakes
Google Cloud is making upgrades to BigQuery hoping it will lead to the 'ultimate data cloud', the tech giant revealed at its Next conference this week.
Starting today, data teams will be able to analyse structured and unstructured data in BigQuery, with easy access to Google Cloud’s capabilities in machine learning, speech recognition, computer vision, translation, and text processing, using BigQuery’s familiar SQL interface.
In the past, data teams have worked with structured data, using BigQuery to analyse data from operational databases and SaaS applications like Adobe, SAP, ServiceNow, and Workday as well as semi-structured data such as JSON log files.
This comes after it announced a public preview last January for the BigQuery native JSON data type, a capability which brings support for storing and analysing semi-structured data in BigQuery. Through this, it said semi-structured data in BigQuery is now intuitive to use and query in its native format.
The company is making changes to how structured and unstructured data works in BigQuery, its data warehouse platform. It underlined that unstructured data may account for up to 90% of all data today, including video from television archives, audio from call centres or radio, and documents in different formats.
It’s also adding support for major data formats in use today. BigLake, its storage engine, will add support for Apache Iceberg, Delta Lake, and Apache Hudi, the leading open-source table formats commonly used in data lakes. Support for Apache Iceberg will be entering preview now while support for the other two formats will be coming soon, but at an undetermined date.
The time for cloud MDM is now
Know the differences between cloud-native and cloud-enabled MDM
The company hopes that by supporting these widely adopted data formats, it can help eliminate barriers that prevent organisations from getting the full value from their data.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
The tech giant is also unifying two of its business intelligence products, Looker and Google Data Studio, under the new Looker umbrella to create a deep integration of Looker, Data Studio, and core Google technologies like AI and ML. This means Data Studio has now become Looker Studio, and Google hopes this will help customers make better data-driven decisions.
It says Looker Studio helps to make it easier to carry out self-service analytics. It currently supports more than 800 data sources with a catalogue surpassing 600 connectors, which it says makes it simple to explore data from different sources.
Looker data models from Looker Studio are currently available in preview. This allows customers to explore trusted data via the Looker modelling layer, and for the first time they will be able to combine both self-service analytics from ad-hoc data sources with trusted data that has already been vetted and modelled in Looker.
Additionally, customers who upgrade to Looker Studio Pro will get new enterprise management features, team collaboration capabilities, and SLAs. The tech giant underlined that this is only the first release, and it has developed a roadmap of capabilities, including Dataplex integration for data lineage and metadata visibility, which enterprise customers have been asking for.
Google Cloud also shared that Looker (Google Cloud core) is in preview. This is a new version of Looker available on the Google Cloud Console and is deeply integrated with core cloud infrastructure services, such as key security and management services.
Enhancements for Looker and BigQuery with Microsoft Power BI have been introduced, too, with Google Cloud branding the move a significant step forward in providing customers with the most open data cloud. It said this means Tableau and Microsoft customers can easily analyse trusted data from Looker and seamlessly connect with BigQuery.
Google Cloud also said that a data cloud should enable organisations to bring together all of their data confidently, which helps ensure that data is of high quality and enables strong, flexible management and governance capabilities.
To address this, the company is updating Dataplex which will automate common processes associated with data quality. For example, users will now be able to more easily understand data lineage — where data originates and how it has transformed and moved over time — reducing the need for manual, time-consuming processes.
“The ability to let our customers work with all kinds of data, in the formats they prefer, is the hallmark of an open data cloud,” said Gerrit Kazmaier, VP and GM of Data Analytics at Google Cloud. “We’re committed to delivering the support and integrations that customers need to remove limits from their data and avoid data lock-in.”
Google Cloud confirmed its ambition to create "the most open, extensible, and powerful data cloud" on the market. It wants customers to be able to utilise all their data from as many sources and in as many formats as necessary.
Zach Marzouk is a former ITPro, CloudPro, and ChannelPro staff writer, covering topics like security, privacy, worker rights, and startups, primarily in the Asia Pacific and the US regions. Zach joined ITPro in 2017 where he was introduced to the world of B2B technology as a junior staff writer, before he returned to Argentina in 2018, working in communications and as a copywriter. In 2021, he made his way back to ITPro as a staff writer during the pandemic, before joining the world of freelance in 2022.