Databricks injects array of AI tools into Lakehouse
Lakehouse IQ and Lakehouse AI, businesses can build better internal chatbots and create their own LLMs


Databricks has launched a raft of new features for its flagship Lakehouse data lake platform, including artificial intelligence (AI) and natural language improvements as well as data governance features.
With Lakehouse AI, customers can develop generative AI applications, including large language models (LLMs), within the platform itself, with tools spanning the breadth of the AI lifecycle. This includes data collection, model development, and monitoring.
“This gives you control over your intellectual property,” said Databricks CEO Ali Ghodsi during his keynote address at Databricks Data + AI Summit 2023. He added every company in the next five years “will be an AI and data company” and would need the tools to create their own LLMs to be competitive, rather than rely on third-party services.
What is LakehouseIQ?
LakehouseIQ is the firm’s ‘knowledge engine’ that’s designed to make natural language interfaces, like chat bots, more accurate and relevant to the needs of a particular organization.
Sitting as a fabric between data and a front-end interface, the tool uses generative AI to understand a business’ unique culture, jargon, and data usage patterns, among other factors.
RELATED RESOURCE
Modern storage: The answer to multi-cloud complexity
Innovative organizations need innovative storage to manage and leverage their data no matter where it lives
It ingests a vast quantity of internal material, including organizational charts, notebooks, and more, in order to understand how best to answer questions users ask via any chat interface.
It’s designed to be specific to each business, so natural language interactions are far more accurate and context-sensitive than classic chat bots. While not a chat bot or virtual assistant itself, it’s designed to improve the reliability and performance of these tools.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Databricks has made Lakehouse IQ available as an API, and the firm plans to integrate this into open source libraries.
New AI features in Lakehouse
Vector Search gives developers the capacity to improve accuracy of generative AI responses. The features creates vector embeddings based on files found in Unity Catalog — the central hub that gives customers access control, auditing, lineage and data discovery capabilities across their workspaces.
It creates vector embeddings from Unity Catalog files, and automatically updates them through integrations with Databricks Model Serving. Developers can also add query filters to provide better outcomes.
Improvements to AutoML also adds low-code functionality to finetuning LLMs. Any tweaks customers make to their LLMs will be owned by the enterprise, with data kept in-house and not shared with any third parties. This model can also be sahred within an organization with Model Serving, Unity Catalog and MLflow integrations.
Finally, Databricks has launched several open source models through its marketplace including MosaicML’s MPT-7B and Hugging Face’s Falcon-7B, as well as Stable Diffusion.
On the LLMOps front, MLflow AI Gateway lets customers centrally manage credentials for software as a service (SaaS) models or APIs, and provide access-controlled querying. This also allows for prediction caching to track repeat prompts and rate limiting to reduce costs.
MLflow Prompt Tools, meanwhile, is a no-code feature that allows users to compare different outputs from various models based on prompts that are automatically tracked within MLflow.
New governance features in Lakehouse
Also new in Lakehouse is LakeHouse Federation capabilities, which has broken down silos between data systems that previously existed on the platform. As a result, businesses are now able to discover, query and govern data across all their systems without the need to copy or move data.
Query federation comprises catalog and querying functionality to consolidate and map out data asses from different platforms beyond the confines of Databricks, including MySQL, PostgreSQL, Redshift, Snowflake, Azure AQL Database, Azure Synapse, BigQuery, and others. Users can secure, audit and access data from one interface rather than having to move the data between different platforms.
Additional governance tools in Unity Catalog also lets customers apply access policies on any data asset that’s registered, including tables, rows, columns and tags. This comes ahead of future plans to define data access policies and push them out to other data warehouses.
The Lakehouse Federation's capabilities also include giving users a common approach to discovering and exploring both structured and unstructured data with a single query. Domain-owned data sources for analytics and AI use cases are also rapidly exposed, offering faster access to data. Finally, the changes make it possible for users to apply a single permission model for the data estate.

Keumars Afifi-Sabet is a writer and editor that specialises in public sector, cyber security, and cloud computing. He first joined ITPro as a staff writer in April 2018 and eventually became its Features Editor. Although a regular contributor to other tech sites in the past, these days you will find Keumars on LiveScience, where he runs its Technology section.
-
Bigger salaries, more burnout: Is the CISO role in crisis?
In-depth CISOs are more stressed than ever before – but why is this and what can be done?
By Kate O'Flaherty Published
-
Cheap cyber crime kits can be bought on the dark web for less than $25
News Research from NordVPN shows phishing kits are now widely available on the dark web and via messaging apps like Telegram, and are often selling for less than $25.
By Emma Woollacott Published
-
A quarter of firms still don’t have a formal data strategy – and it’s hampering AI adoption
News More than a quarter of firms have no formal data strategy, and it's hampering enterprise AI adoption efforts.
By George Fitzmaurice Published
-
AI projects are faltering as CDOs grapple with poor data quality
News Chief data officers say they can't maintain consistent data quality, and that it's affecting AI outcomes
By Emma Woollacott Published
-
Databricks targets steep growth after historic funding round
News The Thrive Capital-led funding round saw Databricks raise $10bn for more acquisitions and AI products
By Nicole Kobie Published
-
Execs are happy to let AI make decisions for them, and it’s got IT workers worried
News IT decision makers are more cautious than the C-suite with AI adoption
By Emma Woollacott Published
-
Predicts 2024: Sustainability reshapes IT sourcing and procurement
whitepaper Take the following actions to realize environmental sustainability
By ITPro Published
-
Advance sustainability and energy efficiency in the era of GenAI
whitepaper Take a future-ready approach with Dell Technologies and Intel
By ITPro Published
-
Put AI to work for talent management
Whitepaper Change the way we define jobs and the skills required to support business and employee needs
By ITPro Published
-
Strengthening your data resilience strategy
webinar Safeguard your digital assets
By ITPro Published