Your expert for questions
Martin Whyte
Partner in the Data & Analytics division at PwC Germany
Email
Microsoft Fabric is a new and comprehensive Software-as-a-Service (SaaS) analytics platform offered by Microsoft. Fabric is meant to be a one-stop-shop solution for all kinds of data and analytics workflows in an enterprise, covering functionalities from data integration to data engineering, data science, real-time analytics, and reporting. Fabric essentially combines three proven Azure data services (Data Factory, Synapse and Power BI) and enhances them with real-time and co-pilot features, all in one unified platform and capacity license model. Fabric eliminates the need for manual resource setup, maintenance, and configuration of the underlying cloud services, which enables a new level of self-service for developing data products and analytics solutions. Fabric fosters collaboration of all data professionals, from data engineers and data scientists to data analysts and data consumers, on one common platform, resulting in faster value creation from data for the enterprise.
In the article below, we will explore the new key features of Microsoft Fabric and discuss how they might help enterprises to build the core capabilities for the next-generation data platform infused by Data Mesh and Data Fabric principles.
The interplay of the different components of Microsoft Fabric becomes clear when looking into a use case example. For instance, an enterprise wants to combine financial data from an ERP system with customer data from a CRM system to predict the customer lifetime value (CLV) and adjust the customer segmentation accordingly.
Integrate: The implementation of this use case starts with ingesting finance data and customer data using Data Factory. Developers can use the Co-Pilot functionalities in Microsoft Fabric to speed up the process.
Transform: Next the data is transformed, harmonized, and combined via Notebooks based on a spark engine which is integrated in Microsoft Fabric.
Serve: The data is served via a SQL endpoint to Power BI, either via Fabric Data Warehouse (limited to structured data, organized by databases, schemas and tables) or directly via the OneLake of Microsoft Fabric (open to any data type, organized by folders, files, databases, and tables).
Consume: Both data serving approaches can be easily integrated to Power BI using DirectLake mode by loading parquet-formatted files directly from OneLake instead of importing or duplicating data into Power BI datasets.
Optimize: To complement the created PowerBI report with predictive insights, machine learning models to predict CLV and classify customers are created with Synapse Data Science in Fabric.
Live: To enable everyone in the organization to continuously apply CLV models and customer segments, Synapse Real Time Analytics in Fabric can be utilized to make optimized business decision on live data and the Data Activator in Fabric can automatically take actions when patterns or conditions are detected in changing data, e.g., to create a new segment.
The explained use case could also be implemented within Azure Synapse as a unified platform for data warehousing and analytics or withing Azure Databricks as an integrated notebook-based analytics and AI platform. However, Microsoft Fabric provides advantages like:
Microsoft Fabric combines all these technical capabilities in one workspace and therefore makes them more accessible to the data citizen. But does every company now need Microsoft Fabric? Consider the following questions:
To reduce technical complexity and simplify governance, Microsoft Fabric might be the right next evolution of your data platform.
Why?
The Data Lakehouse architecture pattern is used in most modern data platforms because it is an open and governed foundation, combining the best elements of data lakes and data warehouses.
How?
In Fabric, notebooks transform the data and store it in the Lakehouse using a common structure, e.g., the medallion architecture.
New in Fabric?
OneLake integration – Create shortcuts from Lakehouses and run SQL queries on your data and reducing data estate fragmentation.
Why?
Data Warehouses are an established approach for managing structured data for analytical purposes.
How?
In Fabric, you can use widely known stored procedures to structure transform and serve your data in a Data Warehouse.
New in Fabric?
The data is stored in OneLake with the open data standard of Delta-Parquet, using the Spark environment without requiring any data movement.
Why?
Data Science enables the extraction of actionable insights from diverse datasets, fostering decision-making, optimizing processes, and enhancing efficiency.
How?
In Fabric, Data Scientists create machine learning models which can be trained and tested directly on Lakehouse, Warehouse or OneLake.
New in Fabric?
Data Science is now part of one unified platform and with semantic link, which allows to leverage Power BI datasets within the Data Science experiments and easily analyze the data with Copilot.
Why?
Seamless data integration in real time is often business critical to enhance the decision process.
How?
In Fabric, there are several open-source connectors for real-time analytics.
New in Fabric?
With the one logical copy, data in KQL (Kusto Query Language) Databases are available in OneLake in Delta-Parquet format. With the new Get Data Experience the data ingestion process is simplified.
Use data even more effectively within your company based on modern data platform concepts like data mesh and data fabric.