Decoding Microsoft Fabric: An Introduction to Fabric OneLake

FABRIC SERIES: FUNDAMENTALS 03 — INTRODUCTION TO MICROSOFT FABRIC LINGOS — OneLake

RK Iyer
6 min readApr 13, 2024

In our last blog “Decoding Microsoft Fabric: An Introduction to Its Key Terminologies”, We went through different Microsoft Lingos which helped in understanding the core concepts & fabric lingos. Within the pages of this blog, we embark on an in-depth exploration of OneLake, unraveling its intricacies, capabilities, and the transformative potential it holds in the realm of data management in Microsoft Fabric.

❑ What is OneLake?

OneLake-Single Unified Logical Storage

OneLake is a single unified logical storage for your whole organization. OneLake comes automatically with every Microsoft Fabric tenant. OneDrive is designed to be the single place for all your analytics data and so it is called “OneDrive for Data”.

Constructed on top Azure Data Lake Storage Gen2, OneLake is designed to accommodate any file type, structured or unstructured. All Fabric data elements such as data warehouses and lakehouses automatically deposit their data into OneLake in delta parquet format. This facilitates data engineers in loading a lakehouse using Spark, allows SQL developers to populate fully transactional data warehouses with T-SQL, and ensures uniformity among all contributors in constructing the data lake. Computes are optimized to work with delta-parquet files as native files.

OneLake supports the same ADLS Gen2 APIs and SDKs to be compatible with existing ADLS Gen2 applications including Azure Databricks.

❑ Why OneLake?

Before OneLake, it was easier for customers to create multiple data lakes for different business groups rather than collaborating on a single lake resulting in Data Swamp, a deteriorated and unmanageable data lake. OneLake focuses on removing these challenges by improving collaboration.

What is Data Swamp?

When a data lake becomes unorganized and difficult to navigate due to the lack of proper data management practices, it is called a data swamp. In simple works it’s a poorly managed Data Lake.

❑ What is the relationship between workspace & OneLake?

Similar to how every Office user can independently create a new Teams channel or SharePoint site without needing administrator coordination, OneLake empowers distributed ownership through its workspaces. These workspaces allow various segments of the organization to operate autonomously while collectively contributing to the development of the data lake. Each workspace is administered independently with its own access controls. Moreover, each workspace operates on dedicated capacity residing in a region chosen by the user. This also ensures that OneLake seamlessly caters to customers operating across multiple countries and effortlessly adheres to local data residency regulations.

OneLake across multiple workspaces

As illustrated in above diagram, OneLake spans the globe with different workspaces residing in different countries while still remaining part of the same logical lake.

❑ OneLake Characteristics

🗹 Every customer tenant has exactly one OneLake. Never Zero or never more than one.

🗹 Every Fabric tenant automatically provisions OneLake, with no extra resources to set up or manage.

🗹 OneLake enables One copy of data for use across multiple analytical engines.

🗹 OneLake enables One security model living natively with the data in the lake.

🗹 OneLake is open at every level.

🗹 OneLake can be addressed as if it were one big ADLS storage account for the entire organization.

❑ How can I access your data in OneLake through OneLake URI?

Microsoft OneLake provides open access to all of your Fabric items through existing Azure Data Lake Storage (ADLS) Gen2 APIs and SDKs. You can access your data in OneLake through any API, SDK, or tool compatible with ADLS Gen2 just by using a OneLake URI instead.

URI Syntax

Because OneLake exists across your entire Microsoft Fabric tenant, you can refer to anything in your tenant by its workspace, item, and path:

The syntax is like the following -

https://onelake.dfs.fabric.microsoft.com/<workspace>/<item>.<itemtype>/<path>/<fileName>

GUID

OneLake also supports referencing workspaces and items with globally unique identifiers (GUIDs). OneLake assigns GUIDs and GUIDs don’t change, even if the workspace or item name changes.

You can find the associated GUID for your workspace or item in the URL on the Fabric portal.

You must use GUIDs for both the workspace and the item, and don’t need the item type.

https://onelake.dfs.fabric.microsoft.com/<workspaceGUID>/<itemGUID>/<path>/<fileName>

ABFS

OneLake also supports the Azure Blob Filesystem driver (ABFS) for more compatibility with ADLS Gen2 and Azure Blob Storage. The ABFS driver uses its own scheme identifier abfs and a different URI format to address files and directories in ADLS Gen2 accounts. To use this URI format over OneLake, swap workspace for filesystem and include the item and item type.

abfs[s]://<workspace>@onelake.dfs.fabric.microsoft.com/<item>.<itemtype>/<path>/<fileName>

❑ OneLake file explorer for Windows

The OneLake file explorer is windows application that seamlessly integrates OneLake with Windows File Explorer. This application automatically syncs all OneLake items that you have access to in Windows File Explorer. The OneLake file explorer simplifies data lakes making them accessible to even non-technical business users.

Use can Download Microsoft OneLake file explorer for Windows from Official Microsoft Download Center

OneLake Explorer

🗹 “Sync” refers to pulling up-to-date metadata on files and folders, and sending changes made locally to the OneLake service. Syncing doesn’t download the data, it creates placeholders. You must double-click on a file to download the data locally.

🗹 When you create, update, or delete a file via OneLake file explorer, it automatically syncs the changes to OneLake service.

🗹 Updates to your item made outside of your OneLake file explorer aren’t automatically synced. To pull these updates, you need to right-click on the workspace name, item name, folder name, or file in OneLake file explorer and select Sync from OneLake. This action refreshes the view for any folders that were previously synced. To pull updates for all workspaces, right-click on the OneLake root folder and select Sync from OneLake subfolder in Windows File Explorer and select Sync from OneLake.

🗹 OneLake’s file explorer syncs updates when online and active, automatically refreshing previously synced folders upon application launch. Files added or updated offline appear as sync pending until saved, while deleted files are restored during refresh if still available on the service.

🗹 Tenant admins can restrict access to OneLake file explorer for their organization in the Microsoft Fabric admin portal.

❑ OneLake data hub

The OneLake data hub makes it easy to find, explore, and use the Fabric data items in your organization that you have access to. It provides information about the items and entry points for working with them.

OneLake data hub

You can display only data items belonging to a particular domain or find recommended items or find items by workspace.

I hope this blog helped you in understanding the core concepts of OneLake. There is still more to come…Happy Learning!!!

Please Note — All opinions expressed here are my personal views and not of my employer.

Thought of the moment-

To succeed in your mission, you must have single-minded devotion to your goal.” — APJ Abdul Kalam

--

--

RK Iyer

Architect@Microsoft, Technology Evangelist, Sports Enthusiast! All opinions here are my personal thoughts and not my employers.