Understand Btrfs File System (Copy On Write, Sub-Volumes, Snapshots, Quota Group) — Part 1

8 min readNov 2, 2023

Btrfs is an advanced file system that uses copy-on-write techniques, sub-volumes, snapshots, and quota groups to provides impressive capabilities and reliability to store and manage data.

Hello There, I hope all you guys are doing well. I’m dividing this article into two parts because the Btrfs file system is a vast topic, and splitting it makes it easier to grasp. Before delving into the introduction of Btrfs, let’s clarify some fundamental terms to ensure everyone is on the same page.

Introduction

What is a file system?

A file system is like a digital organization system for your data. It manages how files and directories are stored and accessed on a storage device like a hard drive or SSD. In Linux, which is known for its versatility, there are several types of file systems supported. The most commonly used one is the Ext4 file system, known for its reliability and performance.

For those interested in advanced features like snapshots and copy-on-write, Btrfs is a compelling option, and it’s the focus of this article. The choice of file system depends on your particular needs and the balance of performance, features, and data integrity you require.

Btrfs File System?

Btrfs, short for “B-tree file system,” is a modern file system for Linux that offers various advanced features. It’s designed to provide improved data integrity, scalability, and management of large amounts of data. Btrfs relies heavily on B-trees, which are a type of self-balancing tree structure. B-trees are used to index and manage various aspects of the file system, including file data, metadata, and directory structures. They provide fast and efficient lookup, insertion, and deletion operations.

Every leaf of these binary tree are use to store data block. Btrfs has kind of similar concept to traditional inodes found in many file systems. These inodes store metadata information about files and directories, such as permissions, timestamps, and file size. As we can guess, In Btrfs inodes are organized using B-trees for efficient access.

Still not clear? Let me try this again.

In your disk, Your filesystem stores data inside small blocks. Blocks represent the smallest individually addressable storage units. Your computer stores each file using one or more of these blocks, with a typical block size of 4096 bytes, though this size can vary based on your hardware and the specific file system in use.

File systems provide a means to locate the data within your files among the multitude of available storage blocks. This is achieved through components known as inodes. Inodes are data structures residing in specially formatted storage blocks and contain vital information about a file, such as its size, the locations of the storage blocks that hold its content, permissions (i.e., who can read, write, or execute the file), and other essential details.

In Btrfs filesystem all blocks and other data structures are stored in Binary tree format. B-trees are used to index and manage various aspects of the file system, including file data, metadata, and directory structures

You can read more about it here : https://btrfs.readthedocs.io/en/latest/dev/dev-btrfs-design.html

Copy-On-Write

From the big list of its features, one of its standout feature, and the focus of our article, is “Copy-On-Write” (COW).

Copy-On-Write is a data management technique used in file systems like Btrfs. When you make changes to a file, instead of directly overwriting the original data (Block), Btrfs creates a copy of the data you’re modifying. This ensures that the original data remains untouched, enhancing data integrity. The new copy contains the changes you’ve made. Btrfs then updates the file’s metadata to point to the new data.

This approach has several advantages. Firstly, it minimizes the risk of data corruption since the original data remains intact until the new copy is successfully written. Secondly, it enables efficient snapshots, where you can create a point-in-time copy of your file system without duplicating all the data, making backups and versioning more efficient. Overall, Copy-On-Write in Btrfs is a powerful tool for data management and protection.

Sub-Volumes

A Btrfs subvolume is like a separate section in the file system with its own set of folders and files. Think of it as a mini file system within the main one, each with its unique identification number (inode)

Subvolumes share certain parts of data, and they can be used for snapshots. A snapshot is essentially a subvolume with a starting point copied from the original subvolume. It’s important to note that Btrfs subvolumes are not the same as logical disk/devices/volumes, which are snapshots at the block level. Btrfs subvolumes work at the file extent level. You can visualize a Btrfs subvolume as a regular directory, but it has some additional features.

In simple words,

Subvolumes can be thought of as organized structures within the file system, like named containers for files and directories. These containers are essentially a type of tree structure, with inodes embedded within the root of the tree. Subvolumes can be owned by specific users and groups and have a limit on the number of blocks they can use. Once this limit is reached, no more data can be written to the subvolume. All the data within subvolumes is tracked and counted to enable snapshotting. You can create up to 264 subvolumes in the file system.

You can access a Btrfs subvolume in two ways:

Like any other directory you can access as a user.
Like a separate mounted file system using options like ‘subvol’ or ‘subvolid.’ When you do this, the parent directory becomes invisible and inaccessible. This operation is similar to a bind mount.

When you create a fresh Btrfs file system, it’s essentially a subvolume called the top-level subvolume, which is internally identified as id 5. This top-level subvolume cannot be removed or replaced by another subvolume. By default, it’s the subvolume that will be mounted unless you’ve changed the default subvolume setting.

Source : https://btrfs.readthedocs.io/en/latest/Subvolumes.html

Snapshots

Snapshots are very similar to subvolumes, but they start with the same root block as another subvolume. When a snapshot is created, it essentially duplicates the initial root block, and the system ensures that changes made in either the snapshot or the original subvolume remain isolated to their respective roots. Snapshots are writable, and you can create as many snapshots as needed. If you want a read-only snapshot, you can set its block quota to one when you create it

By default, snapshots are created read-write. File modifications in a snapshot do not affect the files in the original subvolume.

Quota Group

Quotas have a longstanding role in the Unix world. They come into play when multiple users share a single filesystem, ensuring that no one user hogs all the available space. It’s about fair resource allocation.

For files, it’s pretty straightforward. Each file has an owner and a size. Traditional quotas simply limit the total size of files owned by a user. This system is flexible because administrators can adjust quotas as needed.

However, traditional quotas don’t handle directories well. In the past, administrators had to partition the hard disk during installation, assigning separate partitions to directories like /usr or /var to set limits. The problem is that these limits couldn’t be easily changed without reinstalling the system. Btrfs subvolumes provide a solution by acting like partitions. Each subvolume looks like its own filesystem, and with subvolume quotas, you can restrict them like partitions while retaining quota flexibility. You can expand or restrict space for each subvolume as needed, without the hassle of reinstallation.

As subvolumes form the foundation for snapshots, a question arises about how to account for space when snapshots are involved. If a file is shared between a subvolume and a snapshot, who should be charged for it? The creator? Both? What if the file is modified in the snapshot? Also, sometimes you want to limit the combined space used by both the snapshot and subvolume, but others might have different requirements.

Btrfs subvolume quotas address these issues by introducing groups of subvolumes, referred to as qgroups. Each qgroup tracks two key numbers:

Referenced space: This is the amount of data reachable from any subvolume within the qgroup.
Exclusive space: This is the data where all references are only accessible from within this qgroup

You can read more about it here : https://btrfs.readthedocs.io/en/latest/Qgroups.html#subvolume-quota-groups

Before proceeding, ensure that you are working on a Btrfs file system. If not, you have the option to set up a virtual machine as a test environment (Fedora and SUSE use btrfs filesystem by default). I would recommend trying this in a test setup unless you are confident in your actions.

We have many options to find out, what file system our system is using, we can use findmnt tool on linux.

As you can see, btrfs is the output of the above command.

We can also see the list of btrfs subvolumes using btrfs utility

Or cat ‘/etc/fstab’

Now I’m assuming your system is using btrfs filesystem.

On a Btrfs file system, some disk utilities will not function as expected, requiring alternative commands to perform specific tasks. These alternative commands are tailored to work effectively with Btrfs.

To get the disk usages details.

To get space usage information for a mount point

That’s It for this part.

In the next part, we’ll delve into practical tasks like creating, managing, and deleting subvolumes, creating snapshots, and setting size limits for subvolumes, among other topics.

If you have any specific questions or tasks you’d like to cover, please don’t hesitate to let me know.

I will try my best to cover it in the next part. See you in the next article!

Sources

Introduction - BTRFS documentation

Seed devices. Create a (readonly) filesystem that acts as a template to seed other Btrfs filesystems. The original…

btrfs.readthedocs.io