Chances of data recovery depending on the file system

As mentioned in the article on evaluating recovery potential across various data loss scenarios, the file system is one of the principle factors in determining the likelihood of success. It serves as a mechanism that enables the operating system to organize and retrieve data from a storage medium. But beyond merely defining how information is stored on the disk, each file system has its own distinct practices for handling file deletion and storage formatting. And since different operating systems employ different file systems, the chances of data recovery depend largely on the specific file system in use. The following information will guide you through assessing recovery prospects after accidental file deletion or file system formatting, based on the file system applied on your storage.

At the same time, if both the operating system and the storage device support TRIM, the possibility of successful data recovery can be virtually eliminated, regardless of the file system type in use. A more detailed exploration of TRIM's effects on the procedure can be found in the Obstacles to data recovery induced by TRIM section.

Hint: The most typical cases of data loss referred to in this article are outlined in the principles of data recovery.

Content:

File systems of Windows
File systems of macOS
File systems of Linux
File systems of BSD, Solaris, Unix
Obstacles to data recovery induced by TRIM

File systems of Windows

The primary file systems of the Windows OS family are FAT (FAT32), exFAT, and NTFS. While these are commonly used on general-purpose desktop systems, the next-generation ReFS file system is designed for advanced use cases and may be utilized on certain Windows-based servers as well as computers running professional versions of Windows (starting from Windows 10).

It is important to note that successful data recovery from these file systems can be hindered by severe file fragmentation and is only possible if the affected files have not been overwritten.

FAT/FAT32

The storage space in FAT/FAT32 is divided into equal-sized units called clusters. A file may occupy one or more of these clusters. Yet, the clusters containing data of the same file may not be situated adjacent to each other. Such files are then referred to as fragmented.

The File Allocation Table (FAT) tracks the allocation of clusters across the storage medium, determining which clusters are assigned to which files. Typically, two copies of the FAT table are maintained, in case one copy becomes corrupted. The table has an entry for each cluster in the file system. If a cluster is occupied, its FAT entry may contain a link to the subsequent cluster used by the very same file or an indication that the cluster is the final in the file’s cluster sequence.

The Root Directory keeps entries for all files and folders stored at the root of the storage device. Each entry records essential information about a file, like its starting cluster, name, size and other attributes. It also points to the first cluster belonging to that file.

File deletion from FAT/FAT32

The file’s directory entry is marked as deleted. The first character of the file's name is replaced by a special value indicating its deleted state. In FAT32, the starting cluster may also be updated. The FAT table entries that correspond to its clusters are marked as "free", which destroys the chain of clusters that make up the deleted file.

Recovery of non-fragmented files: When the file’s clusters are located contiguously, data recovery is relatively easy. Its name, size and starting cluster are still present in the directory entry. This increases the chances to restore the file, often close to 100%.
Recovery of fragmented files: The chain of the file’s clusters stored in the FAT entry is destroyed, leaving no links to the intermediate and end clusters. However, the directory entry remains intact, so the file’s name, size and starting position is still known. It may be possible to predict the locations of the file’s fragments using heuristics (trial-and-error methods), yet, there is no guarantee of success.

Formatting of FAT/FAT32

Both copies of the FAT table are wiped, which destroys the mapping between the file’s clusters. The Root Directory entries are also cleared. However, the data content remains on the disk until it is overwritten.

Recovery of non-fragmented files: Formatting clears the directory entries, therefore, the names, sizes and starting clusters of files are unknown. Yet, recovery algorithms based on the known file signatures (RAW recovery method) may identify the data content of files and retrieve them successfully. At the same time, file names, directories and other attributes will be lost.
Recovery of fragmented files: The chains of clusters previously available in the FAT tables are missing, and fragmentation makes it extremely difficult to predict possible content locations. Most files are likely to be corrupted.

exFAT

As a successor to FAT/FAT32, this file system is very similar to it in structure and operation. exFAT also relies on the File Allocation Table, but the table is used to track cluster sequences for fragmented files only. Also, just one copy of it is maintained.

In addition, exFAT has a separate structure for managing cluster usage. Instead of doing this directly in the FAT entries, it employs an Allocation Bitmap. It is stored in the data region and indicates the status of each cluster – whether it’s occupied or available for new data. This approach helps exFAT to optimize data placement and reduce the extent of fragmentation.

File deletion from exFAT

exFAT updates the Allocation Bitmap to mark the clusters used by the file as free for other data. Yet, the FAT table entries are not updated immediately and may still contain links to clusters that belong to the deleted file. The file’s content also stays on the storage until replaced by new files.

File recovery: When the FAT entries are still intact, the cluster sequence of a fragmented file can easily be reconstructed and used to retrieve the entire file. If they get overwritten, no information is available about the location of file’s fragments. However, the RAW recovery method is more likely to yield accurate results in exFAT in view of its lower degree of fragmentation.

NTFS

The core component of NTFS is the Master File Table (MFT). It keeps detailed records for every file and folder stored within the file system. A Bitmap attribute of the MFT table marks which records in it are currently in use and which are free. An MFT record may store various file attributes, including location, name, size, the date/time of creation and last modification.

The data content of small files is stored directly within the MFT record. In contrast, larger files are stored outside the MFT, and the MFT records then contain pointers to their physical locations on the disk. Large file attributes may likewise be stored outside the MFT, and the file’s MFT record then keeps their addresses.

Directories in NTFS are represented by special files. Such files contain a list of entries for all files and subdirectories within the given directory, with references to their MFT records.

In addition to the Bitmap attribute within the MFT, NTFS maintains a separate Bitmap file for the entire file system. Using this file, it tracks which areas on the storage medium are occupied and which are free.

File deletion from NTFS

The file’s MFT record is not erased, but is marked as "unused", which means that it can be overwritten soon. The storage space occupied by the file’s content is marked as free in the Bitmap and can now be reused for other data. The file’s entry is also removed from the directory.

File recovery: With the file’s name, size, and storage location still available in the MFT record, data recovery software can accurately reconstruct the deleted file. Provided that the content hasn’t been overwritten, data recovery chances are almost 100%.

Formatting of NTFS

NTFS generates a new Master File Table. This new MFT overwrites the starting portion of the previous MFT, but the rest of it remains intact.

File recovery: The information about the first 256 files is lost due to partial overwriting of the MFT. Such files can only be restored using the method of RAW recovery, without their original names, directories and other attributes. All files beyond this number can be recovered successfully with the possibility up to 100%, unless their data content gets overwritten.

ReFS

Unlike its predecessors, ReFS organizes data using B+trees that operate similar to databases. Such a tree consists of the root, internal nodes and leaves. Each node contains an ordered list of keys used to guide the search process and pointers that refer to the nodes of a lower level or the actual data in the leaves. B+trees represent nearly all elements in the file system, including the content of files and metadata.

The Directory is the main component of ReFS, also represented as a B+tree. It uses keys corresponding to folder object numbers, whereas files in it are stored as records rather than directory entries.

ReFS employs Copy-on-Write (CoW), ensuring that original file system entries are never modified directly. Instead, the data is copied and changes are written to new locations, preserving the original information.

File deletion from ReFS

As ReFS employs the Copy-on-Write (CoW) technique, it creates a new copy of the file’s metadata, makes the necessary changes to reflect the file's deletion and updates the storage structure only when the new metadata is successfully written.

File recovery: Thanks to Copy-on-Write (CoW), the original version of the metadata is preserved on the storage. As long as it hasn’t been overwritten by new information, the file can be recovered in its entirety, with the possibility of up to 100%.

Hint: Please rely on the instruction if you need to recover data from the file systems of Windows.

File systems of macOS

All modern Mac computers, starting with macOS 10.14 (Mojave), use APFS as their default file system. This format is also implemented across the entire Apple's product line, including appliances running iOS, iPadOS, tvOS and watchOS. Although APFS is the current standard, Apple continues to support the older HFS+ file system, mainly for backward compatibility with legacy versions of macOS. In addition, the exFAT file system of Microsoft is widely used on external Apple devices that need to be accessed under different operating systems.

It is important to emphasize that data recovery from the file systems of macOS is only possible until the files are overwritten.

HFS+

Storage space in HFS+ is divided into equally sized allocation blocks, that may be grouped into clumps to reduce fragmentation. The Allocation File tracks the status of each allocation block – whether it is free or occupied.

The content of files is organized using fork structures: a data fork contains the actual data, whereas all the additional information (metadata) resides in a resource fork. Each of these forks occupies some allocation blocks. A continuous range of allocation blocks assigned to a certain fork is called an extent. An extent, in its turn, is represented by a starting block and the number of blocks it occupies.

Most of the file system metadata is managed through special files that are structured as B-trees. Particularly significant is the Catalog File – it describes the directory hierarchy of the file system, keeps essential properties for each file and folder, including their names, and stores the first eight extents of the file's data and resource forks. If a fork has more extents, they are recorded in the Extents Overflow File. Any additional file’s properties, like extended metadata, are maintained in the Attributes File.

HFS+ supports hard links, allowing a single file to appear in multiple directories without duplication. The file’s data remains stored in one location on the disk, and multiple hard link entries in the Catalog File simply reference that same content.

The journal is used to record changes made to the file system. However, this journal has a limited size, so, once it becomes full, older entries are overwritten by new ones in a cyclical manner.

File deletion from HFS+

HFS+ updates the Catalog File by reorganizing the B-Tree and removing references to the deleted file. In case of a hard link, the reference to the file is deleted from the respective directory. The blocks occupied by the file are marked as free in the Allocation File and can be reused for other data. However, the actual content is not immediately erased. The information also stays in the journal for a certain period of time.

File recovery: The journal may contain information about the deleted file, but the likelihood depends on how much time has passed since deletion. If the journal records are overwritten, the method of RAW recovery can be used, though it is only effective for non-fragmented files and is unable to restore the initial file names, directories and other properties.

Formatting of HFS+

The Catalog File is reset to its default state, which wipes all records about previous files. However, the Journal and the actual data content remain unchanged.

File recovery: Some metadata can be restored with the help of the journal. The rest of the missing files will be reconstructed using the RAW recovery technique. The success will depend on the extent of file system fragmentation.

APFS

An APFS volume is housed within a Container, which can hold multiple file systems sharing the available storage space. All occupied and free storage blocks in the Container are tracked using a common Bitmap. However, each file system is responsible for managing its own directory hierarchy, file content and metadata.

The allocation of files and folders is organized as a B-tree, similar to the Catalog File in HFS+. Files are made up of extents, which specify the block where the content starts and its length in blocks. There is also a separate B-tree to manage extents within the file system.

Instead of modifying existing file system objects in place when changes are made, APFS creates a copy of the data and writes the new version to a different location on the storage. This approach is known as Copy-on-Write (CoW).

File deletion from APFS

APFS removes the references to the deleted file by wiping the corresponding nodes in the allocation B-tree.

File recovery: Older versions of the deleted file's metadata may still be available, offering a possibility to reconstruct the file's content. Nonetheless, APFS implements encryption as an integral part of its architecture, with this feature strongly encouraged and often enabled by default on Apple devices. When encrypted, the file system secures not only user data, but also critical metadata structures. This extensive use of encryption adds significant complexity to the recovery process.

Hint: Please rely on the instruction if you need to recover data from the file systems of macOS.

File systems of Linux

Linux is a versatile open-source project that comes in numerous versions called "distributions," each with its own peculiarities and configurations. And it is no big surprise that the file systems may also differ significantly across these distributions. In general, the core of all these operating systems – the Linux kernel – supports a wide range of storage formats. Yet, the most widely used Linux file systems are those belonging to the Ext family (Ext2, Ext3, Ext4), as well as XFS, Btrfs, F2FS, JFS and ReiserFS.

It must be pointed out that data recovery from these file systems is only possible until the original data remains on the storage and has not been overwritten.

XFS

XFS splits the volume into equal-sized regions called Allocation Groups. Each of them functions like an independent file system, managing its own storage space and structures.

Free space in XFS is controlled with the help of two B+trees: the first one records the starting block of a contiguous free space area, and the second one – the number of blocks in it. A similar mechanism is used for tracking the blocks allocated to files. Their content is stored in continuous blocks, referred to as extents.

Every file and directory is represented by an inode, a special structure that contains their metadata, such as size, permissions, etc. For smaller files, an inode stores the information about extents allocated to the file. For larger or fragmented files, an inode points to a separate B+tree that keeps track of extents associated with the file. However, file names are not available in inodes. They reside in directory entries that map those names to their corresponding inodes. There is also a dedicated B+tree in each Allocation Group used to manage the allocation and deallocation of inodes.

XFS employs a journal for operations with metadata. All changes to it are recorded in the journal before being written to disk.

File deletion from XFS

The inode associated with the deleted file is removed from the B+tree in the Allocation Group and becomes available for reuse. The free block B+trees are updated to indicate the released space. The directory entry, which maps the file name to the corresponding inode, is erased. Yet, the extent information often remains intact.

Recovery of non-fragmented files: If the extent information has not been overwritten, the chances of recovering the file’s content are close to 100%. Recovering the file’s name is more challenging, since names are stored in the directory, which no longer references the deleted file. However, for recently deleted files, the journal may still contain metadata about the file and help in retrieving its correct name and directory.
Recovery of fragmented files: While the extents themselves may remain intact, the inode is no longer associated with the file's data, which complicates the recovery process. Without any information about the relationships between extents, it may be difficult to reconstruct the complete sequence.

Formatting of XFS

The B+trees responsible for managing space allocation are wiped. A new root directory is created, replacing the previous one.

Recovery of non-fragmented files: While the file system structures are reset, the actual data blocks may remain on the disk until they are overwritten. The chances of recovery are generally high for non-fragmented files.
Recovery of fragmented files: Since their data blocks are not stored consecutively, the prospects are lower as compared to non-fragmented files.

Hint: Please rely on the instruction if you need to recover data from the file systems of Linux.

Ext2

Ext2 uses blocks as the smallest unit of data storage. These blocks are then organized into Block Groups. Each Block Group has a Block Bitmap that keeps track of which blocks in it are free or occupied.

Inode structures are used to store metadata about all files and directories, including their sizes and locations of the blocks that hold actual data. The inodes belonging to each Block Group are kept in its Inode Table, and an Inode Bitmap records which inodes are allocated.

However, file names do not constitute part of metadata and are not stored directly in inodes. Instead, they reside in directory files, which are in fact just regular files that contain directory entries.

File deletion from Ext2

Ext2 marks the inode that describes the deleted file as free in the Inode Bitmap. The blocks used to store its data are also marked as free in the Block Bitmap, becoming available for reuse. The file’s name is removed from the directory entry, destroying the link between the name and inode number.

Recovery of non-fragmented files: The inode still keeps important information about the file, including its size and locations of data blocks. Provided that the content has not been overwritten, the chances of recovering the file are quite high. However, since the file’s name is not stored in the inode and the reference to the inode from the directory is missing, the name of the file gets permanently lost.
Recovery of fragmented files: The information needed to locate the blocks of a file can be found in the inode, so the chances of recovery are more or less the same as for non-fragmented files, although fragmentation increases the risk of partial overwriting.

Formatting of Ext2

The file system is reset, which erases the content of all Block Groups, including the inodes.

Recovery of non-fragmented files: All vital structures describing the file system are missing, so, only the RAW recovery method can be applied to attempt recovery. The initial file names and directories will be lost anyway.
Recovery of fragmented files: Without the proper file system metadata, it is difficult to piece the fragments of such files back together, therefore, fragmented files are likely to get corrupted.

Ext3/Ext4

Ext3 expands on Ext2 by adding a journal file. This journal tracks all changes before they're committed to the file system. So, when any modification is made, it is first recorded in the journal, which improves the file system reliability.

Ext4 builds on Ext3 by introducing extents. Extents provide a more efficient mechanism of data placement compared to the block-based one used by Ext2 and Ext3. They allow allocating larger areas of continuous space, described by the address of the starting block and the total number of blocks in the extent. For smaller files, extents are stored directly in the file’s inode. If a file has more than four extents, Ext4 keeps them in a separate hierarchical B+tree structure.

Another feature of Ext4 is called delayed allocation, which also helps to minimize fragmentation. Instead of writing the data right away, Ext4 accumulates it in memory and allocates space for it only when more data is ready to be written or the file is closed.

File deletion from Ext3/Ext4

Ext3/Ext4 logs the operation by creating a record in the journal. After that, the file’s inode is flagged as available for reuse in the Inode Bitmap. The extents used to store its content are also marked as free. The directory entry linking the file’s name to the inode isn’t entirely erased, but the order for directory reading is changed.

Recovery of non-fragmented files: The link between the file’s name and inode is broken, but the journal may still keep metadata about recently deleted files. If the journal contains records related to the file, they can be analyzed to recover both the file’s content and its name. The quality of the recovery result will depend on how long the file system has been active after deletion.
Recovery of fragmented files: Such files in Ext3/Ext4 have lower recovery chances, since the scattered blocks/extents are much harder to locate and put together. However, the journal may improve the possibility for recently deleted files, enabling their retrieval, even with the original names.

Formatting of Ext3/Ext4

This operation involves clearing all Block Groups and deleting the inodes. The journal is typically reset, losing all previous records. However, depending on the specific driver, it may still contain information about recently created or modified files.

Recovery of non-fragmented files: The method of RAW recovery usually allows recovering intact files. However, in most cases, the initial file names cannot be retrieved. Their recovery is possible only for very recent files and provided that the journal has not been cleared as part of formatting.
Recovery of fragmented files: The chances of success are low for such files due to the scattered placement of their content.

ReiserFS

A ReiserFS volume is divided into blocks of a fixed size that serve as a basic unit of storage. A dedicated bitmap tracks which file system blocks are in use and which are free.

An S+tree is used by ReiserFS to organize all files, directories and metadata. It consists of four types of items: indirect items, direct items, directory items and stat items. Direct items contain actual data; indirect items point to the locations of data blocks; directory items represent directory entries; stat items contain metadata details about files and directories. Every item in the S+tree has a key that uniquely identifies it.

ReiserFS reduces wasted space with the help of a special tail-packing technique. When files or file fragments are smaller than a full block, they are packed together and stored in the unused portion of a block, which improves storage efficiency.

Instead of writing changes directly to the S+tree, ReiserFS first logs them in the journal. After that, the changed blocks can be copied from the journal to actual locations on disk.

File deletion from ReiserFS

ReiserFS updates the S+tree by removing the nodes that correspond to the deleted file. The blocks used by the file are marked as free in the bitmap.

Recovery of non-fragmented files: ReiserFS keeps copies of the S+tree and logs changes to the file system in its journal, so, older versions of the S+tree nodes associated with the deleted file may still exist in the file system. In this case, the recovery chances for this file, including its name, are up to 100%.
Recovery of fragmented files: Due to the specifics of ReiserFS architecture, the recovery prospects for fragmented files coincide with those for non-fragmented files.

Formatting of ReiserFS

ReiserFS creates a new S+tree, which overwrites the existing structure.

Recovery of non-fragmented files: The file system keeps copies of the S+tree at various stages, providing a strong possibility to recover the previous S+tree and retrieve the files with their original names. However, the chances for recovery decrease if the volume was full before formatting.
Recovery of fragmented files: The recovery chances for fragmented files do not differ from those for non-fragmented files.

JFS (JFS2)

JFS is divided into multiple regions known as Allocation Groups. Allocation Groups consist of data blocks and metadata blocks, relying on bitmaps to track the status of these blocks.

FileSets are used to organize the logical structure of files and directories within the file system. Every file and directory in JFS is associated with an inode, which not only describes it, but also points to the location where its contents are placed. Small directories are stored within their inodes, whereas larger directories are represented by separate B+tree structures.

The actual file’s data is organized as sequences of contiguous blocks, referred to as extents. All extents are indexed by a dedicated B+tree.

Two B+trees are also used to manage free space: one tracks the starting blocks of free extents, and the other monitors the number of available extents.

To ensure data consistency, JFS creates a dedicated log area and records all changes to its metadata in the journal.

File deletion from JFS

The file system updates the B+tree that tracks free space, marks the file's inode as free and then rebuilds the directory structure to reflect the file’s deletion.

Recovery of non-fragmented files: As long as the inode belonging to the deleted file is not overwritten, the chances of file recovery often approach 100%. However, recovery of its original name is unlikely, as the link between the inode and its name in the directory is missing.
Recovery of fragmented files: The prospects are the same as for non-fragmented files.

Formatting of JFS

JFS creates a new B+tree. It starts small and grows as the file system is used.

Recovery of non-fragmented files: The chances of recovery are relatively high, especially due to the small size of the new B+tree. However, as new data is written to the disk, the information about old files may be overwritten.
Recovery of fragmented files: The likelihood is generally comparable to non-fragmented files.

Btrfs

Btrfs relies entirely on B-trees to manage its data and structures. Each B-tree serves a specific purpose, as explained below.

The file system is able to span across multiple devices, without the need for any additional storage technologies. The space from these devices is combined into a single logical pool, and each block in this pool is assigned a virtual address. Such virtual addresses are used throughout the file system, instead of the real physical addresses. The Chunk B-tree tracks the mapping between virtual addresses and actual physical locations. It also records which devices constitute a part of the storage pool. On the other hand, the Device B-tree maps physical blocks on the devices to their virtual addresses.

All information about files and directories is kept in the File System B-tree. Small files are stored directly in this B-tree as extent items. If a file is larger, its content is stored outside the File System B-tree, whereas extent items in this tree point to all extents of comprising the file (continuous regions of occupied storage space). Directory items in this B-tree store the content of directories, including file names and references to their corresponding inode items. Inode items store metadata, including sizes, permissions and other attributes of files.

The placement of extents in Btrfs is highly dynamic. They are assigned according to the needs of various tasks and distributed across the available space in a non-sequential fashion. A separate Extent B-tree is used to manage the allocation of all extents within the file system.

When any data or metadata is modified, Btrfs writes it to a new location rather than overwriting the existing information. This Copy-on-Write (CoW) principle helps to maintain file system integrity and allows for easier recovery after crashes.

File deletion from Btrfs

Btrfs rebuilds the File System B-tree to remove the nodes associated with the deleted file, including its inodes, directory entries and data content. The Extent B-tree is also updated to release the extents allocated to this file. However, due to the Copy-on-Write nature of this file system, references to the deleted file’s data and metadata remain in older copies.

File recovery: By analyzing the older copies, it's possible to find both data and metadata of the deleted file. In most cases, data recovery can be successful. However, when mass deletion occurs, the rebuilding of storage allocation complicates recovery due to the non-linear nature of data distibution.

Formatting of Btrfs

Btrfs resets its primary metadata structures, including the File System and the Extent B-trees.

File recovery: Old copies of data and metadata may still exist on the storage. However, the storage allocation is reset, making it difficult to locate references to this information. Combined with non-sequential allocation, this creates challenges for data recovery.

F2FS

F2FS divides the entire storage space into segments of a fixed size. These segments are then organized into sections, while several sections make up a zone.

Data placement in F2FS is controlled with the help of nodes. These may fall under three types: direct nodes store the addresses of actual data blocks; indirect nodes hold links to blocks in other nodes; inodes contain metadata for files and directories, including names, sizes and other attributes. The mapping of these nodes to their physical locations on the storage is managed through the Node Address Table (NIT).

The actual content of files and directories is stored in the Main Area. It is divided into sections, separating data blocks from node blocks. The Segment Information Table (SIT) tracks the status of each block. Blocks are labeled as valid when they are occupied and invalid when they contain deleted data. The Segment Summary Area (SSA) records which blocks belong to which node.

F2FS has its own version of directory entries, referred to as dentries. Dentries map file names to the inodes that contain the rest of the file’s metadata.

When the amount of free space in the file system is insufficient for new data, F2FS performs cleaning in the background, usually when the system is idle. Victim segments may be selected based on the number of their used blocks or age.

The file system consistency is maintained with the help of Checkpoint blocks. Such blocks store information about the state of crucial file system elements at specific moments in time, acting as recovery points in the event of a crash.

File deletion from F2FS

F2FS updates the NAT and SIT tables to indicate that the file's blocks are no longer in use. The changes are kept in memory until a new Checkpoint is created, capturing the current file system state. The actual data content remains in place until eventually cleaned.

File recovery: The most recent Checkpoint can be used to locate the file's nodes and data blocks. The file can be restored with high chances of success, as long as it hasn’t been wiped during cleaning.

File systems of BSD, Solaris, Unix

The Unix family of operating systems, which includes variants like BSD and Solaris, has historically relied on the UFS (Unix File System). Over time, UFS was upgraded to UFS2, offering improvements like support for larger file sizes and better overall performance. Later on, a next-generation file system called ZFS was developed for Solaris. Since then, ZFS has been adopted by other operating systems, like FreeBSD, and is particularly favored in environments that require modern storage features, high performance and reliability.

It must be emphasized that data recovery from the file systems of BSD, Solaris and Unix is possible only as long as the original information is present on the storage and has not been overwritten.

UFS/UFS2

A UFS volume consists of one or more Cylinder Groups. Each Cylinder Group has its own set of inodes to store the information about files and data blocks to place their actual content. A bitmap is used to track which blocks and inodes in the Cylinder Group are occupied.

A file in UFS consists of data blocks and an inode. Inodes keep the metadata of files, such as sizes, permissions, etc., and direct pointers to the first 12 data blocks. If a file is larger, its inode points to an indirect block, which, in turn, contains addresses to further data blocks. However, inodes do not include file names – these are stored in directories.

Directories in UFS are presented as lists of entries. Each directory entry stores the name of a file and the corresponding inode number. A single file is usually associated with a single inode. However, in case of a hard link, the same file can have multiple names, with different directory entries pointing to the same inode. The inode then keeps a count of how many hard links refer to it.

File deletion from UFS

UFS clears the file’s inode that describes its size and locations of the first 12 data blocks. The bitmap is updated to mark the respective data blocks and inodes as free. The directory entry that links the file’s name to its inode is deleted as well.

Recovery of non-fragmented files: The file’s inode is missing, so, the information about the file’s size and locations of its first 12 data blocks is not available. The link between the file’s name and its inode is also permanently lost. Using the RAW recovery method, it is possible to recover a non-fragmented file, though without the original name and directory. Nevertheless, you will rarely come across non-fragmented files in UFS due to the specifics of its Soft Updates algorithm.
Recovery of fragmented files: There is no information about which data blocks belong to the deleted file, therefore, the chances of recovery are very low in case of fragmentation.

Formatting of UFS

The operation resets Cylinder Groups, which wipes the existing inodes and directory entries. Consequently, all references to files and their names are destroyed. At the same time, the data blocks remain intact until overwritten.

Recovery of non-fragmented files: Non-fragmented files can be restored using the RAW recovery method, but will lose their initial names and directories.
Recovery of fragmented files: Such files are nearly impossible to reconstruct due to the lack of inodes that could connect their fragments.

ZFS

ZFS differs from most file systems in its ability to combine multiple physical drives into a common storage pool. A pool contains one or more virtual devices, referred to as vdevs. Each vdev has a label with an Uberblock that serves as the pool’s primary metadata structure.

ZFS allocates storage in blocks of variable sizes. The blocks are organized as objects of different types. Each object is described by a dnode, a structure that records such details as the object's type, size, etc. and contains up to three block pointers. A pointer can either refer directly to a block with actual data (leaf block) or to an indirect block that links to another block.

All related objects are grouped into object sets. Each object within it is assigned a unique identifier called an object number. The metadata for these objects is also organized as an object and is referenced through a special structure known as a metadnode. A dedicated object set called the Meta Object Set (MOS) maintains metadata for the entire storage pool.

ZFS follows the Copy-on-Write (CoW) principle when writing data. Instead of overwriting the existing blocks, it allocates new blocks in a free storage space area to save the updated data. Once the new data is successfully written, the file system metadata is updated to point to these newly written blocks, while the old blocks also remain intact.

File deletion from ZFS

The dnode describing the deleted file and the blocks containing its data are unlinked. The object number associated with the file is marked as available for reuse. The reference to the file is removed from the directory object. A new Uberblock is written to reflect the updated state of the file system. However, all these modifications are performed with the Copy-on-Write (COW).

File recovery: Depending on the file system usage and how full the pool is, older versions of the file’s data and metadata may exist in the storage pool for quite a long time and be used for complete file recovery with the correct name. However, recovery efforts may be complicated by the fact that data is distributed across multiple drives, in blocks of varying sizes. Without the pool metadata (the Uberblock, Meta Object Set, etc.), it is nearly impossible to assemble the storage and reconsruct its layout. Therefore, the success will also depend on the integrity of this metadata.

Hint: Please rely on the instruction if you need to recover data from the file systems of Unix, Solaris or BSD.

For the best chances of data recovery from the above-described file systems, UFS Explorer and Recovery Explorer are recommended as reliable and effective software solutions with extensive compatibility. These programs run under different operating systems and excel at recovering deleted, lost or inaccessible data from various devices formatted with file systems used in Windows, Linux, macOS, Unix, BSD and Solaris.

Obstacles to data recovery induced by TRIM

Most modern SSDs and a growing number of new SMR drives are equipped with a translator that supports TTRIM. When TRIM is enabled, the file system sends a TRIM command to the storage device, informing it that specific blocks are no longer in use and should be prepared for erasure. These blocks are flagged as free in the device's internal mapping tables and await physical wiping, known as garbage collection. Such measures are aimed at optimizing the drive's performance and help to extend its lifespan. However, they also present significant, often insurmountable obstacles to recovering deleted or formatted files.

Once TRIM is executed, the affected blocks are treated by the device as empty. Although the data may still exist, the references to these blocks are effectively removed at the drive’s internal mapping level. This makes the blocks invisible not only to the file system but also to any data recovery software. And once garbage collection is performed, the blocks are wiped at the hardware level, leaving no possibility for data recovery by any means.

Garbage collection may not start immediately after the TRIM command, and it’s a relatively slow process, with the speed varying depending on the drive’s vendor, model and type of memory. In contrast, updating the translator is a quick operation, often occurring almost instantly. After this update, the drive will start returning zeros in the trimmed areas, but the actual data may not be erased yet.

File deletion

The file system marks the space as free for reuse, and the TRIM command is triggered shortly afterward. The translator is updated, marking the deleted data as no longer needed. The blocks it occupies are scheduled for garbage collection. The cleaning process itself may begin immediately or be delayed by a period from a few seconds to a week.

File recovery: There is a very short time window, usually about 5 seconds, to power off the device before the translator gets updated. Once TRIM is executed, standard recovery methods will be unable to access the data. On the other hand, data recovery professionals may still be able to retrieve it using specialized equipment to bypass the drive’s internal translator layer. They can put the drive into debug mode and copy the raw contents of its memory chips. Success is not guaranteed in this case and will depend on many technical factors. However, if the drive stays on for some time, the deleted data will be gradually destroyed by garbage collection.

Formatting

The operation resets the file system structure. Meanwhile, the TRIM command is activated, informing the drive that the previously occupied blocks are now "free". The drive’s translator updates its internal mapping. Garbage collection may start clearing the data in those blocks right away or be postponed, depending on the drive's workload.

File recovery: As with deletion, if the drive is disconnected before garbage collection occurs, data recovery professionals may have a chance to retrieve raw data from the blocks marked as unused by the translator. However, once garbage collection is complete, the wiped data is gone for good and cannot be recovered.

Metadata damage

In the event of data or metadata corruption, the file system is unable to identify and manage the affected blocks correctly, failing to issue the TRIM command. Without this instruction, the storage device can’t update the status for those blocks. Consequently, the blocks aren't scheduled for garbage collection and remain untouched until the drive receives valid instructions.

File recovery: Since TRIM is not executed, the data within the blocks remains physically intact, although it is no longer accessible through the operating system. Data recovery software may be able to access and reconstruct the missing files. The likelihood of success will depend on the extent of the damage and the specifics of the file system in question.

Last update: December 04, 2024

If you liked this article, you can share it on social media: