Solutions Products Services Download Order Knowledge Base Support
 
Knowledge Base
 
 

Understanding file systems

    

Presently, the computer market offers a variety of opportunities of storing huge amount of personal or corporate information in digital form. Storage devices include internal and external hard drives, USB flash drives, memory cards of photo/video cameras, complex RAID-systems etc. Actual documents, presentations, pictures, music, video, databases, email messages are stored in a form of files which may be place-consuming.

The following article provides detailed description of how information is stored on a storage device.


What is a file system?

Any computer file is stored on a storage with given capacity. Actually, each storage is a linear space for reading or both reading and writing digital information. Each byte of information on the storage has its own offset from the storage start (address) and is referenced by this address. A storage can be presented as a grid with a set of numbered cells (each cell is a single byte). Any file that is saved to the storage gets these cells.

Generally, computer storages use a pair of sector and in-sector offset to reference any byte of information on the storage. A sector is a group of bytes (usually 512 bytes), a minimum addressable unit of the physical storage. For example, byte 1040 on a hard disk will be referenced as a sector #3 and offset in sector 16 bytes ([sector]+[sector]+[16 bytes]). This scheme is applied to optimize storage addressing and to use a smaller number to reference any portion of information on the storage.

To omit the second part of the address (in-sector offset), files are usually stored starting from the sector start and occupy whole sectors (e.g.: a 10-byte file occupies the whole sector, a 512-byte file also occupies the whole sector, at the same time, a 514-byte file occupies two whole sectors).

Each file is stored on “unused” sectors and can be read then by a known position and size. However, how do we know which sectors are used and which are not? Where are the size, position and name of the file stored? These answers are given by a file system.

As a whole, a file system is a structured data representation and a set of metadata describing the stored data. A file system serves for the purposes of the whole storage and it is also a part of an isolated storage segment – a disk partition. Usually, a file system operates blocks, not sectors. File system blocks are groups of sectors that optimize storage addressing. Modern file systems generally use block sizes from 1 to 128 sectors (512-65536 bytes). Files are usually stored at the start of a block and take entire blocks.

Immense write/delete operations to file system cause file system fragmentation. Thus, files are not stored as whole units, but divided into fragments. For example, a storage is entirely taken by files with the size about 4 blocks (e.g. a collection of pictures). A user wants to store a file that will take 8 blocks and therefore deletes the first and the last files. By doing this, he or she clears the space on 8 blocks, however, the first segment is near to the storage start, while the second – to the storage end. In this case the 8-block file is split into two parts (4 blocks for each part) and takes free space "holes". The information about both fragments as parts of a single file is stored in the file system.

In addition to user’s files, the file system also contains its own parameters (such as a block size), file descriptors (including file size, file location, its fragments etc.), file names and directory hierarchy. It may also store security information, extended attributes and other parameters.

To comply with diverse requirements, such as storage performance, stability and reliability, plenty of file systems are developed to serve certain user purposes.



Windows file systems

Microsoft Windows OS uses two major file systems: FAT, inherited from old DOS with its later extension FAT32, and widely-used NTFS file systems. Recently released ReFS file system was developed by Microsoft as a new generation file system for Windows 8 Servers.

FAT:
FAT (File Allocation Table) is one of the simplest types of file systems. It consists of a file system descriptor sector (boot sector or superblock), a file system block allocation table (referred as File Allocation Table) and plain storage space to store files and folders. Files on FAT are stored in directories. Each directory is an array of 32-byte records, each defining file or file extended attributes (e.g. a long file name). File record attributes the first block of a file. Any next block can be found through a block allocation table by using it as a linked list.

Block allocation table contains an array of block descriptors. Zero value indicates that the block is not used and non-zero relates to the next block of the file or a special value for file end.

The numbers in FAT12, FAT16, FAT32 stand for the number of bits used to enumerate a file system block. This means that FAT12 can use up to 4096 different block references, while FAT16 and FAT32 can use up to 65536 and 4294967296 accordingly. The actual maximum count of blocks is even less and depends on the implementation of a file system driver.

FAT12 was used for old floppy disks. FAT16 (or simply FAT) and FAT32 are widely used for flash memory cards and USB flash sticks. The system is supported by mobile phones, digital cameras and other portable devices.

FAT or FAT32 is a file system that is used on Windows-compatible external storages or disk partitions with the size under 2GB (for FAT) or 32GB (for FAT32). Windows cannot create FAT32 file system over 32GB (however Linux supports FAT32 up to 2TB).

NTFS:
NTFS (New Technology File System) was introduced in Windows NT and currently it is a major file system for Windows. This is the default file system for disk partitions and the only file system that supports disk partitions over 32GB. The file system is quite extensible and supports many file properties, including access control, encryption etc. Each file on NTFS is stored as a file descriptor in aMaster File Table and file content. A Master file table contains all information about the file: size, allocation, name etc. The first and the last sectors of the file system contain file system settings (boot record or superblock). This file system uses 48 and 64 bit values to reference files, thus, supporting disk storages with high capacity.

ReFS:
ReFS (Resilient File System) is the latest development of Microsoft currently available for Windows 8 Servers. The file system architecture absolutely differs from other Windows file systems and is mainly organized in a form of a B+-tree. ReFS has high tolerance to failures due to new features included into the system. And, namely, Copy-on-Write (CoW): no metadata is modified without being copied; data is not written over the existing data, but into new disk space. With any file modifications, a new copy of metadata is stored into free storage space, and then the system creates a link from older metadata to the newer one. Thus, the system stores significant quantity of older backups in different places providing easy file recovery unless this storage space is overwritten.

For information about data recovery from these file systems please visit Chances for recovery page.



MacOS file systems

Apple's MacOS operating system applies HFS+ file system, an extension to their own HFS file system used on old Macintosh computers.

HFS+ file system is operated by Apple desktop products, including Mac computers, iPods, as well as Apple X Server products. Advanced server products also use Apple Xsan file system, a clustered file system derived from StorNext or CentraVision file systems.

This file system stores files and folders and Finder information about directories view, window positions etc.

For information about data recovery from these file systems please visit Chances for recovery page.



Linux file systems

Open-source Linux OS aims at implementing, testing and using different concepts of file systems. The most popular Linux file systems include:

  • Ext2, Ext3, Ext4 - a “native” Linux file system. This file system falls under active developments and improvements. Ext3 file system is just an extension of Ext2 that uses transactional file writing operations with a journal. Ext4 is a further development of Ext3, extended with the support of optimized file allocation information (extents) and extended file attributes. This file system is frequently used as a "root" file system for most Linux installations.

  • ReiserFS - an alternative Linux file system for storing a huge number of small files.. It has good capability of files search and enables compact files allocation by storing file tails or small files along with metadata in order not to use large file system blocks for the same purpose.

  • XFS - a file system derived from SGI company and was initially used for company’s IRIX servers. Now XFS specifications are implemented in Linux. XFS file system has great performance and is widely used to store files.

  • JFS - a file system developed by IBM for the company’s powerful computing systems. JFS1 usually stands for JFS, JFS2 is the second release. Currently, this file system is open-source and implemented in most modern Linux versions.

The concept of “hard links” used in this kind of operating systems makes most Linux file systems similar in that the file name is not regarded as a file attribute and rather defined as an alias for a file in a certain directory. A file object can be linked from many locations, even multiply from the same directory under different names. This can lead to serious and even insurmountable difficulties in recovery of file names after file deletion or file system damage.

For information about data recovery from these file systems please visit Chances for recovery page.



BSD, Solaris, Unix file systems

The most common file system for these operating systems is UFS (Unix File System) also often referred to as FFS (Fast File System).

Currently, UFS (in different editions) is supported by all Unix-family operating systems and is a major file system of the BSD OS and the Sun Solaris OS. Modern computer technologies tend to implement replacements for UFS in different operating systems (ZFS for Solaris, JFS and derived file systems for Unix etc.).

For information about data recovery from these file systems please visit Chances for recovery page.



Clustered file systems

Clustered file systems are used in computer cluster systems. These file systems support distributed storage.

Distributed file systems include:

  • ZFS - Sun company “Zettabyte File System” - a new file system developed for distributed storages of Sun Solaris OS.

  • Apple Xsan – the Apple company evolution of CentraVision and later StorNext file systems.

  • VMFS - “Virtual Machine File System” developed by VMware company for its VMware ESX Server.

  • GFS - Red Hat Linux “Global File System”.

  • JFS1 - the original (legacy) design of IBM JFS file system used in older AIX storage systems.

Common properties of these file systems include distributed storages support, extensibility and modularity.

For more information about data recovery from these file systems please visit Chances for recovery page.