2. Introduction to File Systems
All file systems consist of structures necessary for storing
and managing data. These structures typically include
an operating system boot record, directories, and files.
Functions of a File System:
Tracking allocated and free space
Maintaining directories and file names
Tracking where each file is physically stored on the
disk
3. Definition: Computers use particular kinds of file systems
to store and organize data on media, such as a hard drive,
the CDs, DVDs, and BDs in an optical drive or on a flash
drive. Any place that a PC stores data is employing the use
of some type of file system. A file system can be thought of
as an index or database containing the physical location of
every piece of data on a hard drive.
A file system is setup on a drive during a format. See How
To Format a Hard Drive for more information.
4. Types of File Systems
FAT File System
NTFS File System
Disk file systems
Flash file systems
Tape file systems
Database file systems
Transactional file systems
Network file systems
work file systems
Shared disk file systems
Special file systems
Device file systems
5. FAT File System
FAT12 - The initial version of the FAT file system, FAT12 was introduced
in 1977, even before MS-DOS, and was the primary file system for
Microsoft operating systems up to MS-DOS 4.0. FAT12 supports drive
sizes up to 32MB.
FAT16 - The second implementation of FAT was FAT16, introduced in
1988. FAT16 was the primary file system for MS-DOS 4.0 up to
Windows 95. FAT16 supports drive sizes up to 2GB.
FAT32 - FAT32 is the latest version of the FAT file system. It was
introduced in 1996 for Windows 95 OSR2 users and was the primary
file system for consumer Windows versions through Windows ME.
FAT32 supports drive sizes up to 8TB.
6. NTFS File System
Definition: New Technology File System (NTFS) is a file
system that was introduced by Microsoft in 1993 with
Windows NT 3.1. NTFS supports hard drive sizes up to
256TB. NTFS is the primary file system used in Microsoft's
Windows 7, Windows Vista, Windows XP, Windows 2000
and Windows NT operating systems. The Windows Server
line of operating systems also primarily use NTFS.
The File Allocation Table (FAT) file system was the primary
file system in Microsoft's older operating systems but it is
still supported today along with NTFS.
7. Disk file systems
A disk file system takes advantages of the ability of disk storage media
to randomly address data in a short amount of time. Additional
considerations include the speed of accessing data following that
initially requested and the anticipation that the following data may also
be requested. This permits multiple users (or processes) access to
various data on the disk without regard to the sequential location of
the data. Examples include FAT
(FAT12, FAT16, FAT32), exFAT, NTFS, HFS and
HFS+, HPFS, UFS, ext2, ext3, ext4, XFS, btrfs, ISO 9660, Files11, Veritas File System, VMFS, ZFS, ReiserFS and UDF. Some disk file
systems are journaling file systems or versioning file systems.
Optical discs
ISO 9660 and Universal Disk Format (UDF) are two common formats
that target Compact Discs, DVDs and Blu-ray discs. Mount Rainier is
an extension to UDF supported by Linux 2.6 series and Windows Vista
that facilitates rewriting to DVDs.
8. Flash file system
A flash file system considers the special abilities,
performance and restrictions of flash memory devices.
Frequently a disk file system can use a flash memory
device as the underlying storage media but it is much
better to use a file system specifically designed for a
flash device.
9. Tape file systems
A tape file system is a file system and tape format designed to
store files on tape in a self-describing form. Magnetic tapes are
sequential storage media with significantly longer random data
access times than disks, posing challenges to the creation and
efficient management of a general-purpose file system.
In a disk file system there is typically a master file directory, and
a map of used and free data regions. Any file
additions, changes, or removals require updating the directory
and the used/free maps. Random access to data regions is
measured in milliseconds so this system works well for disks.
Tape requires linear motion to wind and unwind potentially very
long reels of media. This tape motion may take several seconds
to several minutes to move the read/write head from one end of
the tape to the other.
10. Database file systems
Another concept for file management is the idea of a
database-based file system. Instead of, or in addition
to, hierarchical structured management, files are
identified by their characteristics, like type of file,
topic, author, or similar rich metadata.
11. Network file systems
A network file system is a file system that acts as a
client for a remote file access protocol, providing
access to files on a server. Examples of network file
systems include clients for the NFS, AFS, SMB
protocols, and file-system-like clients for FTP and
WebDAV.
12. Shared disk file systems
A shared disk file system is one in which a number of
machines (usually servers) all have access to the same
external disk subsystem (usually a SAN). The file
system arbitrates access to that subsystem, preventing
write collisions. Examples include GFS2 from Red Hat,
GPFS from IBM, SFS from DataPlow, CXFS from SGI
and StorNext from Quantum Corporation.
13. Special file systems
A special file system presents non-file elements of an
operating system as files so they can be acted on using
file system APIs. This is most commonly done in Unixlike operating systems, but devices are given file names
in some non-Unix-like operating systems as well.
14. Device file systems
A device file system represents I/O devices and
pseudo-devices as files, called device files. Examples in
Unix-like systems include devfs and, in Linux 2.6
systems, udev. In non-Unix-like systems, such as
TOPS-10 and other operating systems influenced by
it, where the full filename or pathname of a file can
include a device prefix, devices other than those
containing file systems are referred to by a device
prefix specifying the device, without anything
following it.
15. Flat file systems
In a flat file system, there are no subdirectories.
When floppy disk media was first available this type of file
system was adequate due to the relatively small amount of
data space available. CP/M machines featured a flat file
system, where files could be assigned to one of 16 user
areas and generic file operations narrowed to work on one
instead of defaulting to work on all of them. These user
areas were no more than special attributes associated with
the files, that is, it was not necessary to define specific
quota for each of these areas and files could be added to
groups for as long as there was still free storage space on
the disk..
16. Aspects of file systems
Space management
File systems allocate space in a granular
manner, usually multiple physical units on the device.
The file system is responsible for organizing files and
directories, and keeping track of which areas of the
media belong to which file and which are not being
used. For example, in Apple DOS of the early
1980s, 256-byte sectors on 140 kilobyte floppy disk
used a track/sector map.
17. Filenames
A filename (or file name) is used to
identify a storage location in the file system.
Most file systems have restrictions on the
length of filenames. In some file systems,
filenames are not case sensitive (i.e.,
filenames such as FOO and foo refer to the
same file); in others, filenames are case
sensitive (i.e., the names FOO and foo refer
to two separate files).
18. Directories
File systems typically have directories (also called folders)
which allow the user to group files into separate collections. This
may be implemented by associating the file name with an index
in a table of contents or an inode in a Unix-like file system.
Directory structures may be flat (i.e. linear), or allow hierarchies
where directories may contain subdirectories. The first file
system to support arbitrary hierarchies of directories was used in
the Multics operating system. The native file systems of Unix-like
systems also support arbitrary directory hierarchies, as do, for
example, Apple's Hierarchical File System, and its successor
HFS+ in classic Mac OS (HFS+ is still used in Mac OS X), the
FAT file system in MS-DOS 2.0 and later and Microsoft
Windows, the NTFS file system in the Windows NT family of
operating systems, and the ODS-2 (On-Disk Structure-2) and
higher levels of the Files-11 file system in OpenVMS.
19. Metadata
Other bookkeeping information is typically associated with
each file within a file system. The length of the data
contained in a file may be stored as the number of blocks
allocated for the file or as a byte count. The time that the
file was last modified may be stored as the file's timestamp.
File systems might store the file creation time, the time it
was last accessed, the time the file's metadata was changed,
or the time the file was last backed up. Other information
can include the file's device type (e.g. block, character,
socket, subdirectory, etc.), its owner user ID and group ID,
its access permissions and other file attributes (e.g.
whether the file is read-only, executable, etc.).
21. The Second Extended File system was devised (by Rémy Card) as an extensible
and powerful file system for Linux. It is also the most successful file system so
far in the Linux community and is the basis for all of the currently shipping
Linux distributions.
The EXT2 file system, like a lot of the file systems, is built on the premise that
the data held in files is kept in data blocks. These data blocks are all of the
same length and, although that length can vary between different EXT2 file
systems the block size of a particular EXT2 file system is set when it is created
(using mke2fs). Every file's size is rounded up to an integral number of blocks.
If the block size is 1024 bytes, then a file of 1025 bytes will occupy two 1024 byte
blocks. Unfortunately this means that on average you waste half a block per
file. Usually in computing you trade off CPU usage for memory and disk space
utilisation. In this case Linux, along with most operating systems, trades off a
relatively inefficient disk usage in order to reduce the workload on the CPU.
Not all of the blocks in the file system hold data, some must be used to contain
the information that describes the structure of the file system. EXT2 defines
the file system topology by describing each file in the system with an inode
data structure. An inode describes which blocks the data within a file occupies
as well as the access rights of the file, the file's modification times and the type
of the file. Every file in the EXT2 file system is described by a single inode and
each inode has a single unique number identifying it. The inodes for the file
system are all kept together in inode tables. EXT2 directories are simply special
files (themselves described by inodes) which contain pointers to the inodes of
their directory entries.
.
23. In the EXT2 file system, the inode is the basic building block; every file and
directory in the file system is described by one and only one inode. The EXT2
inodes for each Block Group are kept in the inode table together with a bitmap
that allows the system to keep track of allocated and unallocated inodes.
Figure 9.2 shows the format of an EXT2 inode, amongst other information, it
contains the following fields:
mode This holds two pieces of information; what this inode describes and the
permissions that users have to it. For EXT2, an inode can describe one of
file, directory, symbolic link, block device, character device or FIFO. Owner
Information The user and group identifiers of the owners of this file or
directory. This allows the file system to correctly allow the right sort of
accesses, Size The size of the file in bytes, Timestamps The time that the
inode was created and the last time that it was modified, Datablocks Pointers
to the blocks that contain the data that this inode is describing. The first twelve
are pointers to the physical blocks containing the data described by this inode
and the last three pointers contain more and more levels of indirection.
24. The EXT2 Superblock
The Superblock contains a description of the basic size and shape of
this file system. The information within it allows the file system
manager to use and maintain the file system. Usually only the
Superblock in Block Group 0 is read when the file system is mounted
but each Block Group contains a duplicate copy in case of file system
corruption. Amongst other information it holds the:
Magic Number This allows the mounting software to check that this is
indeed the Superblock for an EXT2 file system. For the current version
of EXT2 this is 0xEF53. Revision Level The major and minor revision
levels allow the mounting code to determine whether or not this file
system supports features that are only available in particular revisions
of the file system. There are also feature compatibility fields which help
the mounting code to determine which new features can safely be used
on this file system, Mount Count and Maximum Mount Count
Together these allow the system to determine if the file system should
be fully checked.
25. Conclusion
This paper discusses how the Modify-on-Access file system efficiently
extends the capabilities of conventional file systems. It demonstrates
how an active file system can simplify both applications and system
usage by performing computations on behalf of processes.
Furthermore, the paper describes the implementation of the MonA file
system and the export transformation. Section 5 shows that the
overhead of a kernel-resident transformation is very small and that
transforming data inline provides performance benefits. Furthermore, it
shows that the export transformation provides user extensibility.
The MonA file system is the first component of a suite of system
software designed for a collaborative memory system in which
intelligent peripheral devices collaborate with a host processor to
accomplish tasks. Current projects include a MonA virtual memory
system and a MonA peripheral device. These implementations are
similar to the Active Page and Active Disk simulations described in
related work.