The Linux Filesystem

The Linux Filesystem

The phrase "Everything is a file" has been used to desscribe Unix and Linux for a long time, and is profoundly characteristic of the fundamental architecture of Linux, because yes; EVERYTHING is a file.

Linux used a tree structure file system, all leading to the topmost path of /. Here is a quick explanation of the key directories you will find at the top level of the Linux system:

bin/ - binaries for commands such as ls and yum
dev/ - device files / hardware and key functions like random and null
home/ - home
llib/lib64/ - shared code libraries
mnt/ - empty by default, but you can mount file systems here (even remote locations)
proc/ - processes like output interrupts
run/ - runtime variable data si stored here
var/ - logs, mail etc, with web servers and services in www
boot/ - all the files needed to boot the system. Contains the kernel.
etc/ - IMPORTANT: config files for applications
media/ - mem sticks, peripherals
opt/ - optional addon system software
root/ - home file for root user
sbin/ - system/super binaries similar to bin, but for more important root stuff like ifconfig,  iptables and iw
sys/ - firemare, hypervisor, power and stuff
usr/ - UNIX SYSTEM RESOURCES (not "User"). Like a whole new file system for the system. Has another etc, bin and tmp and stuff. NOT USER.
srv/ - The Server directory. This is intended for static files that are served from services such as HTTPS and FTP. You will commonly find this empty most systems.

LINKS

Linux uses a system of links to map files to raw data stored on the drive. The two types of links are as follows:

Hard link - the same file can have different names and be spread out accross the filesystem. A name for any file is a single hard link to the data, and more can be created. Deleting a link doesn't delete the file unless there are no other links

Soft link - a file can be created and given a hard link. A soft link can be created which then links to that file via the hard link, but the hard link is the path to that file still. You can still cat, grep, etc on the soft link. but deleting it doesnt actually remove the file data

The file type can be shown with "ll" or "ls -la" The first character is d for directory, l for link (soft link) and - for file (hard link). The softlink path details are shown too:

drwxr-xr-x 24 root   root   4096 May  1 17:54 ../
lrwxrwxrwx  1 root   root     44 May  1 18:06 .directory -> /path/to/file...
-rw-r--r--  1 root   root      0 May 15 21:56 example

The number before user shows how many links there are. The hardlink becomes a second link, and so both links recognise they lead to the same file. THe soft link shows 1 because it links to a single hardlink, and if that is removed then the softlink breaks (dead links are red~)

-rw-r--r-- 2 root root    0 May 15 21:56 example
-rw-r--r-- 2 root root    0 May 15 21:56 Hardlinktoexample
lrwxrwxrwx 1 root root    7 May 15 22:01 Softlinktoexample -> example

ln file alsofile - creates a second hard link
ln -s file SL - creates a softlink named SL to the "file" hardlink

INODES

To fully understand how links work, you must understand how files work. The Unix filesystem uses a strucure of inodes, which logically map filesystem data to sections of the disk block. Any system will have a finite number of inodes, relative to the available capacity on the disk. Whenever you create a file, you are storing the file data within an inode range on the disk. A second range of adjacent inodes is reserved to store the metadata. This metadata is critical to the filesystem, as it contains the initial Hard Link, the mapping to the inodes that contain the raw file data. Without this mapping, the filesystem will never be able to access that particular file, whether the data exists on the disk or not.

Also found within the metadata is:

  • Name of file
  • Owner and Owning Group
  • Inode mapping (Hard Link)
  • Mode
  • ACL
  • Date created and modified
  • File size

The Mode and ACL are used for granular control over which users and proccesses have access to a given file, which we will go into next...


MODE PERMISSIONS

Because everything on Linux is a file, a permission system is needed to define who, and what, has access to what files. There are 3 user identifiers in Linux; User, Group and World. Each user will also have their own Group, but can be included in multiple Groups. World refers to users not specified, or simply "other users"

The 3 types of permissions for files are Read, Write and Exacute. These are self explanatory, and the method of attaching multiple permssions is very logical:

4 = read/view (r) 
2 = write/modify (w) 
1 = execute/access (x)

7 = 4+2+1 
6 = 4+2 
5 = 4+1 
4 = 4 
3 = 2+1 
2 = 2 
1 = 1

So the full host of permissions can be represented by a single number for a user, and three numbers can cover the entire spectrum of users, groups, and other:

COMMAND:OWNER:GROUP:WORLD:PATH 

The CHMOD command can be used to set and change the mode of the file:

chmod 777 /path/to/file

ACL PERMISSIONS

ACLs, or Access Control Lists, allow far more granularity over file access rules. Most importantly, ACLs are able to target specific users, where the Mode only targets the Owner and Owning Group. Because ACLs take priority over Modes, specific users can be granted permissions that would otherwise be denied by the Mode. ACLs use standard Unix permissions:

r : read or ls
w : write to file or modify files in that directory
x : execute file or cd into that directory
\- : remove all permissions

ACLs are viewed and modified using the getfacl and setfacl commands respectively, as per the following examples:

EXAMPLE 1:

setfacl -R -m g:group:<permissions> folder

-R - recursive (for entire directories with contents)
-m - modify existing acl’s (for appending, without overwriting)

EXAMPLE 2

setfacl -m u:user:rw file

Grant user read-write access to a file

EXAMPLE 3 (default)

setfacl -d -m g::rwx <file> - default permissions for group owner WITHOUT specifying group

EXAMPLE 4 (mask)

setfacl -m default:m::rx /directory

read/execute permissions set as the default mask (The acl mask defines the maximum effective permissions for any entry in the acl.)

EXAMPLE 5 (delete)

setfacl -b file

Delete ACL from file

EXAMPLE 6 (match)

getfacl file1 | setfacl --setfile=- file2

Sets file1’s ACL for file2 as well


Linux I/O and Redirection

Linux is built on a foundation to switches that can accept instructions through the command line. These switches can in turn direct an input to other binaries or back to the user, and can provide error information when applicable. Each form of process stream can is designated a numerical identifier:

0 - STDIN - (input)
1 - STDOUT - (output)
2 - STDERR (errors)

The output of one process can be redirected and fed into another process as follows:

|  - uses STDOUT of left command as stdin for right command
>  - redirects the stdoutput to a location
>> - same as redirection, but appends to existing file instead of overwriting

In the below example, I am performing the cat operator on a nonexistant file:

[skye@skyenet]$cat FAKEFILENAME
[skye@skyenet]$cat: FAKEFILENAME: No such file or directory

This outputs a STDERR, as expected. Next I will redirect this error as the STDIN for a new file:

[skye@skyenet]$cat FAKEFILENAME 2> errormessage.txt

The error thrown by catting a nonexistant file will be shown in the .txt file, but is not output as STDOUT.



Globbing / CLI Wildcards

As a final note, Globbing can be used to find and interact with scopes of files:

*  - replaces multiple character positions
? - replaces an unknown character single position
[nN] - single position with n or N
[1-9] - single position with value 1-9
[!x] - single position NOT equal to x
[!1-6] - single position NOT equal to 1-6

Related Article