You store your information in a file, and the operating system stores the information about a file in an inode(sometimes called as an inode number).
Information about files(data) are sometimes called metadata. So you can even say it in another way, “An inode is metadata of the data.”
Whenever a user or a program needs access to a file, the operating system first searches for the exact and unique inode (inode number), in a table called as an inode table. In fact the program or the user who needs access to a file, reaches the file with the help of the inode number found from the inode table.
To reach a particular file with its “name” needs an inode number corresponding to that file. But to reach an inode number you dont require the file name. Infact with the inode number you can get the data.How does the structure of an inode look like?
This is the most important part to understand in terms of an inode. Here we will be discussing the contents of an inode.
Inode Structure of a Directory:
Inode structure of a directory just consists of Name to Inode mapping of files and directories in that directory.
you can see the first two entries of (.) and (..) dot dot. You might have seen them whenever you list the contents of a directory.(most of the times they are hidden. You will have to use -a option with “ls” command to see them).
And people who are more into Linux or any NIX system, knows that the command “cd .” will change the directory to the current directory itself(which means it does nothing..because you are already in that directory.).
And the command “cd ..” will take you to the previous directory or call it the parent directory of the current directory. Now why that happens?
Lets understand why this happens with an example.
[root@rafi log]# pwd
[root@rafi log]# ls -ai
393456 . 392471 boot.log 393598 gdm 392887 pm-powersave.log 404347 spooler
392449 .. 392943 btmp 393624 httpd 393490 ppp 393609 spooler-20160330
392779 anaconda.ifcfg.log 393568 btmp-20160401 393561 lastlog 523827 prelink 393484 tallylog
392465 anaconda.log 393489 ConsoleKit 404086 mail 393000 puppet 392865 vmware-tools-upgrader.log
392476 anaconda.program.log 404343 cron 404344 maillog 393903 rhsm 392806 wpa_supplicant.log
392478 anaconda.storage.log 392789 cron-20160330 393608 maillog-20160330 394649 sa 393566 wtmp
392466 anaconda.syslog 393571 cups 404345 messages 393894 samba 392922 Xorg.0.log
392467 anaconda.xlog 392468 dmesg 393606 messages-20160330 404346 secure 392871 Xorg.0.log.old
392780 anaconda.yum.log 392469 dmesg.old 404342 mysqld.log 393607 secure-20160330 392938 yum.log
524731 audit 393493 dracut.log 393610 ntpstats 392790 spice-vdagent.log
Now lets note down inode numbers of .(dot) and ..(dot dot).
Now lets do the directory listing of /var/ directory and see the inodes there.
[root@rafi log]# cd ..
[root@rafi var]# pwd
[root@rafi var]# ls -ai
392449 . 392452 cache 393458 empty 392450 lib 393456 log 393466 nis 393469 run 393626 www
2 .. 523733 crash 393459 games 393462 local 393465 mail 393467 opt 393470 spool 393473 yp
524735 account 393457 db 393593 gdm 393463 lock 392990 named 393468 preserve 392474 tmp
So you can clearly note that inode of .(dot) inside /var/log directory is equal to inode of log directory. And inode of ..(dot dot ) inside /var/log/ is equal to inode of .(dot) inside /var/ directory.
.(dot) always means the current directory just because its inode is same as the directory’s inode. And ..(dot dot) means parent directory inode because its inode is same as the previous(parent) directory.
Inode Structure of a File
This keeps information about two things, one is the permission information, the other is the type of inode, for example an inode can be of a file, directory or a block device etc.
Owner Info: Access details like owner of the file, group of the file etc.
Size: This location store the size of the file in terms of bytes.
Time Stamps: it stores the inode creation time, modification time, etc.
Now comes the important thing to understand about how a file is saved in a partition with the help of an inode.
Block Size: Whenever a partition is formatted with a file system.It normally gets formatted with a default block size. Now block size is the size of chunks in which data
will be spread. So if the block size is 4K, then for a file of 15K it will take 4 blocks(because 4K*4 16), and technically speaking you waste 1 K.
Direct Block Pointers:
In an ext2 file system an inode consists of only 15 block pointers. The first 12 block pointers are called as Direct Block pointers. Which means that these pointers point to the address of the blocks containing the data of the file. 12 Block pointers can point to 12 data blocks. So in total the Direct Block pointers can address only 48K(12 * 4K) of data. Which means if the file is only of 48K or below in size, then inode itself can address all the blocks
containing the data of the file.
Now What if the file size is above 48K?
Indirect Block Pointers:
whenever the size of the data goes above 48k(by considering the block size as 4k), the 13th pointer in the inode will point to the very next block after the data(adjacent block after 48k of data), which inturn will point to the next block address where data is to be copied.
Now as we have took our block size as 4K, the indirect block pointer, can point to 1024 blocks containing data(by taking the size of a block pointer as 4bytes, one 4K block can point to 1024 blocks because 4 bytes * 1024 = 4K).
which means an indirect block pointer can address, upto 4MB of data(4bytes of block pointer in 4K block, can point and address 1024 number of 4K blocks which makes the data size of 4M)
Double indirect Block Pointers:
Now if the size of the file is above 4MB + 48K then the inode will start using Double Indirect Block Pointers, to address data blocks. Double Indirect Block pointer in an inode will point to the block that comes just after 4M + 48K data, which intern will point to the blocks where the data is stored.
Double Indirect block pointer also is inside a 4K block as every blocks are 4K, Now block pointers are 4 bytes in size, as mentioned previously, so Double indirect block pointer can address 1024 Indirect Block pointers(which means 1024 * 4M =4G). So with the help of a double indirect Block Pointer the size of the data can go upto 4G.
Triple Indirect Block Pointers:
Now this triple Indirect Block Pointers can address upto 4G * 1024 = 4TB, of file size. The fifteenth block pointer in the inode will point to the block just after the 4G of data, which intern will point to 1024 Double Indirect Block Pointers.
So after the 12 direct block pointers, 13th block pointer in inode is for Indirect block pointers, and 14th block pointer is for double indirect block pointers, and 15th block pointer is for triple indirect block pointers.
Now this is the main reason why there are limits to the full size of a single file that you can have in a file system.
Now an interesting fact to understand is that the total no of inodes are created at the time of creating a file system. Which means there is an upper limit in the number of inodes you can have in a file system. Now after that limit has reached you will not be able to create any more files on the file system, even if you have space left on the partitio