How to protect disk with Linux Semaphore Lock, inotify and poll?
This post is meant to record a case occurred to me on my job. This made me get to be more familiar with Linux Semaphore and inotify mechanism.
Semaphore in Linux
There are more than one kind of semaphore is Linux kernel, System V semaphore, POSIX semaphore and Reader-Writer Semaphore (see init_rwsem).
Moreover, semaphore can be either named semaphore, which is identified by a name and two processes can operate on the same named
semaphore by passing the same name to sem_open(3);
Or unnamed semaphore, which does not have a name, instead the
semaphore is placed in a region of memory that is shared
between multiple threads (a thread-shared semaphore) or
processes (a process-shared semaphore). (e.g., a System V shared memory segment created using shmget(2), or a POSIX shared memory object
built created using shm_open(3)).
For more information, see: sem_overview — overview of POSIX semaphores
The difference between System V semaphore and POSIX semaphore is not going to be discussed here. see: Differences between System V and Posix semaphores
Scenario
In our system, disks are involved in tons of operations. One can image that, for instance, when a disk is doing smart testing (see: SMART tests with smartctl), we should not conduct firmware upgrade on it (see Download Microcode).
The question is, how to prevent users from misusing the disks ?
In our system, there is a framework to cover this situation. Anyone who wants to use a disk should register or claim that is going to use it. While one process is using a disks, others processes will not be able to use it.
This framework is based on System V named semaphore. Every time one process try to register the use of a disk. It is actually trying to acquire the semaphore. That means, if we can guarantee the everyone follow rules like:
acquire_lock(disk)
...
do_something_on_disk()
...
release_lock(disk)
We ensure the safety of operations on disks.
Smart testing
However, smart testing is not something that can be done fast. What happens is actually like:
- User send a web api to the system
- parse the parameters and invoke an executable which invoke smartctl to send command to the disks to start smart testing.
- cgi immediately responds to the user that disk is doing it
- There is a daemon monitoring the progress of smart testing by using smart commands, and save the progress to a set of cache file
- The cache files are then used to determined the status of smart testing
- Once the smart testing is finished or terminated, the cache files will be removed
So the question is: we need to acquire the semaphore to use the disks exclusively, but when and where should we release the semaphore?
Should we do a “blocking-waiting” until the smart testing is done, release it and then response to users? = every time a user click the smart testing button, he/she will have to wait 10 to 20 minutes to make the next move.
Should we no acquire the semaphore and do the smart testing? = user can wrongly do firmware upgrade while he/she is doing smart testing.
The Framework and System V named Semaphore
System V Semaphores: https://docs.oracle.com/cd/E19683-01/816-5042/auto32/index.html
In the framework, there are lots of files under in a certain path. Given the device name (e.g /dev/sda).
To acquire the semaphore:
- call ftok(file, id) to get the System V IPC key
- use semop or semtimedop to set the semaphore
It’s almost same to release it, the only difference is the parameters of semop or semtimedop.
The first idea crosses my mind is that we should acquire the semaphore and release it in the monitor daemon. Because semaphore, unlike mutex, does not require to be released by the same process who acquire it. That is, one can acquire it in Process A, release it in Process B.
The idea is to applicable because the process who acquire it will terminate after sending to smart command, and a system V semaphore is release automatically after the termination.
As the result, I come up with the second idea: adopt inotify + poll to watch the cache file, and release the semaphore after the cache file is removed.
The PoC is right here.
notify mechanism in Linux kernel
In Linux kernel v5.13 (see), there are generally three notify framework dnotify , inotify and fanotify
In this work, I choose inotify because I need to capture a file-been-deleted event which dnotify cannot help because releasing a semaphore cannot or is difficult to achieve via a command and fanotify cannot help because, according to the manual, in particular, there was no support for create, delete, and move events.
Generally speaking, several steps are commonly involved in inotify.
- initilization: inotify_init() and inotify_init1(). on success, we should have a file desciptor standing for the inotify instance.
- add watch: inotify_add_watch(). we will have to add a watch event on a certain file or directory. For my case, I want to capture a deletion of a certoin file A under /path. So, I should do, inotify_add_watch(inotify df, “/path/A”, IN_DELETE_SELF);
- combine with poll(). to monitor the trigger of the event.
You may see for detail in the PoC above.
There are serveral events you can choose. For instance:
unlink("dir2/yy");
Generates an IN_ATTRIB event for xx (because its link
count changes) and an IN_DELETE event for dir2.unlink("dir1/xx");
Generates IN_ATTRIB, IN_DELETE_SELF, and IN_IGNORED
events for xx, and an IN_DELETE event for dir1.
you may see more detail here
Note that, when you do something like
char *tmpFile = "cache.tmp";
char *cacheFile = "cache";
rename(tmpFile, cacheFile);
More than the document describes, such behavior not only cause IN_MOVED_FROM and IN_MOVED_TO.
If the reference count of cacheFile count down to zero, IN_DELETE_SELF does also happen. I learn the hard way.
Conclusion
If you want to protect your hardware resource which OS does not have a good way to cover, for instance, smart testing is conducted within disks one will have to query the status via smart commands. Perhaps my experience can come to use.
Firstly, you need to develop a cache file system.
Secondly, Use semaphore or any kind of lock to prevent simultenously operations.
Third, use the PoC above to release the lock or semaphore rightly.
Please let me know if you have any question or there is any error. Thank you for your time.