The NFS network lock manager

File locking and system crashes

Locking prevents multiple processes from modifying the same file at the same time and allows cooperating processes to synchronize access to shared files. The user interfaces with the network locking service by way of the standard fcntl system call interface and rarely requires any detailed knowledge of how it works. The kernel maps user calls to the fcntl system call or the lockf library routine into RPC-based messages to the local lock manager. The fact that the filesystem may be spread across multiple machines is really not a complication -- until a crash occurs.

All computers crash from time to time. In an NFS environment, where multiple machines can have access to the same file at the same time, the process of recovering from a crash is necessarily more complex than in a non-network environment. First, locking is inherently stateful (it requires information about locks to be maintained on the server). If a server crashes, clients with locked files must be able to recover their locks. If a client crashes, its servers must release the locks held by processes running on the client. Second, to preserve NFS's overall transparency, the recovery of lost locks must not require the intervention of the applications themselves. This is accomplished as follows:

The Network Lock Manager solves both of these problems by cooperating with the Network Status Monitor to ensure that it is notified of relevant machine crashes. Its own protocol then allows it to recover the lock information it needs when crashed machines recover.

For a representation of the relationship between the lock and status managers on the server and the clients, refer to ``NFS locking manager architecture''.

NFS locking manager architecture

The following figure depicts the overall architecture of the locking service.

Architecture of the locking service over NFS

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 22 April 2004