Sunday 17 March 2013

Trouble Unmounting? Try These Commands


If you receive an "is busy" message when unmounting a file system, you have a few commands at your disposal to solve the problem and achieve a successful unmount. My favorite commands are:
  • ps
  • fuser
  • procfiles
  • lsof
Let's take a look at how to use these commands.

An Unmounting Scenario

Suppose you want to unmount the /storix file system, so you run the command:
  1. # unmount /storix
(Note that unmount and umount are just different names for the same command.) Because a process is still active on that device, you get the following message in the output:
  1. umount: 0506-349 Cannot unmount /dev/lv00:
  2. The requested resource is busy.
Before you start troubleshooting with ps, fuser, procfiles, or lsof, you should check for three "gotchas." First, you should make sure that your login session isn't currently sitting in that file system. (This is more common than you’d think.) Second, you should make sure that a process isn't being respawned from /etc/inittab. If this is the case, comment out the entry and get init to reread the inittab by issuing the command:
  1. telint -q
Finally, you should make sure that all applications using that file system have been shut down.
Assuming that the problem wasn't due to any of these three gotchas, you’re going to have to do more digging to determine which process is stopping the unmounting of the /storix file system. To do that, you can use the ps, fuser, procfiles, and lsof commands.
Note that once you find the culprit process, make sure that process can be stopped before attempting to kill it. Killing a process might cause issues with the running of the associated application, creating even more problems. You should also be cautious with any database-related application, because the process could be doing writes on the file system.

The Faithful ps Command

Using the good old ps command with grep, you can search for a common pattern. In this example, you would search for storix, because that’s the name of the file system:
  1. # ps -ef |grep storix
From the ps command's output
  1. root 5767400 3670250  68 12:26:36
  2. pts/1  3:12 /bin/sh ./storix_rep
you can see the culprit straightaway: the script storix_rep. At this point, you can use the ps -fT command to get the process identifiers (PIDs) of all the child processes:
  1. # ps -fT 5767400
In this instance, the output shows that there are no child processes involved. Thus, this process can be terminated with the kill command so that the unmount command will be successful. If you know the PID but not the process name, you can use the following command to return the process name associated with the PID:
  1. # ps eww  5767400

Everybody's Favorite Command: fuser

The fuser command is by far the most popular method for returning and killing processes associated with a file system or file. In this example, you can specify the file system in the command:
  1. # fuser -u /storix
Alternatively, you can specify the logical volume on which the file system resides, like this:
  1. # fuser -uc /dev/lv00
The output of these two commands is
  1. /storix:  5767400c(root) 5898440c(dxtans)
  2. /dev/lv00:  5767400c(root) 5898440c(dxtans)
As you can see, fuser returns the userid and open files associated with the file system. In this case, two processes are using /storix: root and dxtans. To kill those processes, you can stay with fuser by issuing the commands:
  1. # fuser -k /storix
  2. # fuser -u /storix

The Little-Known procfiles Command

The procfiles command will return open files and descriptors from a given process. To use it, you first need to determine the PID. You then use that PID in the procfiles command. In this case, you first use the ps command:
  1. # ps -ef |grep storix
From its output
  1. root 5832924 3670250  69 12:44:50
  2. pts/1  1:29 /bin/sh ./storix_rep
you get the PID 5832924, which you then use in the procfiles command:
  1. # procfiles -n 5832924
Figure 1 shows the procfiles output, which has been truncated to save space.

Figure 1: Truncated output from the procfiles command that identifies problematic files
The output gives you a lot of useful information. In particular, it tells you the actual names of the files that are causing the problem. In this case, there's one problematic file: /usr/local/bin/storix_rep. You could further confirm this with the ps command if desired.

The Big Hitter Command: lsof

The lsof command is the best method for finding open files. Not only is it fast, but it can also be easily used within shell scripts. In this instance, you'd use the command:
  1. # lsof | grep storix
As Figure 2 shows, the output identifies the culprits straightaway and presents their PIDs, making it quite easy to kill these processes.

Figure 2: Output from the lsof command that identifies problematic processes
If you want the output in a more user-friendly format, you could run the command:
  1. # lsof  /storix
Because everything on AIX is a file, you can use the following command to show all the files used by that PID:
  1. #  lsof -p 5832924
Figure 3 shows the results.

Figure 3: Output from the lsof command that shows all the files used by a PID

Forcing an Unmount

Trying to unmount a NFS-based file system can be troublesome sometimes, especially when the client has gone down or is unreachable. If this is the case, you need to use the showmount command to list all the clients that have remotely mounted a file system, then use the unmount command with the force option to unmount the target file system:
  1. # showmount -a <remote_name>
  2. # unmount -f <file_system>
The force option can also be used on an enhanced journaled file system (JFS2), although it’s always better to first discover what process is using the device, as previously described. To force an unmount of /storix, without pause or regards, you could use:
  1. # unmount -f /storix
Most systems administrators use the df command to see their currently mounted file systems. However, if you want to see more information about all the mounted file systems, you can use the mount command:
  1. # mount
Figure 4 shows sample results.

Figure 4: Output from the mount command that shows details about the mounted file systems

Quickly Identify the Culprits

Getting an "is busy" message is an inconvenience when trying to unmount a file system. However, by using the tools highlighted in this article, you can quickly identify the culprits.

No comments:

Post a Comment