Sunday, January 5, 2014

How to trouble shoot - RHEL

If the trouble report is coming from a user, gather more data before getting into the troubleshooting phase:
  • Ask for specific error messages and/or output.
  • Ask how the user became aware of the problem when it first occurred.
  • Determine if the problem is ongoing or intermittent
  • Ask for detailed steps on how to reproduce the issue.
  • If unknown to you, ask the user what the expected outcome should have been.
  • What changes have been made recently?
  • If the issue is reported by a user, determine what they may have changed recently.
    • Review all recent changes in available change management logs, if they exist.
    • Review patch management systems to determine if the environment has been updated.
    • Check configuration file time stamps and run comparisons on configuration files related to the issue.
    • If time permits, contact anyone who has access to make changes to the environment to discover if undocumented changes have occurred.
  • In RHEL, the main log file is /var/log/messages, this would be the first log to start looking at.
    • Not all applications write entries into this log file, and not all log files are written in the /var/log directory.
    • You may have to look at the application documentation to determine the correct log file location.
      • Documentation for most packages is available in /usr/share/doc/[packagename] for most RHEL software.
      • Using the man command against the application can also show where log file locations are as well.
  • A quick way to determine which logs are being updated in /var/log is with the ls -ltr command which lists the files in a directory by time stamp, the last file listed being the latest.

# ls - ltr /var/log

  • To see kernel and hardware related events you can use the dmesg command.
    • This command will display kernel related events that have occurred recently.
    • The system only keeps a small amount of this data resident and will overwrite it as new events occur.
    • The /var/log/dmesg file contains a snapshot of dmesg output at boot time, useful for determining what may have happened to a system hardware wise when the system last started.
  • Kernel events should also show up in /var/log/messages, but are some times harder to find with all the other logging that goes in that file.
  • When dealing with log files, knowing how to parse them is extremely important.
  • The grep command is probably the most used command to find text in log files.
    • The command grep “httpd” /var/log/messages will return any line containing the string httpd in /var/log/messages.
    • Conversely, grep -v “httpd” /var/log/messages will return any line not containing the string httpd in /var/log/messages.
  • The grep command can also be used to parse output of commands:

# ps ax | grep init

  • More information on grep can be obtained in its man page, grep --help, or in /usr/share/doc/grep-*.
  • Multiple strings can be specified when using grep by using the -E flag and the pipe (|) character between search strings to specify the or operator:
# ps ax | grep -E "findSTRING1|findSTRING2|findSTRING3"
# ps ax | grep -Ev "ignorestring1|ignorestring2"
  • Multiple grep commands can be used, in this example; we return lines with findstring but ignore lines with ignorestring .
# ps ax | grep findstring|grep -v ignorestring

Table 1: Common grep Options
Perform a case-insensitive search

Exclude lines that contain the pattern


Display a count of lines with the matching pattern

Only list files names, do not display the matched lines

Precede matched lines with line number
Highlight the matched string

-A, -B
When followed by a number, thes options print that many lines after or before each match. This is useful for seeing the context in which a match appears within a file.

Perform a recursive search of files starting with the named directory

  • The head and tail commands help limit the amount of data the system administrator has to go through to read files or parse command output.
  • The head -number command will show the first number lines of command output or of a text file.

# head -1 /etc/passwd

  • The tail -number command will show the last number lines of command output or of a text file.

# tail -2 /etc/passwd
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin

  • The tail -n +number command will show all of the lines of command output or of a text file after line number.
  • This is useful to remove a header line in command output:
# ps aux | tail -n +2root      1       0.0     00      19328   1412    ?       Ss      May04           0:02           /sbin/initroot 2       0.0     00      0       0       ?       Ss        May04          0:02           [kthreadd]root 3       0.0     00      0       0       ?        Ss      May04          0:02           [migration/0]      

  • To follow a text file as it gets updated in real time use tail -f (the short command tailf can be used as well):

# tail -f /var/log/messagesMay 6 18:53:47  nas01     smbd[14012]:  failed to retrieve printer list: NT-STATUS_UNSUCCES

No comments:

Post a Comment