Techrock2013: Troubleshooting

Killing Zombie Process

Zombie or defunct processes are dead processes that have completed the execution, have released all CPU, memory resources, but still has an entry in the process table.

Usually, when a child or sub-process finishes it’s task and exits, it’s parent is supposed to call the “wait” system call and get the status of the process. So, until the parent process don’t check for the child’s exit status, the process is a zombie process waiting for it’s parent to check it’s status.

You can use ‘top’ or ‘ps aux’ command to check for any zombie processes. All the processes which are having “z” in their Stat column are Zombie processes.

In order to kill Zombie processes, you can send a SIGCHLD signal to the parent process of zombie which will instruct parents to reap their zombie children.

kill -s SIGCHLD {PPID}

In case the above command doesn’t work, then the last option you will have is to kill the parent process.

You can easily find out the parent’s process ID with the following command:

ps aux -eo ppid | grep {zombie_process_id}

kill -9 {PPID}

Once zombie process loses it’s parent process, it becomes orphan and is adopted by “init” process. Init periodically executes the wait system call to reap any zombies with init as parent

Orphan Process

Orphaned processes are processes whose parent process is dead. Immediately, re-parenting occurs where ‘init’ process adopts the orphaned once. Though re-parenting occurs, the process still remains Orphan as the parent which created the process is dead.

In order to find any orphaned process, issue the following command:

[root@nagios ~]# ps -elf | head -1; ps -elf | awk '{if ($5 == 1 && $3 != "root") {print $0}}'| head

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD

5 S dbus 872 1 0 80 0 - 742 - May16 ? 00:00:00 dbus-daemon --system

5 S 68 901 1 0 80 0 - 1466 - May16 ? 00:00:00 hald

1 S nagios 25443 1 0 80 0 - 3630 - Jun06 ? 00:47:34 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

You can kill the orphan process as follows:

kill -15 {PID}

If the above doesn’t work, you can try with -9 option as follows:

kill -9 {PID}

How to Find single user disk space usage

find / -user naresh -fstype 4.2 ! -name /dev/\* -ls | \ awk '{sum+=$7}; END {print "User naresh total disk use = " sum}'

Swap management

Step1:First check what is the swap space the system is having and utilization of swap size

#free -m

#swapon -s

-s for st

#cat /proc/swaps

this will give output of swap statistics in Megabytes(m), if we want the output in Kilobytes then free -k

Step2:Before starting the Swap management we have to take precautions such as switch off all the swap and no user should be logged in

#swapoff -a

Step3:Check in the system if there is any raw space in the system

#fdisk -l

Step4:If the system is having free space then create a partition which support swap(partition type 82) with required amount of free space

#fdisk /dev/hda

p #press p to print the partition table

n #press n for creating new partition

256M #specify the amount of swap required

t #press t to change the partition type to 82 (because partition type 82 is well supported for swap)

8 #enter the partition no on which u want to create swap(here i am creating on /dev/hda8 partition)

82 #specifying the partition type

p #press p to print the partition table and to just conform the /dev/hda8 partition type

w #press w for writing the changes to partition table (if suppose if u have any problem or trouble just press q to quit from fdisk utility without any problem)

Step5:update the partition table changes to kernel so that there is no need to restart the system/server

#partprob

Step6:Permanently mounting the partition details,in order to do this one we have to update the /etc/fstab file

#vi /etc/fstab

/dev/hda8 swap swap defaults 0 0

enter the above entires in to fstab file,save it and exit from editing /etc/fstab file

:wq

Step7:formatting/creating swap signature on the newly created partition

#mkswap /dev/hda8

Note:stpe6 and step7 are interchangeable.

Step8:update the mount table to kernel

#mount -a

Step9:now on the swap so that it can be available for use.

#swapon -a

Step10:check weather the swap is updated or not

#free -m

#swapon -s

#cat /proc/swaps

Removing swap:

Step1:Before doing any thing with swap first we have to switch of the swap

#swapoff -a

Step2:Remove/comment the entry of swap from /etc/fstab file

#vi /etc/fstab

then save and exit

Step3:update the kernel about mount table changes

#mount -a

Step4:Remove the partition used by swap

#fdisk /dev/hda8

d #press d to delete the partition

8 #specify the partition no to be deleted

w #press w to write the changes to partition table and quit the fdisk utility

Step5:update the partition table changes to kernel with out rebooting the system/server

#partprobe

Step6:now on the swap

#swapon -a

Step7:now check weather swap is updated properly or not

#free -m

#swapon -s

#cat /proc/swaps

Advanced Swap management

Actually its advantageous if we create swap partition separately if the server is not having any raw space left what we can do ? At any cost we have to increase the swap to improve the system productivity

To come out of this situation there is one solution provided by Linux e.i we can create a swap file with in all ready existing and using partition if that partition is having sufficient free space

Step1:Switch off all the swap before any swap management

#swapoff -a

Step2:Determine what is the swap size we required(here i am taking 128MB) and execut the following command with count equal to 131072 (because 131072 is equal to 128M)

#dd if=/dev/zero of=/swapfile bs=1024 count=131072

Step3:Now set this swap file in order to use this file as swap

#mkswap /swapfile

Step4:Edit the /etc/fstab file to specify the swap file for perminient swap space

#vi /etc/fstab

/swapfile swap defaults 0 0

:wq

Step5:Update the kernel about the mount table changes

#mount -a

Step6:After managing the swap on the swap

#swapon -a

Step6:Check weather the swap space is updated or not by using following commands

#free -m

#swapon -s

#cat /proc/swaps

Removing swap:

Step1:Before doing any thing with swap first we have to switch of the swap

#swapoff -v /swapfile

Step2:Remove/comment the entry of swap from /etc/fstab file

#vi /etc/fstab

then save and exit

Step3:update the kernel about mount table changes

#mount -a

Step4:Remove the swapfile permanently

#rm /swapfile

Linux Kernel panic reboot

By default after a kernel panic, Linux kernel just waits there for a system administrator to hit the restart or powercycle button. This is because of the value set on "kernel.panic" parameter.

[root@linux23 ~]# cat /proc/sys/kernel/panic
0
[root@linux23 ~]# sysctl -a | grep kernel.panic
kernel.panic = 0
[root@linux23 ~]#

To disable this and make the Linux OS reboot after a kernel panic, we have to set an integer N greater than zero to the paramter "kernel.panic", where "N" is the number of seconds to wait before a automatic reboot.

For example , if you set N = 10 , then the system waits for 10 seconds before automatic reboot. To make this permanent, edit /etc/sysctl.conf and set it.

[root@linux23 ~]# echo "10" > /proc/sys/kernel/panic
0
[root@linux23 ~]# grep kernel.panic /etc/sysctl.conf
kernel.panic = 10
[root@linux23 ~]#

This helps in preventing manual intervention after a kernel panic. Setup some kernel dump or netdump to capture the kernel crash debug information.

Display Date And Time For Each Command

If the HISTTIMEFORMAT is set, the time stamp information associated with each history entry is written to the history file, marked with the history comment character. Defining the environment variable as follows:
$ HISTTIMEFORMAT=”%d/%m/%y %T “
OR
echo ‘export HISTTIMEFORMAT=”%d/%m/%y %T “‘ >> ~/.bash_profile
Where,
%d – Day
%m – Month
%y – Year
%T – Time
To see history type
# history

Clear RAM cache

Linux kernels provide us the provision to clear page cache and/or inode and dentry caches which can help free to clear a lot of memory.
Following command can help you to check the memory usage.
[root@server~]# free -m
In order to free pagecache use following command:
[root@server~]# echo 1 > /proc/sys/vm/drop_caches
In order to free dentries and inodes:
[root@server~]# echo 2 > /proc/sys/vm/drop_caches
In order to free pagecache, dentries and inodes:
[root@server]# echo 3 > /proc/sys/vm/drop_caches
As this is a non-destructive operation and dirty objects are not freeable. So run a sync command for this.
[root@server]# sync; echo 3 > /proc/sys/vm/drop_caches

Techrock2013

Thursday 10 January 2013

Troubleshooting