Showing posts with label Unix. Show all posts

Wednesday, September 12, 2012

Daemon and rstatd daemon

In Unix and other multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user. Typically daemon names end with the letter d: for example, syslogd is the daemon that implements the system logging facility and sshd is a daemon that services incoming SSH connections.

In a Unix environment, the parent process of a daemon is often, but not always, the init process. A daemon is usually created by a process forking a child process and then immediately exiting, thus causing init to adopt the child process. In addition, a daemon or the operating system typically must perform other operations, such as dissociating the process from any controlling terminal (tty). Such procedures are often implemented in various convenience routines such as daemon(3) in Unix.

Systems often start daemons at boot time: they often serve the function of responding to network requests, hardware activity, or other programs by performing some task. Daemons can also configure hardware (like udevd on some GNU/Linux systems), run scheduled tasks (like cron), and perform a variety of other tasks.

Daemon stands for Disk and Execution Monitor. A daemon is a long-running background process that answers requests for services. The term originated with Unix, but most operating systems use daemons in some form or another. In Windows NT, 2000, and XP, for example, daemons are called "services". In Unix, the names of daemons conventionally end in "d". Some examples include inetd, httpd, nfsd, sshd, named, and lpd.

rstatd Daemon

Purpose
Returns performance statistics obtained from the kernel.

Syntax
/usr/sbin/rpc.rstatd

Description
The rstatd daemon is a server that returns performance statistics obtained from the kernel. The rstatd daemon is normally started by the inetd daemon.

Files
/etc/inetd.conf TCP/IP configuration file that starts RPC daemons and other TCP/IP daemons.
/etc/services Contains an entry for each server available through Internet.

Thursday, September 6, 2012

Top 10 SQL Server Counters for Monitoring SQL Server Performance

Do you have a list of SQL Server Counters you review when monitoring your SQL Server environment? Counters allow you a method to measure current performance, as well as performance over time. Identifying the metrics you like to use to measure SQL Server performance and collecting them over time gives you a quick and easy way to identify SQL Server problems, as well as graph your performance trend over time.

Below is my top 10 list of SQL Server counters in no particular order. For each counter I have described what it is, and in some cases I have described the ideal value of these counters. This list should give you a starting point for developing the metrics you want to use to measure database performance in your SQL Server environment.

1. SQLServer: Buffer Manager: Buffer cache hit ratio

The buffer cache hit ratio counter represents how often SQL Server is able to find data pages in its buffer cache when a query needs a data page. The higher this number the better, because it means SQL Server was able to get data for queries out of memory instead of reading from disk. You want this number to be as close to 100 as possible. Having this counter at 100 means that 100% of the time SQL Server has found the needed data pages in memory. A low buffer cache hit ratio could indicate a memory problem.

2. SQLServer: Buffer Manager: Page life expectancy

The page life expectancy counter measures how long pages stay in the buffer cache in seconds. The longer a page stays in memory, the more likely SQL Server will not need to read from disk to resolve a query. You should watch this counter over time to determine a baseline for what is normal in your database environment. Some say anything below 300 (or 5 minutes) means you might need additional memory.

3. SQLServer: SQL Statistics: Batch Requests/Sec

Batch Requests/Sec measures the number of batches SQL Server is receiving per second. This counter is a good indicator of how much activity is being processed by your SQL Server box. The higher the number, the more queries are being executed on your box. Like many counters, there is no single number that can be used universally to indicate your machine is too busy. Today’s machines are getting more and more powerful all the time and therefore can process more batch requests per second. You should review this counter over time to determine a baseline number for your environment.

4. SQLServer: SQL Statistics: SQL Compilations/Sec

The SQL Compilations/Sec measure the number of times SQL Server compiles an execution plan per second. Compiling an execution plan is a resource-intensive operation. Compilations/Sec should be compared with the number of Batch Requests/Sec to get an indication of whether or not complications might be hurting your performance. To do that, divide the number of batch requests by the number of compiles per second to give you a ratio of the number of batches executed per compile. Ideally you want to have one compile per every 10 batch requests.

5. SQLServer: SQL Statistics: SQL Re-Compilations/Sec

When the execution plan is invalidated due to some significant event, SQL Server will re-compile it. The Re-compilations/Sec counter measures the number of time a re-compile event was triggered per second. Re-compiles, like compiles, are expensive operations so you want to minimize the number of re-compiles. Ideally you want to keep this counter less than 10% of the number of Compilations/Sec.

6. SQLServer: General Statistics: User Connections

The user connections counter identifies the number of different users that are connected to SQL Server at the time the sample was taken. You need to watch this counter over time to understand your baseline user connection numbers. Once you have some idea of your high and low water marks during normal usage of your system, you can then look for times when this counter exceeds the high and low marks. If the value of this counter goes down and the load on the system is the same, then you might have a bottleneck that is not allowing your server to handle the normal load. Keep in mind though that this counter value might go down just because less people are using your SQL Server instance.

7. SQLServer: Locks: Lock Waits / Sec: _Total

In order for SQL Server to manage concurrent users on the system, SQL Server needs to lock resources from time to time. The lock waits per second counter tracks the number of times per second that SQL Server is not able to retain a lock right away for a resource. Ideally you don't want any request to wait for a lock. Therefore you want to keep this counter at zero, or close to zero at all times.

8. SQLServer: Access Methods: Page Splits / Sec

This counter measures the number of times SQL Server had to split a page when updating or inserting data per second. Page splits are expensive, and cause your table to perform more poorly due to fragmentation. Therefore, the fewer page splits you have the better your system will perform. Ideally this counter should be less than 20% of the batch requests per second.

9. SQLServer: General Statistic: Processes Block

The processes blocked counter identifies the number of blocked processes. When one process is blocking another process, the blocked process cannot move forward with its execution plan until the resource that is causing it to wait is freed up. Ideally you don't want to see any blocked processes. When processes are being blocked you should investigate.

10. SQLServer: Buffer Manager: Checkpoint Pages / Sec

The checkpoint pages per second counter measures the number of pages written to disk by a checkpoint operation. You should watch this counter over time to establish a baseline for your systems. Once a baseline value has been established you can watch this value to see if it is climbing. If this counter is climbing, it might mean you are running into memory pressures that are causing dirty pages to be flushed to disk more frequently than normal.

Friday, August 31, 2012

vmstat command in Unix

vmstat command

The first tool to use is the vmstat command, which quickly provides compact information about various system resources and their related performance problems.

The vmstat command reports statistics about kernel threads in the run and wait queue, memory, paging, disks, interrupts, system calls, context switches, and CPU activity. The reported CPU activity is a percentage breakdown of user mode, system mode, idle time, and waits for disk I/O.

Note: If the vmstat command is used without any interval, then it generates a single report. The single report is an average report from when the system was started. You can specify only the Count parameter with the Interval parameter. If the Interval parameter is specified without the Count parameter, then the reports are generated continuously.

As a CPU monitor, the vmstat command is superior to the iostat command in that its one-line-per-report output is easier to scan as it scrolls and there is less overhead involved if there are many disks attached to the system. The following example can help you identify situations in which a program has run away or is too CPU-intensive to run in a multiuser environment.

# vmstat 2
kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 1  0 22478  1677   0   0   0   0    0   0 188 1380 157 57 32  0 10
 1  0 22506  1609   0   0   0   0    0   0 214 1476 186 48 37  0 16
 0  0 22498  1582   0   0   0   0    0   0 248 1470 226 55 36  0  9

 2  0 22534  1465   0   0   0   0    0   0 238  903 239 77 23  0  0
 2  0 22534  1445   0   0   0   0    0   0 209 1142 205 72 28  0  0
 2  0 22534  1426   0   0   0   0    0   0 189 1220 212 74 26  0  0
 3  0 22534  1410   0   0   0   0    0   0 255 1704 268 70 30  0  0
 2  1 22557  1365   0   0   0   0    0   0 383  977 216 72 28  0  0

 2  0 22541  1356   0   0   0   0    0   0 237 1418 209 63 33  0  4
 1  0 22524  1350   0   0   0   0    0   0 241 1348 179 52 32  0 16
 1  0 22546  1293   0   0   0   0    0   0 217 1473 180 51 35  0 14

This output shows the effect of introducing a program in a tight loop to a busy multiuser system. The first three reports (the summary has been removed) show the system balanced at 50-55 percent user, 30-35 percent system, and 10-15 percent I/O wait. When the looping program begins, all available CPU cycles are consumed. Because the looping program does no I/O, it can absorb all of the cycles previously unused because of I/O wait. Worse, it represents a process that is always ready to take over the CPU when a useful process relinquishes it. Because the looping program has a priority equal to that of all other foreground processes, it will not necessarily have to give up the CPU when another process becomes dispatchable. The program runs for about 10 seconds (five reports), and then the activity reported by the vmstat command returns to a more normal pattern.

Optimum use would have the CPU working 100 percent of the time. This holds true in the case of a single-user system with no need to share the CPU. Generally, if us + sy time is below 90 percent, a single-user system is not considered CPU constrained. However, if us + sy time on a multiuser system exceeds 80 percent, the processes may spend time waiting in the run queue. Response time and throughput might suffer.

To check if the CPU is the bottleneck, consider the four cpu columns and the two kthr (kernel threads) columns in the vmstat report. It may also be worthwhile looking at the faults column:

cpu
Percentage breakdown of CPU time usage during the interval. The cpu columns are as follows:
- us
  The us column shows the percent of CPU time spent in user mode. A UNIX process can execute in either user mode or system (kernel) mode. When in user mode, a process executes within its application code and does not require kernel resources to perform computations, manage memory, or set variables.
- sy
  The sy column details the percentage of time the CPU was executing a process in system mode. This includes CPU resource consumed by kernel processes (kprocs) and others that need access to kernel resources. If a process needs kernel resources, it must execute a system call and is thereby switched to system mode to make that resource available. For example, reading or writing of a file requires kernel resources to open the file, seek a specific location, and read or write data, unless memory mapped files are used.
- id
  The id column shows the percentage of time which the CPU is idle, or waiting, without pending local disk I/O. If there are no threads available for execution (the run queue is empty), the system dispatches a thread called wait, which is also known as the idle kproc. On an SMP system, one wait thread per processor can be dispatched. The report generated by the ps-k or -g 0 option) identifies this as kproc or wait. If the ps report shows a high aggregate time for this thread, it means there were significant periods of time when no other thread was ready to run or waiting to be executed on the CPU. The system was therefore mostly idle and waiting for new tasks. command (with the
- wa
  The wa column details the percentage of time the CPU was idle with pending local disk I/O and NFS-mounted disks. If there is at least one outstanding I/O to a disk when wait is running, the time is classified as waiting for I/O. Unless asynchronous I/O is being used by the process, an I/O request to disk causes the calling process to block (or sleep) until the request has been completed. Once an I/O request for a process completes, it is placed on the run queue. If the I/Os were completing faster, more CPU time could be used.
  
  A wa value over 25 percent could indicate that the disk subsystem might not be balanced properly, or it might be the result of a disk-intensive workload.
  
  For information on the change made to wa, see Wait I/O time reporting.
kthr
Number of kernel threads in various queues averaged per second over the sampling interval. The kthr columns are as follows:
- r
  Average number of kernel threads that are runnable, which includes threads that are running and threads that are waiting for the CPU. If this number is greater than the number of CPUs, there is at least one thread waiting for a CPU and the more threads there are waiting for CPUs, the greater the likelihood of a performance impact.
- b
  Average number of kernel threads in the VMM wait queue per second. This includes threads that are waiting on filesystem I/O or threads that have been suspended due to memory load control.
  
  If processes are suspended due to memory load control, the blocked column (b) in the vmstat report indicates the increase in the number of threads rather than the run queue.
- p
  For vmstat -I The number of threads waiting on I/Os to raw devices per second. Threads waiting on I/Os to filesystems would not be included here.
faults
Information about process control, such as trap and interrupt rate. The faults columns are as follows:
- in
  Number of device interrupts per second observed in the interval. Additional information can be found in Assessing disk performance with the vmstat command.
- sy
  The number of system calls per second observed in the interval. Resources are available to user processes through well-defined system calls. These calls instruct the kernel to perform operations for the calling process and exchange data between the kernel and the process. Because workloads and applications vary widely, and different calls perform different functions, it is impossible to define how many system calls per-second are too many. But typically, when the sy column raises over 10000 calls per second on a uniprocessor, further investigations is called for (on an SMP system the number is 10000 calls per second per processor). One reason could be "polling" subroutines like the select() subroutine. For this column, it is advisable to have a baseline measurement that gives a count for a normal sy value.
- cs
  Number of context switches per second observed in the interval. The physical CPU resource is subdivided into logical time slices of 10 milliseconds each. Assuming a thread is scheduled for execution, it will run until its time slice expires, until it is preempted, or until it voluntarily gives up control of the CPU. When another thread is given control of the CPU, the context or working environment of the previous thread must be saved and the context of the current thread must be loaded. The operating system has a very efficient context switching procedure, so each switch is inexpensive in terms of resources. Any significant increase in context switches, such as when cs is a lot higher than the disk I/O and network packet rate, should be cause for further investigation.

Tuesday, August 7, 2012

SAR Command in Unix

Using sar you can monitor performance of various Linux subsystems (CPU, Memory, I/O..) in real time.

Using sar, you can also collect all performance data on an on-going basis, store them, and do historical analysis to identify bottlenecks.

Sar is part of the sysstat package.

This article explains how to install and configure sysstat package (which contains sar utility) and explains how to monitor the following Linux performance statistics using sar.

Collective CPU usage
Individual CPU statistics
Memory used and available
Swap space used and available
Overall I/O activities of the system
Individual device I/O activities
Context switch statistics
Run queue and load average data
Network statistics
Report sar data from a specific time

This is the only guide you’ll need for sar utility. So, bookmark this for your future reference.

I. Install and Configure Sysstat

Install Sysstat Package

First, make sure the latest version of sar is available on your system. Install it using any one of the following methods depending on your distribution.

sudo apt-get install sysstat
(or)
yum install sysstat
(or)
rpm -ivh sysstat-10.0.0-1.i586.rpm

Install Sysstat from Source

wget http://pagesperso-orange.fr/sebastien.godard/sysstat-10.0.0.tar.bz2

tar xvfj sysstat-10.0.0.tar.bz2

cd sysstat-10.0.0

./configure --enable-install-cron

Note: Make sure to pass the option –enable-install-cron. This does the following automatically for you. If you don’t configure sysstat with this option, you have to do this ugly job yourself manually.

Creates /etc/rc.d/init.d/sysstat
Creates appropriate links from /etc/rc.d/rc*.d/ directories to /etc/rc.d/init.d/sysstat to start the sysstat automatically during Linux boot process.
For example, /etc/rc.d/rc3.d/S01sysstat is linked automatically to /etc/rc.d/init.d/sysstat

After the ./configure, install it as shown below.

make

make install

Note: This will install sar and other systat utilities under /usr/local/bin

Once installed, verify the sar version using “sar -V”. Version 10 is the current stable version of sysstat.

$ sar -V
sysstat version 10.0.0
(C) Sebastien Godard (sysstat  orange.fr)

Finally, make sure sar works. For example, the following gives the system CPU statistics 3 times (with 1 second interval).

$ sar 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Utilities part of Sysstat

Following are the other sysstat utilities.

sar collects and displays ALL system activities statistics.
sadc stands for “system activity data collector”. This is the sar backend tool that does the data collection.
sa1 stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.
sa2 creates daily summary of the collected statistics. sa2 runs from cron.
sadf can generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.
iostat generates CPU, I/O statistics
mpstat displays CPU statistics.
pidstat reports statistics based on the process id (PID)
nfsiostat displays NFS I/O statistics.
cifsiostat generates CIFS statistics.

This article focuses on sysstat fundamentals and sar utility.

Collect the sar statistics using cron job – sa1 and sa2

Create sysstat file under /etc/cron.d directory that will collect the historical sar data.

# vi /etc/cron.d/sysstat
*/10 * * * * root /usr/local/lib/sa/sa1 1 1
53 23 * * * root /usr/local/lib/sa/sa2 -A

If you’ve installed sysstat from source, the default location of sa1 and sa2 is /usr/local/lib/sa. If you’ve installed using your distribution update method (for example: yum, up2date, or apt-get), this might be /usr/lib/sa/sa1 and /usr/lib/sa/sa2.

/usr/local/lib/sa/sa1

This runs every 10 minutes and collects sar data for historical reference.
If you want to collect sar statistics every 5 minutes, change */10 to */5 in the above /etc/cron.d/sysstat file.
This writes the data to /var/log/sa/saXX file. XX is the day of the month. saXX file is a binary file. You cannot view its content by opening it in a text editor.
For example, If today is 26th day of the month, sa1 writes the sar data to /var/log/sa/sa26
You can pass two parameters to sa1: interval (in seconds) and count.
In the above crontab example: sa1 1 1 means that sa1 collects sar data 1 time with 1 second interval (for every 10 mins).

/usr/local/lib/sa/sa2

This runs close to midnight (at 23:53) to create the daily summary report of the sar data.
sa2 creates /var/log/sa/sarXX file (Note that this is different than saXX file that is created by sa1). This sarXX file created by sa2 is an ascii file that you can view it in a text editor.
This will also remove saXX files that are older than a week. So, write a quick shell script that runs every week to copy the /var/log/sa/* files to some other directory to do historical sar data analysis.

II. 10 Practical Sar Usage Examples

There are two ways to invoke sar.

sar followed by an option (without specifying a saXX data file). This will look for the current day’s saXX data file and report the performance data that was recorded until that point for the current day.
sar followed by an option, and additionally specifying a saXX data file using -f option. This will report the performance data for that particular day. i.e XX is the day of the month.

In all the examples below, we are going to explain how to view certain performance data for the current day. To look for a specific day, add “-f /var/log/sa/saXX” at the end of the sar command.

All the sar command will have the following as the 1st line in its output.

$ sar -u
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

Linux 2.6.18-194.el5PAE – Linux kernel version of the system.
(dev-db) – The hostname where the sar data was collected.
03/26/2011 – The date when the sar data was collected.
_i686_ – The system architecture
(8 CPU) – Number of CPUs available on this system. On multi core systems, this indicates the total number of cores.

1. CPU Usage of ALL CPUs (sar -u)

This gives the cumulative real-time CPU usage of all CPUs. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on the last field “%idle” to see the cpu load.

$ sar -u 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Following are few variations:

sar -u Displays CPU usage for the current day that was collected until that point.
sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.
sar -u ALL Same as “sar -u” but displays additional fields.
sar -u ALL 1 3 Same as “sar -u 1 3″ but displays additional fields.
sar -u -f /var/log/sa/sa10 Displays CPU usage for the 10day of the month from the sa10 file.

2. CPU Usage of Individual CPU or Core (sar -P)

If you have 4 Cores on the machine and would like to see what the individual cores are doing, do the following.

“-P ALL” indicates that it should displays statistics for ALL the individual Cores.

In the following example under “CPU” column 0, 1, 2, and 3 indicates the corresponding CPU core numbers.

$ sar -P ALL 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:34:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:34:13 PM       all     11.69      0.00      4.71      0.69      0.00     82.90
01:34:13 PM         0     35.00      0.00      6.00      0.00      0.00     59.00
01:34:13 PM         1     22.00      0.00      5.00      0.00      0.00     73.00
01:34:13 PM         2      3.00      0.00      1.00      0.00      0.00     96.00
01:34:13 PM         3      0.00      0.00      0.00      0.00      0.00    100.00

“-P 1″ indicates that it should displays statistics only for the 2nd Core. (Note that Core number starts from 0).

$ sar -P 1 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:36:25 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:36:26 PM         1      8.08      0.00      2.02      1.01      0.00     88.89

Following are few variations:

sar -P ALL Displays CPU usage broken down by all cores for the current day.
sar -P ALL 1 3 Displays real time CPU usage for ALL cores every 1 second for 3 times (broken down by all cores).
sar -P 1 Displays CPU usage for core number 1 for the current day.
sar -P 1 1 3 Displays real time CPU usage for core number 1, every 1 second for 3 times.
sar -P ALL -f /var/log/sa/sa10 Displays CPU usage broken down by all cores for the 10day day of the month from sa10 file.

3. Memory Free and Used (sar -r)

This reports the memory statistics. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on “kbmemfree” and “kbmemused” for free and used memory.

$ sar -r 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:28:06 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact
07:28:07 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:08 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:09 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
Average:      6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204

Following are few variations:

sar -r
sar -r 1 3
sar -r -f /var/log/sa/sa10

4. Swap Space Used (sar -S)

This reports the swap statistics. “1 3″ reports for every 1 seconds a total of 3 times. If the “kbswpused” and “%swpused” are at 0, then your system is not swapping.

$ sar -S 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:31:06 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
07:31:07 AM   8385920         0      0.00         0      0.00
07:31:08 AM   8385920         0      0.00         0      0.00
07:31:09 AM   8385920         0      0.00         0      0.00
Average:      8385920         0      0.00         0      0.00

Following are few variations:

sar -S
sar -S 1 3
sar -S -f /var/log/sa/sa10

Notes:

Use “sar -R” to identify number of memory pages freed, used, and cached per second by the system.
Use “sar -H” to identify the hugepages (in KB) that are used and available.
Use “sar -B” to generate paging statistics. i.e Number of KB paged in (and out) from disk per second.
Use “sar -W” to generate page swap statistics. i.e Page swap in (and out) per second.

5. Overall I/O Activities (sar -b)

This reports I/O statistics. “1 3″ reports for every 1 seconds a total of 3 times.

Following fields are displays in the example below.

tps – Transactions per second (this includes both read and write)
rtps – Read transactions per second
wtps – Write transactions per second
bread/s – Bytes read per second
bwrtn/s – Bytes written per second

$ sar -b 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:56:28 PM       tps      rtps      wtps   bread/s   bwrtn/s
01:56:29 PM    346.00    264.00     82.00   2208.00    768.00
01:56:30 PM    100.00     36.00     64.00    304.00    816.00
01:56:31 PM    282.83     32.32    250.51    258.59   2537.37
Average:       242.81    111.04    131.77    925.75   1369.90

Following are few variations:

sar -b
sar -b 1 3
sar -b -f /var/log/sa/sa10

Note: Use “sar -v” to display number of inode handlers, file handlers, and pseudo-terminals used by the system.

6. Individual Block Device I/O Activities (sar -d)

To identify the activities by the individual block devices (i.e a specific mount point, or LUN, or partition), use “sar -d”

$ sar -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM    dev8-0      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM    dev8-1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM dev120-64      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM dev120-65      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM  dev120-0      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM  dev120-1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM dev120-96      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM dev120-97      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

In the above example “DEV” indicates the specific block device.

For example: “dev53-1″ means a block device with 53 as major number, and 1 as minor number.

The device name (DEV column) can display the actual device name (for example: sda, sda1, sdb1 etc.,), if you use the -p option (pretty print) as shown below.

$ sar -p -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM       sda      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sda1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sdb1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sdc1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sde1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sdf1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sda2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM      sdb2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

Following are few variations:

sar -d
sar -d 1 3
sar -d -f /var/log/sa/sa10
sar -p -d

7. Display context switch per second (sar -w)

This reports the total number of processes created per second, and total number of context switches per second. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -w 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

08:32:24 AM    proc/s   cswch/s
08:32:25 AM      3.00     53.00
08:32:26 AM      4.00     61.39
08:32:27 AM      2.00     57.00

Following are few variations:

sar -w
sar -w 1 3
sar -w -f /var/log/sa/sa10

8. Reports run queue and load average (sar -q)

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -q 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

06:28:53 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
06:28:54 AM         0       230      2.00      3.00      5.00         0
06:28:55 AM         2       210      2.01      3.15      5.15         0
06:28:56 AM         2       230      2.12      3.12      5.12         0
Average:            3       230      3.12      3.12      5.12         0

Note: The “blocked” column displays the number of tasks that are currently blocked and waiting for I/O operation to complete.

Following are few variations:

sar -q
sar -q 1 3
sar -q -f /var/log/sa/sa10

9. Report network statistics (sar -n)

This reports various network statistics. For example: number of packets received (transmitted) through the network card, statistics of packet failure etc.,. “1 3″ reports for every 1 seconds a total of 3 times.

sar -n KEYWORD

KEYWORD can be one of the following:

DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.

$ sar -n DEV 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:11:13 PM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
01:11:14 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:11:14 PM      eth0    342.57    342.57  93923.76 141773.27      0.00      0.00      0.00
01:11:14 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00

10. Report Sar Data Using Start Time (sar -s)

When you view historic sar data from the /var/log/sa/saXX file using “sar -f” option, it displays all the sar data for that specific day starting from 12:00 a.m for that day.

Using “-s hh:mi:ss” option, you can specify the start time. For example, if you specify “sar -s 10:00:00″, it will display the sar data starting from 10 a.m (instead of starting from midnight) as shown below.

You can combine -s option with other sar option.

For example, to report the load average on 26th of this month starting from 10 a.m in the morning, combine the -q and -s option as shown below.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
...
11:20:01 AM         0       127      5.00      3.00      3.00         0
12:00:01 PM         0       127      4.00      2.00      1.00         0

There is no option to limit the end-time. You just have to get creative and use head command as shown below.

For example, starting from 10 a.m, if you want to see 7 entries, you have to pipe the above output to “head -n 10″.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01 | head -n 10
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
10:30:01 AM         0       127      3.00      5.00      2.00         0
10:40:01 AM         0       127      4.00      2.00      1.00         2
10:50:01 AM         0       127      3.00      5.00      5.00         0
11:00:01 AM         0       127      2.00      1.00      6.00         0
11:10:01 AM         0       127      1.00      3.00      7.00         2

There is lot more to cover in Linux performance monitoring and tuning. We are only getting started. More articles to come in the performance series.

SSH Command in Unix

Below listed are the 5 basic usage of SSH command

Identify SSH client version
Login to remote host
Transfer Files to/from remote host
Debug SSH client connection
SSH escape character usage: (Toggle SSH session, SSH session statistics etc.)

1. SSH Client Version:

Sometimes it may be necessary to identify the SSH client that you are currently running and it’s corresponding version number, which can be identified as shown below. Please note that Linux comes with OpenSSH.

$ ssh -V
OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003

$ ssh -V
ssh: SSH Secure Shell 3.2.9.1 (non-commercial version) on i686-pc-linux-gnu

2. Login to remote host:

The First time when you login to the remotehost from a localhost, it will display the host key not found message and you can give “yes” to continue. The host key of the remote host will be added under .ssh2/hostkeys directory of your home directory, as shown below.

localhost$ ssh -l jsmith remotehost.example.com

Host key not found from database.
Key fingerprint:
xabie-dezbc-manud-bartd-satsy-limit-nexiu-jambl-title-jarde-tuxum
You can get a public key’s fingerprint by running
% ssh-keygen -F publickey.pub
on the keyfile.
Are you sure you want to continue connecting (yes/no)? yes
Host key saved to /home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub
host key for remotehost.example.com, accepted by jsmith Mon May 26 2008 16:06:50 -0700
jsmith@remotehost.example.com password:
remotehost.example.com$

The Second time when you login to the remote host from the localhost, it will prompt only for the password as the remote host key is already added to the known hosts list of the ssh client.

         localhost$ ssh -l jsmith remotehost.example.com
         jsmith@remotehost.example.com password: 
         remotehost.example.com$

For some reason, if the host key of the remote host is changed after you logged in for the first time, you may get a warning message as shown below. This could be because of various reasons such as 1) Sysadmin upgraded/reinstalled the SSH server on the remote host 2) someone is doing malicious activity etc., The best possible action to take before saying “yes” to the message below, is to call your sysadmin and identify why you got the host key changed message and verify whether it is the correct host key or not.

        localhost$ ssh -l jsmith remotehost.example.com
         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
         Someone could be eavesdropping on you right now (man-in-the-middle attack)!
         It is also possible that the host key has just been changed.
         Please contact your system administrator.
         Add correct host key to "/home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub"
         to get rid of this message.
        Received server key's fingerprint:
        xabie-dezbc-manud-bartd-satsy-limit-nexiu-jambl-title-jarde-tuxum
        You can get a public key's fingerprint by running
         % ssh-keygen -F publickey.pub
         on the keyfile.
         Agent forwarding is disabled to avoid attacks by corrupted servers.
         Are you sure you want to continue connecting (yes/no)? yes
         Do you want to change the host key on disk (yes/no)? yes
         Agent forwarding re-enabled.
         Host key saved to /home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub
         host key for remotehost.example.com, accepted by jsmith Mon May 26 2008 16:17:31 -0700
         jsmith @remotehost.example.com's password: 
        remotehost$

3. File transfer to/from remote host:

Another common use of ssh client is to copy files from/to remote host using scp.

Copy file from the remotehost to the localhost:

        localhost$scp jsmith@remotehost.example.com:/home/jsmith/remotehostfile.txt remotehostfile.txt

Copy file from the localhost to the remotehost:

        localhost$scp localhostfile.txt jsmith@remotehost.example.com:/home/jsmith/localhostfile.txt

4. Debug SSH Client:

Sometimes it is necessary to view debug messages to troubleshoot any SSH connection issues. For this purpose, pass -v (lowercase v) option to the ssh as shown below.

Example without debug message:

        localhost$ ssh -l jsmith remotehost.example.com
        warning: Connecting to remotehost.example.com failed: No address associated to the name
        localhost$

Example with debug message:

        locaclhost$ ssh -v -l jsmith remotehost.example.com
        debug: SshConfig/sshconfig.c:2838/ssh2_parse_config_ext: Metaconfig parsing stopped at line 3.
        debug: SshConfig/sshconfig.c:637/ssh_config_set_param_verbose: Setting variable 'VerboseMode' to 'FALSE'.
        debug: SshConfig/sshconfig.c:3130/ssh_config_read_file_ext: Read 17 params from config file.
        debug: Ssh2/ssh2.c:1707/main: User config file not found, using defaults. (Looked for '/home/jsmith/.ssh2/ssh2_config')
        debug: Connecting to remotehost.example.com, port 22... (SOCKS not used)
        warning: Connecting to remotehost.example.com failed: No address associated to the name

5. Escape Character: (Toggle SSH session, SSH session statistics etc.)

Escape character ~ get’s SSH clients attention and the character following the ~ determines the escape command.
Toggle SSH Session: When you’ve logged on to the remotehost using ssh from the localhost, you may want to come back to the localhost to perform some activity and go back to remote host again. In this case, you don’t need to disconnect the ssh session to the remote host. Instead follow the steps below.

Login to remotehost from localhost: localhost$ssh -l jsmith remotehost
Now you are connected to the remotehost: remotehost$
To come back to the localhost temporarily, type the escape character ~ and Control-Z. When you type ~ you will not see that immediately on the screen until you press <Control-Z> and press enter. So, on the remotehost in a new line enter the following key strokes for the below to work: ~<Control-Z>

    remotehost$ ~^Z
    [1]+  Stopped                 ssh -l jsmith remotehost
    localhost$

Now you are back to the localhost and the ssh remotehost client session runs as a typical unix background job, which you can check as shown below:

    localhost$ jobs
    [1]+  Stopped                 ssh -l jsmith remotehost

You can go back to the remote host ssh without entering the password again by bringing the background ssh remotehost session job to foreground on the localhost

    localhost$ fg %1
    ssh -l jsmith remotehost
    remotehost$

SSH Session statistics: To get some useful statistics about the current ssh session, do the following. This works only on SSH2 client.

Login to remotehost from localhost: localhost$ssh -l jsmith remotehost
On the remotehost, type ssh escape character ~ followed by s as shown below. This will display lot of useful statistics about the current SSH connection.

        remotehost$  [Note: The ~s is not visible on the command line when you type.] 
        remote host: remotehost
        local host: localhost
        remote version: SSH-1.99-OpenSSH_3.9p1
        local version:  SSH-2.0-3.2.9.1 SSH Secure Shell (non-commercial)
        compressed bytes in: 1506
        uncompressed bytes in: 1622
        compressed bytes out: 4997
        uncompressed bytes out: 5118
        packets in: 15
        packets out: 24
        rekeys: 0
        Algorithms:
        Chosen key exchange algorithm: diffie-hellman-group1-sha1
        Chosen host key algorithm: ssh-dss
        Common host key algorithms: ssh-dss,ssh-rsa
        Algorithms client to server:
        Cipher: aes128-cbc
        MAC: hmac-sha1
        Compression: zlib
        Algorithms server to client:
        Cipher: aes128-cbc
        MAC: hmac-sha1
        Compression: zlib
        localhost$

Friday, June 8, 2012

15 Practical Grep Command Examples In Linux / UNIX

You should get a grip on the Linux grep command.

This is part of the on-going 15 Examples series, where 15 detailed examples will be provided for a specific command or functionality. Earlier we discussed 15 practical examples for Linux find command, Linux command line history and mysqladmin command.

In this article let us review 15 practical examples of Linux grep command that will be very useful to both newbies and experts.

First create the following demo_file that will be used in the examples below to demonstrate grep command.

$ cat demo_file
THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.
this line is the 1st lower case line in this file.
This Line Has All Its First Character Of The Word With Upper Case.

Two lines above this line is empty.
And this is the last line.

1. Search for the given string in a single file

The basic usage of grep command is to search for a specific string in the specified file as shown below.

Syntax:
grep "literal_string" filename

$ grep "this" demo_file
this line is the 1st lower case line in this file.
Two lines above this line is empty.

2. Checking for the given string in multiple files.

Syntax:
grep "string" FILE_PATTERN

This is also a basic usage of grep command. For this example, let us copy the demo_file to demo_file1. The grep output will also include the file name in front of the line that matched the specific pattern as shown below. When the Linux shell sees the meta character, it does the expansion and gives all the files as input to grep.

$ cp demo_file demo_file1

$ grep "this" demo_*
demo_file:this line is the 1st lower case line in this file.
demo_file:Two lines above this line is empty.
demo_file:And this is the last line.
demo_file1:this line is the 1st lower case line in this file.
demo_file1:Two lines above this line is empty.
demo_file1:And this is the last line.

3. Case insensitive search using grep -i

Syntax:
grep -i "string" FILE

This is also a basic usage of the grep. This searches for the given string/pattern case insensitively. So it matches all the words such as “the”, “THE” and “The” case insensitively as shown below.

$ grep -i "the" demo_file
THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.
this line is the 1st lower case line in this file.
This Line Has All Its First Character Of The Word With Upper Case.
And this is the last line.

4. Match regular expression in files

Syntax:
grep "REGEX" filename

This is a very powerful feature, if you can use use regular expression effectively. In the following example, it searches for all the pattern that starts with “lines” and ends with “empty” with anything in-between. i.e To search “lines[anything in-between]empty” in the demo_file.

$ grep "lines.*empty" demo_file
Two lines above this line is empty.

From documentation of grep: A regular expression may be followed by one of several repetition operators:

    ? The preceding item is optional and matched at most once.
    * The preceding item will be matched zero or more times.
    + The preceding item will be matched one or more times.
    {n} The preceding item is matched exactly n times.
    {n,} The preceding item is matched n or more times.
    {,m} The preceding item is matched at most m times.
    {n,m} The preceding item is matched at least n times, but not more than m times.

5. Checking for full words, not for sub-strings using grep -w

If you want to search for a word, and to avoid it to match the substrings use -w option. Just doing out a normal search will show out all the lines.

The following example is the regular grep where it is searching for “is”. When you search for “is”, without any option it will show out “is”, “his”, “this” and everything which has the substring “is”.

$ grep -i "is" demo_file
THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.
this line is the 1st lower case line in this file.
This Line Has All Its First Character Of The Word With Upper Case.
Two lines above this line is empty.
And this is the last line.

The following example is the WORD grep where it is searching only for the word “is”. Please note that this output does not contain the line “This Line Has All Its First Character Of The Word With Upper Case”, even though “is” is there in the “This”, as the following is looking only for the word “is” and not for “this”.

$ grep -iw "is" demo_file
THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.
this line is the 1st lower case line in this file.
Two lines above this line is empty.
And this is the last line.

6. Displaying lines before/after/around the match using grep -A, -B and -C

When doing a grep on a huge file, it may be useful to see some lines after the match. You might feel handy if grep can show you not only the matching lines but also the lines after/before/around the match.

Please create the following demo_text file for this example.

$ cat demo_text
4. Vim Word Navigation

You may want to do several navigation in relation to the words, such as:

* e - go to the end of the current word.
* E - go to the end of the current WORD.
* b - go to the previous (before) word.
* B - go to the previous (before) WORD.
* w - go to the next word.
* W - go to the next WORD.

WORD - WORD consists of a sequence of non-blank characters, separated with white space.
word - word consists of a sequence of letters, digits and underscores.

Example to show the difference between WORD and word

* 192.168.1.1 - single WORD
* 192.168.1.1 - seven words.

6.1 Display N lines after match

-A is the option which prints the specified N lines after the match as shown below.

Syntax:
grep -A <N> "string" FILENAME

The following example prints the matched line, along with the 3 lines after it.

$ grep -A 3 -i "example" demo_text
Example to show the difference between WORD and word

* 192.168.1.1 - single WORD
* 192.168.1.1 - seven words.

6.2 Display N lines before match

-B is the option which prints the specified N lines before the match.

Syntax:
grep -B <N> "string" FILENAME

When you had option to show the N lines after match, you have the -B option for the opposite.

$ grep -B 2 "single WORD" demo_text
Example to show the difference between WORD and word

* 192.168.1.1 - single WORD

6.3 Display N lines around match

-C is the option which prints the specified N lines before the match. In some occasion you might want the match to be appeared with the lines from both the side. This options shows N lines in both the side(before & after) of match.

$ grep -C 2 "Example" demo_text
word - word consists of a sequence of letters, digits and underscores.

Example to show the difference between WORD and word

* 192.168.1.1 - single WORD

7. Highlighting the search using GREP_OPTIONS

As grep prints out lines from the file by the pattern / string you had given, if you wanted it to highlight which part matches the line, then you need to follow the following way.

When you do the following export you will get the highlighting of the matched searches. In the following example, it will highlight all the this when you set the GREP_OPTIONS environment variable as shown below.

$ export GREP_OPTIONS='--color=auto' GREP_COLOR='100;8'

$ grep this demo_file
this line is the 1st lower case line in this file.
Two lines above this line is empty.
And this is the last line.

8. Searching in all files recursively using grep -r

When you want to search in all the files under the current directory and its sub directory. -r option is the one which you need to use. The following example will look for the string “ramesh” in all the files in the current directory and all it’s subdirectory.

$ grep -r "ramesh" *

9. Invert match using grep -v

You had different options to show the lines matched, to show the lines before match, and to show the lines after match, and to highlight match. So definitely You’d also want the option -v to do invert match.

When you want to display the lines which does not matches the given string/pattern, use the option -v as shown below. This example will display all the lines that did not match the word “go”.

$ grep -v "go" demo_text
4. Vim Word Navigation

You may want to do several navigation in relation to the words, such as:

WORD - WORD consists of a sequence of non-blank characters, separated with white space.
word - word consists of a sequence of letters, digits and underscores.

Example to show the difference between WORD and word

* 192.168.1.1 - single WORD
* 192.168.1.1 - seven words.

10. display the lines which does not matches all the given pattern.

Syntax:
grep -v -e "pattern" -e "pattern"

$ cat test-file.txt
a
b
c
d

$ grep -v -e "a" -e "b" -e "c" test-file.txt
d

11. Counting the number of matches using grep -c

When you want to count that how many lines matches the given pattern/string, then use the option -c.

Syntax:
grep -c "pattern" filename

$ grep -c "go" demo_text
6

When you want do find out how many lines matches the pattern

$ grep -c this demo_file
3

When you want do find out how many lines that does not match the pattern

$ grep -v -c this demo_file
4

12. Display only the file names which matches the given pattern using grep -l

If you want the grep to show out only the file names which matched the given pattern, use the -l (lower-case L) option.

When you give multiple files to the grep as input, it displays the names of file which contains the text that matches the pattern, will be very handy when you try to find some notes in your whole directory structure.

$ grep -l this demo_*
demo_file
demo_file1

13. Show only the matched string

By default grep will show the line which matches the given pattern/string, but if you want the grep to show out only the matched string of the pattern then use the -o option.

It might not be that much useful when you give the string straight forward. But it becomes very useful when you give a regex pattern and trying to see what it matches as

$ grep -o "is.*line" demo_file
is line is the 1st lower case line
is line
is is the last line

14. Show the position of match in the line

When you want grep to show the position where it matches the pattern in the file, use the following options as

Syntax:
grep -o -b "pattern" file

$ cat temp-file.txt
12345
12345

$ grep -o -b "3" temp-file.txt
2:3
8:3

Note: The output of the grep command above is not the position in the line, it is byte offset of the whole file.
15. Show line number while displaying the output using grep -n

To show the line number of file with the line matched. It does 1-based line numbering for each file. Use -n option to utilize this feature.

$ grep -n "go" demo_text
5: * e - go to the end of the current word.
6: * E - go to the end of the current WORD.
7: * b - go to the previous (before) word.
8: * B - go to the previous (before) WORD.
9: * w - go to the next word.
10: * W - go to the next WORD.

Unix Commands Part-1

Listed here are a few system monitoring commands which should give you a rough idea of how the server is running.

# server information
uname -a

# server config information
prtconf
sysdef -i

# server up time
uptime

# disk free, listed in KB
df -kt

# mounted devices
mount

# network status
netstat -rn

# network configuration info
ifconfig -a

# processes currently running
ps -elf

# user processes
w
whodo
who am i
finger
ps

# virtual memory statistics
vmstat 5 5

# system activity reporter (Solaris/AIX)
sar 5 5

# report per processor statistics (Solaris)
mpstat 5 5
psrinfo

# swap disk status (Solaris)
swap -l

# shared memory
ipcs -b

Solaris note: SAR numbers can be misleading: as memory use by processes is freed, but not considered 'available' by the reporting tool. Solaris support has recommended using the SR (swap rate) column of vmstats to monitor the availability of memory. When this number reaches 150+, a kernel panic may ensue.

System startup

The kernel is loaded by the boot command, which is executed during startup in a machine-specific way. The kernel may exist on a local disk, CD-ROM, or network. After the kernel loads, the necessary file systems are mounted (located in /etc/vfstab), and /sbin/init is run, which brings the system up to the "initdefault" state set in /etc/inittab. Subsystems are started by scripts in the /etc/rc1.d,/etc/rc2.d, and /etc/rc3.d directories.

System shutdown

# shutdown the server in 60 seconds, restart system in administrative state
# (Solaris)
/usr/sbin/shutdown -y -g60 -i1  "System is begin restarted"

# shutdown the server immediately, cold state
# (Solaris)
/usr/sbin/shutdown -y -g0 -i0

# shutdown AIX server, reboot .. also Ctrl-Ctrl/Alt
shutdown -Fr



# restart the server
/usr/sbin/reboot

User accounts

Adding a unix account involves creating the login and home directory, assigning a group, adding a description, and setting the password. The .profile script should then be placed in the home directory.

# add jsmith account .. the -m parm forces the home dir creation
useradd -c "Jim Smith" -d /home/jsmith -m -s "/usr/bin/ksh" jsmith

# change group for jsmith
chgrp staff jsmith

# change jsmith password
passwd jsmith

# change jsmith description
usermod -c "J.Smith" jsmith

# remove ksmith account
userdel ksmith

# display user accounts
cat /etc/passwd

/* here is a sample .profile script, for sh or ksh */
stty istrip
stty erase ^H
PATH=/usr/bin:/usr/ucb:/etc:/usr/lib/scripts:/usr/sbin:.
export PATH
PS1='BOXNAME:$PWD>'
export PS1

Displaying files

# display file contents
cat myfile

# determine file type
file myfile

# display file, a screen at a time (Solaris)
pg myfile

# display first 100 lines of a file
head -100 myfile

# display last 50 lines of a file
tail -50 myfile

# display file that is changing, dynamically
tail errlog.out -f

File permissions

Permission flags: r = read, w = write, x = execute Permissions are displayed for owner, group, and others.

# display files, with permissions
ls -l
# make file readable, writeable, and executable for group/others
chmod 777 myfile

# make file readable and executable for group/others
chmod 755 myfile

# make file inaccessible for all but the owner
chmod 700 myfile

# make file readable and executable for group/others,
# user assumes owner's group during execution
chmod 4755 myfile

# change permission flags for directory contents
chmod -R mydir

# change group to staff for this file
chgrp staff myfile

# change owner to jsmith for this file
chown jsmith myfile

Listing files

See scripting examples for more elaborate file listings.

# list all files, with directory indicator, long format
ls -lpa

# list all files, sorted by date, ascending
ls -lpatr

# list all text files
ls *.txt

Moving/copying files

See scripting examples for moving and renaming collections of files.

# rename file to backup copy
mv myfile myfile.bak

# copy file to backup copy
cp myfile myfile.bak

# move file to tmp directory
mv myfile /tmp

# copy file from tmp dir to current directory
cp /tmp/myfile .

Deleting files

See scripting examples for group dissection routines.

# delete the file
rm myfile

# delete directory
rd mydir

# delete directory, and all files in it
rm -r mydir

Disk usage

# display disk free, in KB
df -kt

# display disk usage, in KB for directory
du -k mydir

# display directory disk usage, sort by largest first
du -ak / | sort -nr | pg

Using tar

# display contents of a file
tar tvf myfile.tar

# display contents of a diskette (Solaris)
volcheck
tar tvf /vol/dev/rdiskette0/unnamed_floppy

# copy files to a tar file
tar cvf myfile.tar *.sql

# format floppy, and copy files to it (Solaris)
fdformat -U -b floppy99
tar cvf /vol/dev/rdiskette0/floppy99 *.sql

# append files to a tar file
tar rvfn myfile.tar *.txt

# extract files from a tar filem to current dir
tar xvf myfile.tar

Starting a process

This section briefly describes how to start a process from the command line.

Glossary:

   & - run in background
   nohup (No Hang Up) - lets process continue, even if session is disconnected


# run a script, in the background
runbackup &

# run a script, allow it to continue after logging off
nohup runbackup &


# Here nohup.out will still be created, but any output will
# show up in test70.log.  Errors will appear in nohup.out.

nohup /export/spare/hmc/scripts/test70 > test70.log &



# Here nohup.out will not be created; any output will
# show up in test70.log.  Errors will appear test70.log also !

nohup /export/spare/hmc/scripts/test70 > test70.log 2>&1  &

Killing a process

1) In your own session;  e.g. jobs were submitted, but you never logged out:

ps                           # list jobs
kill -9  < process id>       # kill it



2) In a separate session

# process ID appears as column 4
ps -elf | grep -i 

kill -9  < process id>       # kill it



3)  For device (or file)

# find out who is logged in from where

w

# select device, and add /dev ... then use the fuser command

fuser -k /dev/pts/3

Redirecting output

Output can be directed to another program or to a file.

# send output to a file
runbackup > /tmp/backup.log

# also redirect error output
runbackup > /tmp/backup.log 2> /tmp/errors.log

# send output to grep program
runbackup | grep "serious"

Date stamping, and other errata

Other errata is included in this section

# Date stamping files
# format is :
# touch -t yyyymmddhhmi.ss filename

touch -t 199810311530.00 hallowfile

# lowercase functions (ksh)
typeset -u newfile=$filename

# date formatting, yields 112098 for example
date '+%m%d%y'

# display a calendar (Solaris / AIX)
cal

# route output to both test.txt and std output
./runbackup | tee test.txt

# sleep for 5 seconds
sleep 5

# send a message to all users
wall "lunch is ready"

# edit file, which displays at login time (message of the day)
vi /etc/motd

# edit file, which displays before login time (Solaris)
vi /etc/issue