Friday, August 31, 2012

vmstat command in Unix

vmstat command

The first tool to use is the vmstat command, which quickly provides compact information about various system resources and their related performance problems.

The vmstat command reports statistics about kernel threads in the run and wait queue, memory, paging, disks, interrupts, system calls, context switches, and CPU activity. The reported CPU activity is a percentage breakdown of user mode, system mode, idle time, and waits for disk I/O.

Note: If the vmstat command is used without any interval, then it generates a single report. The single report is an average report from when the system was started. You can specify only the Count parameter with the Interval parameter. If the Interval parameter is specified without the Count parameter, then the reports are generated continuously.

As a CPU monitor, the vmstat command is superior to the iostat command in that its one-line-per-report output is easier to scan as it scrolls and there is less overhead involved if there are many disks attached to the system. The following example can help you identify situations in which a program has run away or is too CPU-intensive to run in a multiuser environment.

# vmstat 2
kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
 r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 1  0 22478  1677   0   0   0   0    0   0 188 1380 157 57 32  0 10
 1  0 22506  1609   0   0   0   0    0   0 214 1476 186 48 37  0 16
 0  0 22498  1582   0   0   0   0    0   0 248 1470 226 55 36  0  9

 2  0 22534  1465   0   0   0   0    0   0 238  903 239 77 23  0  0
 2  0 22534  1445   0   0   0   0    0   0 209 1142 205 72 28  0  0
 2  0 22534  1426   0   0   0   0    0   0 189 1220 212 74 26  0  0
 3  0 22534  1410   0   0   0   0    0   0 255 1704 268 70 30  0  0
 2  1 22557  1365   0   0   0   0    0   0 383  977 216 72 28  0  0

 2  0 22541  1356   0   0   0   0    0   0 237 1418 209 63 33  0  4
 1  0 22524  1350   0   0   0   0    0   0 241 1348 179 52 32  0 16
 1  0 22546  1293   0   0   0   0    0   0 217 1473 180 51 35  0 14

This output shows the effect of introducing a program in a tight loop to a busy multiuser system. The first three reports (the summary has been removed) show the system balanced at 50-55 percent user, 30-35 percent system, and 10-15 percent I/O wait. When the looping program begins, all available CPU cycles are consumed. Because the looping program does no I/O, it can absorb all of the cycles previously unused because of I/O wait. Worse, it represents a process that is always ready to take over the CPU when a useful process relinquishes it. Because the looping program has a priority equal to that of all other foreground processes, it will not necessarily have to give up the CPU when another process becomes dispatchable. The program runs for about 10 seconds (five reports), and then the activity reported by the vmstat command returns to a more normal pattern.

Optimum use would have the CPU working 100 percent of the time. This holds true in the case of a single-user system with no need to share the CPU. Generally, if us + sy time is below 90 percent, a single-user system is not considered CPU constrained. However, if us + sy time on a multiuser system exceeds 80 percent, the processes may spend time waiting in the run queue. Response time and throughput might suffer.

To check if the CPU is the bottleneck, consider the four cpu columns and the two kthr (kernel threads) columns in the vmstat report. It may also be worthwhile looking at the faults column:

cpu
Percentage breakdown of CPU time usage during the interval. The cpu columns are as follows:
- us
  The us column shows the percent of CPU time spent in user mode. A UNIX process can execute in either user mode or system (kernel) mode. When in user mode, a process executes within its application code and does not require kernel resources to perform computations, manage memory, or set variables.
- sy
  The sy column details the percentage of time the CPU was executing a process in system mode. This includes CPU resource consumed by kernel processes (kprocs) and others that need access to kernel resources. If a process needs kernel resources, it must execute a system call and is thereby switched to system mode to make that resource available. For example, reading or writing of a file requires kernel resources to open the file, seek a specific location, and read or write data, unless memory mapped files are used.
- id
  The id column shows the percentage of time which the CPU is idle, or waiting, without pending local disk I/O. If there are no threads available for execution (the run queue is empty), the system dispatches a thread called wait, which is also known as the idle kproc. On an SMP system, one wait thread per processor can be dispatched. The report generated by the ps-k or -g 0 option) identifies this as kproc or wait. If the ps report shows a high aggregate time for this thread, it means there were significant periods of time when no other thread was ready to run or waiting to be executed on the CPU. The system was therefore mostly idle and waiting for new tasks. command (with the
- wa
  The wa column details the percentage of time the CPU was idle with pending local disk I/O and NFS-mounted disks. If there is at least one outstanding I/O to a disk when wait is running, the time is classified as waiting for I/O. Unless asynchronous I/O is being used by the process, an I/O request to disk causes the calling process to block (or sleep) until the request has been completed. Once an I/O request for a process completes, it is placed on the run queue. If the I/Os were completing faster, more CPU time could be used.
  
  A wa value over 25 percent could indicate that the disk subsystem might not be balanced properly, or it might be the result of a disk-intensive workload.
  
  For information on the change made to wa, see Wait I/O time reporting.
kthr
Number of kernel threads in various queues averaged per second over the sampling interval. The kthr columns are as follows:
- r
  Average number of kernel threads that are runnable, which includes threads that are running and threads that are waiting for the CPU. If this number is greater than the number of CPUs, there is at least one thread waiting for a CPU and the more threads there are waiting for CPUs, the greater the likelihood of a performance impact.
- b
  Average number of kernel threads in the VMM wait queue per second. This includes threads that are waiting on filesystem I/O or threads that have been suspended due to memory load control.
  
  If processes are suspended due to memory load control, the blocked column (b) in the vmstat report indicates the increase in the number of threads rather than the run queue.
- p
  For vmstat -I The number of threads waiting on I/Os to raw devices per second. Threads waiting on I/Os to filesystems would not be included here.
faults
Information about process control, such as trap and interrupt rate. The faults columns are as follows:
- in
  Number of device interrupts per second observed in the interval. Additional information can be found in Assessing disk performance with the vmstat command.
- sy
  The number of system calls per second observed in the interval. Resources are available to user processes through well-defined system calls. These calls instruct the kernel to perform operations for the calling process and exchange data between the kernel and the process. Because workloads and applications vary widely, and different calls perform different functions, it is impossible to define how many system calls per-second are too many. But typically, when the sy column raises over 10000 calls per second on a uniprocessor, further investigations is called for (on an SMP system the number is 10000 calls per second per processor). One reason could be "polling" subroutines like the select() subroutine. For this column, it is advisable to have a baseline measurement that gives a count for a normal sy value.
- cs
  Number of context switches per second observed in the interval. The physical CPU resource is subdivided into logical time slices of 10 milliseconds each. Assuming a thread is scheduled for execution, it will run until its time slice expires, until it is preempted, or until it voluntarily gives up control of the CPU. When another thread is given control of the CPU, the context or working environment of the previous thread must be saved and the context of the current thread must be loaded. The operating system has a very efficient context switching procedure, so each switch is inexpensive in terms of resources. Any significant increase in context switches, such as when cs is a lot higher than the disk I/O and network packet rate, should be cause for further investigation.

Friday, August 10, 2012

Top 10 performance issues with a Database

Here is a list of top 10 performance issues with a Database and their most probable solutions

Too many calls to the DB - There might be multiple trips to a single DB from various middleware components and any of the following scenarios will occur
1. More data is requested than necessary, primarily for faster rendering (but in the slowing down the entire performance )
2. Multiple applications requesting for the same data.
3. Multiple queries are executed which in the end return the same result
This kind of problem generally arises when there is too much object orientation. The key is to strike a balance between how many objects to create and what to put in each object. Object oriented programing may be good for maintenance, but it surely degrades performance if they are not handled correctly

Too much synchronization – Most developers tend to over-synchronize, large pieces of code are synchronized by writing even larger pieces of code. It is generally fine when there is low load, under high load, the performance of the application will definitely take a beating. How to determine if the application has sync issues. The easiest way (but not 100% fool proof) is to chart CPU time and Execution time
CPU Time – is the time spent on the CPU by the executed code
Execution time - This is the total time the method takes to execute. It includes all times including CPU, I/O, waiting to enter sync block, etc
Generally the gap between the two times gives the waiting time. If our trouble making method does not make an I/O call nor an external call, then it’s most probably a sync issue that is causing the slowness.

Joining too many tables – The worst kind of SQL issues creep up when too many tables are joined and the data is to be extracted from it. Sometimes it is just unfortunate that so many tables have to be necessarily joined to be able to pull out the necessary data.
There are two ways to attack this problem
1) Is it possible to denormalize a few tables to have more data?
2) Is it possible to create a summary table with most of the information that will be updated periodically?
Returning a large result set - Generally no user will go through thousands of records in the result set. Most users will generally limit to only the first few hundreds (or the first 3 -4 pages). By returning all the results, the developer is not only slowing the database but also chocking the network. Breaking the result set into batches (on the database side) will generally solve this issue (though not possible always)

Joining tables in the middleware – SQL is a fantastic language for data manipulation and retrieval. There is simply no need to move data to a middle tier and join tables there. Generally by joining data in the middle tier:
1. Unnecessary load on the network as it has to transport data back and forth
2. Increasing memory requirements on the application server to handle the extra load
3. Drop in server performance as the app tier is mainly held up with processing large queries
The best way to approach this problem is to user Inner and Outer joins right in the database itself. By this, the all the power of SQL and the database is utilized for processing the query.

Ad hock queries – just because SQL gives the privilege to create and use ad-hock queries, there is no point in abusing them. In quite a few cases it is seen that ad-hock queries create more mess than advantage they bring. The best way is to use stored procedures. This is not always possible. Sometimes it is necessary to use ad-hock queries, then there is no option but to use them, but whenever possible, it is recommended to use stored procedures. The main advantage with stored procedures is
1. Pre compiled and ready
2. Optimized by DB
3. Stored procedure in on the DB server, i.e. no network transmission of large SQL request.

Lack of indices – You see that the data is not large, yet the DB seems to be taking an abnormally long time to retrieve the results. The most possible cause for this problem could be lack of or misconfigured index. At first sight it might seem trivial, but when the data grows large, then it plays a significant role. There can be significant hit in performance if the indices are not configured properly.

Fill factor – One of the other things to consider along with index is fill factor. MSDN describes fill factor as a percentage that indicates how much the Database Engine should fill each index page during index creation or rebuild. The fill-factor setting applies only when the index is created or rebuilt. Why is this so important? If the fill factor is too high, if a new record is inserted and index rebuilt, then the DB will more often than not split the index (page splitting) into a new page. This is very resource intensive and causes fragmentation. On the other hand having a very low value for fill factor means that lots of space is reserved for index alone. The easiest way to overcome this is to look at the type of queries that come to the DB; if there are too many SELECT queries, then it is best to leave the default fill factor. On the other hand if there are lots of INSERT, UPDATE and DELETE operations, a nonzero fill factor other than 0 or 100 can be good for performance if the new data is evenly distributed throughout the table.

My Query was fine last week but it is slow this week?? – We get to see a lot of this. The load test ran fine last week but this week the search page is taking a long time. What is wrong with the database? The main issue could be that the execution plan (the way the query is going to get executed on the DB) has changed. The easiest way to get the current explain plan the explain plan for the previous week, compare them and look for the differences.

High CPU and Memory Utilization on the DB – There is a high CPU and high Memory utilization on the database server. There could be a multitude of possible reasons for this.
1. See if there are full table scans happening (soln: create index and update stats)
2. See if there is too much context switching (soln: increase the memory)
3. Look for memory leaks (in terms of tables not being freed even after their usage is complete) (soln: recode!)

There can be many more reasons, but there are the most common ones.

Low CPU and Memory utilization yet poor performance – This is also another case (though not frequent). The CPU and memory are optimally used yet the performance is still slow. The only reason why this can be is for two reasons:
1. Bad network – the database server is waiting for a socket read or write
2. Bad disk management – the database server is waiting for a disk controller to become free

As always these are only the most common database performance issues that might come up in any performance test. There are many more of them out there...

Mitigation of Performance Testing Impediments

An impediment is anything that prevents people from doing their job. Here are some impediments that performance testing teams have encountered.

A. Unavailability of subject matter / technical experts such as developers and operations staff.

B. Unavailability of applications to test due to delays or defects in the functionality of the system under test.

C. Lack of Connectivity/access to resources due to network security ports being available or other network blockage.

D. The script recorder fails to recognize applications (due to non-standard security apparatus or other complexity in the application).

E. Not enough Test Data to cover unique conditions necessary during runs that usually go several hours.

F. Delays in obtaining or having enough software licenses and hardware in the performance environment testing.

G. Lack of correspondence between versions of applications in performance versus in active development.

H. Managers not familiar with the implications of ad-hoc approaches to performance testing.

Fight Or Flight? Proactive or Reactive?

      Some call the list above "issues" which an organization may theoretically face.

      Issues become "risks" when they already impact a project.

      A proactive management style at a particular organization sees value in investing up-front to ensure that desired outcomes occur rather than "fight fires" which occur without preparation.

      A reactive management style at a particular organization believes in "conserving resources" by not dedicating resources to situations that may never occur, and addressing risks when they become actual reality.


Subject Matter Expertise

      The Impediment
      Knowledge about a system and how it works are usually not readily available to those outside the development team.

      What documents written are often one or more versions behind what is under development.

      Requirements and definitions are necessary to separate whether a particular behavior is intended or is a deviation from that requirement.

      Even if load testers have access to up-to-the-minute wiki entries, load testers usually are not free to interact as a peer of developers.

      Load testers are usually not considered a part of the development team or even the development process, so are therefore perceived as an intrusion to developers.

      To many developers, Performance testers are a nuisence who waste time poking around a system that is "already perfect" or "one we already know that is slow".

      What can reactive load testers do?
      Work among developers and easedrop on their conversations (like those studying animals in the wild).

      What can proactive load testers do?
      Up-front, an executive formally establishes expectations for communication and coordination between developers and load testers.

      Ideally, load testers participate in the development process from the moment a development team is formed so that they are socially bonded with the developers.

      Recognizing that developers are under tight deadlines, the load test team member defines exactly what is needed from the developer and when it is needed.

      This requires up-front analysis of the development organization:

          o the components of the application
          o which developers work on which component
          o contact information for each developer
          o existing documents available and who wrote each document
          o comments in blogs written by each developer

      An executive assigns a "point person" within the development organization who can provide this information.

      Assignments for each developer needs to originate from the development manager under whom a developer works for.

            When one asks/demands something without the authority to do so, that person would over time be perceived as a nuisence.

            No one can serve two masters. For you will hate one and love the other; you will be devoted to one and despise the other.

      A business analyst who is familiar with the application's intended behavior makes a video recording of the application using a utility such as Camtasia from ToolSmith. A recording has the advtange of capturing the timing as well as the steps.



The U.S. military developed the web-based CAVNET system to collaborate on innovations to improvise around impediments found in the found.

Availability of applications

      The Impediment
      Parts of an applications under active development become inacessible while developers are in the middle of working on them.

      The application may not have been built successfully. There are many root causes for bad builds:

          o Specification of what goes into each build are not accurate or complete.
          o Resources intended to go into a particular build are not made available.
          o An incorrect version of a component is built with newer incompatible components.
          o Build scripts and processes do not recognize these potential errors, leading to build errors.
          o Inadequate verification of build completeness.

      What can reactive load testers do?
      Frequent (nightly) builds may enable testers more opportunities than losing perhaps weeks wait for the next good build.

      Testers switch to another project/application when one application cannot be tested.

      What can proactive load testers do?
      Use a separate test environment that is updated from the development system only when parts of the application become stable enough to test.

      Have a separate test environment for each version so that work on a prior version can occur when a build is not successful on one particular environment.

      Develop a "smoke test" suite to ensure that applications are testable.

      Coordinate testing schedules with what is being changed by developers.

      Analyze the root causes of why builds are not successful, and track progress on elminating those causes over time.


Connectivity/access to resources

      The Impediment
      Workers may not be able to reach the application because of network (remote VPN) connectivity or security access.

      What can reactive load testers do?
      Work near the physical machine.

      Grant unrestricted access to those working on the system.

      What can proactive load testers do?
      Analyze the access for each functionality required by each role.

      Pre-schedule when those who grant access are available to the project.


Script Recorder Recognition

      The Impediment
      Load test script creation software such as LoadRunner work by listening and capturing what goes across the wire and display those conversations as script code which may be modified by humans.

      Such recording mechanisms are designed to recognize only standard protocols going through the wire.

      Standard recording mechanisms will not recognize custom communications, especially within applications using advanced security mechanisms.

      Standard recording mechanisms also have difficulty recognizing complex use of Javascript or CSS syntax in SAP portal code.

      What can reactive load testers do?
      Skip (de-scope) portions which cannot be easily recognized.

      What can proactive load testers do?
      To ensure that utility applications (such as LoadRunner) can be installed, install them before locking down the system.

      Define the pattern install them before locking down the system.


Test Data

      The Impediment
      Applications often only allow a certain combination of values to be accepted. An example of this is only specific postal zip codes being valid within a certain US state.

      Using the same value repeatedly during load testing does not create a realistic emulation of actual behavior because most modern systems cache data in memory, which is 100 times faster than retrieving data from a hard drive.

      This discussion also includes role permissions having a different impact on the system. For example, the screen of an administrator or manager would have more options. The more options, the more resources it takes just to display the screen as well as to edit input fields.

      A wide variation in data values forces databases to take time to scan through files. Specifying an index used to retrieve data is the most common approach to make applications more efficient.

      Growth in the data volume handled by a system can render indexing schemes inefficient at the new level of data.

      What can reactive load testers do?
      Use a single role for all testing.

      Qualify results from each test with the amount of data used to conduct each test.

      Use trial-and-error approachs to finding combinations of values which meet field validation rules.

      Examine application source code to determine the rules.

      Analyze existing logs to define the distribution of function invocations during test runs.

      What can proactive load testers do?
      Project the likely growth of the application in terms of impact to the number of rows in each key data variable. This information is then used to define the growth in row in each table.

      Define procedures for growing the database size, using randomized data values in names.


Test Environment

      The Impediment
      Creating a separate enviornment for load testing can be expensive for a large complex system.

      In order to avoid overloading the production network, the load testing enviornment is often setup so no communication is possible to the rest of the network. This makes it difficult to deploy resources into the environment and then retrieve run result files from the environment.

      A closed environment requires its own set of utility services such as DNS, authentication (LDAP), time sychronization, etc.

      What can reactive load testers do?
      Change network firewalls temporarily while using the development environment for load testing (when developers do not use it).

      Use the production fail-over environment temporarily and hope that it is not needed during the test.

      What can proactive load testers do?
      Build up a production environment and use it for load testing before it is used in actual production.


Correspondance Between Versions

      The Impediment
      Defects found in the version running on the perftest environment may not be reproducible by developers in the development/unit test environments running a different (more recent) version.

      Developers may have moved on to a different version, different projects, or even different employers.

      What can reactive load testers do?
      Rerun short load tests on development servers. If the server is shared, the productivity of developers would be affected.

      What can proactive load testers do?
      Before testing, freeze the total state of the application in a full back-up so that the exact state of the system can be restored, even after changes are tried to diagnose or fix the application on the system where it's found.

      Run load tests with trace logs information. This would not duplicate how the system is actually run in production mode.


Ad-hoc Approaches

      The Impediment
      Most established professional fields (such as accounting and medicine) have laws, regulations, and defined industry practices which give legitimacy to certain approaches. People are trained to follow them. The consequences of certain courses of action are known.

      But the profession of performance and load testing has not matured to that point.

      The closest industry document, ITIL, is not yet universally adopted. And ITIL does not clarify the work of performance testing in much detail.

      Consequently, each individual involved with load testing is likely to have his/her own opinions about what actions should be taken.

      This makes rational exploration of the implications of specific courses of action a conflict-ridden and thus time-consuming and expensive endeavor.

      What can reactive load testers do?
      Allocate time for planning before starting actual work until concurrance on the project plan is achieved among the stakeholders.

      Revise project completion estimates or scope as new information becomes available.

      What can proactive load testers do?
      Before the project gets away, agree on the rationale for elements of the project plan and who will do what when (commitments of tasks and deliverables). This is difficult for those who are not accustomed to being accountable, and requests for it would result in withdrawl or other defensive behavior.

      Identify alternative approaches and analyze them before managers come up with it themselves.

      Up-front, identify how to contact each stakeholder and keep them updated at least weekly, and immediately if decisions impact what they are actively working on.

      If a new manager is inserted in the project after it starts, review the project plan and rationale for its elements.

Tuesday, August 7, 2012

Performance Testing in Agile Framework

This is again not a new thing in the industry. It has been there for some time now. At the first look performance testing within Agile framework do not seem to fit in as performance testing for a large part has always been done within the functionally-stable-code paradigm.

On the other side performance testing at an early stage looks to be a very good idea. The reason is there will be limited possibility of presence hotspots in code-base. The main performance bottlenecks that may be present, would be more on the infrastructure side which are more easier to fix and with almost no functional regression-testing. Thus we do have a very good case for performance testing to be in Agile framework.

Although agile after each sprint delivers a workable product, there are some challenges that would crop-up here -

Stakeholders Buy-in
Project Management Vision
Definition of the SLAs
Unstable builds
Development and Execution of Test Cases
Highly skilled performance team

1. Stakeholders Buy-in

This is the most important aspect for any project. From a performance point of view this becomes altogether more important. Having an additional performance test team from the early phases of this project does put an additional strain on the budget and the timeline of the projects. Furthermore the advantages of performance testing are more of an intangible kind than say a functional testing. Thus it becomes a tad more difficult to get a stakeholders buy-in.

2. Project Management Vision

Despite the importance of the performance of an application, many project managers are still unaware of the performance test processes and the challenges a performance team faces. This apathy has been the undoing of many good projects. Within Agile framework, this would cause more conflicts between the project managers and performance teams. The ideal way would be to have a detailed meeting between the developers, performance architect/team and project managers and chalk out a plan which would decide the following -

Modules to be targeted for performance testing
Approach of performance testing
In which sprint to begin of performance testing?
Identification of Module-wise load that needs to be simulated
Profiling Tool Selection
Transition of Performance Tool – This would happen, as initially tools like JPerfUnit may be used and at a later stage end-to-end load testing tools like LoadRunner or SilkPerformer may be used.

These are some of the many parameters that needs to be defined at a project management level.

3. Definition of SLAs

This will be one of the most challenging aspect as business can always give an insight on end-to-end SLA. However, at a module-level it becomes more challenging. Here there arises the need for developer and performance architect to arrive at an estimated number. Apart from this, there may be a need for de-scoping of some modules as not all modules may be critical from performance stand-point.

4. Development and Execution of Test Cases

Although agile delivers a workable product at the end of every sprint, we may not be able to use the standard load testing tools as not all components would be present which a standard load-testing tool may require. So in most cases, there is an inherent challenge in simulating the load. Tools may very between phases. For instance, JPerfUnit, a stub or test harness might be used in the initial phases. An extra development time for this would have to be estimated in the initial project planning phase. Finally, once created, development and execution of test cases would follow. With the help of profiling tool, most performance hotspots can be identified early.

At later stages of the project, the traditional performance scripting and execution will replace the harness that was created in the earlier stage. So there will be a big re-work which again would be a challenge considering the sprint-timeline. So sprint should be planned taking this into consideration too.

5. Unstable builds

As development and testing is going on simultaneously, a break-off time would be needed within a sprint. The other alternative is to build the script on a continuous basis based which is quite difficult. Performance team would then create the required test script and test the code. This break-off point is only to enable performance team to work on a particular test case. If major changes occur after the break-off time, they will have to be incorporated in the next sprint.

6. High-skilled Performance Team

Last but not the least as the cliché goes. More than the skill-level, it is imperative that team members should have the confidence to get their hands dirty learning about the system and innovating as challenges continue to crop-up during the various sprints, keeping in mind the performance test approach, and thus ensuring a quality product.

The agile methodology does present lot of challenges at least from performance stand-point. However, if the performance angle is kept in mind from the beginning, it will certainly help in reducing lot of pain later. From a performance tester stand-point, agile is a gold-mine, as you get involved with developers at an early stage of the application. This would help in greater understanding of the application and its internal workings which eventually would help in better identification of performance bottlenecks.

Performance Testing and Non-Functional Requirements

Performance Testing

Performance testing is one of those phrases on which there is little agreement as to what it precisely means. However the various descriptions given for performance testing show that there is a broad agreement about the non-functional requirements that need to be tested.

Another complication factor is that the various types of performance testing are all done using a load testing tool. The tool, like all tools, enables somebody to do a job, not to do the testing itself. For this you need a trained and experienced performance or load tester.

Part-1

Performance Testing and Non-Functional Requirements

The performance based non-functional requirements to be tested cover a number of questions which the owners of the system can reasonable ask. These include:

How many users can the system deal with?
What performance will each type of transaction have?
What happens if very high usage occurs?
What happens if the system fails?
Can the system operate reliably over a period of time?

Each of these questions then has detail numbers placed against them in order to allow testing to take place.

Types of Performance Testing

For each of the questions a separate type of test can be defined. The most important confusion is between load and performance testing. Some documentation does not mention one or the other, some swap meanings which results in a general lack of clarity. In essence the two types of testing are mirror images of each other.

This article argues that although a load testing tool is used for each type of testing, the key difference between them is the Acceptance Criteria or objectives of each type of test. The types of test matching the non-functional questions are:

Load Testing – How many users supported.
Performance Testing – The performance of each transaction type.
Stress Testing – Explores high but valid usage.
Robustness Testing – Explores the effect of unavailability of resources.
Endurance Testing – Checks long term running of the system.

To illustrate each type of testing a simple real world model will be used. Imagine our system is a road network connecting together houses with shops, schools and workplaces. Our transactions are users going from one point in the network to another, such as from the house to the workplace. More complicated transactions might be leaving the house to go to the school to drop off children and from there drive on to work. The functional testing that would have been done would checked that each type of transaction is possible, that roads exist between each of the network points for each type of transactions they want to follow. The following paragraphs will illustrate how the various types of load testing would apply to this model.

Load Testing

Load Testing means that given a performance range can a varying numbers of users be dealt with. This is dealing with the issues of checking various patterns of load and seeing if a pattern of usage is delivered rather than checking the details of the individual transactions.

In terms of the road network model this would mean simulating a wide variety of patterns of usage. Can the school and work rush peak load perform acceptably. Can picking up children during school homeward rush be dealt with. Can delivery vans get to shops in time and not block the roads. These are all issues to do with checking the volume and mix of transactions.

Performance Testing

Performance Testing itself checks that for a given range of users loads are each of the specific transaction performance criteria met. This is involved in seeing where the problems are with individual transactions, rather than the delivery of the overall load.

The road network would show this by tracking individual users during the various road patterns and see if they get to school or work on time. Whether they are held up in traffic jams, and where the problems occur so that changes can be made to the road network.

Stress Testing

Stress Testing explores the issues of what happens to the system as it is put under valid load in excess of it design parameters. It is sometimes defined as to break the system but this is not the point of the test, though it may be the result. The purpose is to put the system under a number of very high load patterns and observe what happens to it. Some of the observations would be if the system just slows down, come to a stop, or causes serious damage.

For a road network this would be the equivalent of putting in a lot more vehicles. Perhaps the school is having an exchange visit, there is a conference in town, or the shops are having a special sale which attracts out of town visitors. Each or several of these would put a stress on the network.

Robustness Testing

Robustness Testing in contrast to Stress Testing deals with the response of the system under various loads when resources, such as a dependent system, hardware, or network, become unavailable. Again the purpose is not to break the system but study the system’s response to the various resource failures against several load patterns.

The equivalent for the road network would be having roads collapse, schools or works places shut down, or the shops deciding not the open. The purpose is to see if everybody could still get around town.

Endurance Testing

Endurance Testing checks how stable the system is over a period of time. It applies a high but normal load pattern of transactions to the system for a continuous period. Checks are made during the run to see if any subtle errors build up over a period of time which can affect the performance or integrity of the system.

In road network terms this would be checking usage over several consecutive days to see what happens. It may be that vehicles break down and stop people getting around, or because of extended jams schools and shops do not open on time. These errors build up over a period of time.

Load Testing Tools

All these tests use a load testing tool to assist in running of the tests. The Acceptance Criteria and tests have to be designed by the testers, and then implemented in the tool. The tool itself has three key functions:

Provide a simulated load to the system which can vary with time.
Record the results of running test loads.
Provide tools to display the results of tests.

In road terms they would be providing more vehicles to go on runs within the network, having watchers to count vehicles and record what is going on, and then writing a report on what is found.

When to Performance Test

Types of Testing employed during a project including when performance testing takes place.
User Requirements - three areas to include, such as non-functional requirements, and one to avoid when writing user requirements.

Part-2

Before starting the Performance testing, performance test engineer should collect non-functional requirement and Below is comprehensive list of performance Testing Requirements.

Total user base : How many users will have access to this application?

Customer will provide answer, how many users are having an access to the application.
If the application is existing one, Customer will get the data from previous years to understand how many users are using the application. There are third party tools like Web trends to analyze the traffic of existing used application
If the application is new and not yet in live, Business analyst and Application owners will be able to provide total user base.

Concurrent User Load :What is the number of total users who will access the application together at any given point in time?

According to industry standard, it is 10% of total users will be the concurrent load but this formula will not applicable for all the domains.
Third party tools like web trends will be useful

Business Processes ( Performance Scenarios):

What are the business processes (Workflows) to be performance tested with concurrent users? Objective here is to find few business processes which can cover up to 80% of the application code"
Load Distribution :What is the user/Load distribution on each business processes listed
What is the time duration taken for each business process to complete end to end(Workflow)?
How often are the users doing each business process (Workflow) in an hour, a day, etc? to get this data for existing application

Service Level Agreement (Performance - Service Level Agreement):

What are the acceptable response times for each transaction (page) in all the business flows that needs to be performance tested? Mention SLA for against each of the transaction (page) identified
1. All pages within the portal shall load in 3 seconds or less.
2. Report should load in 5 sec
3. Yearly Report should load in 20
4. SecgetAccountHistory API’s roundtrip response time should be < 250 msec

SAR Command in Unix

Using sar you can monitor performance of various Linux subsystems (CPU, Memory, I/O..) in real time.

Using sar, you can also collect all performance data on an on-going basis, store them, and do historical analysis to identify bottlenecks.

Sar is part of the sysstat package.

This article explains how to install and configure sysstat package (which contains sar utility) and explains how to monitor the following Linux performance statistics using sar.

Collective CPU usage
Individual CPU statistics
Memory used and available
Swap space used and available
Overall I/O activities of the system
Individual device I/O activities
Context switch statistics
Run queue and load average data
Network statistics
Report sar data from a specific time

This is the only guide you’ll need for sar utility. So, bookmark this for your future reference.

I. Install and Configure Sysstat

Install Sysstat Package

First, make sure the latest version of sar is available on your system. Install it using any one of the following methods depending on your distribution.

sudo apt-get install sysstat
(or)
yum install sysstat
(or)
rpm -ivh sysstat-10.0.0-1.i586.rpm

Install Sysstat from Source

wget http://pagesperso-orange.fr/sebastien.godard/sysstat-10.0.0.tar.bz2

tar xvfj sysstat-10.0.0.tar.bz2

cd sysstat-10.0.0

./configure --enable-install-cron

Note: Make sure to pass the option –enable-install-cron. This does the following automatically for you. If you don’t configure sysstat with this option, you have to do this ugly job yourself manually.

Creates /etc/rc.d/init.d/sysstat
Creates appropriate links from /etc/rc.d/rc*.d/ directories to /etc/rc.d/init.d/sysstat to start the sysstat automatically during Linux boot process.
For example, /etc/rc.d/rc3.d/S01sysstat is linked automatically to /etc/rc.d/init.d/sysstat

After the ./configure, install it as shown below.

make

make install

Note: This will install sar and other systat utilities under /usr/local/bin

Once installed, verify the sar version using “sar -V”. Version 10 is the current stable version of sysstat.

$ sar -V
sysstat version 10.0.0
(C) Sebastien Godard (sysstat  orange.fr)

Finally, make sure sar works. For example, the following gives the system CPU statistics 3 times (with 1 second interval).

$ sar 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Utilities part of Sysstat

Following are the other sysstat utilities.

sar collects and displays ALL system activities statistics.
sadc stands for “system activity data collector”. This is the sar backend tool that does the data collection.
sa1 stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.
sa2 creates daily summary of the collected statistics. sa2 runs from cron.
sadf can generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.
iostat generates CPU, I/O statistics
mpstat displays CPU statistics.
pidstat reports statistics based on the process id (PID)
nfsiostat displays NFS I/O statistics.
cifsiostat generates CIFS statistics.

This article focuses on sysstat fundamentals and sar utility.

Collect the sar statistics using cron job – sa1 and sa2

Create sysstat file under /etc/cron.d directory that will collect the historical sar data.

# vi /etc/cron.d/sysstat
*/10 * * * * root /usr/local/lib/sa/sa1 1 1
53 23 * * * root /usr/local/lib/sa/sa2 -A

If you’ve installed sysstat from source, the default location of sa1 and sa2 is /usr/local/lib/sa. If you’ve installed using your distribution update method (for example: yum, up2date, or apt-get), this might be /usr/lib/sa/sa1 and /usr/lib/sa/sa2.

/usr/local/lib/sa/sa1

This runs every 10 minutes and collects sar data for historical reference.
If you want to collect sar statistics every 5 minutes, change */10 to */5 in the above /etc/cron.d/sysstat file.
This writes the data to /var/log/sa/saXX file. XX is the day of the month. saXX file is a binary file. You cannot view its content by opening it in a text editor.
For example, If today is 26th day of the month, sa1 writes the sar data to /var/log/sa/sa26
You can pass two parameters to sa1: interval (in seconds) and count.
In the above crontab example: sa1 1 1 means that sa1 collects sar data 1 time with 1 second interval (for every 10 mins).

/usr/local/lib/sa/sa2

This runs close to midnight (at 23:53) to create the daily summary report of the sar data.
sa2 creates /var/log/sa/sarXX file (Note that this is different than saXX file that is created by sa1). This sarXX file created by sa2 is an ascii file that you can view it in a text editor.
This will also remove saXX files that are older than a week. So, write a quick shell script that runs every week to copy the /var/log/sa/* files to some other directory to do historical sar data analysis.

II. 10 Practical Sar Usage Examples

There are two ways to invoke sar.

sar followed by an option (without specifying a saXX data file). This will look for the current day’s saXX data file and report the performance data that was recorded until that point for the current day.
sar followed by an option, and additionally specifying a saXX data file using -f option. This will report the performance data for that particular day. i.e XX is the day of the month.

In all the examples below, we are going to explain how to view certain performance data for the current day. To look for a specific day, add “-f /var/log/sa/saXX” at the end of the sar command.

All the sar command will have the following as the 1st line in its output.

$ sar -u
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

Linux 2.6.18-194.el5PAE – Linux kernel version of the system.
(dev-db) – The hostname where the sar data was collected.
03/26/2011 – The date when the sar data was collected.
_i686_ – The system architecture
(8 CPU) – Number of CPUs available on this system. On multi core systems, this indicates the total number of cores.

1. CPU Usage of ALL CPUs (sar -u)

This gives the cumulative real-time CPU usage of all CPUs. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on the last field “%idle” to see the cpu load.

$ sar -u 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Following are few variations:

sar -u Displays CPU usage for the current day that was collected until that point.
sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.
sar -u ALL Same as “sar -u” but displays additional fields.
sar -u ALL 1 3 Same as “sar -u 1 3″ but displays additional fields.
sar -u -f /var/log/sa/sa10 Displays CPU usage for the 10day of the month from the sa10 file.

2. CPU Usage of Individual CPU or Core (sar -P)

If you have 4 Cores on the machine and would like to see what the individual cores are doing, do the following.

“-P ALL” indicates that it should displays statistics for ALL the individual Cores.

In the following example under “CPU” column 0, 1, 2, and 3 indicates the corresponding CPU core numbers.

$ sar -P ALL 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:34:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:34:13 PM       all     11.69      0.00      4.71      0.69      0.00     82.90
01:34:13 PM         0     35.00      0.00      6.00      0.00      0.00     59.00
01:34:13 PM         1     22.00      0.00      5.00      0.00      0.00     73.00
01:34:13 PM         2      3.00      0.00      1.00      0.00      0.00     96.00
01:34:13 PM         3      0.00      0.00      0.00      0.00      0.00    100.00

“-P 1″ indicates that it should displays statistics only for the 2nd Core. (Note that Core number starts from 0).

$ sar -P 1 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:36:25 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:36:26 PM         1      8.08      0.00      2.02      1.01      0.00     88.89

Following are few variations:

sar -P ALL Displays CPU usage broken down by all cores for the current day.
sar -P ALL 1 3 Displays real time CPU usage for ALL cores every 1 second for 3 times (broken down by all cores).
sar -P 1 Displays CPU usage for core number 1 for the current day.
sar -P 1 1 3 Displays real time CPU usage for core number 1, every 1 second for 3 times.
sar -P ALL -f /var/log/sa/sa10 Displays CPU usage broken down by all cores for the 10day day of the month from sa10 file.

3. Memory Free and Used (sar -r)

This reports the memory statistics. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on “kbmemfree” and “kbmemused” for free and used memory.

$ sar -r 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:28:06 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact
07:28:07 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:08 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:09 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
Average:      6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204

Following are few variations:

sar -r
sar -r 1 3
sar -r -f /var/log/sa/sa10

4. Swap Space Used (sar -S)

This reports the swap statistics. “1 3″ reports for every 1 seconds a total of 3 times. If the “kbswpused” and “%swpused” are at 0, then your system is not swapping.

$ sar -S 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:31:06 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
07:31:07 AM   8385920         0      0.00         0      0.00
07:31:08 AM   8385920         0      0.00         0      0.00
07:31:09 AM   8385920         0      0.00         0      0.00
Average:      8385920         0      0.00         0      0.00

Following are few variations:

sar -S
sar -S 1 3
sar -S -f /var/log/sa/sa10

Notes:

Use “sar -R” to identify number of memory pages freed, used, and cached per second by the system.
Use “sar -H” to identify the hugepages (in KB) that are used and available.
Use “sar -B” to generate paging statistics. i.e Number of KB paged in (and out) from disk per second.
Use “sar -W” to generate page swap statistics. i.e Page swap in (and out) per second.

5. Overall I/O Activities (sar -b)

This reports I/O statistics. “1 3″ reports for every 1 seconds a total of 3 times.

Following fields are displays in the example below.

tps – Transactions per second (this includes both read and write)
rtps – Read transactions per second
wtps – Write transactions per second
bread/s – Bytes read per second
bwrtn/s – Bytes written per second

$ sar -b 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:56:28 PM       tps      rtps      wtps   bread/s   bwrtn/s
01:56:29 PM    346.00    264.00     82.00   2208.00    768.00
01:56:30 PM    100.00     36.00     64.00    304.00    816.00
01:56:31 PM    282.83     32.32    250.51    258.59   2537.37
Average:       242.81    111.04    131.77    925.75   1369.90

Following are few variations:

sar -b
sar -b 1 3
sar -b -f /var/log/sa/sa10

Note: Use “sar -v” to display number of inode handlers, file handlers, and pseudo-terminals used by the system.

6. Individual Block Device I/O Activities (sar -d)

To identify the activities by the individual block devices (i.e a specific mount point, or LUN, or partition), use “sar -d”

$ sar -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM    dev8-0      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM    dev8-1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM dev120-64      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM dev120-65      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM  dev120-0      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM  dev120-1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM dev120-96      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM dev120-97      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

In the above example “DEV” indicates the specific block device.

For example: “dev53-1″ means a block device with 53 as major number, and 1 as minor number.

The device name (DEV column) can display the actual device name (for example: sda, sda1, sdb1 etc.,), if you use the -p option (pretty print) as shown below.

$ sar -p -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM       sda      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sda1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sdb1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sdc1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sde1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sdf1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sda2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM      sdb2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

Following are few variations:

sar -d
sar -d 1 3
sar -d -f /var/log/sa/sa10
sar -p -d

7. Display context switch per second (sar -w)

This reports the total number of processes created per second, and total number of context switches per second. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -w 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

08:32:24 AM    proc/s   cswch/s
08:32:25 AM      3.00     53.00
08:32:26 AM      4.00     61.39
08:32:27 AM      2.00     57.00

Following are few variations:

sar -w
sar -w 1 3
sar -w -f /var/log/sa/sa10

8. Reports run queue and load average (sar -q)

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -q 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

06:28:53 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
06:28:54 AM         0       230      2.00      3.00      5.00         0
06:28:55 AM         2       210      2.01      3.15      5.15         0
06:28:56 AM         2       230      2.12      3.12      5.12         0
Average:            3       230      3.12      3.12      5.12         0

Note: The “blocked” column displays the number of tasks that are currently blocked and waiting for I/O operation to complete.

Following are few variations:

sar -q
sar -q 1 3
sar -q -f /var/log/sa/sa10

9. Report network statistics (sar -n)

This reports various network statistics. For example: number of packets received (transmitted) through the network card, statistics of packet failure etc.,. “1 3″ reports for every 1 seconds a total of 3 times.

sar -n KEYWORD

KEYWORD can be one of the following:

DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.

$ sar -n DEV 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:11:13 PM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
01:11:14 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:11:14 PM      eth0    342.57    342.57  93923.76 141773.27      0.00      0.00      0.00
01:11:14 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00

10. Report Sar Data Using Start Time (sar -s)

When you view historic sar data from the /var/log/sa/saXX file using “sar -f” option, it displays all the sar data for that specific day starting from 12:00 a.m for that day.

Using “-s hh:mi:ss” option, you can specify the start time. For example, if you specify “sar -s 10:00:00″, it will display the sar data starting from 10 a.m (instead of starting from midnight) as shown below.

You can combine -s option with other sar option.

For example, to report the load average on 26th of this month starting from 10 a.m in the morning, combine the -q and -s option as shown below.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
...
11:20:01 AM         0       127      5.00      3.00      3.00         0
12:00:01 PM         0       127      4.00      2.00      1.00         0

There is no option to limit the end-time. You just have to get creative and use head command as shown below.

For example, starting from 10 a.m, if you want to see 7 entries, you have to pipe the above output to “head -n 10″.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01 | head -n 10
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
10:30:01 AM         0       127      3.00      5.00      2.00         0
10:40:01 AM         0       127      4.00      2.00      1.00         2
10:50:01 AM         0       127      3.00      5.00      5.00         0
11:00:01 AM         0       127      2.00      1.00      6.00         0
11:10:01 AM         0       127      1.00      3.00      7.00         2

There is lot more to cover in Linux performance monitoring and tuning. We are only getting started. More articles to come in the performance series.

SSH Command in Unix

Below listed are the 5 basic usage of SSH command

Identify SSH client version
Login to remote host
Transfer Files to/from remote host
Debug SSH client connection
SSH escape character usage: (Toggle SSH session, SSH session statistics etc.)

1. SSH Client Version:

Sometimes it may be necessary to identify the SSH client that you are currently running and it’s corresponding version number, which can be identified as shown below. Please note that Linux comes with OpenSSH.

$ ssh -V
OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003

$ ssh -V
ssh: SSH Secure Shell 3.2.9.1 (non-commercial version) on i686-pc-linux-gnu

2. Login to remote host:

The First time when you login to the remotehost from a localhost, it will display the host key not found message and you can give “yes” to continue. The host key of the remote host will be added under .ssh2/hostkeys directory of your home directory, as shown below.

localhost$ ssh -l jsmith remotehost.example.com

Host key not found from database.
Key fingerprint:
xabie-dezbc-manud-bartd-satsy-limit-nexiu-jambl-title-jarde-tuxum
You can get a public key’s fingerprint by running
% ssh-keygen -F publickey.pub
on the keyfile.
Are you sure you want to continue connecting (yes/no)? yes
Host key saved to /home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub
host key for remotehost.example.com, accepted by jsmith Mon May 26 2008 16:06:50 -0700
jsmith@remotehost.example.com password:
remotehost.example.com$

The Second time when you login to the remote host from the localhost, it will prompt only for the password as the remote host key is already added to the known hosts list of the ssh client.

         localhost$ ssh -l jsmith remotehost.example.com
         jsmith@remotehost.example.com password: 
         remotehost.example.com$

For some reason, if the host key of the remote host is changed after you logged in for the first time, you may get a warning message as shown below. This could be because of various reasons such as 1) Sysadmin upgraded/reinstalled the SSH server on the remote host 2) someone is doing malicious activity etc., The best possible action to take before saying “yes” to the message below, is to call your sysadmin and identify why you got the host key changed message and verify whether it is the correct host key or not.

        localhost$ ssh -l jsmith remotehost.example.com
         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         @       WARNING: HOST IDENTIFICATION HAS CHANGED!         @
         @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
         IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
         Someone could be eavesdropping on you right now (man-in-the-middle attack)!
         It is also possible that the host key has just been changed.
         Please contact your system administrator.
         Add correct host key to "/home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub"
         to get rid of this message.
        Received server key's fingerprint:
        xabie-dezbc-manud-bartd-satsy-limit-nexiu-jambl-title-jarde-tuxum
        You can get a public key's fingerprint by running
         % ssh-keygen -F publickey.pub
         on the keyfile.
         Agent forwarding is disabled to avoid attacks by corrupted servers.
         Are you sure you want to continue connecting (yes/no)? yes
         Do you want to change the host key on disk (yes/no)? yes
         Agent forwarding re-enabled.
         Host key saved to /home/jsmith/.ssh2/hostkeys/key_22_remotehost.example.com.pub
         host key for remotehost.example.com, accepted by jsmith Mon May 26 2008 16:17:31 -0700
         jsmith @remotehost.example.com's password: 
        remotehost$

3. File transfer to/from remote host:

Another common use of ssh client is to copy files from/to remote host using scp.

Copy file from the remotehost to the localhost:

        localhost$scp jsmith@remotehost.example.com:/home/jsmith/remotehostfile.txt remotehostfile.txt

Copy file from the localhost to the remotehost:

        localhost$scp localhostfile.txt jsmith@remotehost.example.com:/home/jsmith/localhostfile.txt

4. Debug SSH Client:

Sometimes it is necessary to view debug messages to troubleshoot any SSH connection issues. For this purpose, pass -v (lowercase v) option to the ssh as shown below.

Example without debug message:

        localhost$ ssh -l jsmith remotehost.example.com
        warning: Connecting to remotehost.example.com failed: No address associated to the name
        localhost$

Example with debug message:

        locaclhost$ ssh -v -l jsmith remotehost.example.com
        debug: SshConfig/sshconfig.c:2838/ssh2_parse_config_ext: Metaconfig parsing stopped at line 3.
        debug: SshConfig/sshconfig.c:637/ssh_config_set_param_verbose: Setting variable 'VerboseMode' to 'FALSE'.
        debug: SshConfig/sshconfig.c:3130/ssh_config_read_file_ext: Read 17 params from config file.
        debug: Ssh2/ssh2.c:1707/main: User config file not found, using defaults. (Looked for '/home/jsmith/.ssh2/ssh2_config')
        debug: Connecting to remotehost.example.com, port 22... (SOCKS not used)
        warning: Connecting to remotehost.example.com failed: No address associated to the name

5. Escape Character: (Toggle SSH session, SSH session statistics etc.)

Escape character ~ get’s SSH clients attention and the character following the ~ determines the escape command.
Toggle SSH Session: When you’ve logged on to the remotehost using ssh from the localhost, you may want to come back to the localhost to perform some activity and go back to remote host again. In this case, you don’t need to disconnect the ssh session to the remote host. Instead follow the steps below.

Login to remotehost from localhost: localhost$ssh -l jsmith remotehost
Now you are connected to the remotehost: remotehost$
To come back to the localhost temporarily, type the escape character ~ and Control-Z. When you type ~ you will not see that immediately on the screen until you press <Control-Z> and press enter. So, on the remotehost in a new line enter the following key strokes for the below to work: ~<Control-Z>

    remotehost$ ~^Z
    [1]+  Stopped                 ssh -l jsmith remotehost
    localhost$

Now you are back to the localhost and the ssh remotehost client session runs as a typical unix background job, which you can check as shown below:

    localhost$ jobs
    [1]+  Stopped                 ssh -l jsmith remotehost

You can go back to the remote host ssh without entering the password again by bringing the background ssh remotehost session job to foreground on the localhost

    localhost$ fg %1
    ssh -l jsmith remotehost
    remotehost$

SSH Session statistics: To get some useful statistics about the current ssh session, do the following. This works only on SSH2 client.

Login to remotehost from localhost: localhost$ssh -l jsmith remotehost
On the remotehost, type ssh escape character ~ followed by s as shown below. This will display lot of useful statistics about the current SSH connection.

        remotehost$  [Note: The ~s is not visible on the command line when you type.] 
        remote host: remotehost
        local host: localhost
        remote version: SSH-1.99-OpenSSH_3.9p1
        local version:  SSH-2.0-3.2.9.1 SSH Secure Shell (non-commercial)
        compressed bytes in: 1506
        uncompressed bytes in: 1622
        compressed bytes out: 4997
        uncompressed bytes out: 5118
        packets in: 15
        packets out: 24
        rekeys: 0
        Algorithms:
        Chosen key exchange algorithm: diffie-hellman-group1-sha1
        Chosen host key algorithm: ssh-dss
        Common host key algorithms: ssh-dss,ssh-rsa
        Algorithms client to server:
        Cipher: aes128-cbc
        MAC: hmac-sha1
        Compression: zlib
        Algorithms server to client:
        Cipher: aes128-cbc
        MAC: hmac-sha1
        Compression: zlib
        localhost$