| Using EVAPerf and PerfMon for HP EVA performance monitoring |
|
This section discusses the EVA performance information displayed by the Windows utility PerfMon when used in conjunction with EVAPerf. An explanation for each metric will be given, together with information on how those metrics are collected. A future release of this document will provide information on how to manipulate these metrics to obtain additional information, as well as using all metrics to perform bottleneck analysis and capacity planning on the EVA.
Sampling interval
The EVA continuously collects performance information, updating various internal counters as events occur. Once a second, the data in these internal counters are transferred to an area that is accessible to external applications (such as PerfMon), overlaying any old information in that area. Following this, the EVA internal counters are zeroed, and performance information is collected for another one second interval. Since this transfer and zeroing action occurs once every second, the performance information that is available externally represents the average value only over the preceding second. Although the sampling interval in PerfMon can be set to any value, the data collected and displayed by PerfMon only represents the performance of the EVA for the one second interval preceding the instant when PerfMon sampled the data. Thus, if the PerfMon sampling interval is set to 15 seconds (the default value), the information displayed by PerfMon will only represent a one second average, taken once every 15 seconds. Because of this once per second averaging performed by the EVA, if PerfMon is set to an interval other than one second, the displayed data may not represent actual conditions on the EVA over this interval. If the EVA performance is steady and invariant, the data will be accurate. If, however, there are changes in the I/O workload or other conditions that cause transient behavior on the EVA, the true performance of the EVA may not be accurately represented by the displayed PerfMon data. As such, it is recommended that the PerfMon sampling interval be set as low as possible in order to ascertain whether or not there is any transient performance behavior. Long sampling intervals should be used with caution, since there is a possibility of missing valuable information and only obtaining a partial picture of the EVA performance. At the same time it should be noted that very small sampling intervals will result in large amounts of data, and in extreme conditions may result in a measurable impact on the host performance. The correct choice for the sampling interval should be based on these tradeoffs, and will vary from one situation to another. Metrics layout The remainder of this document deals with the individual performance metrics. The metrics are broken up into six main groups, shown as “Objects” within PerfMon. Within each object group, the metrics are further subdivided into specific counters, shown as “Counters” within PerfMon. Finally, each counter for each object has one or more “Instances”, where an instance represents a particular occurrence of that metric. Since the instances are identical within an object group, they are shown below immediately following the object description and preceding the individual counter descriptions. As an example, one object is an EVA
virtual disk. For each virtual
disk, there are several counters, such as the read and write request rates. For each of these counters, there is one instance for every virtual disk on the EVA.Kilobytes When measuring quantities, there are two standards that are currently in use. For disks and performance information, the generally accepted industry standard is that powers of 10 are used, so a kilobyte (KB) is equal to 103, or 1,000 bytes. For memory sizes, most Microsoft utilities, the SAN Appliance, and the EVA performance metrics, powers of 2 are used, so 1 KB is equal to 210, or 1,024 bytes. Although the difference between the two methods is slight, it is nonetheless measurable, and may cause confusion when executing performance tests with external tools that use powers of 10, and comparing those results against EVAPerf that uses powers of two. Object: EVA Host Connection The EVA host connection object covers performance information that is common to all host connections on EVA. The two counters in this object deal with the host port connections on the EVA, and provide information for each external host
adapter that is connected to the EVA.Instances There is one instance of each of the following counters for each host HBA that is connected to the EVA. For each instance, the WWID of the host HBA is present as an identifier. Counter: Queue Depth This counter keeps track of the number of outstanding host requests. For each I/O that is issued by the host, this counter is incremented, and when a host request is completed, the counter is decremented. Object: EVA Host Port Statistics The EVA host port statistics collect information on the four EVA host ports (two per
controller). These metrics provide information on the performance and data flow through each of these four ports.Instances There are a total of four instances for each of the following counters; one for each EVA port. The instances are identified as follows: • Port 0 THIS • Port 1 THIS • [Port 2 THIS] • [Port 3 THIS] • Port 0 OTHER • Port 1 OTHER • [Port 2 OTHER] • [Port 3 OTHER] Counter: Av Queue Depth This counter tracks the total number of outstanding host commands for one specific EVA host port. The count represents the number of outstanding I/O requests from all hosts that are connected to a specific EVA host port. Counter: Read KBS This counter tabulates the total KB of data that was sent through a specific host port as a result of read commands. Since the update frequency of the EVA data collection is once per second, this translates directly into read KB per second. Counter: Read Latency This counter tracks the amount of time from when the EVA receives a read request until such time as it returns completion of that request to the requesting host over a specific EVA host port. This time, which is measured in microseconds, is an average of the request latency for all virtual disks on the system, and includes both
cache hits and misses.Counter: Read RPS This counter tabulates the number of read commands that were sent from a host over a specific EVA host port. Since the update frequency of the EVA data collection is once per second, this translates directly to read requests per second. Note that this counter only counts read requests; read completions are not tabulated. Counter: Write KBS This counter tabulates the total KB of data that was sent over a specific EVA host port as a result of write commands. Since the update frequency of the EVA data collection is once per second, this translates directly into write KB per second. Counter: Write Latency This counter tracks the amount of time from when the EVA receives a write request over a specific EVA host port until such time as it returns completion of that request to the requesting host. This time, which is measured in microseconds, is an average of all write requests to all virtual disks on the system. Counter: Write RPS This counter tabulates the number of write commands that were sent from a host over a specific EVA host port. Since the update frequency of the EVA data collection is once per second, this translate directly to write requests per second. Note that this counter only counts requests; completions are not tabulated. Object: EVA Physical Disk The physical disk counters keep track of information on each physical disk on the system. There is no information relating these disks to a specific disk group, nor is the activity broken out into the underlying cause of the I/O (host driven, cache flushes, read-ahead, leveling, snapshot activity, etc.). Instances There is one instance of these counters for each physical disk on the EVA. Each disk is uniquely identified by a 4 digit hexadecimal number. This number is an internal representation of the disk used by the EVA known as a “noid”, and has no relationship to the shelf or bay where this disk resides. Counter: Drive latency [Need to determine if this is drive or volume latency] This counter keeps track of the time between when a data transfer command is sent to a disk and when command completion is returned from the disk. This time, which is measured in microseconds, is not broken into read and write latencies, but is simply a “command processing” time. Note that completion of a disk command does not necessarily imply host I/O completion, since the I/O to a specific disk may be only a part of a larger I/O operation. Counter: Drive Queue Depth [Need to determine if this is drive or volume queue] This counter keeps track of the total number of requests that have been sent to the drive but not yet completed. It is incremented whenever a command is sent to the disk, and decremented whenever a command completes. Counter: Read KBS This counter keeps track of the amount of data (in KB) that has been read from a drive. Since this counter is updated once per second, this translates directly into read KB per second. Counter: Read RPS This counter keeps track of the number of read requests that have been sent to the disk drive. Since this counter is updated once per second, this translates directly into the read requests per second. Counter: Write KBS This counter keeps track of the amount of data (in KB) that has been sent to a drive. Since this counter is updated once per second, this translates directly into write KB per second. Counter: Write RPS This counter keeps track of the number of write requests that have been sent to the disk drive. Since this counter is updated once per second, this translates directly into the write requests per second. Object: EVA Port Status The port status counter keeps track of the fibre ports on the EVA (back-end, mirror, etc.). There is only one counter for this object. Instances There are a total of 12 instances for the counter; one for each port on each controller in the EVA. They instances are as follows: • DP-1A (Back-end device port) • [DP-1B (Back-end device port)] • [DP-1C (Back-end device port)] • DP-2A (Back-end device port) • [DP-2B (Back-end device port)] • [DP-2C (Back-end device port)] • FP1 (Front end host port) • FP2 (Front end host port) • [FP3 (Front end host port)] • [FP4 (Front end host port)] • MP1 (Mirror port) • MP2 (Mirror port) Each instance is the queue depth for a specific port for one of the two controllers within the EVA and is identified with a unique ASCII string for each controller. Counter: Queue Depth As with other queue counters, this counter tabulates the number of requests that are currently outstanding on each port. The specific command and/or data for each port will vary depending on what the function of each port is. As an example, the “MP” instance counts the number of mirror port requests that are currently outstanding form one controller to the other. Object: EVA Storage Cell The storage cell object keeps track of information that is related to the overall storage system. In essence, it is a quick roll-up of several of the important metrics associated with overall EVA performance. Instances There is only a single instance for these counters. With two exceptions (the % data transfer time and % of processor time), this single instance represents the sum total of both controllers. Counter: % Data Transfer Time [This counter should not be present in EVAPerf] This counter displays the percentage of time that the data flow process within the controller is running. This process is the one that is responsible for data movement within the controller, and is a primary indicator of how much of the controller’s CPU is being used for user I/O. Although it does track processor utilization for a specific process within the controller, it does not correlate to the percentage of available
bandwidth that is currently in use. The value ranges from 0 (idle) to 100 (completely saturated). It is important to note that this counter only shows the CPU utilization of one controller in the EVA pair.[Need to distinguish which controller EVAPerf is connected to and how to monitor the other one] Counter: % Processor Time This counter presents the percentage of time that the CPU on the EVA controller is busy. A complete idle controller will show 0%, while one that is saturated will show a value of 100%. The “% Data Transfer Time” counter is included in this counter, so any difference between the two counters can be attributed to processing within the controller that is not directly related to user I/O (such as leveling and sparing). It is important to note that this counter only shows the CPU utilization of one controller in the EVA pair. [Need to distinguish which controller EVAPerf is connected to and how to monitor the other one] Counter: Total host KBS This counter keeps track of the total KB that has been read and written by all hosts connected to the EVA. Since this information is updated once per second, this translates directly into the total KB per second that the EVA is processing. Note that this is the sum of both read and write data. Counter: Total host RPS This counter keeps track of the total number of I/O requests that have been issued by all hosts connected to the EVA. Since this information is updated once per second, this translates directly into the total requests per second that the EVA is processing. Note that this is the sum of both read and write requests. Object: EVA VDisk The VDisk object tracks performance for each virtual disk (LUN) on the EVA. It is similar to the physical disk object, but keeps track of virtual LUNs instead. Note that a VDisk could also be a snapshot or snapclone. Instances There is one instance of these counters for each virtual disk on the EVA. Each VDisk is uniquely identified by a 4 digit hexadecimal number. This number is an internal representation of the LUN used by the EVA known as a “noid”, and has no relationship to the LUN number. Counter: Read Hit KBS This counter keeps track of the amount of data that has been read from the EVA’s cache memory as a result of a virtual disk access. It is measured in KB, and since this data is updated once per second, it translates directly into KB per second. This data only applies to data that has been transferred from the EVA’s cache memory. If the data was not in memory and was instead read from disk, it will not be counted here (see Counter: Read Miss KBS). Note that this value includes not only cache hits generated from random access activity, but also any data that was a
cache hit as a result of a prefetch operation generated by a sequential read data stream.Counter: Read Hit Latency This counter keeps track of the time taken from when a host read request is received until such time as that request has been satisfied from the EVA’s cache memory. The time, which is measured in microseconds, only applies to read commands that are satisfied from read cache. If the read command is a cache miss, the time will not be tabulated here (see Counter: Read Miss Latency). Note that this value includes not only the latency from cache hits generated from random access activity, but also the latency associated with a cache hit as a result of a prefetch operation generated by a sequential read data stream. Counter: Read Hit RPS This counter keeps track of the number of times a host read request was satisfied from the EVA’s cache memory. Since the counter is updated once per second, this translates directly into read cache hits per second. This counter only applies to read hits, so if the read command is a cache miss, this counter will not be incremented (see Counter: Read Miss RPS). Note that this value includes not only cache hits generated from random access activity, but also any requests that were a cache hit as a result of a prefetch operation generated by a sequential read data stream. Counter: Read Miss KBS This counter keeps track of the total amount of data that was not present in the EVA’s cache memory and had to be read from physical disks. It is measured in KB, and since this data is updated once per second, it translates directly into KB per second. This data only applies to data that has been read from disk. If the data was in cache memory and read directly from there, it will not be counted here (see Counter: Read Hit KBS). Counter: Read Miss Latency This counter keeps track of the time taken from when a host read request is received until such time as that request has been satisfied from the physical disks. The time, which is measured in microseconds, only applies to read commands where the data is not in read cache and must be read from disk. If the read command results in the data being read from cache, the time will not be tabulated here (see Counter: Read Hit Latency). Counter: Read Miss RPS This counter keeps track of the number of times a host read request was satisfied from the EVA’s cache memory. Since the counter is updated once per second, this translates directly into read cache misses per second (disk accesses per second). This counter only applies to read misses, so if the read command is a cache hit, this counter will not be incremented (see Counter: Read Hit RPS). Counter: Write KBS This counter keeps track of the total amount of data that was written to the virtual disk by all hosts. It is measured in KB, and since this data is updated once per second, it translates directly into KB per second. Counter: Write Latency This counter keeps track of the time, measured in microseconds, between when a write command is received from a host and when command completion is returned. Counter: Write RPS This counter keeps track of the total number of write requests to a virtual disk that were received from all hosts. Since this data is updated once per second, it translates directly into write requests per second.
|



Comments
Hello Mikel.
Thank you for the good log analyzer program.
It`s will be good if all menus will be in English.
The tool is downloadable from:
https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0Bx2CfWawpqNcNTE3NDFhZDMtNzhlMi00MWRhLWJhZDgtNzgwNWE0YWU5Zjg1&hl=es
RSS feed for comments to this post