Monday, 21 April 2014

Zabbix and Iostat: monitoring disk subsystem

The disk subsystem is one of the most important subsystems of the server and on the load on the disk subsystem often depends very much, such as the recoil velocity of the content or how fast will respond to the database . This mainly relates to mail or file servers , database servers . In general , the indicators need to monitor disk performance . On the basis of the disk subsystem performance graphs we can decide on the need to build capacity long before the cock bite .  


And just do useful glance from release to release as developers work affects the level of the load.Under the cut , monitoring and how to configure.Dependencies:Monitoring is implemented through zabbix agent and two utilities : awk and iostat ( package sysstat). If awk goes distributions by default, iostat want to install a package sysstat ( here special thanks to Sebastien Godard and associates ) .Known limitations :Need for monitoring sysstat since version 9.1.2 , because there is a very important change : «Added r_await and w_await fields to iostat's extended statistics». So you should be careful in some distributions , such as CentOS little "stable" and less fichastaya version sysstat.If you start from the same version of zabbix (2.0 or 2.2), then there is no question of principle , works on both versions. 1.8 will not work because used Low level discovery.Opportunities ( purely subjective , in descending order of utility ) :

    
Low level discovery ( hereinafter simply LLD) for automatic detection of block devices on a host ;
    
recycle block device % - convenient metric to track the total load on the device ;
    
latency or responsiveness - available as a general responsiveness, and responsiveness to the read / write ;
    
value of the queue ( in requests ) and the average request size ( in sectors) - allows you to evaluate the nature and extent of the workload load device;
    
current speed of read / write device in Beautifying kilobytes ;
    
number of read / write requests ( per second) combined with queued for execution ;
    
iops - the value of read / write operations per second ;
    
the average service time of requests (svctm). Generally it deprecated, the developers promise to cut down her long , but still can not reach your hands .In general it seems, are available here all the metrics that are in iostat ( who are unfamiliar with this utility strongly recommend to look into the man iostat).Available graphics :Graphs are drawn per-device, LLD detects devices that fall under the regular expression "(xvd | sd | hd | vd) [az]", so if your drives have different names , you can easily make the appropriate changes . This regular season is done to detect the device to be parental to the other sections , LVM volumes , MDRAID arrays etc. In general not to collect too much. A little distracted, so the list of schedules:

    
Disk await - the responsiveness of the device (r_await, w_await);
    
Disk merges - merge operations in the queue (rrqm / s, wrqm / s);
    
Disk queue - the queue status (avgrq-sz, avgqu-sz);
    
Disk read and write - the current values ​​of the read / write device (rkB / s, wkB / s);
    
Disk utilization - disk utilization and value of IOPS (% util, r / s, w / s) - allows good horse racing track in the utilization and than reading or writing they were called .The similarities and differences :In zabbikse have boxed versions for similar monitoring , and the keys vfs.dev.read vfs.dev.write. They are good and work fine , but less informative than iostat. Such as iostat have such metrics as latency and utilization.Also there is a similar pattern from Michael Noman in my opinion the only difference is one , it is sharpened on old versions of iostat, well, + small syntactic change.Where to get :Thus , monitoring consists of a configuration file for the agent , two scripts to collect / receive data and a template for the Web interface . All this is available in the repository on Github, so any method available (git clone, wget, curl, etc ...) to download them to the machines that want Monitoring started and proceed to the next step.How to configure :

    
iostat.conf - contents of this file should be placed in the configuration file zabbix agent, or put in the directory specified in the configuration option Include main agent configuration . In general depends on the policy of the party. I use the second option of custom configs for me is a separate directory .
    
scripts / iostat-collect.sh and scripts / iostat-parse.sh - this two working script should be copied to / usr / libexec / zabbix-extensions / scripts /. Here also you can use the convenient placement , but in this case, do not forget to correct the path in the parameters defined in iostat.conf. Be sure to check that they are executable (mode = 755) .Now everything is ready , start the agent and go to the monitoring server and run the command (do not forget to replace agent_ip):# Zabbix_get-s agent_ip-k iostat.discoverySo , check with the monitoring server that iostat.conf to load and gives information , at the same time look at what works LLD . In response returns JSON with the names of found devices . If no response came , then something went wrong .How to configure a web inteyrfeys :Now there was a pattern iostat-disk-utilization-template.xml. Through a web interface import it into the templates section and assigned to our host. It's simple . Now have to wait for about an hour , this time set in the LLD rule ( also configurable ) . Alternatively, you can keep an eye on in Latest Data observed the host section Iostat. Once there appeared values, you can go to the section of charts and watch the first data .And finally triple screenshots local host graphs c ))):Data directly in the Latest Data:

image

Graphics responsiveness (Latency):
image

Graph utilization and IOPS:
image

No comments:

Post a Comment