Using PyRRD to gather system statistics
python observabilityLast week, I spent sometime benchmarking the state-of-the-art WSGI application
server for SurveyMonkey Contribute. It
is quite challenging to configure various WSGI app servers for apple-to-apple
comparison due to the diverse concurrency paradigms. The system statistics, such
as the CPU consumption and memory usage will provide another perspective of the
performances. RRDtool is the go-to solution in general for system statistics
collecting, retrieving and visualization. And the pyrrd and subprocess
module allow us to use python other than perl to glue the rrdtool and other
system utilities:
from pyrrd.rrd import DataSource, RRA, RRD
dss = [
DataSource(dsName='cpu', dsType='GAUGE', heartbeat=4),
DataSource(dsName='mem', dsType='GAUGE', heartbeat=4)
]
rras = [RRA(cf='AVERAGE', xff=0.5, steps=1, rows=100)]
rrd = RRD('/tmp/heartbeat.rrd', ds=dss, rra=rras, step=1)
rrd.create()
The above snippet creates /tmp/heartbeat.rrd
RRD file with two data sources,
cpu and mem respectively; both are defined as GAUGE
type. Then we define a
round-robin archive(RRA) to save up to 100 data points, sampled every step. At
the end, we create a RRD file with above data configuration with 1 second
sampling intervals. It is quite obvious that the pyrrd
modules use the same
terminology as rrdtool, thus you can leverage the existing knowledge and enjoy
the convenience in the python land.
With subprocess
module, we manipulate the pipe just as easy as bash and perl:
pattern = re.compile('\s+')
command = '/bin/ps --no-headers -o pcpu,pmem -p %s' % ' '.join(pids)
while True:
ps = subprocess.check_output(command, shell=True)
pcpu = 0.0
pmem = 0.0
for line in ps.split('\n'):
if line.strip():
cpu, mem = map(float, pattern.split(line.strip()))
pcpu += cpu
pmem += mem
rrd.bufferValue(time.time(), pcpu, pmem)
rrd.update()
time.sleep(1)
ps
did all the heavy lifting for us in the sampling phase: it printed out the
%CPU and %MEM for all pids we are interested in; then the output is
parsed, aggregated and dumped to the rrd file.
Please bear in mind that this is not a typical rrdtool use case: the system statistics are sampled in the real-time fashion as the benchmarking session is relative short. In the real world, the data are usually sampled in a much more coarse granularity, and consolidated in a statistic fashion. You can download the sampling script here.
I have a hard time to grasp the pyrrd.graph
module though, the extra
abstraction does not make things less complicated and I end up using rrdtool
directly, for example:
rrdtool graph /tmp/heartbeat.png --start 1401919870 --end 1401919879 \
DEF:cpu=/tmp/heartbeat.rrd:cpu:AVERAGE LINE2:cpu#FF0000 \
DEF:mem=/tmp/heartbeat.rrd:mem:AVERAGE LINE:mem#ccff00