Blog Article:

Monitoring OpenNebula events with Graphite + Grafana

Christian González

Cloud Engineer at OpenNebula Systems

Nov 18, 2019

With the introduction of the new OpenNebula Hook Subsystem, the OpenNebula Hook Manager has been adapted to capture all OpenNebula events (API calls and state changes of VMs and Hosts) and publish them over a ZeroMQ publisher socket.

By taking advantage of this new feature, we can monitor the OpenNebula API and keep an eye on the changes of state in our VMs and Hosts. In this post I will discuss this new architecture and some of the use cases that it enables.

Architecture

For this example (see diagram below), we have decided to use Graphite as data storage system. On top of it we have set up Grafana for plotting the data.

Blog MonitoringOpenNebula

In order to feed the data into Graphite, we need to develop a custom component that will be ‘subscribed’ to the events that we actually want to monitor. This component can be just a simple script acting as a proxy between the OpenNebula Hook Manager and Graphite, like in the following example.

Monitoring API calls

For monitoring the API calls we are going to use this simple ruby script:

#!/usr/bin/ruby

require 'ffi-rzmq'
require 'base64'
require 'statsd-ruby'

# Set up a global Statsd client for a server on localhost:8125
$statsd = Statsd.new 'localhost', 8125

# Set up ZeroMQ subscriber socket
@context    = ZMQ::Context.new(1)
@subscriber = @context.socket(ZMQ::SUB)

# Subscribe to API events
@subscriber.setsockopt(ZMQ::SUBSCRIBE, "EVENT API")
@subscriber.connect("tcp://localhost:2101")

key = ''
content = ''

loop do
    # Receive event
    @subscriber.recv_string(key)
    @subscriber.recv_string(content)
    
   # Get key for being stored in Graphite
    key_p = key.split[2].gsub('.', '_')
    success = key.split[3]
    
    $statsd.increment "one.call.total"
    $statsd.increment "one.call.#{key_p}"

    if success.to_i == 1
        $statsd.increment "one.call.success"
    else
        $statsd.increment "one.call.failure"
    end

    puts "KEY: #{key_p} SUCCESS: #{success}"
end

This script is connected to the ZeroMQ subscriber socket, which is the one publishing the API events. Each time an API event arrives, the script increases the counter of the corresponding call, the total number of calls, and the successful/failure calls (using a Statsd Graphite client).

In order to feed data into Graphite, we just need to run the following script and generate some data by running, for instance, the onevm top command:

$ ./event-client.rb
KEY: one_clusterpool_info SUCCESS: 1
KEY: one_vmpool_info SUCCESS: 1
KEY: one_vmpool_info SUCCESS: 1
KEY: one_vmpool_info SUCCESS: 1
KEY: one_vmpool_info SUCCESS: 1
KEY: one_vmpool_info SUCCESS: 1

Note that other OpenNebula components (e.g. the scheduler) can also be performing API calls at the same time, so you’ll probably end up seeing some calls you didn’t perform manually.

Once we have generated enough data, we can plot it with Grafana. Once Grafana is installed, we just need to configure it to use Graphite as data source, and then create a new panel.

With the query below, for example, we can see the total number of calls every 10 seconds:

img 5d97191d4a06b

Once the query is added, don’t forget to refresh the panel to start seeing the data. If everything goes fine, you should get something like this:

img 5d97191fd4193

Note that each point represents 10 seconds so if the point shows a value of 1.8, that means there have been 18 calls in those 10 seconds.

We can also check how the ratio of successful versus error queries evolves over time:

img 5d9719227882c

For example, to see some error queries you can always try to show a non-existing VM: for i in `seq 0 30`; do onevm show <non_existing_vm_id>; done. You should see this kind of output:

img 5d9719250d242

Monitoring VM creation

Let’s now monitor the creation of VMs. To do so we are going to set up a Hook that will be triggered every time a VM switches to the RUNNING state. First we will create the Hook script (increment_running.rb) inside /var/lib/one/remotes/hooks:

#!/usr/bin/ruby

require 'statsd-ruby'

# Set up a global Statsd client for a server on localhost:8125
$statsd = Statsd.new 'localhost', 8125

$statsd.increment "one.call.running_vms"

Now you can create the Hook with the following template using onehook create hook.tmpl command:

$ cat hook.tmpl

NAME = hook-running
TYPE = state
COMMAND = increment_running.rb
ON = RUNNING
RESOURCE = VM

Once the Hook is ready, every time a VM switches to the RUNNING state, the value of one.call.running_vms will be incremented in the Graphite storage.

Now let’s create a panel in Grafana so that we can plot all this data. For that, we’ll need to add a query that shows one.call.running_vm:

img 5d9719279500e

At this point you should see something like this:

img 5d97192a24ee5

Note that the peaks in our graph correspond to the times when the scheduler was actually scheduling the VMs.

Conclusions

This post is intended to give the OpenNebula community a taste of the great potential that the new Hook Manager adds to OpenNebula environments. These are just some examples, but remember: now it is in your hands to decide how to use these amazing tools. Like a wise toy once said: to infinity and beyond!

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *