Blog Article:

A Sneak Peek into OneFlow’s New Performance Improvements

Alejandro Huertas

Cloud Engineer at OpenNebula

Development



Mar 23, 2020

Use Case | Virtualization

After months of work, we have finally finished the OneFlow revamp and internal logic redesign! 🎉 OneFlow is one of the most powerful components in OpenNebula, as it allows our users to run complex services in their private cloud.

We have focused on improving the internal logic while leaving the OneFlow API unchanged and providing the same functionalities it had in the past. In this way, those users that have created their own applications using this API do not need to apply any changes 😉

The most important benefit that this revamp brings to the table is the global improvement in performance that we obtain by reducing the time that each individual operation takes. We have achieved this mainly by using OpenNebula’s new hook event manager system. Instead of constantly checking the state of each virtual machine, we take the messages that are published in the events queue and use them to decide which operation to execute.

If we have, for instance, a service implementing a straight strategy, we can subscribe to the queue monitoring changes in virtual machine states:

EVENT STATE VM/#{state}/#{lcm_state}/#{vm_id}

We will get a message every time this virtual machine changes its state, so in applications that require multiple VMs to implement a specific workflow, we can identify very quickly when a parent process is running in order to deploy at that point the child virtual machines.

If we have a service with straight strategy and a wait gate to report ready, we can subscribe to the queue for virtual machine update operations:

EVENT API one.vm.update 1

In this case, we will get a message every time the virtual machine is updated, so when OneGate—the service that allows virtual machine guests to pull and push VM information from OpenNebula—updates it and writes READY=”YES”, we can immediately deploy the child virtual machines.

Apart from these two changes, we have also undertaken a full redesign of OneFlow’s code, to make it more readable and better implemented. We have also implemented a better error treatment system, so now most of the errors are shown in both the CLI and the GUI (Sunstone). Again, all this without modifying the API, don’t worry!

Time to check those performance improvements! For comparative purposes, we have carried out some benchmarks using the OneFlow component from Hotfix version 5.10.2—released on February 12, 2020—and from the new version of OneFlow that will be part of the forthcoming Minor release 5.12.

For these tests, we have used the following service template:

This is a very simple template with just two roles, using the Alpine Linux 3.8 appliance from OpenNebula’s public Marketplace for the two VMs that we’ll be deploying.

We have executed some of the most performance sensitive operations (i.e. deploy, scale and warning) through both versions of OneFlow, and here you have the results:

OpenNebula 5.10.2

Deploy (~1 minute)

11:15:26 [I]: [SER] New state: DEPLOYING
11:15:26 [I]: [ROL] Role Master new state: DEPLOYING
11:15:57 [I]: [ROL] Role Master new state: RUNNING
11:15:57 [I]: [ROL] Role Slave new state: DEPLOYING
11:16:27 [I]: [ROL] Role Slave new state: RUNNING
11:16:27 [I]: [SER] New state: RUNNING

Scale (~1 minute 20 seconds)

11:24:10 [I]: [ROL] Role Master scaling down from 2 to 1 nodes
11:24:10 [I]: [ROL] Role Master new state: SCALING
11:24:10 [I]: [SER] New state: SCALING
11:25:03 [I]: [ROL] Role Master new state: COOLDOWN
11:25:03 [I]: [SER] New state: COOLDOWN
11:25:33 [I]: [ROL] Role Master new state: RUNNING
11:25:33 [I]: [SER] New state: RUNNING

Note: cooldown takes 10 seconds, so real time is 1 minute and 10 seconds.

Warning (23 seconds)

12:30:18 [Z0][DiM][D]: Powering off VM 1
12:30:41 [I]: [ROL] Role Slave new state: WARNING
12:30:41 [I]: [SER] New state: WARNING

OpenNebula 5.12.0

Deploy (11 seconds)

11:47:09 [I]: [SER] New state: DEPLOYING
11:47:09 [I]: [ROL] Role Master new state: DEPLOYING
11:47:15 [I]: [ROL] Role Master new state: RUNNING
11:47:15 [I]: [ROL] Role Slave new state: DEPLOYING
11:47:20 [I]: [ROL] Role Slave new state: RUNNING
11:47:20 [I]: [SER] New state: RUNNING

Scale (15 seconds)

12:00:48 [I]: [ROL] Role Master scaling up from 1 to 2 nodes
12:00:48 [I]: [ROL] Role Master new state: SCALING
12:00:48 [I]: [SER] New state: SCALING
12:00:53 [I]: [SER] New state: COOLDOWN
12:00:53 [I]: [ROL] Role Master new state: COOLDOWN
12:01:03 [I]: [ROL] Role Master new state: RUNNING
12:01:03 [I]: [SER] New state: RUNNING

Note: cooldown takes 10 seconds, so real time is 5 seconds.

Warning (1 second)

12:07:01 [Z0][DiM][D]: Powering off VM 10
12:07:02 [I]: [SER] New state: WARNING
12:07:02 [I]: [ROL] Role Master new state: WARNING

As you can see, the total time has been incredibly reduced! Again, this test has been done with just two virtual machines and two roles… so imagine that you have a service with more roles and more dependencies: you will get an even better performance!

But this is not all the new OneFlow brings along! Apart from these performance improvements, we have made some other interesting changes:

As we announced a few weeks ago, we have introduced the ability to create virtual networks automatically.
You will be able to use your own custom attributes for each service, and they will be passed on to the VMs via the context section. This will be really useful for your own contextualization process!

Well, that’s all for now 🤓 Hope you’ll find these new features as cool as we do! Feedback, as usual, is always very welcome: feel free to use the comments section below or our Community Forum. Cheers!

4 Comments

darkfader on Mar 25, 2020 at 3:33 am

Thanks for being able to pass arguments, and putting them into the context directly.
This is not just a strong simplification of the OneFlow workflow, but also makes it easier to have user scripts in templates that are hybrid and know how to handle both cases – with the same template.
Reply
- Alejandro Huertas on Mar 25, 2020 at 3:56 pm
  
  Thank you! Glad to know you find it useful! Any feedback is very welcome, we will keep improving it!
  Reply
Nazeer on May 30, 2020 at 4:40 am

Oneflow stuck in deploying no oneflow.error file generated. Trying to follow this document to setup a master/slave k8 setup using oneflow.

Fri May 29 23:32:34 2020 [I]: [ROL] Role master new state: DEPLOYING
Fri May 29 23:33:05 2020 [I]: [ROL] Role master new state: DEPLOYING
…
Fri May 29 23:37:38 2020 [I]: [ROL] Role master new state: DEPLOYING
Reply
- Alberto P. Martí on May 30, 2020 at 5:56 pm
  
  Hi Nazeer! Thanks for your feed-back. Please, use our Forum to report this issue: https://forum.opennebula.io Other members of the Community might be able to assist with this, and maybe give you some advice. Cheers!
  Reply