Blog Article:

Installation of HA OpenNebula on CentOS 7 with Ceph as a datastore and IPoIB as backend network

Alexey Vyrodov

Apr 15, 2015

Introduction.

This article is exploring the process of installing HA OpenNebula and Ceph as datastore on three nodes (disks – 6xSSD 240GB, backend network IPoIB, OS CentOS 7) and using one additional node for backup.

Scheme of equipment below:

We are using this solution for virtualization of our imagery processing servers.

Preparing.

All actions should be performed on all nodes. For kosmo-arch all except bridge-utils and FrontEnd network.

yum install bridge-utils

FrontEnd network.

Configure bond0 (mode0) and start script below to create frontend interface for VMs (OpenNebula)

#!/bin/bash
Device=bond0
cd /etc/sysconfig/network-scripts
if [ ! -f ifcfg-nab1 ]; then
cp -p ifcfg-$Device bu-ifcfg-$Device
  echo -e "DEVICE=$Device\nTYPE=Ethernet\nBOOTPROTO=none\nNM_CONTROLLED=no\nONBOOT=yes\nBRIDGE=nab1" > ifcfg-$Device
    grep ^HW bu-ifcfg-$Device >> ifcfg-$Device
      echo -e "DEVICE=nab1\nNM_CONTROLLED=no\nONBOOT=yes\nTYPE=bridge" > ifcfg-nab1 
        egrep -v "^#|^DEV|^HWA|^TYP|^UUI|^NM_|^ONB" bu-ifcfg-$Device >> ifcfg-nab1
fi

BackEnd network. Configuration of IPoIB:

yum groupinstall -y "Infiniband Support"
yum install opensm

Enable IPoIB and switch infiniband to connected mode. This Link about differences of connected or datagram modes.

 cat /etc/rdma/rdma.conf
# Load IPoIB
IPOIB_LOAD=yes
# Setup connected mode
SET_IPOIB_CM=yes

Start Infiniband services.

systemctl enable rdma opensm
systemctl start rdma opensm

Check of working

ibv_devinfo

hca_id: mlx4_0
      transport:                      InfiniBand (0)
      fw_ver:                         2.7.000
      node_guid:                      0025:90ff:ff07:3368
      sys_image_guid:                 0025:90ff:ff07:336b
      vendor_id:                      0x02c9
      vendor_part_id:                 26428
      hw_ver:                         0xB0
      board_id:                       SM_1071000001000
      phys_port_cnt:                  2
              port:   1
                      state:                  PORT_ACTIVE (4)
                      max_mtu:                4096 (5)
                      active_mtu:             4096 (5)
                      sm_lid:                 8
                      port_lid:               4
                      port_lmc:               0x00
                      link_layer:             InfiniBand
              port:   2
                      state:                  PORT_ACTIVE (4)
                      max_mtu:                4096 (5)
                      active_mtu:             4096 (5)
                      sm_lid:                 4
                      port_lid:               9
                      port_lmc:               0x00
                      link_layer:             InfiniBand

and

iblinkinfo
CA: kosmo-virt1 mlx4_0:
    0x002590ffff073385     13    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       2   10[  ] "Infiniscale-IV Mellanox Technologies" ( )
Switch: 0x0002c90200482d08 Infiniscale-IV Mellanox Technologies:
         2    1[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    2[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    1[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    4[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    5[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    6[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    7[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    8[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2    9[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   10[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>      13    1[  ] "kosmo-virt1 mlx4_0" ( )
         2   11[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       4    1[  ] "kosmo-virt2 mlx4_0" ( )
         2   12[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   13[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   14[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   15[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   16[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   17[  ] ==(                Down/ Polling)==>             [  ] "" ( )
         2   18[  ] ==(                Down/ Polling)==>             [  ] "" ( )
CA: kosmo-virt2 mlx4_0:
    0x002590ffff073369      4    1[  ] ==( 4X          10.0 Gbps Active/  LinkUp)==>       2   11[  ] "Infiniscale-IV Mellanox Technologies" ( )

Setup bond1 (mode1) of two IB interfaces. Set up IP 172.19.254.X where X is node number. Example below:

 cat /etc/modprobe.d/bonding.conf
 alias bond0 bonding
 alias bond1 bonding
 cat /etc/sysconfig/network-scripts/ifcfg-bond1
 DEVICE=bond1
 TYPE=bonding
 BOOTPROTO=static
 USERCTL=no
 ONBOOT=yes
 IPADDR=172.19.254.x
 NETMASK=255.255.255.0
 BONDING_OPTS="mode=1 miimon=500 primary=ib0"
 MTU=65520

Disable firewall

Tuning sysctl.

net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.rmem_default=16777216
net.core.wmem_default=16777216
net.core.optmem_max=16777216
net.ipv4.tcp_mem=16777216 16777216 16777216
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216
 

Installing Ceph.

Preparation

Configure passwordless access between nodes for user root. The key shoud be created on one node and then copy to other to /root/.ssh/.

ssh-keygen -t dsa (creation of passwordless key)
cd /root/.ssh
cat id_dsa.pub >> authorized_keys
chown root.root authorized_keys
chmod 600 authorized_keys
echo "StrictHostKeyChecking no" > config

Disable Selinux on all nodes

In /etc/selinux/config
SELINUX=disabled

setenforce 0

Add max open files to /etc/security/limits.conf (depends on your requirements) on all nodes

  • hard nofile 1000000
  • soft nofile 1000000

Setup /etc/hosts on all nodes:

172.19.254.1 kosmo-virt1
172.19.254.2 kosmo-virt2
172.19.254.3 kosmo-virt3  
172.19.254.150 kosmo-arch
192.168.14.42 kosmo-virt1
192.168.14.43 kosmo-virt2
192.168.14.44 kosmo-virt3  
192.168.14.150 kosmo-arch

Installing

Install kernel >3.15 on all nodes (That is needed for using cephFS client)

rpm -ivh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel install kernel-ml -y

Set up new kernel for booting.

grep ^menuentry /boot/grub2/grub.cfg 
grub2-set-default 0 # number of our kernel
grub2-editenv list
grub2-mkconfig -o /boot/grub2/grub.cfg

Reboot.

Set up repository: (on all nodes)

 cat << EOT > /etc/yum.repos.d/ceph.repo
 [ceph]
 name=Ceph packages for $basearch
 baseurl=http://ceph.com/rpm/el7/$basearch
 enabled=1
 gpgcheck=1
 type=rpm-md
 gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
 
 [ceph-noarch]
 name=Ceph noarch packages
 baseurl=http://ceph.com/rpm/el7/noarch
 enabled=1
 gpgcheck=1
 type=rpm-md
 gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
 EOT

Import gpgkey: (on all nodes)

 rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'

Setup ntpd. (on all nodes)

yum install ntp

Editing /etc/ntp.conf and start ntpd. (on all nodes)

systemctl enable ntpd
systemctl start ntpd

Install: (on all nodes)

yum install libunwind -y
yum install -y  ceph-common ceph ceph-fuse ceph-deploy

Deploying.

(on kosmo-virt1) 
cd /etc/ceph
ceph-deploy new kosmo-virt1 kosmo-virt2 kosmo-virt3

MON deploying: (on kosmo-virt1)

ceph-deploy  mon create-initial

OSD deploying:

(on kosmo-virt1)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt1
 ceph-deploy disk zap kosmo-virt1:sdb
 ceph-deploy osd prepare kosmo-virt1:sdb
 ceph-deploy disk zap kosmo-virt1:sdc
 ceph-deploy osd prepare kosmo-virt1:sdc
 ceph-deploy disk zap kosmo-virt1:sdd
 ceph-deploy osd prepare kosmo-virt1:sdd
 ceph-deploy disk zap kosmo-virt1:sde
 ceph-deploy osd prepare kosmo-virt1:sde
 ceph-deploy disk zap kosmo-virt1:sdf
 ceph-deploy osd prepare kosmo-virt1:sdf
 ceph-deploy disk zap kosmo-virt1:sdg
 ceph-deploy osd prepare kosmo-virt1:sdg

(on kosmo-virt2)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt2
 ceph-deploy disk zap kosmo-virt2:sdb
 ceph-deploy osd prepare kosmo-virt2:sdb
 ceph-deploy disk zap kosmo-virt2:sdc
 ceph-deploy osd prepare kosmo-virt2:sdc
 ceph-deploy disk zap kosmo-virt2:sdd
 ceph-deploy osd prepare kosmo-virt2:sdd
 ceph-deploy disk zap kosmo-virt2:sde
 ceph-deploy osd prepare kosmo-virt2:sde
 ceph-deploy disk zap kosmo-virt2:sdf
 ceph-deploy osd prepare kosmo-virt2:sdf
 ceph-deploy disk zap kosmo-virt2:sdg
 ceph-deploy osd prepare kosmo-virt2:sdg

(on kosmo-virt3)

 cd /etc/ceph
 ceph-deploy gatherkeys kosmo-virt3
 ceph-deploy disk zap kosmo-virt3:sdb
 ceph-deploy osd prepare kosmo-virt3:sdb
 ceph-deploy disk zap kosmo-virt3:sdc
 ceph-deploy osd prepare kosmo-virt3:sdc
 ceph-deploy disk zap kosmo-virt3:sdd
 ceph-deploy osd prepare kosmo-virt3:sdd
 ceph-deploy disk zap kosmo-virt3:sde
 ceph-deploy osd prepare kosmo-virt3:sde
 ceph-deploy disk zap kosmo-virt3:sdf
 ceph-deploy osd prepare kosmo-virt3:sdf
 ceph-deploy disk zap kosmo-virt3:sdg
 ceph-deploy osd prepare kosmo-virt3:sdg

where sd[b-g] – SSD disks.

MDS deploying:

New giant version of ceph doesn’t have osd pool data and metadata
Use ceph osd lspools to check.

 ceph osd pool create data 1024
 ceph osd pool set data min_size 1
 ceph osd pool set data size 2
 ceph osd pool create metadata 1024
 ceph osd pool set metadata min_size 1
 ceph osd pool set metadata size 2

Check pool id of data and metadata with

 ceph osd lspools

Configure FS

 ceph mds newfs 4 3 --yes-i-really-mean-it

where 4 – id metadata pool, 3 – id metadata pool

Configure MDS

(on kosmo-virt1)

 cd /etc/ceph
 ceph-deploy mds create kosmo-virt1

(on kosmo-virt2)

 cd /etc/ceph
 ceph-deploy mds create kosmo-virt2

(on all nodes)

 chkconfig ceph on

Configure kosmo-arch.

Copy /etc/ceph.conf and /etc/ceph.client.admin.keyring from any of kosmo-virt to kosmo-arch

 

Preparing Ceph for OpenNebula.

Create pool:

 ceph osd pool create one 4096
 ceph osd pool set one min_size 1
 ceph osd pool set one size 2

Setup authorization to pool one:

 ceph auth get-or-create client.oneadmin mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=one' > /etc/ceph/ceph.client.oneadmin.keyring

Get key from keyring:

  cat /etc/ceph/ceph.client.oneadmin.keyring | grep key | awk '{print $3}' >>  /etc/ceph/oneadmin.key

Checking:

 ceph auth list

Copy /etc/ceph/ceph.client.oneadmin.keyring and /etc/ceph/oneadmin.key to the second node.

 

Preparing for Opennebula HA

 

Configuring MariaDB cluster

Configure MariaDB cluster on all nodes except kosmo-arch

Setup repo:

 cat << EOT > /etc/yum.repos.d/mariadb.repo
 [mariadb]
 name = MariaDB
 baseurl = http://yum.mariadb.org/10.0/centos7-amd64
 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
 gpgcheck=1
 EOT

Install:

 yum install MariaDB-Galera-server MariaDB-client rsync galera

start service:

 service mysql start
 chkconfig mysql on
 mysql_secure_installation

prepare for cluster:

 mysql -p
 GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'PASS';
 GRANT ALL PRIVILEGES on *.* to sst_user@'%';
 FLUSH PRIVILEGES;
 exit
 service mysql stop

configuring cluster: (for kosmo-virt1)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.1' # setup real node ip
 wsrep_node_name='kosmo-virt1' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(for kosmo-virt2)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.2' # setup real node ip
 wsrep_node_name='kosmo-virt2' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(for kosmo-virt3)

 cat << EOT > /etc/my.cnf
 collation-server = utf8_general_ci
 init-connect = 'SET NAMES utf8'
 character-set-server = utf8
 binlog_format=ROW
 default-storage-engine=innodb
 innodb_autoinc_lock_mode=2
 innodb_locks_unsafe_for_binlog=1
 query_cache_size=0
 query_cache_type=0
 bind-address=0.0.0.0
 datadir=/var/lib/mysql
 innodb_log_file_size=100M
 innodb_file_per_table
 innodb_flush_log_at_trx_commit=2
 wsrep_provider=/usr/lib64/galera/libgalera_smm.so
 wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3"
 wsrep_cluster_name='scanex_galera_cluster'
 wsrep_node_address='172.19.254.2' # setup real node ip
 wsrep_node_name='kosmo-virt3' #  setup real node name
 wsrep_sst_method=rsync
 wsrep_sst_auth=sst_user:PASS
 EOT

(on kosmo-virt1)

 /etc/init.d/mysql start --wsrep-new-cluster

(on kosmo-virt2)

 /etc/init.d/mysql start

(on kosmo-virt3)

 /etc/init.d/mysql start

check on all nodes:

 mysql -p
 show status like 'wsrep%';

| Variable_name | Value | +——————————+————————————–+

wsrep_local_state_uuid 739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_protocol_version 7
wsrep_last_committed 0
wsrep_replicated 0
wsrep_replicated_bytes 0
wsrep_repl_keys 0
wsrep_repl_keys_bytes 0
wsrep_repl_data_bytes 0
wsrep_repl_other_bytes 0
wsrep_received 6
wsrep_received_bytes 425
wsrep_local_commits 0
wsrep_local_cert_failures 0
wsrep_local_replays 0
wsrep_local_send_queue 0
wsrep_local_send_queue_max 1
wsrep_local_send_queue_min 0
wsrep_local_send_queue_avg 0.000000
wsrep_local_recv_queue 0
wsrep_local_recv_queue_max 1
wsrep_local_recv_queue_min 0
wsrep_local_recv_queue_avg 0.000000
wsrep_local_cached_downto 18446744073709551615
wsrep_flow_control_paused_ns 0
wsrep_flow_control_paused 0.000000
wsrep_flow_control_sent 0
wsrep_flow_control_recv 0
wsrep_cert_deps_distance 0.000000
wsrep_apply_oooe 0.000000
wsrep_apply_oool 0.000000
wsrep_apply_window 0.000000
wsrep_commit_oooe 0.000000
wsrep_commit_oool 0.000000
wsrep_commit_window 0.000000
wsrep_local_state 4
wsrep_local_state_comment Synced
wsrep_cert_index_size 0
wsrep_causal_reads 0
wsrep_cert_interval 0.000000
wsrep_incoming_addresses 172.19.254.1:3306,172.19.254.3:3306,172.19.254.2:3306
wsrep_evs_delayed  
wsrep_evs_evict_list  
wsrep_evs_repl_latency 0/0/0/0/0
wsrep_evs_state OPERATIONAL
wsrep_gcomm_uuid 7397d6d6-d6de-11e4-a515-d3302a8c2342
wsrep_cluster_conf_id 2
wsrep_cluster_size 2
wsrep_cluster_state_uuid 739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_cluster_status Primary
wsrep_connected ON
wsrep_local_bf_aborts 0
wsrep_local_index 0
wsrep_provider_name Galera
wsrep_provider_vendor Codership Oy info@codership.com
wsrep_provider_version 25.3.9(r3387)
wsrep_ready ON
wsrep_thread_count 2

+——————————+————————————–+

Creating user and database:

mysql -p
create database opennebula;
GRANT USAGE ON opennebula.* to oneadmin@'%' IDENTIFIED BY 'PASS';
GRANT ALL PRIVILEGES on opennebula.* to oneadmin@'%';
FLUSH PRIVILEGES;

Remember, if all nodes will be down, actual node must be started with /etc/init.d/mysql start –wsrep-new-cluster. You should find an actual node. If you start node with not actual view, other nodes will issue error (see logs) – [ERROR] WSREP: gcs/src/gcs_group.cpp:void group_post_state_exchange(gcs_group_t*)():319: Reversing history: 0 → 0, this member has applied 140536161751824 more events than the primary component.Data loss is possible. Aborting.

Configuring HA cluster

Unfortunately pcs cluster conflicts with Opennebula server. That’s why will go with pacemaker,corosync and crmsh.

Installing HA

Set up repo on all nodes except kosmo-arch:

 cat << EOT > /etc/yum.repos.d/network\:ha-clustering\:Stable.repo
 [network_ha-clustering_Stable]
 name=Stable High Availability/Clustering packages (CentOS_CentOS-7)
 type=rpm-md
 baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/
 gpgcheck=1
 gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key
 enabled=1
 EOT

Install on all nodes except kosmo-arch:

 yum install corosync pacemaker crmsh resource-agents -y

On kosmo-virt1 create configuration

 vi /etc/corosync/corosync.conf
 totem {
 version: 2  
 secauth: off
 cluster_name: cluster
 transport: udpu
 }
 nodelist {
 node {
      ring0_addr: kosmo-virt1
      nodeid: 1
     }
 node {
      ring0_addr: kosmo-virt2
      nodeid: 2
     }
  node {
      ring0_addr: kosmo-virt3
      nodeid: 3
     }
 }
 quorum {
 provider: corosync_votequorum
 }
 logging {
 to_syslog: yes
 }

and create authkey on kosmo-virt1

 cd /etc/corosync
 corosync-keygen

Copy corosync and authkey to kosmo-virt2 and kosmo-virt3

Enabling (on all nodes except kosmo-arch):

 systemctl enable pacemaker corosync

Starting (on all nodes except kosmo-arch):

 systemctl start pacemaker corosync

Checking:

 crm status
 
 Last updated: Mon Mar 30 18:33:14 2015
 Last change: Mon Mar 30 18:23:47 2015 via crmd on kosmo-virt2
 Stack: corosync
 Current DC: kosmo-virt2 (2) - partition with quorum
 Version: 1.1.10-32.el7_0.1-368c726
 3 Nodes configured
 0 Resources configured
 Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3]

add properies

crm configure property stonith-enabled=false
crm configure property no-quorum-policy=stop
 

Installing Opennebula

Installing

Setup repo on all nodes except kosmo-arch:

 cat << EOT > /etc/yum.repos.d/opennebula.repo
 [opennebula]
 name=opennebula
 baseurl=http://downloads.opennebula.org/repo/4.12/CentOS/7/x86_64/
 enabled=1
 gpgcheck=0
 EOT

Installing (on all nodes except kosmo-arch):

 yum install -y opennebula-server opennebula-sunstone opennebula-node-kvm qemu-img qemu-kvm

Ruby Runtime Installation:

 /usr/share/one/install_gems

Change password oneadmin:

 passwd oneadmin

Create passworless access for oneadmin (on kosmo-virt1):

 su oneadmin
 cd ~/.ssh
 ssh-keygen -t dsa
 cat id_dsa.pub >> authorized_keys
 chown oneadmin:oneadmin authorized_keys
 chmod 600 authorized_keys
 echo "StrictHostKeyChecking no" > config

Copy to other nodes (remember that oneadmin home directory is /var/lib/one).

Change listen for sunstone-server (on all nodes):

 sed -i 's/host:\ 127\.0\.0\.1/host:\ 0\.0\.0\.0/g' /etc/one/sunstone-server.conf

on kosmo-virt1:

copy all /var/lib/one/.one/*.auth and one.key files to OTHER_NODES:/var/lib/one/.one/

Start stop services on kosmo-virt1:

 
 systemctl start opennebula opennebula-sunstone

Try to connect to http://node:9869.
Check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log).
If no errors:

 systemctl stop opennebula opennebula-sunstone

Add ceph support for qemu-kvm for all nodes except kosmo-arch

 qemu-img -h | grep rbd
 /usr/libexec/qemu-kvm --drive format=? | grep rbd

if there is no rbd support than you have to compile and install:

 qemu-kvm-rhev
 qemu-kvm-common-rhev 
 qemu-img-rhev

Download:

 yum groupinstall -y "Development Tools"
 yum install -y yum-utils rpm-build
 yumdownloader --source qemu-kvm
 rpm -ivh qemu-kvm-1.5.3-60.el7_0.11.src.rpm

Compiling.

 
 cd ~/rpmbuild/SPEC
 vi qemu-kvm.spec

Change %define rhev 0 to %define rhev 1.

 rpmbuild -ba qemu-kvm.spec

Installing (for all nodes except kosmo-arch).

 rpm -e --nodeps libcacard-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-img-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-kvm-common-1.5.3-60.el7_0.11.x86_64
 rpm -e --nodeps qemu-kvm-1.5.3-60.el7_0.11.x86_64
 rpm -ivh libcacard-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-img-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-kvm-common-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
 rpm -ivh qemu-kvm-rhev-1.5.3-60.el7.centos.11.x86_64.rpm

Check for ceph support.

 qemu-img -h | grep rbd
 Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify    blkdebug
 /usr/libexec/qemu-kvm --drive format=? | grep rbd
 Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug

Try to write image (for all nodes except kosmo-arch):

 qemu-img create -f rbd rbd:one/test-virtN 10G

where N node number.

Add ceph support for libvirt

On all nodes:

 systemctl enable messagebus.service
 systemctl start messagebus.service
 systemctl enable libvirtd.service
 systemctl start libvirtd.service

On kosmo-virt1 create uuid:

 uuidgen
 cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5

Create secret.xml

 
 cat > secret.xml <<EOF
 <secret ephemeral='no' private='no'>
 <uuid>cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5</uuid>
 <usage type='ceph'>
 <name>client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==</name>
 </usage>
 </secret>
 EOF

Where AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== is cat /etc/ceph/oneadmin.key.
Copy secret.xml to other nodes.

Add key to libvirt (for all nodes except kosmo-arch)

 virsh secret-define --file secret.xml
 virsh secret-set-value --secret virsh secret-set-value --base64 $(cat /etc/ceph/oneadmin.key)

check

 virsh secret-list
 UUID                                 Usage
 -----------------------------------------------------------
 cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 ceph client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==

Restart libvirtd:

 systemctl restart libvirtd.service

Convering database to mysql:

Downloading script:

 wget http://www.redmine.org/attachments/download/6239/sqlite3-to-mysql.py

Converting:

 sqlite3 /var/lib/one/one.db .dump | ./sqlite3-to-mysql.py > mysql.sql   
 mysql -u oneadmin -p opennebula < mysql.sql

Change /etc/one/oned.conf from

 DB = [ backend = "sqlite" ]

to

 DB = [ backend = "mysql",
      server  = "localhost",
      port    = 0,
      user    = "oneadmin",
      passwd  = "PASS",
      db_name = "opennebula" ]

Copy oned.conf to other nodes as root except kosmo-arch.

Check kosmo-virt2 and kosmo-virt3 nodes in turn:

   systemctl start opennebula opennebula-sunstone

check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log)

   systemctl start opennebula opennebula-sunstone
 

Creating HA resources

On all nodes except kosmo-arch:

 systemctl disable opennebula opennebula-sunstone opennebula-novnc

From any of the nodes except kosmo-arch:

 crm
 configure
 primitive ClusterIP ocf:heartbeat:IPaddr2 params ip="192.168.14.41" cidr_netmask="24" op monitor interval="30s"
 primitive opennebula_p systemd:opennebula \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 primitive opennebula-sunstone_p systemd:opennebula-sunstone \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 primitive opennebula-novnc_p systemd:opennebula-novnc \
 op monitor interval=60s timeout=20s \
 op start interval="0" timeout="120s" \
 op stop  interval="0" timeout="120s" 
 group Opennebula_HA ClusterIP opennebula_p opennebula-sunstone_p  opennebula-novnc_p
 exit

Check

 crm status
 Last updated: Tue Mar 31 16:43:00 2015
 Last change: Tue Mar 31 16:40:22 2015 via cibadmin on kosmo-virt1
 Stack: corosync
 Current DC: kosmo-virt2 (2) - partition with quorum
 Version: 1.1.10-32.el7_0.1-368c726
 3 Nodes configured
 4 Resources configured
 Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3 ]
 Resource Group: Opennebula_HA
   ClusterIP  (ocf::heartbeat:IPaddr2):       Started kosmo-virt1
   opennebula_p       (systemd:opennebula):   Started kosmo-virt1
   opennebula-sunstone_p      (systemd:opennebula-sunstone):  Started kosmo-virt1
   opennebula-novnc_p (systemd:opennebula-novnc):     Started kosmo-virt1
 

Configuring OpenNebula

http://active_node:9869 – web management.

With web management. 1. Create Cluster. 2. Add hosts (using 192.168.14.0 networks).

Console management.

3. Add net. (su oneadmin)

 
 cat << EOT > def.net
 NAME    = "Shared LAN"
 TYPE    = RANGED
 # Now we'll use the host private network (physical)
 BRIDGE  = nab0
 NETWORK_SIZE    = C
 NETWORK_ADDRESS = 192.168.14.0
 EOT
 onevnet create def.net

4. Create image rbd datastore. (su oneadmin)

 cat << EOT > rbd.conf
 NAME = "cephds"
 DS_MAD = ceph
 TM_MAD = ceph
 DISK_TYPE = RBD
 POOL_NAME = one
 BRIDGE_LIST ="192.168.14.42 192.168.14.43 192.168.14.44"
 CEPH_HOST ="172.19.254.1:6789 172.19.254.2:6789 172.19.254.3:6789"
 CEPH_SECRET ="cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5" #uuid key, looked at libvirt authentication for ceph
 CEPH_USER = oneadmin
 onedatastore create rbd.conf

5. Create system ceph datastore.

check last id number – N.

onedatastore list

on all nodes create directory and mount ceph

mkdir /var/lib/one/datastores/N+1
echo "172.19.254.K:6789:/ /var/lib/one/datastores/N+1 ceph rw,relatime,name=admin,secret=AQB4jxJV8PuhJhAAdsdsdRBkSFrtr0VvnQNljBw==,nodcache 0 0 # see secret in /etc/ceph/ceph.client.admin.keyring" >> /etc/fstab
mount /var/lib/one/datastores/N+1

where K= IP of curent node.

From one node change permitions:

chown oneadmin:oneadmin /var/lib/one/datastores/N+1

Create system ceph datastore (su oneadmin):

 cat << EOT > sys_fs.conf
 NAME    = system_ceph
 TM_MAD  = shared
 TYPE    = SYSTEM_DS
 EOT

 onedatastore create sys_fs.conf

6. Add nodes, vnets, datastories to created cluster with web management.

HA VM

Here is official doc.
But one comment. I’m using migrate instead of recreate command.

 /etc/one/oned.conf
 HOST_HOOK = [
  name      = "error",
  on        = "ERROR",
  command   = "host_error.rb",
  arguments = "$HID -m",
  remote    = no ]
 

BACKUP

Some words about backup.

Use persistent image type for this work scheme.

For BACKUP was used a single Linux server kosmo-arch (ceph client) with installed zfs on linux. For zpool set ZFS and deduplication on. (Remember that deduplication required about 2GB mem for 1TB storage space.)

Example of simple script that is starting by cron:

#!/bin/sh
currdate=`/bin/date +%Y-%m-%0e`
olddate=`/bin/date --date="60 days ago" +%Y-%m-%0e`
imagelist="one-21" #space delimited list
for i in $imagelist
do
snapcurchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate`
snapoldchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate`
if test -z "$snapcurchk"
 then
  /usr/bin/rbd snap create --snap $currdate one/$i
  /usr/bin/rbd export one/$i@$currdate /rbdback/$i-$currdate
 else
  echo "current snapshot exist" 
fi
if test -z "$snapoldchk"
  then
   echo "old snapshot doesn't exist"
  else
  /usr/bin/rbd snap rm one/$i@$olddate
  /bin/rm -f /rbdback/$i-$olddate
 fi
done


Use onevm utility or web-interface (see template) to know which image assigned to VM.

onevm list
onevm show "VM_ID" -a | grep IMAGE_ID
 

PS

Don’t forget to change storage driver for VM to vda.(Drivers for windows). Without that you will face with low IO performance. (no more than 100 MB/s).
I saw 415MB/s with virtio drivers.

 

Links.

 

7 Comments

  1. Jeremie

    Hi,

    What a how to ! Thank you _very_ much for sharing 🙂
    Now, I must find so time to reproduce it in home lab… in the mean time, would you please clarify why OpenNebula isn’t compatible with pcs ?

    Thanks again for sharing
    Bests

    Reply
  2. subhash

    We had followed this document for setup CEPH cluster integration with OpenNebula 5.2. but we faced some issues on that, after creating CEPH cluster when try to integrating with opennebula frontend node then its not getting up and not showing disk uasble space.

    [oneadmin@node1 ~]$ cat cephds.conf
    NAME = “cephds”
    DS_MAD = ceph
    TM_MAD = ceph
    DISK_TYPE = RBD
    POOL_NAME = data
    CEPH_HOST = storage1-storage2-storage3
    CEPH_USER = oneadmin
    CEPH_SECRET = “AQCU1GxYHmbkBxAA6TlD6q+PyMTL+/a+AB+2bg==2dsf”
    BRIDGE_LIST = storage1

    Plz suggest us.

    Reply
  3. subhash

    [cephuser@storage1 ~]$ ceph -s
    cluster dd8ea244-d2e4-4948-b3b8-ec2ddedeff80
    health HEALTH_OK
    monmap e1: 3 mons at {storage1=XX:XX:XX:137:6789/0,storage2=XX.XX.XX.138:6789/0,storage3=XX.XX.XX.139:6789/0}
    election epoch 70, quorum 0,1,2 storage1,storage2,storage3
    osdmap e204: 3 osds: 3 up, 3 in
    flags sortbitwise,require_jewel_osds
    pgmap v312560: 2112 pgs, 3 pools, 14641 kB data, 24 objects
    19914 MB used, 11104 GB / 11123 GB avail
    2112 active+clean

    Reply
  4. Alexey

    Hi subhash!

    I see two errors:
    >CEPH_HOST = storage1-storage2-storage3
    CEPH_HOST = “storage1 storage2 storage3”
    >CEPH_SECRET = “AQCU1GxYHmbkBxAA6TlD6q+PyMTL+/a+AB+2bg==2dsf”
    CEPH_SECRET should be uuid

    Be attentive.
    PS: You don’t need to use crmsh. Just use pcs cluster.

    Reply
  5. subhash

    Some more updates on CEPH Cluster:
    I have setup 3 nodes CEPH storage cluster and all are working fine and data replicated on each node after that we want to integrate with OpenNebula 5.2 we have setup as below setting as shared storage but no getting up as shared storage on opennebula frontend.
    Please suggest us..

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *