Introduction.
This article is exploring the process of installing HA OpenNebula and Ceph as datastore on three nodes (disks – 6xSSD 240GB, backend network IPoIB, OS CentOS 7) and using one additional node for backup.
Scheme of equipment below:
We are using this solution for virtualization of our imagery processing servers.
All actions should be performed on all nodes. For kosmo-arch all except bridge-utils and FrontEnd network.
yum install bridge-utils
FrontEnd network.
Configure bond0 (mode0) and start script below to create frontend interface for VMs (OpenNebula)
#!/bin/bash Device=bond0 cd /etc/sysconfig/network-scripts if [ ! -f ifcfg-nab1 ]; then cp -p ifcfg-$Device bu-ifcfg-$Device echo -e "DEVICE=$Device\nTYPE=Ethernet\nBOOTPROTO=none\nNM_CONTROLLED=no\nONBOOT=yes\nBRIDGE=nab1" > ifcfg-$Device grep ^HW bu-ifcfg-$Device >> ifcfg-$Device echo -e "DEVICE=nab1\nNM_CONTROLLED=no\nONBOOT=yes\nTYPE=bridge" > ifcfg-nab1 egrep -v "^#|^DEV|^HWA|^TYP|^UUI|^NM_|^ONB" bu-ifcfg-$Device >> ifcfg-nab1 fi
BackEnd network. Configuration of IPoIB:
yum groupinstall -y "Infiniband Support" yum install opensm
Enable IPoIB and switch infiniband to connected mode. This Link about differences of connected or datagram modes.
cat /etc/rdma/rdma.conf # Load IPoIB IPOIB_LOAD=yes # Setup connected mode SET_IPOIB_CM=yes
Start Infiniband services.
systemctl enable rdma opensm systemctl start rdma opensm
Check of working
ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.7.000 node_guid: 0025:90ff:ff07:3368 sys_image_guid: 0025:90ff:ff07:336b vendor_id: 0x02c9 vendor_part_id: 26428 hw_ver: 0xB0 board_id: SM_1071000001000 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 8 port_lid: 4 port_lmc: 0x00 link_layer: InfiniBand port: 2 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 4 port_lid: 9 port_lmc: 0x00 link_layer: InfiniBand
and
iblinkinfo CA: kosmo-virt1 mlx4_0: 0x002590ffff073385 13 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 2 10[ ] "Infiniscale-IV Mellanox Technologies" ( ) Switch: 0x0002c90200482d08 Infiniscale-IV Mellanox Technologies: 2 1[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 2[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 1[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 4[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 5[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 6[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 7[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 8[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 9[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 10[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 13 1[ ] "kosmo-virt1 mlx4_0" ( ) 2 11[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 4 1[ ] "kosmo-virt2 mlx4_0" ( ) 2 12[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 13[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 14[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 15[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 16[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 17[ ] ==( Down/ Polling)==> [ ] "" ( ) 2 18[ ] ==( Down/ Polling)==> [ ] "" ( ) CA: kosmo-virt2 mlx4_0: 0x002590ffff073369 4 1[ ] ==( 4X 10.0 Gbps Active/ LinkUp)==> 2 11[ ] "Infiniscale-IV Mellanox Technologies" ( )
Setup bond1 (mode1) of two IB interfaces. Set up IP 172.19.254.X where X is node number. Example below:
cat /etc/modprobe.d/bonding.conf alias bond0 bonding alias bond1 bonding
cat /etc/sysconfig/network-scripts/ifcfg-bond1 DEVICE=bond1 TYPE=bonding BOOTPROTO=static USERCTL=no ONBOOT=yes IPADDR=172.19.254.x NETMASK=255.255.255.0 BONDING_OPTS="mode=1 miimon=500 primary=ib0" MTU=65520
Disable firewall
Tuning sysctl.
net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.core.rmem_default=16777216 net.core.wmem_default=16777216 net.core.optmem_max=16777216 net.ipv4.tcp_mem=16777216 16777216 16777216 net.ipv4.tcp_rmem=4096 87380 16777216 net.ipv4.tcp_wmem=4096 65536 16777216
Installing Ceph.
Preparation
Configure passwordless access between nodes for user root. The key shoud be created on one node and then copy to other to /root/.ssh/.
ssh-keygen -t dsa (creation of passwordless key) cd /root/.ssh cat id_dsa.pub >> authorized_keys chown root.root authorized_keys chmod 600 authorized_keys echo "StrictHostKeyChecking no" > config
Disable Selinux on all nodes
In /etc/selinux/config SELINUX=disabled
setenforce 0
Add max open files to /etc/security/limits.conf (depends on your requirements) on all nodes
-
hard nofile 1000000
-
soft nofile 1000000
Setup /etc/hosts on all nodes:
172.19.254.1 kosmo-virt1 172.19.254.2 kosmo-virt2 172.19.254.3 kosmo-virt3 172.19.254.150 kosmo-arch 192.168.14.42 kosmo-virt1 192.168.14.43 kosmo-virt2 192.168.14.44 kosmo-virt3 192.168.14.150 kosmo-arch
Installing
Install kernel >3.15 on all nodes (That is needed for using cephFS client)
rpm -ivh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm yum --enablerepo=elrepo-kernel install kernel-ml -y
Set up new kernel for booting.
grep ^menuentry /boot/grub2/grub.cfg grub2-set-default 0 # number of our kernel grub2-editenv list grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot.
Set up repository: (on all nodes)
cat << EOT > /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc EOT
Import gpgkey: (on all nodes)
rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'
Setup ntpd. (on all nodes)
yum install ntp
Editing /etc/ntp.conf and start ntpd. (on all nodes)
systemctl enable ntpd systemctl start ntpd
Install: (on all nodes)
yum install libunwind -y yum install -y ceph-common ceph ceph-fuse ceph-deploy
Deploying.
(on kosmo-virt1) cd /etc/ceph ceph-deploy new kosmo-virt1 kosmo-virt2 kosmo-virt3
MON deploying: (on kosmo-virt1)
ceph-deploy mon create-initial
OSD deploying:
(on kosmo-virt1)
cd /etc/ceph ceph-deploy gatherkeys kosmo-virt1 ceph-deploy disk zap kosmo-virt1:sdb ceph-deploy osd prepare kosmo-virt1:sdb ceph-deploy disk zap kosmo-virt1:sdc ceph-deploy osd prepare kosmo-virt1:sdc ceph-deploy disk zap kosmo-virt1:sdd ceph-deploy osd prepare kosmo-virt1:sdd ceph-deploy disk zap kosmo-virt1:sde ceph-deploy osd prepare kosmo-virt1:sde ceph-deploy disk zap kosmo-virt1:sdf ceph-deploy osd prepare kosmo-virt1:sdf ceph-deploy disk zap kosmo-virt1:sdg ceph-deploy osd prepare kosmo-virt1:sdg
(on kosmo-virt2)
cd /etc/ceph ceph-deploy gatherkeys kosmo-virt2 ceph-deploy disk zap kosmo-virt2:sdb ceph-deploy osd prepare kosmo-virt2:sdb ceph-deploy disk zap kosmo-virt2:sdc ceph-deploy osd prepare kosmo-virt2:sdc ceph-deploy disk zap kosmo-virt2:sdd ceph-deploy osd prepare kosmo-virt2:sdd ceph-deploy disk zap kosmo-virt2:sde ceph-deploy osd prepare kosmo-virt2:sde ceph-deploy disk zap kosmo-virt2:sdf ceph-deploy osd prepare kosmo-virt2:sdf ceph-deploy disk zap kosmo-virt2:sdg ceph-deploy osd prepare kosmo-virt2:sdg
(on kosmo-virt3)
cd /etc/ceph ceph-deploy gatherkeys kosmo-virt3 ceph-deploy disk zap kosmo-virt3:sdb ceph-deploy osd prepare kosmo-virt3:sdb ceph-deploy disk zap kosmo-virt3:sdc ceph-deploy osd prepare kosmo-virt3:sdc ceph-deploy disk zap kosmo-virt3:sdd ceph-deploy osd prepare kosmo-virt3:sdd ceph-deploy disk zap kosmo-virt3:sde ceph-deploy osd prepare kosmo-virt3:sde ceph-deploy disk zap kosmo-virt3:sdf ceph-deploy osd prepare kosmo-virt3:sdf ceph-deploy disk zap kosmo-virt3:sdg ceph-deploy osd prepare kosmo-virt3:sdg
where sd[b-g] – SSD disks.
MDS deploying:
New giant version of ceph doesn’t have osd pool data and metadata
Use ceph osd lspools to check.
ceph osd pool create data 1024 ceph osd pool set data min_size 1 ceph osd pool set data size 2
ceph osd pool create metadata 1024 ceph osd pool set metadata min_size 1 ceph osd pool set metadata size 2
Check pool id of data and metadata with
ceph osd lspools
Configure FS
ceph mds newfs 4 3 --yes-i-really-mean-it
where 4 – id metadata pool, 3 – id metadata pool
Configure MDS
(on kosmo-virt1)
cd /etc/ceph ceph-deploy mds create kosmo-virt1
(on kosmo-virt2)
cd /etc/ceph ceph-deploy mds create kosmo-virt2
(on all nodes)
chkconfig ceph on
Configure kosmo-arch.
Copy /etc/ceph.conf and /etc/ceph.client.admin.keyring from any of kosmo-virt to kosmo-arch
Preparing Ceph for OpenNebula.
Create pool:
ceph osd pool create one 4096 ceph osd pool set one min_size 1 ceph osd pool set one size 2
Setup authorization to pool one:
ceph auth get-or-create client.oneadmin mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=one' > /etc/ceph/ceph.client.oneadmin.keyring
Get key from keyring:
cat /etc/ceph/ceph.client.oneadmin.keyring | grep key | awk '{print $3}' >> /etc/ceph/oneadmin.key
Checking:
ceph auth list
Copy /etc/ceph/ceph.client.oneadmin.keyring and /etc/ceph/oneadmin.key to the second node.
Preparing for Opennebula HA
Configuring MariaDB cluster
Configure MariaDB cluster on all nodes except kosmo-arch
Setup repo:
cat << EOT > /etc/yum.repos.d/mariadb.repo [mariadb] name = MariaDB baseurl = http://yum.mariadb.org/10.0/centos7-amd64 gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB gpgcheck=1 EOT
Install:
yum install MariaDB-Galera-server MariaDB-client rsync galera
start service:
service mysql start chkconfig mysql on mysql_secure_installation
prepare for cluster:
mysql -p GRANT USAGE ON *.* to sst_user@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on *.* to sst_user@'%'; FLUSH PRIVILEGES; exit service mysql stop
configuring cluster: (for kosmo-virt1)
cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.1' # setup real node ip wsrep_node_name='kosmo-virt1' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT
(for kosmo-virt2)
cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.2' # setup real node ip wsrep_node_name='kosmo-virt2' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT
(for kosmo-virt3)
cat << EOT > /etc/my.cnf collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 binlog_format=ROW default-storage-engine=innodb innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 query_cache_size=0 query_cache_type=0 bind-address=0.0.0.0 datadir=/var/lib/mysql innodb_log_file_size=100M innodb_file_per_table innodb_flush_log_at_trx_commit=2 wsrep_provider=/usr/lib64/galera/libgalera_smm.so wsrep_cluster_address="gcomm://172.19.254.1,172.19.254.2,172.19.254.3" wsrep_cluster_name='scanex_galera_cluster' wsrep_node_address='172.19.254.2' # setup real node ip wsrep_node_name='kosmo-virt3' # setup real node name wsrep_sst_method=rsync wsrep_sst_auth=sst_user:PASS EOT
(on kosmo-virt1)
/etc/init.d/mysql start --wsrep-new-cluster
(on kosmo-virt2)
/etc/init.d/mysql start
(on kosmo-virt3)
/etc/init.d/mysql start
check on all nodes:
mysql -p show status like 'wsrep%';
| Variable_name | Value | +——————————+————————————–+
wsrep_local_state_uuid | 739895d5-d6de-11e4-87f6-3a3244f26574 |
wsrep_protocol_version | 7 |
wsrep_last_committed | 0 |
wsrep_replicated | 0 |
wsrep_replicated_bytes | 0 |
wsrep_repl_keys | 0 |
wsrep_repl_keys_bytes | 0 |
wsrep_repl_data_bytes | 0 |
wsrep_repl_other_bytes | 0 |
wsrep_received | 6 |
wsrep_received_bytes | 425 |
wsrep_local_commits | 0 |
wsrep_local_cert_failures | 0 |
wsrep_local_replays | 0 |
wsrep_local_send_queue | 0 |
wsrep_local_send_queue_max | 1 |
wsrep_local_send_queue_min | 0 |
wsrep_local_send_queue_avg | 0.000000 |
wsrep_local_recv_queue | 0 |
wsrep_local_recv_queue_max | 1 |
wsrep_local_recv_queue_min | 0 |
wsrep_local_recv_queue_avg | 0.000000 |
wsrep_local_cached_downto | 18446744073709551615 |
wsrep_flow_control_paused_ns | 0 |
wsrep_flow_control_paused | 0.000000 |
wsrep_flow_control_sent | 0 |
wsrep_flow_control_recv | 0 |
wsrep_cert_deps_distance | 0.000000 |
wsrep_apply_oooe | 0.000000 |
wsrep_apply_oool | 0.000000 |
wsrep_apply_window | 0.000000 |
wsrep_commit_oooe | 0.000000 |
wsrep_commit_oool | 0.000000 |
wsrep_commit_window | 0.000000 |
wsrep_local_state | 4 |
wsrep_local_state_comment | Synced |
wsrep_cert_index_size | 0 |
wsrep_causal_reads | 0 |
wsrep_cert_interval | 0.000000 |
wsrep_incoming_addresses | 172.19.254.1:3306,172.19.254.3:3306,172.19.254.2:3306 |
wsrep_evs_delayed | |
wsrep_evs_evict_list | |
wsrep_evs_repl_latency | 0/0/0/0/0 |
wsrep_evs_state | OPERATIONAL |
wsrep_gcomm_uuid | 7397d6d6-d6de-11e4-a515-d3302a8c2342 |
wsrep_cluster_conf_id | 2 |
wsrep_cluster_size | 2 |
wsrep_cluster_state_uuid | 739895d5-d6de-11e4-87f6-3a3244f26574 |
wsrep_cluster_status | Primary |
wsrep_connected | ON |
wsrep_local_bf_aborts | 0 |
wsrep_local_index | 0 |
wsrep_provider_name | Galera |
wsrep_provider_vendor | Codership Oy info@codership.com |
wsrep_provider_version | 25.3.9(r3387) |
wsrep_ready | ON |
wsrep_thread_count | 2 |
+——————————+————————————–+
Creating user and database:
mysql -p create database opennebula; GRANT USAGE ON opennebula.* to oneadmin@'%' IDENTIFIED BY 'PASS'; GRANT ALL PRIVILEGES on opennebula.* to oneadmin@'%'; FLUSH PRIVILEGES;
Remember, if all nodes will be down, actual node must be started with /etc/init.d/mysql start –wsrep-new-cluster. You should find an actual node. If you start node with not actual view, other nodes will issue error (see logs) – [ERROR] WSREP: gcs/src/gcs_group.cpp:void group_post_state_exchange(gcs_group_t*)():319: Reversing history: 0 → 0, this member has applied 140536161751824 more events than the primary component.Data loss is possible. Aborting.
Configuring HA cluster
Unfortunately pcs cluster conflicts with Opennebula server. That’s why will go with pacemaker,corosync and crmsh.
Installing HA
Set up repo on all nodes except kosmo-arch:
cat << EOT > /etc/yum.repos.d/network\:ha-clustering\:Stable.repo [network_ha-clustering_Stable] name=Stable High Availability/Clustering packages (CentOS_CentOS-7) type=rpm-md baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ gpgcheck=1 gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key enabled=1 EOT
Install on all nodes except kosmo-arch:
yum install corosync pacemaker crmsh resource-agents -y
On kosmo-virt1 create configuration
vi /etc/corosync/corosync.conf totem { version: 2 secauth: off cluster_name: cluster transport: udpu } nodelist { node { ring0_addr: kosmo-virt1 nodeid: 1 } node { ring0_addr: kosmo-virt2 nodeid: 2 } node { ring0_addr: kosmo-virt3 nodeid: 3 } } quorum { provider: corosync_votequorum } logging { to_syslog: yes }
and create authkey on kosmo-virt1
cd /etc/corosync corosync-keygen
Copy corosync and authkey to kosmo-virt2 and kosmo-virt3
Enabling (on all nodes except kosmo-arch):
systemctl enable pacemaker corosync
Starting (on all nodes except kosmo-arch):
systemctl start pacemaker corosync
Checking:
crm status Last updated: Mon Mar 30 18:33:14 2015 Last change: Mon Mar 30 18:23:47 2015 via crmd on kosmo-virt2 Stack: corosync Current DC: kosmo-virt2 (2) - partition with quorum Version: 1.1.10-32.el7_0.1-368c726 3 Nodes configured 0 Resources configured Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3]
add properies
crm configure property stonith-enabled=false crm configure property no-quorum-policy=stop
Installing Opennebula
Installing
Setup repo on all nodes except kosmo-arch:
cat << EOT > /etc/yum.repos.d/opennebula.repo [opennebula] name=opennebula baseurl=http://downloads.opennebula.org/repo/4.12/CentOS/7/x86_64/ enabled=1 gpgcheck=0 EOT
Installing (on all nodes except kosmo-arch):
yum install -y opennebula-server opennebula-sunstone opennebula-node-kvm qemu-img qemu-kvm
Ruby Runtime Installation:
/usr/share/one/install_gems
Change password oneadmin:
passwd oneadmin
Create passworless access for oneadmin (on kosmo-virt1):
su oneadmin cd ~/.ssh ssh-keygen -t dsa cat id_dsa.pub >> authorized_keys chown oneadmin:oneadmin authorized_keys chmod 600 authorized_keys echo "StrictHostKeyChecking no" > config
Copy to other nodes (remember that oneadmin home directory is /var/lib/one).
Change listen for sunstone-server (on all nodes):
sed -i 's/host:\ 127\.0\.0\.1/host:\ 0\.0\.0\.0/g' /etc/one/sunstone-server.conf
on kosmo-virt1:
copy all /var/lib/one/.one/*.auth and one.key files to OTHER_NODES:/var/lib/one/.one/
Start stop services on kosmo-virt1:
systemctl start opennebula opennebula-sunstone
Try to connect to http://node:9869.
Check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log).
If no errors:
systemctl stop opennebula opennebula-sunstone
Add ceph support for qemu-kvm for all nodes except kosmo-arch
qemu-img -h | grep rbd /usr/libexec/qemu-kvm --drive format=? | grep rbd
if there is no rbd support than you have to compile and install:
qemu-kvm-rhev qemu-kvm-common-rhev qemu-img-rhev
Download:
yum groupinstall -y "Development Tools" yum install -y yum-utils rpm-build yumdownloader --source qemu-kvm rpm -ivh qemu-kvm-1.5.3-60.el7_0.11.src.rpm
Compiling.
cd ~/rpmbuild/SPEC vi qemu-kvm.spec
Change %define rhev 0 to %define rhev 1.
rpmbuild -ba qemu-kvm.spec
Installing (for all nodes except kosmo-arch).
rpm -e --nodeps libcacard-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-img-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-kvm-common-1.5.3-60.el7_0.11.x86_64 rpm -e --nodeps qemu-kvm-1.5.3-60.el7_0.11.x86_64 rpm -ivh libcacard-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-img-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-kvm-common-rhev-1.5.3-60.el7.centos.11.x86_64.rpm rpm -ivh qemu-kvm-rhev-1.5.3-60.el7.centos.11.x86_64.rpm
Check for ceph support.
qemu-img -h | grep rbd Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug /usr/libexec/qemu-kvm --drive format=? | grep rbd Supported formats: vvfat vpc vmdk vhdx vdi sheepdog sheepdog sheepdog rbd raw host_cdrom host_floppy host_device file qed qcow2 qcow parallels nbd nbd nbd iscsi gluster gluster gluster gluster dmg cow cloop bochs blkverify blkdebug
Try to write image (for all nodes except kosmo-arch):
qemu-img create -f rbd rbd:one/test-virtN 10G
where N node number.
Add ceph support for libvirt
On all nodes:
systemctl enable messagebus.service systemctl start messagebus.service systemctl enable libvirtd.service systemctl start libvirtd.service
On kosmo-virt1 create uuid:
uuidgen cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5
Create secret.xml
cat > secret.xml <<EOF <secret ephemeral='no' private='no'> <uuid>cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5</uuid> <usage type='ceph'> <name>client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==</name> </usage> </secret> EOF
Where AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== is cat /etc/ceph/oneadmin.key.
Copy secret.xml to other nodes.
Add key to libvirt (for all nodes except kosmo-arch)
virsh secret-define --file secret.xml virsh secret-set-value --secret virsh secret-set-value --base64 $(cat /etc/ceph/oneadmin.key)
check
virsh secret-list UUID Usage ----------------------------------------------------------- cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5 ceph client.oneadmin AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q==
Restart libvirtd:
systemctl restart libvirtd.service
Convering database to mysql:
Downloading script:
wget http://www.redmine.org/attachments/download/6239/sqlite3-to-mysql.py
Converting:
sqlite3 /var/lib/one/one.db .dump | ./sqlite3-to-mysql.py > mysql.sql mysql -u oneadmin -p opennebula < mysql.sql
Change /etc/one/oned.conf from
DB = [ backend = "sqlite" ]
to
DB = [ backend = "mysql", server = "localhost", port = 0, user = "oneadmin", passwd = "PASS", db_name = "opennebula" ]
Copy oned.conf to other nodes as root except kosmo-arch.
Check kosmo-virt2 and kosmo-virt3 nodes in turn:
systemctl start opennebula opennebula-sunstone
check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log)
systemctl start opennebula opennebula-sunstone
Creating HA resources
On all nodes except kosmo-arch:
systemctl disable opennebula opennebula-sunstone opennebula-novnc
From any of the nodes except kosmo-arch:
crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip="192.168.14.41" cidr_netmask="24" op monitor interval="30s" primitive opennebula_p systemd:opennebula \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-sunstone_p systemd:opennebula-sunstone \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive opennebula-novnc_p systemd:opennebula-novnc \ op monitor interval=60s timeout=20s \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" group Opennebula_HA ClusterIP opennebula_p opennebula-sunstone_p opennebula-novnc_p exit
Check
crm status Last updated: Tue Mar 31 16:43:00 2015 Last change: Tue Mar 31 16:40:22 2015 via cibadmin on kosmo-virt1 Stack: corosync Current DC: kosmo-virt2 (2) - partition with quorum Version: 1.1.10-32.el7_0.1-368c726 3 Nodes configured 4 Resources configured Online: [ kosmo-virt1 kosmo-virt2 kosmo-virt3 ] Resource Group: Opennebula_HA ClusterIP (ocf::heartbeat:IPaddr2): Started kosmo-virt1 opennebula_p (systemd:opennebula): Started kosmo-virt1 opennebula-sunstone_p (systemd:opennebula-sunstone): Started kosmo-virt1 opennebula-novnc_p (systemd:opennebula-novnc): Started kosmo-virt1
Configuring OpenNebula
http://active_node:9869 – web management.
With web management. 1. Create Cluster. 2. Add hosts (using 192.168.14.0 networks).
Console management.
3. Add net. (su oneadmin)
cat << EOT > def.net NAME = "Shared LAN" TYPE = RANGED # Now we'll use the host private network (physical) BRIDGE = nab0 NETWORK_SIZE = C NETWORK_ADDRESS = 192.168.14.0 EOT
onevnet create def.net
4. Create image rbd datastore. (su oneadmin)
cat << EOT > rbd.conf NAME = "cephds" DS_MAD = ceph TM_MAD = ceph DISK_TYPE = RBD POOL_NAME = one BRIDGE_LIST ="192.168.14.42 192.168.14.43 192.168.14.44" CEPH_HOST ="172.19.254.1:6789 172.19.254.2:6789 172.19.254.3:6789" CEPH_SECRET ="cfb34c4b-d95c-4abc-a4cc-f8a2ae532cb5" #uuid key, looked at libvirt authentication for ceph CEPH_USER = oneadmin
onedatastore create rbd.conf
5. Create system ceph datastore.
check last id number – N.
onedatastore list
on all nodes create directory and mount ceph
mkdir /var/lib/one/datastores/N+1 echo "172.19.254.K:6789:/ /var/lib/one/datastores/N+1 ceph rw,relatime,name=admin,secret=AQB4jxJV8PuhJhAAdsdsdRBkSFrtr0VvnQNljBw==,nodcache 0 0 # see secret in /etc/ceph/ceph.client.admin.keyring" >> /etc/fstab mount /var/lib/one/datastores/N+1
where K= IP of curent node.
From one node change permitions:
chown oneadmin:oneadmin /var/lib/one/datastores/N+1
Create system ceph datastore (su oneadmin):
cat << EOT > sys_fs.conf NAME = system_ceph TM_MAD = shared TYPE = SYSTEM_DS EOT onedatastore create sys_fs.conf
6. Add nodes, vnets, datastories to created cluster with web management.
Here is official doc.
But one comment. I’m using migrate instead of recreate command.
/etc/one/oned.conf HOST_HOOK = [ name = "error", on = "ERROR", command = "host_error.rb", arguments = "$HID -m", remote = no ]
BACKUP
Some words about backup.
Use persistent image type for this work scheme.
For BACKUP was used a single Linux server kosmo-arch (ceph client) with installed zfs on linux. For zpool set ZFS and deduplication on. (Remember that deduplication required about 2GB mem for 1TB storage space.)
Example of simple script that is starting by cron:
#!/bin/sh currdate=`/bin/date +%Y-%m-%0e` olddate=`/bin/date --date="60 days ago" +%Y-%m-%0e` imagelist="one-21" #space delimited list for i in $imagelist do snapcurchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate` snapoldchk=`/usr/bin/rbd -p one ls | grep $i | grep $currdate` if test -z "$snapcurchk" then /usr/bin/rbd snap create --snap $currdate one/$i /usr/bin/rbd export one/$i@$currdate /rbdback/$i-$currdate else echo "current snapshot exist" fi if test -z "$snapoldchk" then echo "old snapshot doesn't exist" else /usr/bin/rbd snap rm one/$i@$olddate /bin/rm -f /rbdback/$i-$olddate fi done
Use onevm utility or web-interface (see template) to know which image assigned to VM.
onevm list onevm show "VM_ID" -a | grep IMAGE_ID
PS
Don’t forget to change storage driver for VM to vda.(Drivers for windows). Without that you will face with low IO performance. (no more than 100 MB/s).
I saw 415MB/s with virtio drivers.
Links.
1. Official documentation Opennebula
2. Official documentation Ceph
3. Convertation sqlite to mysql
4. Convertation Opennebula DB to mysql
5. HA Rhel 7
6. Cluster wirh crmsh
Hi,
What a how to ! Thank you _very_ much for sharing 🙂
Now, I must find so time to reproduce it in home lab… in the mean time, would you please clarify why OpenNebula isn’t compatible with pcs ?
Thanks again for sharing
Bests
Hi,
Opennebula and pcs have different ruby gems packages which conflict with each other.
To solve problem see follow link https://forum.opennebula.org/t/ruby-problem-centos-7-pcsd-and-opennebula/356.
We had followed this document for setup CEPH cluster integration with OpenNebula 5.2. but we faced some issues on that, after creating CEPH cluster when try to integrating with opennebula frontend node then its not getting up and not showing disk uasble space.
[oneadmin@node1 ~]$ cat cephds.conf
NAME = “cephds”
DS_MAD = ceph
TM_MAD = ceph
DISK_TYPE = RBD
POOL_NAME = data
CEPH_HOST = storage1-storage2-storage3
CEPH_USER = oneadmin
CEPH_SECRET = “AQCU1GxYHmbkBxAA6TlD6q+PyMTL+/a+AB+2bg==2dsf”
BRIDGE_LIST = storage1
Plz suggest us.
[cephuser@storage1 ~]$ ceph -s
cluster dd8ea244-d2e4-4948-b3b8-ec2ddedeff80
health HEALTH_OK
monmap e1: 3 mons at {storage1=XX:XX:XX:137:6789/0,storage2=XX.XX.XX.138:6789/0,storage3=XX.XX.XX.139:6789/0}
election epoch 70, quorum 0,1,2 storage1,storage2,storage3
osdmap e204: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v312560: 2112 pgs, 3 pools, 14641 kB data, 24 objects
19914 MB used, 11104 GB / 11123 GB avail
2112 active+clean
subhash, I’m sure you’ll get help if you send this question to the forum:
https://forum.opennebula.org/c/support
Hi subhash!
I see two errors:
>CEPH_HOST = storage1-storage2-storage3
CEPH_HOST = “storage1 storage2 storage3”
>CEPH_SECRET = “AQCU1GxYHmbkBxAA6TlD6q+PyMTL+/a+AB+2bg==2dsf”
CEPH_SECRET should be uuid
Be attentive.
PS: You don’t need to use crmsh. Just use pcs cluster.
Some more updates on CEPH Cluster:
I have setup 3 nodes CEPH storage cluster and all are working fine and data replicated on each node after that we want to integrate with OpenNebula 5.2 we have setup as below setting as shared storage but no getting up as shared storage on opennebula frontend.
Please suggest us..