Openstack Kolla

List of Hosts:

10.240.169.3: “MAAS, docker, openstack-kolla, kolla” all installed on here

Local Docker Registry

To install multi-node docker openstack, we need to have local registry service, Nexus3 is a GUI visible easy to use registry server.
install it via docker,
create ./nexus3/data/docker-compose.yml

=====================================
nexus:
image: sonatype/nexus3:latest
ports:
– “8081:8081”
– “5000:5000”
volumes:
– ./data:/nexus-data
=======================================

and then “docker-compose up -d” to create docker container. May need to pip install docker-compose.

Launch web browser to 10.240.169.3(docker host):8081, default account:admin/admin123, then create a new repo type hosted/docker, use port 5000 and enable docker v1.

verify on docker hosts they can login this private registry: docker login -p admin123 -u admin 10.240.169.3:5000

To pull images from internet repo to local registry

on 10.240.169.3

pip install kolla
kolla-build –base ubuntu –type source –registry 10.240.169.3:5000 –push

This will pull all available docker images from internet, and stored at local.

Prepare hosts for ceph osd

part disk label on each host:
parted /dev/sdb -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdc -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdd -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sde -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdf -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdg -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP 1 -1
parted /dev/sdh -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_1 1 -1
parted /dev/sdi -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_2 1 -1
parted /dev/sdj -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_1_J 1 -1
parted /dev/sdk -s — mklabel gpt mkpart KOLLA_CEPH_OSD_BOOTSTRAP_2_J 1 -1

each host needs to install following:

apt install python-pip -y
pip install -U docker-py

apt-get install bridge-utils debootstrap ifenslave ifenslave-2.6 lsof lvm2 ntp ntpdate openssh-server sudo tcpdump python-dev vlan -y

no need to install docker.io manually, as there’s a bootstrap cmd doing this job under kolla-ansible: kolla-ansible -i multinode bootstrap-servers

if any deployment failure, copy /usr/local/share/kolla-ansible/tools/cleanup-containers to each host and run it to clean up containers and redo deploy again.

“kolla-ansible -i multinode destroy” can remove all deployed containor on all nodes, but ceph partitions will be kept. so to erase partitioned disks, run following on each host:

umount /dev/sdb1
umount /dev/sdc1
umount /dev/sdd1
umount /dev/sde1
umount /dev/sdf1
umount /dev/sdg1
umount /dev/sdh1
umount /dev/sdi1
dd if=/dev/zero of=/dev/sdb bs=512 count=1
dd if=/dev/zero of=/dev/sdc bs=512 count=1
dd if=/dev/zero of=/dev/sdd bs=512 count=1
dd if=/dev/zero of=/dev/sde bs=512 count=1
dd if=/dev/zero of=/dev/sdf bs=512 count=1
dd if=/dev/zero of=/dev/sdg bs=512 count=1
dd if=/dev/zero of=/dev/sdh bs=512 count=1
dd if=/dev/zero of=/dev/sdi bs=512 count=1
dd if=/dev/zero of=/dev/sdj bs=512 count=1
dd if=/dev/zero of=/dev/sdk bs=512 count=1

Openstack Ansible

Same External/Internal IP

when deploy via ansible, if external and internal VIP are using same IP, SSL feature needs to be disabled, otherwise it will cause pip install failure:

____________________________________________________________

————————————————————

FAILED – RETRYING: TASK: pip_install : Install pip packages (fall back mode) (2 retries left).
FAILED – RETRYING: TASK: pip_install : Install pip packages (fall back mode) (1 retries left).
fatal: [infra01_galera_container-ff9ac443]: FAILED! => {“attempts”: 5, “changed”: false, “cmd”: “/usr/local/bin/pip2 install -U –isolated –constraint http://10.240.169.102:8181/os-releases/master/requirements_absolute_requirements.txt “, “failed”: true, “msg”: “\n:stderr: Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by ‘ProtocolError(‘Connection aborted.’, BadStatusLine(\””\”,))’: /os-releases/master/requirements_absolute_requirements.txt\nRetrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by ‘ProtocolError(‘Connection aborted.’, BadStatusLine(\””\”,))’: /os-releases/master/requirements_absolute_requirements.txt\nRetrying (Retry(total=2, ………. Max retries exceeded with url: /os-releases/master/requirements_absolute_requirements.txt (Caused by ProtocolError(‘Connection aborted.’, BadStatusLine(\””\”,)))\n”}

resolution: add /etc/openstack_deploy/user_variables.yml

openstack_service_publicuri_proto: http
openstack_external_ssl: false
haproxy_ssl: false

Openstack-ansible playbook

to make openstack ansible build openstack, need to generate random password for components first then run playbook
/opt/openstack-ansible/scripts
python pw-token-gen.py –file /etc/openstack_deploy/user_secrets.yml

after fully deploying openstack ansible, lxc-attach onto utility container, find /root/openrc file for openstack environment vars.

when reboot or all galera servers down and couldn’t be started, showing as mysql service failure, run “openstack-ansible galera-install.yml –tags galera-bootstrap” to recover it.

Openstack ceph-ansible

to make Openstack use cinder-ceph, we need to manually install ceph from git first.

git clone https://github.com/ceph/ceph-ansible/
cd ceph-ansible/
cp site.yml.sample site.yml
cp group_vars/all.yml.sample group_vars/all.yml
cp group_vars/mons.yml.sample group_vars/mons.yml
cp group_vars/osds.yml.sample group_vars/osds.yml

edit hosts file
[root@ansible ~]# vi inventory_hosts
[mons]
10.240.173.102
10.240.173.103
10.240.173.104

[osds]
10.240.173.102
10.240.173.103
10.240.173.104
10.240.173.105
10.240.173.106

[rgws]
10.240.173.102

verify they are reachable via ssh
ansible -m ping -i hosts all

edit site.yml unmark anything not needed
edit group_vars/all.yml
ceph_origin: upstream
ceph_stable: true
ceph_stable_release:jewel
monitor_interface: br-storage
journal_size: 1024
public_network: 10.240.173.0/24

edit group_vars/osd.yml to indicate which disks are used for osd and journal.
then run the installation file: ansible-playbook site.yml -i hosts

make sure ceph health is OK, if it’s stuck at inactive stat, check mtu on eth ports.

if ceph -s complains “too few PGs per OSD”, then change osd pool num. for 10-50 osd, use 1024 pg
# ceph osd pool set rbd pg_num 1024
# ceph osd pool set rbd pgp_num 1024

After installation, generate keyring on all ceph-mon nodes, otherwise openstack-ansible will complain for missing keyring:
ceph auth get-or-create client.cinder
ceph auth get-or-create client.glance
ceph auth get-or-create client.cinder-backup

on mon, you can check keyring with “ceph auth list”

also need to add permission for each client: ceph auth caps client.cinder mon ‘allow *’ osd ‘allow *’
otherwise it will cause cinder-volume failure.

another option is to uncomment these settings inside ceph-ansible/groupvars/mons.yml

Access rbd image from ceph mon

To directly access ceph disk which is also the actual mounted vm/image/volume disk, use “rbd map” to map image to mon’s system, and mount to a folder to access. Use “rbd -p poolname ls” to show ceph images inside a pool, and use “rbd -p poolname info imagename” to see details.

on ubuntu 16.04, with ceph jewelle, some new features are enabled but not supported on ubuntu, need to disable these feature on a per volume mapping base “rbd feature disable imagename deep-flatten fast-diff object-map exclusive-lock”, and then “rbd map pool/image” will work. A /dev/rbdx will be generated and if it’s a right image it will have sub partition which can be mounted.

Nova access cinder-volume

To make nova able to attach or mount cinder volume, a rbd_secret_uuid need to be added on both cinder.conf and nova.conf. otherwise it will complain “notype” error.

Horizon Issue

To fix URL option missing under image tab on openstack dashboard, add this line inside /etc/horizon/local_settings.py on all 3 controller’s horizon container.
IMAGES_ALLOW_LOCATION = ‘true’
then restart apache2
or add this inside /etc/ansible/roles/os_horizon/templates/horizon_local_settings.py.j2, as by default, newer version of openstack has omit it for security concern.

To make original location visiable under glance image-list, change /etc/glance/glance-api.conf inside each glance container,
#display URL address
show_image_direct_url = True
#display available multiple locations
show_multiple_locations = True
then restart glance-api service

“Cannot read property ‘data’ of undefined” ” while creating new images can be multiple causes, check available stores defined for your input field, if it’s enabled with “File” on horizon, check “HORIZON_IMAGES_UPLOAD_MODE” under horizon local_settings.py; if it’s URL enabled, check your glance store setting. If use URL link, then all instance when they first boot will use this url to download image.

Glance Issue

add extra image location and path to an exiting image:
to authenticate with API calls we need a token, so

$ keystone token-get
+———–+———————————-+
| Property | Value |
+———–+———————————-+
| expires | 2015-05-06T14:22:16Z |
| id | 2602709084d64417b7f3480fccfa1785 |
| tenant_id | 486ab7509bfd46c386d4a8353b80a08d |
| user_id | 0b78d6793b1c4305ad6e76fa232b5a74 |
+———–+———————————-+

and then reuse this token to make API call

$ curl -i -X PATCH -H ‘Content-Type: application/openstack-images-v2.1-json-patch’ \
-H “X-Auth-Token: 2602709084d64417b7f3480fccfa1785” \
http://192.168.0.60:9292/v2/images/90674766-dbaa-4a6e-a344-2a4116af9fab \
-d ‘[{“op”: “add”, “path”: “/locations/-“, “value”: {“url”: “rbd://5de961fb-2368-4f77-8725-7b002732e214/images/7bb0484c-cb6b-4700-88bb-0a18b8f3a8f5/snap”, “metadata”: {}}}]’

HTTP/1.1 200 OK
Content-Length: 955
Content-Type: application/json; charset=UTF-8
X-Openstack-Request-Id: req-req-29faba33-657e-4959-b508-fcffe8081d8f
Date: Wed, 06 May 2015 14:21:21 GMT

{“status”: “active”, “virtual_size”: null, “name”: “CirrOS-0.3.3”, “tags”: [], “container_format”: “bare”, “created_at”: “2015-05-06T09:29:40Z”, “size”: 13200896, “disk_format”: “qcow2”, “updated_at”: “2015-05-06T14:21:20Z”, “visibility”: “private”, “locations”: [{“url”: “rbd://5de961fb-2368-4f77-8725-7b002732e214/images/90674766-dbaa-4a6e-a344-2a4116af9fab/snap”, “metadata”: {}}, {“url”: “rbd://5de961fb-2368-4f77-8725-7b002732e214/images/7bb0484c-cb6b-4700-88bb-0a18b8f3a8f5/snap”, “metadata”: {}}], “self”: “/v2/images/90674766-dbaa-4a6e-a344-2a4116af9fab”, “min_disk”: 0, “protected”: false, “id”: “90674766-dbaa-4a6e-a344-2a4116af9fab”, “file”: “/v2/images/90674766-dbaa-4a6e-a344-2a4116af9fab/file”, “checksum”: “133eae9fb1c98f45894a4e60d8736619”, “owner”: “486ab7509bfd46c386d4a8353b80a08d”, “direct_url”: “rbd://5de961fb-2368-4f77-8725-7b002732e214/images/90674766-dbaa-4a6e-a344

Ceilometer issue

Ceilometer only works with mongodb, openstack-ansible doesn’t have mongodb role, so we need to manually install it.

apt-get install mongodb-server mongodb-clients python-pymongo

add smallfiles = true in /etc/mongodb.conf, restart the service and add ceilometer user

mongo –host 127.0.0.1 –eval ‘db = db.getSiblingDB(“ceilometer”); db.addUser({user: “ceilometer”, pwd: “CEILOMETER_DBPASS”, roles: [ “readWrite”, “dbAdmin” ]})’

then add following in user_variable.yml

ceilometer_db_type: mongodb
ceilometer_db_ip: localhost
ceilometer_db_port: 27017

this way, we make each ceilometer to use its own local mongo database

Nova boot process illustration

nova-boot1.PNG

nova-boot2.PNG

Rabbitmq

To have a general view of what’s going on with AMQP traffic, we need to access rabbitmq GUI.

enable the mgmt GUI plugin

rabbitmq-plugins enable rabbitmq_management
rabbitmqctl add_user test test
rabbitmqctl set_user_tags test administrator
rabbitmqctl set_permissions -p / test ".*" ".*" ".*"

SRIOV config

1.#change /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT=”nomdmonddf nomdmonisw intel_iommu=on
update-grub

#add vif
echo ‘7’ > /sys/class/net/eth6/device/sriov_numvfs

2.#change compute /etc/nova/nova.conf to enable vif passthrough
[default]
pci_passthrough_whitelist = { “devname”: “eth6”, “physical_network”: “sriov”}
service nova-compute restart

3.#change neutron server nodes to support sriov
/etc/neutron/plugins/ml2/ml2_conf.ini
mechanism_drivers = sriovnicswitch

#(optional)add /etc/neutron/plugins/ml2/ml2_conf_sriov.ini
supported_pci_vendor_devs = 8086:10ed
service neutron-server restart

4.#add on each nova-scheduler node
[DEFAULT]
scheduler_default_filters = PciPassthroughFilter

service nova-scheduler restart

5.#each compute nodes
apt-get install neutron-plugin-sriov-agent
/etc/neutron/plugins/ml2/sriov_agent.ini
[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[sriov_nic]
physical_device_mappings = sriov:eth6
exclude_devices =

#apply new ini to agent
neutron-sriov-nic-agent \
–config-file /etc/neutron/neutron.conf \
–config-file /etc/neutron/plugins/ml2/sriov_agent.ini

#change neutron.conf to enable tlsv1.2, as default tls1 is not supported by rabbitmq anymore
kombu_ssl_version = SSLv23

service neutron-sriov-agent restart

OVS traffic capture

OVS traffic flow: VM –> tap+”qbr(linuxbridge)”+qvb –> qvo+”br-int”+patch-br-ex –> patch-br-int+”br-ex”+port# –> external network

if no DVR used, then all traffic will go to neutron nodes from compute nodes then use neutron nodes’ port# to go out.

if DVR used, every host has a qrouter(same mac+IP), then when there’s no float IP for vm,  it can go out right from compute, don’t need to go to neutron; if there’s float IP, the float IP will reside on neutron node, so traffic need to go from vm to neutron first, then NATed and send to external, and when initiated from external, it will first hit neutron’s float IP, then filtered and NATed to vm.

Regular tcpdump can be done on Host’s port, but that only usable down to qvo. for patch-br-ex –> patch-br-int, you need to do following:

$ ip link add name snooper0 type dummy
$ ip link set dev snooper0 up

$ ovs-vsctl add-port br-int snooper0

$ ovs-vsctl — set Bridge br-int mirrors=@m  — –id=@snooper0 \
 get Port snooper0  — –id=@patch-tun get Port patch-tun \
 — –id=@m create Mirror name=mymirror select-dst-port=@patch-tun \
 select-src-port=@patch-tun output-port=@snooper0 select_all=1

You can then try to do the tcp dump :

$ tcpdump -i snooper0

To clear it:

$ ovs-vsctl clear Bridge br-int mirrors

$ ovs-vsctl del-port br-int snooper0

$ ip link delete dev snooper0

virsh based Openstack hint

install ubuntu 16.04 LTS

create / partition with 1TB

512MB for EFI

50000MB for SWAP

leave about 1.8TB for PV future use. can’t just use up entire sdb for / with lvm, it will cause issue when resizing lv space later. partition table will be massed up.

 

sudo fdisk -l

show partitioned disks

fdisk /dev/sdb

partition unallocated space for new pv, say sdb4 with 1.8TB

 

sudo apt install lvm2

install lvm for volume management. then need to reboot to fix “unreachable issue”

 

sudo pvcreate /dev/sdb4

make entire /dev/sdb4 usable for pv

sudo vgcreate Openstack /dev/sdb4

create bg group Openstack on sdb4 1.8TB

sudo lvcreate –size 20GB -n maas Openstack

create volume called maas with 20GB on newly created vg Openstack

new lvs need to have disk format to be mount as a dirve, mkfs.ext4 /dev/Openstack/maas, but this is not required for virsh as it will format it itself.

*** resize2fs /dev/Openstack/maas, to use up all space of the newly extended lvm space(lvextend)***

Open virt-manager and create maas vm

URL maas repo:

http://us.archive.ubuntu.com/ubuntu/dists/xenial/main/installer-amd64/

after maas server bootup, install maas: apt-get install maas

setup region controller address, dhcp for zones. install libvirt-bin for qemu PXE, and delete the default generated network from virsh: virsh net-destroy default, virsh net-undefine default.

 

create a new vm in virt-manager called ctl1, using lv ctl1 which was previously created.

Open MAAS GUI, add nodes for ctl1, booting uses: qemu+ssh://chz8494@192.168.100.2/system, where 192.168.100.2 is the hypervisor address, and power id to be vm name on virt-manager which is ctl1.

repeat above steps for all components: juju, ctl, neutron, computes.

sudo virt-install –name juju –ram 2048 –vcpus=2 –disk /dev/mapper/Openstack-juju,bus=scsi  –network bridge=PXE@vlan2,model=virtio  –network bridge=br3,model=virtio  –network bridge=br4,model=virtio  –network bridge=br5,model=virtio  –network bridge=br6,model=virtio  –noautoconsole –vnc –pxe

 

install juju-2.0

sudo apt-get install juju

 

create clouds.yaml file to later be used by add-cloud

clouds.yaml

clouds:

maas:

type: maas

auth-types: [ oauth1 ]

regions:

home:

endpoint: http://192.168.100.10/MAAS/

 

add new maas cloud to juju usable clouds

juju add-cloud maas clouds.yaml

 

add password for ssh maas

juju add-credential maas

And now all config will be saved at ~/.local/share/juju

Install bostrap on machine with tag bootstrap and name it juju on cloud maas and controller juju.

juju bootstrap –constraints tags=bootstrap  juju maas

This will generate a model called “controller”(don’t confuse it with real controller) and bootstrap will be installed on it.

You can also create new models for new environments for new deployments. This is the new feature added in juju 2.0 to support multiple environments under same juju host.

juju add-model Openstack

And use “juju switch openstack” to switch model from controller to openstack, and now if you use “juju deploy” it will deploy yaml file to your current model only. So in this way, the old method to redo the whole deployment is not necessary anymore, as you can always just use “juju destroy-model openstack” to only erase model openstack and leave controller bootstrap untouched. You can still destroy the whole juju though, the previous cmd “juju destroy-environment” is now “juju kill-controller”

juju may have many different models and controllers. The MODEL was not existed in juju 1.25. Controller may contain multiple models. The previous “juju switch maas” switched between environments, it now switches between models. After juju bootstrap installed, you can see bootstrap as machine 0 under juju status.

Optional juju-gui can be installed on machine 0 juju bootstrap vm.

juju deploy juju-gui –to=0

juju gui –show-credentials

check admin password for juju-gui to login.

juju debug-log

to see current debug message across the deployment.

In juju 2.0, there’s no more “local” for bundle.yaml. Instead, it uses this way “charm: ./xenial/ntp”. And if you want to use cloud resources, use “charm: cs:ntp” .

Openstack yaml will install its components onto each controller’s LXD in xenial. Instead of caching lxc config into /var/lxc/default.conf, LXD uses “lxc profile” cmd to change port bindings.

lxc profile device set default eth0 parent juju-br0

set default profile to bind lxc container’s port eth0 to its host juju-br0

If your bridged host port set to static IP, then the newly created lxc container will boot up with eth0 manual. So if you want to make this dhcp, you need to change it in container.