I'm running Openstack Grizzly on CentOS installed by Mirantis Fuel.
[root@controller-20 ~]# cat /etc/redhat-release
CentOS release 6.4 (Final)
[root@controller-20 ~]# rpm -qa | grep -i openstack-nova
openstack-nova-console-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-common-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-scheduler-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-conductor-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-objectstore-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-novncproxy-0.4-8.el6.noarch
openstack-nova-cert-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-api-2013.1.1.fuel3.0-mira.2.noarch
The topology is currently one controller node and three compute nodes, all running on modern Dell rackmount hardware. I've provisioned roughly 25 VMs today before the issues started.
For some reason, while creating/deleting VMs, a fixed IP got stuck in some indeterminate state. Now, I'm having trouble creating new VMs. Openstack tries to use IPs that it thinks are still part of an old VM and fails to build the VM.
My fixed network is 10.129.0.0/24.
Here's a list of the problem IPs from the nova-manage command line:
# nova-manage fixed list | grep -E 'network|WARNING' -A 1
network IP address hostname host
10.129.0.0/24 10.129.0.0 None None
--
WARNING: fixed ip 10.129.0.20 allocated to missing instance
10.129.0.0/24 10.129.0.20 None None
--
WARNING: fixed ip 10.129.0.23 allocated to missing instance
10.129.0.0/24 10.129.0.23 None None
--
WARNING: fixed ip 10.129.0.25 allocated to missing instance
10.129.0.0/24 10.129.0.25 None None
WARNING: fixed ip 10.129.0.26 allocated to missing instance
10.129.0.0/24 10.129.0.26 None None
WARNING: fixed ip 10.129.0.27 allocated to missing instance
10.129.0.0/24 10.129.0.27 None None
--
WARNING: fixed ip 10.129.0.30 allocated to missing instance
10.129.0.0/24 10.129.0.30 None None
WARNING: fixed ip 10.129.0.31 allocated to missing instance
10.129.0.0/24 10.129.0.31 None None
WARNING: fixed ip 10.129.0.32 allocated to missing instance
10.129.0.0/24 10.129.0.32 None None
WARNING: fixed ip 10.129.0.33 allocated to missing instance
10.129.0.0/24 10.129.0.33 None None
WARNING: fixed ip 10.129.0.34 allocated to missing instance
10.129.0.0/24 10.129.0.34 None None
WARNING: fixed ip 10.129.0.35 allocated to missing instance
10.129.0.0/24 10.129.0.35 None None
WARNING: fixed ip 10.129.0.36 allocated to missing instance
10.129.0.0/24 10.129.0.36 None None
WARNING: fixed ip 10.129.0.37 allocated to missing instance
10.129.0.0/24 10.129.0.37 None None
WARNING: fixed ip 10.129.0.38 allocated to missing instance
10.129.0.0/24 10.129.0.38 None None
WARNING: fixed ip 10.129.0.39 allocated to missing instance
10.129.0.0/24 10.129.0.39 None None
WARNING: fixed ip 10.129.0.40 allocated to missing instance
10.129.0.0/24 10.129.0.40 None None
WARNING: fixed ip 10.129.0.41 allocated to missing instance
10.129.0.0/24 10.129.0.41 None None
WARNING: fixed ip 10.129.0.42 allocated to missing instance
10.129.0.0/24 10.129.0.42 None None
WARNING: fixed ip 10.129.0.43 allocated to missing instance
10.129.0.0/24 10.129.0.43 None None
WARNING: fixed ip 10.129.0.44 allocated to missing instance
10.129.0.0/24 10.129.0.44 None None
WARNING: fixed ip 10.129.0.45 allocated to missing instance
10.129.0.0/24 10.129.0.45 None None
WARNING: fixed ip 10.129.0.46 allocated to missing instance
10.129.0.0/24 10.129.0.46 None None
--
WARNING: fixed ip 10.129.0.48 allocated to missing instance
10.129.0.0/24 10.129.0.48 None None
WARNING: fixed ip 10.129.0.49 allocated to missing instance
10.129.0.0/24 10.129.0.49 None None
WARNING: fixed ip 10.129.0.50 allocated to missing instance
10.129.0.0/24 10.129.0.50 None None
--
WARNING: fixed ip 10.129.0.52 allocated to missing instance
10.129.0.0/24 10.129.0.52 None None
WARNING: fixed ip 10.129.0.53 allocated to missing instance
10.129.0.0/24 10.129.0.53 None None
--
WARNING: fixed ip 10.129.0.55 allocated to missing instance
10.129.0.0/24 10.129.0.55 None None
WARNING: fixed ip 10.129.0.56 allocated to missing instance
10.129.0.0/24 10.129.0.56 None None
WARNING: fixed ip 10.129.0.57 allocated to missing instance
10.129.0.0/24 10.129.0.57 None None
--
WARNING: fixed ip 10.129.0.59 allocated to missing instance
10.129.0.0/24 10.129.0.59 None None
WARNING: fixed ip 10.129.0.60 allocated to missing instance
10.129.0.0/24 10.129.0.60 None None
WARNING: fixed ip 10.129.0.61 allocated to missing instance
10.129.0.0/24 10.129.0.61 None None
I know that the 10.129.0.20 IP marks the VM instantiation that started the problems. The issue manifests itself in a failure to provision new VMs.
[root@controller-20 ~]# nova --os-username demetri --os-tenant-name admin --os-auth-url http://localhost:5000/v2.0/ fixed-ip-get 10.129.0.20
OS Password:
+-------------+---------------+----------+-----------------------+
| address | cidr | hostname | host |
+-------------+---------------+----------+-----------------------+
| 10.129.0.20 | 10.129.0.0/24 | devdbl9 | compute-21.domain.tld |
+-------------+---------------+----------+-----------------------+
The nova-manage commands don't seem to offer any tool to reclaim these IPs. I've tried reserve/unreserve but that doesn't do the trick. Also, these IPs are represented in a nova mysql table called fixed_ips. Example:
+---------------------+---------------------+------------+-----+--------------+------------+-----------+--------+----------+----------------------+-----------------------+--------------------------------------+---------+
| created_at | updated_at | deleted_at | id | address | network_id | allocated | leased | reserved | virtual_interface_id | host | instance_uuid | deleted |
+---------------------+---------------------+------------+-----+--------------+------------+-----------+--------+----------+----------------------+-----------------------+--------------------------------------+---------+
| 2013-08-05 11:10:19 | 2013-10-16 11:32:20 | NULL | 21 | 10.129.0.20 | 1 | 0 | 0 | 0 | NULL | NULL | df2e9214-78cf-49d3-b256-e35d48818f29 | 0 |
To further confirm the problem relates to the fixed ip networking, the UI reflects an incrementing IP address for the VM say starting at .21, going to .22, going to .23 before ultimately failing with state "ERROR".
All this by way of saying, since this started happening, most but not all attempts to invoke a new VM fail. How can I troubleshoot this further and ultimately how can I return to smoothly provisioning new VMs?
Thanks.
I was able to track this down to a buggy/faulty install of rabbitmq. The rabbitmq logs started to show errors like:
I upgraded from the installed package, rabbitmq-server-2.8.7-2.el6.noarch.rpm, to the package hosted on the rabbitmq site, rabbitmq-server-3.2.0-1.noarch.rpm. Now I can successfully provision nodes!