dmourati's questions -server

dmourati

Asked: 2013-12-16 12:17:49 +0800 CST

How Can I Update user-data on Openstack?

3

I have a bunch of openstack VMs running on Grizzly. I need to change their domain which is currently managed by cloud-init. How do I update the user-data?

dmourati

Asked: 2013-10-17 11:12:07 +0800 CST

Openstack Grizzly Fails to Provision New VMs

1

I'm running Openstack Grizzly on CentOS installed by Mirantis Fuel.

[root@controller-20 ~]# cat /etc/redhat-release 
CentOS release 6.4 (Final)

[root@controller-20 ~]# rpm -qa | grep -i openstack-nova
openstack-nova-console-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-common-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-scheduler-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-conductor-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-objectstore-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-novncproxy-0.4-8.el6.noarch
openstack-nova-cert-2013.1.1.fuel3.0-mira.2.noarch
openstack-nova-api-2013.1.1.fuel3.0-mira.2.noarch

The topology is currently one controller node and three compute nodes, all running on modern Dell rackmount hardware. I've provisioned roughly 25 VMs today before the issues started.

For some reason, while creating/deleting VMs, a fixed IP got stuck in some indeterminate state. Now, I'm having trouble creating new VMs. Openstack tries to use IPs that it thinks are still part of an old VM and fails to build the VM.

My fixed network is 10.129.0.0/24.

Here's a list of the problem IPs from the nova-manage command line:

# nova-manage fixed list | grep -E 'network|WARNING' -A 1

network             IP address      hostname        host
10.129.0.0/24       10.129.0.0      None            None
--
WARNING: fixed ip 10.129.0.20 allocated to missing instance
10.129.0.0/24       10.129.0.20     None            None
--
WARNING: fixed ip 10.129.0.23 allocated to missing instance
10.129.0.0/24       10.129.0.23     None            None
--
WARNING: fixed ip 10.129.0.25 allocated to missing instance
10.129.0.0/24       10.129.0.25     None            None
WARNING: fixed ip 10.129.0.26 allocated to missing instance
10.129.0.0/24       10.129.0.26     None            None
WARNING: fixed ip 10.129.0.27 allocated to missing instance
10.129.0.0/24       10.129.0.27     None            None
--
WARNING: fixed ip 10.129.0.30 allocated to missing instance
10.129.0.0/24       10.129.0.30     None            None
WARNING: fixed ip 10.129.0.31 allocated to missing instance
10.129.0.0/24       10.129.0.31     None            None
WARNING: fixed ip 10.129.0.32 allocated to missing instance
10.129.0.0/24       10.129.0.32     None            None
WARNING: fixed ip 10.129.0.33 allocated to missing instance
10.129.0.0/24       10.129.0.33     None            None
WARNING: fixed ip 10.129.0.34 allocated to missing instance
10.129.0.0/24       10.129.0.34     None            None
WARNING: fixed ip 10.129.0.35 allocated to missing instance
10.129.0.0/24       10.129.0.35     None            None
WARNING: fixed ip 10.129.0.36 allocated to missing instance
10.129.0.0/24       10.129.0.36     None            None
WARNING: fixed ip 10.129.0.37 allocated to missing instance
10.129.0.0/24       10.129.0.37     None            None
WARNING: fixed ip 10.129.0.38 allocated to missing instance
10.129.0.0/24       10.129.0.38     None            None
WARNING: fixed ip 10.129.0.39 allocated to missing instance
10.129.0.0/24       10.129.0.39     None            None
WARNING: fixed ip 10.129.0.40 allocated to missing instance
10.129.0.0/24       10.129.0.40     None            None
WARNING: fixed ip 10.129.0.41 allocated to missing instance
10.129.0.0/24       10.129.0.41     None            None
WARNING: fixed ip 10.129.0.42 allocated to missing instance
10.129.0.0/24       10.129.0.42     None            None
WARNING: fixed ip 10.129.0.43 allocated to missing instance
10.129.0.0/24       10.129.0.43     None            None
WARNING: fixed ip 10.129.0.44 allocated to missing instance
10.129.0.0/24       10.129.0.44     None            None
WARNING: fixed ip 10.129.0.45 allocated to missing instance
10.129.0.0/24       10.129.0.45     None            None
WARNING: fixed ip 10.129.0.46 allocated to missing instance
10.129.0.0/24       10.129.0.46     None            None
--
WARNING: fixed ip 10.129.0.48 allocated to missing instance
10.129.0.0/24       10.129.0.48     None            None
WARNING: fixed ip 10.129.0.49 allocated to missing instance
10.129.0.0/24       10.129.0.49     None            None
WARNING: fixed ip 10.129.0.50 allocated to missing instance
10.129.0.0/24       10.129.0.50     None            None
--
WARNING: fixed ip 10.129.0.52 allocated to missing instance
10.129.0.0/24       10.129.0.52     None            None
WARNING: fixed ip 10.129.0.53 allocated to missing instance
10.129.0.0/24       10.129.0.53     None            None
--
WARNING: fixed ip 10.129.0.55 allocated to missing instance
10.129.0.0/24       10.129.0.55     None            None
WARNING: fixed ip 10.129.0.56 allocated to missing instance
10.129.0.0/24       10.129.0.56     None            None
WARNING: fixed ip 10.129.0.57 allocated to missing instance
10.129.0.0/24       10.129.0.57     None            None
--
WARNING: fixed ip 10.129.0.59 allocated to missing instance
10.129.0.0/24       10.129.0.59     None            None
WARNING: fixed ip 10.129.0.60 allocated to missing instance
10.129.0.0/24       10.129.0.60     None            None
WARNING: fixed ip 10.129.0.61 allocated to missing instance
10.129.0.0/24       10.129.0.61     None            None

I know that the 10.129.0.20 IP marks the VM instantiation that started the problems. The issue manifests itself in a failure to provision new VMs.

[root@controller-20 ~]# nova --os-username demetri --os-tenant-name admin --os-auth-url http://localhost:5000/v2.0/ fixed-ip-get 10.129.0.20
OS Password: 
+-------------+---------------+----------+-----------------------+
| address     | cidr          | hostname | host                  |
+-------------+---------------+----------+-----------------------+
| 10.129.0.20 | 10.129.0.0/24 | devdbl9  | compute-21.domain.tld |
+-------------+---------------+----------+-----------------------+

The nova-manage commands don't seem to offer any tool to reclaim these IPs. I've tried reserve/unreserve but that doesn't do the trick. Also, these IPs are represented in a nova mysql table called fixed_ips. Example:

+---------------------+---------------------+------------+-----+--------------+------------+-----------+--------+----------+----------------------+-----------------------+--------------------------------------+---------+
| created_at          | updated_at          | deleted_at | id  | address      | network_id | allocated | leased | reserved | virtual_interface_id | host                  | instance_uuid                        | deleted |
+---------------------+---------------------+------------+-----+--------------+------------+-----------+--------+----------+----------------------+-----------------------+--------------------------------------+---------+

| 2013-08-05 11:10:19 | 2013-10-16 11:32:20 | NULL       |  21 | 10.129.0.20  |          1 |         0 |      0 |        0 |                 NULL | NULL                  | df2e9214-78cf-49d3-b256-e35d48818f29 |       0 |

To further confirm the problem relates to the fixed ip networking, the UI reflects an incrementing IP address for the VM say starting at .21, going to .22, going to .23 before ultimately failing with state "ERROR".

All this by way of saying, since this started happening, most but not all attempts to invoke a new VM fail. How can I troubleshoot this further and ultimately how can I return to smoothly provisioning new VMs?

Thanks.

dmourati

Asked: 2013-07-27 13:53:44 +0800 CST

DNS Server Responses and Timeouts

17

We're experiencing a frustrating problem on our LAN. Periodically, DNS queries to our ISP nameservers timeout forcing a 5 second delay. Even if I bypass /etc/resolv.conf by using a direct dig to one of our DNS servers, I still encounter the problem. Here's an example:

mv-m-dmouratis:~ dmourati$ time dig www.google.com @209.81.9.1 

; <<>> DiG 9.8.3-P1 <<>> www.google.com @209.81.9.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14473
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 4, ADDITIONAL: 4

;; QUESTION SECTION:
;www.google.com.            IN  A

;; ANSWER SECTION:
www.google.com.     174 IN  A   74.125.239.148
www.google.com.     174 IN  A   74.125.239.147
www.google.com.     174 IN  A   74.125.239.146
www.google.com.     174 IN  A   74.125.239.144
www.google.com.     174 IN  A   74.125.239.145

;; AUTHORITY SECTION:
google.com.     34512   IN  NS  ns2.google.com.
google.com.     34512   IN  NS  ns1.google.com.
google.com.     34512   IN  NS  ns3.google.com.
google.com.     34512   IN  NS  ns4.google.com.

;; ADDITIONAL SECTION:
ns2.google.com.     212097  IN  A   216.239.34.10
ns3.google.com.     207312  IN  A   216.239.36.10
ns4.google.com.     212097  IN  A   216.239.38.10
ns1.google.com.     212096  IN  A   216.239.32.10

;; Query time: 8 msec
;; SERVER: 209.81.9.1#53(209.81.9.1)
;; WHEN: Fri Jul 26 14:44:25 2013
;; MSG SIZE  rcvd: 248


real    0m5.015s
user    0m0.004s
sys 0m0.002s

Other times, the queries respond instantly, as in under 20 ms or so. I've done a packet trace and discovered something interesting. The DNS server is responding but the client ignores the initial response, then sends a second identical query which is immediately responded to.

See packet trace. Note the identical source ports to the queries (62076).

Question: what is causing the first DNS query to fail?

UPDATE

Resources:

Packet trace:

http://www.cloudshark.org/captures/8b1c32d9d015

Dtruss (strace for mac):

https://gist.github.com/dmourati/6115180

Mountain Lion firewall is randomly delaying DNS requests from apple.stackexchange.com:

https://apple.stackexchange.com/questions/80678/mountain-lion-firewall-is-randomly-delaying-dns-requests

UPDATE 2

System Software Overview:

  System Version:   OS X 10.8.4 (12E55)
  Kernel Version:   Darwin 12.4.0
  Boot Volume:  Macintosh HD
  Boot Mode:    Normal
  Computer Name:    mv-m-dmouratis
  User Name:    Demetri Mouratis (dmourati)
  Secure Virtual Memory:    Enabled
  Time since boot:  43 minutes

Hardware Overview:

  Model Name:   MacBook Pro
  Model Identifier: MacBookPro10,1
  Processor Name:   Intel Core i7
  Processor Speed:  2.7 GHz
  Number of Processors: 1
  Total Number of Cores:    4
  L2 Cache (per Core):  256 KB
  L3 Cache: 6 MB
  Memory:   16 GB

Firewall Settings:

  Mode: Limit incoming connections to specific services and applications
  Services:
  Apple Remote Desktop: Allow all connections
  Screen Sharing:   Allow all connections
  Applications:
  com.apple.java.VisualVM.launcher: Block all connections
  com.getdropbox.dropbox:   Allow all connections
  com.jetbrains.intellij.ce:    Allow all connections
  com.skype.skype:  Allow all connections
  com.yourcompany.Bitcoin-Qt:   Allow all connections
  org.m0k.transmission: Allow all connections
  org.python.python:    Allow all connections
  Firewall Logging: Yes
  Stealth Mode: No

dmourati

Asked: 2013-06-04 19:03:17 +0800 CST

Dig -x equivalent for AWS Route 53

0

I have an IP number associated with an AWS Elastic IP (EIP). I'd like to know, what DNS records in my Route 53 domain are associated with that A record's IP.

In a traditional DNS service, I would run dig -x $EIP and get back a PTR record and be done.

Amazon only allows actual PTR records by filling out some form.

Otherwise, the PTR records point to amazonaws.com.

The API for Route 53 doesn't seem to support the dig -x approach either. Moreover, the data looks like XML which will make it a bit challenging from the command line.

So, how can I get this data?

dmourati

Asked: 2013-05-04 13:30:10 +0800 CST

Disallow -p on sftp

3

We have a server that acts as a "dropbox" for (outside) users to upload data to us over sftp/ssh. We need to process these files (gpg decrypt, unzip, etc) as they come in. In the past, we simply processed each file in each users home directory without regard to whether we had already processed it. This turned out to be wasteful. I updated (rewrote) our processing script to rely on a mechanism like:

FILESTOPROCESS=$(find -H /home/$CUST -type f -newer /home/$CUST/marker-file)

This combined with a touch /home/$CUST/marker-file has worked great and our workload was dramatically reduced.

A day or so ago, we had a configuration issue in our SSH server which temporarily disallowed users to upload files to us. When the script ran again, it overlooked a file the user initially failed to upload, but subsequently uploaded via sftp/ssh with a "-p" option, for preserve timestamp. This set the c/a/mtimes of the file to be a day or so older than the marker-file and so it was subsequently ignored.

I'd like to disallow users from uploading with "-p" so that files are created with current timestamps.

Can I do this in sshd_config?

dmourati

Asked: 2012-11-30 10:47:02 +0800 CST

ipvsadm lists a few hosts by IP only, rest by name

2

We use keepalived to manage our Linux Virtual Server (LVS) load balancer. The LVS VIPs are setup to use a FWMARK as configured in iptables.

 virtual_server fwmark 300000 {
    delay_loop 10
    lb_algo wrr
    lb_kind NAT
    persistence_timeout 180
    protocol TCP

    real_server 10.10.35.31 {
        weight 24
        MISC_CHECK {
            misc_path "/usr/local/sbin/check_php_wrapper.sh 10.10.35.31"
            misc_timeout 30
        }
    }

    real_server 10.10.35.32 {
        weight 24
        MISC_CHECK {
            misc_path "/usr/local/sbin/check_php_wrapper.sh 10.10.35.32"
            misc_timeout 30
        }
    }

    real_server 10.10.35.33 {
        weight 24
        MISC_CHECK {
            misc_path "/usr/local/sbin/check_php_wrapper.sh 10.10.35.33"
            misc_timeout 30
        }
    }

    real_server 10.10.35.34 {
        weight 24
        MISC_CHECK {
            misc_path "/usr/local/sbin/check_php_wrapper.sh 10.10.35.34"
            misc_timeout 30
        }
    }
}

http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.fwmark.html

[root@lb1 ~]# iptables -L -n -v -t mangle
Chain PREROUTING (policy ACCEPT 182G packets, 114T bytes)
 190M  167G MARK       tcp  --  *      *       0.0.0.0/0            w1.x1.y1.4       multiport dports 80,443 MARK set 0x493e0 
  62M   58G MARK       tcp  --  *      *       0.0.0.0/0            w1.x1.y2.4       multiport dports 80,443 MARK set 0x493e0 


[root@lb1 ~]# ipvsadm -L
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
FWM  300000 wrr persistent 180
  -> 10.10.35.31:0                Masq    24     1          0         
  -> dis2.domain.com:0                Masq    24     3          231       
  -> 10.10.35.33:0                Masq    24     0          208       
  -> 10.10.35.34:0                Masq    24     0          0

At the time the realservers were setup, there was a misconfigured dns for some hosts in the 10.10.35.0/24 network. Thereafter, we fixed the DNS. However, the hosts continue to show up as only their IP numbers (10.10.35.31,10.10.35.33,10.10.35.34) above.

[root@lb1 ~]# host 10.10.35.31 31.35.10.10.in-addr.arpa domain name pointer dis1.domain.com.

OS is CentOS 6.3. Ipvsadm is ipvsadm-1.25-10.el6.x86_64. kernel is kernel-2.6.32-71.el6.x86_64. Keepalived is keepalived-1.2.7-1.el6.x86_64.

How can we get ipvsadm -L to list all realservers by their proper hostnames?

dmourati

Asked: 2011-04-20 18:00:53 +0800 CST

Load Balancing Best Practices for Persistence

7

We run a web application serving up web APIs for an increasing number of clients. To start, the clients were generally home, office, or other wireless networks submitting chunked http uploads to our API. We've now branched out into handling more mobile clients. The files ranging from a few k to several gigs, broken down into smaller chunks and reassembled on our API.

Our current load balancing is performed at two layers, first we use round robin DNS to advertise multiple A records for our api.company.com address. At each IP, we host a Linux LVS: http://www.linuxvirtualserver.org/, load-balancer that looks at the source IP address of a request to determine which API server to hand the connection to. This LVS boxes are configured with heartbeatd to take-over external VIPs and internal gateway IPs from one another.

Lately, we've seen two new error conditions.

The first error is where clients are oscillating or migrating from one LVS to another, mid-upload. This in turn causes our load balancers to lose track of the persistent connection and send the traffic to a new API server, thereby breaking the chunked upload across two or more servers. Our intent was for the Round Robin DNS TTL value for our api.company.com (which we've set at 1 hour) to be honored by the downstream caching nameservers, OS caching layers, and client application layers. This error occurs for approximately 15% of our uploads.

The second error we've seen much less commonly. A client will initiate traffic to an LVS box and be routed to realserver A behind it. Thereafter, the client will come in via a new source IP address, which the LVS box does not recognize, thereby routing ongoing traffic to realserver B also behind that LVS.

Given our architecture as described in part above, I'd like to know what are people's experiences with a better approach that will allow us to handle each of the error cases above more gracefully?

Edit 5/3/2010:

This looks like what we need. Weighted GSLB hashing on the source IP address.

http://www.brocade.com/support/Product_Manuals/ServerIron_ADXGlobalServer_LoadBalancingGuide/gslb.2.11.html#271674

How Can I Update user-data on Openstack?

Openstack Grizzly Fails to Provision New VMs

DNS Server Responses and Timeouts

Dig -x equivalent for AWS Route 53

Disallow -p on sftp

ipvsadm lists a few hosts by IP only, rest by name

Load Balancing Best Practices for Persistence

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?