Danyright

Asked: 2021-11-14 09:14:13 +0800 CST

Proxmox on Ceph performance & stability issues / Configuration doubts

We have just installed a cluster of 6 Proxmox servers, using 3 nodes as Ceph storage, and 3 nodes as compute nodes.

We are experiencing strange and critical issues with the performances and stability of our cluster.

VMs and Proxmox web access tends to hang for no obvious reason, from a few seconds to a few minutes - when accessing via SSH, RDP or VNC console directly. Even the Proxmox hosts seem to be out of reach, as can be seen in this monitoring capture. This also creates Proxmox cluster issues with some servers falling out of sync.

For instance, when testing ping between host nodes, it would work perfectly a few pings, hang, carry on (without any pingback time increase - still <1ms), hang again, etc.

We had some performance issues initially, but those have been fixed by adjusting the NICs' MTU to 9000 (+1300% read/write improvement). Now we need to get all this stable, because right now it's not production ready.

Hardware configuration

We have a similar network architecture to the one described in Ceph's official doc, with a 1 Gbps public network, and a 10 Gbps cluster network. Those are connected to two physical network cards for each of the 6 servers.

Storage server nodes:

CPU: Xeon E-2136 (6 cores, 12 threads), 3.3 GHz, Turbo 4.5 GHz
RAM: 16 GB
Storage:
- 2x RAID 1 256 GB NVMe, LVM
  - system root logical volume: 15 GB (~55% free)
  - swap: 7.4 GB
  - WAL for OSD2: 80 GB
- 4 TB SATA SSD (OSD1)
- 12 TB SATA HDD (OSD2)
Network Interface Controller:
- Intel Corporation I350 Gigabit: connected to public 1 Gbps network
- Intel Corporation 82599 10 Gigabit: connected to 10 Gbps cluster (internal) network

Compute server nodes:

CPU: Xeon E-2136 (6 cores, 12 threads), 3.3 GHz, Turbo 4.5 GHz
RAM: 64 GB
Storage:
- 2x RAID 1 256 GB SATA SSD
  - system root logical volume: 15 GB (~65% free)

Software: (on all 6 nodes)

Proxmox 7.0-13, installed on top of Debian 11
Ceph v16.2.6, installed with Proxmox GUI
Ceph Monitor on each storage node
Ceph manager on storage node 1 + 3

Ceph configuration

ceph.conf of the cluster:

[global]
     auth_client_required = cephx
     auth_cluster_required = cephx
     auth_service_required = cephx
     cluster_network = 192.168.0.100/30
     fsid = 97637047-5283-4ae7-96f2-7009a4cfbcb1
     mon_allow_pool_delete = true
     mon_host = 1.2.3.100 1.2.3.101 1.2.3.102
     ms_bind_ipv4 = true
     ms_bind_ipv6 = false
     osd_pool_default_min_size = 2
     osd_pool_default_size = 3
     public_network = 1.2.3.100/30

[client]
     keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
     keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.asrv-pxdn-402]
     host = asrv-pxdn-402
     mds standby for name = pve

[mds.asrv-pxdn-403]
     host = asrv-pxdn-403
     mds_standby_for_name = pve

[mon.asrv-pxdn-401]
     public_addr = 1.2.3.100

[mon.asrv-pxdn-402]
     public_addr = 1.2.3.101

[mon.asrv-pxdn-403]
     public_addr = 1.2.3.102

Questions:

Is our architecture correct?
Should the Ceph Monitors and Managers be accessed through the public network? (Which is what Proxmox's default configuration gave us)
Does anyone know where these disturbances/instabilities come from and how to fix those?

[edit]

Is it correct to use a default pool size of 3, when you have 3 storage nodes? I was initially temped by using 2, but couldn't find similar examples and decided to use the default config.

Noticed issues

We noticed that arping somehow is returning pings from two MAC addresses (public NIC and private NIC), which doesn't make any sens since these are seperate NICs, linked by a separate switch. This is maybe part of the network issue.
During a backup task on one of the VMs (backup to a Proxmox Backup Server, physically remote), somehow it seems to affect the cluster. The VM gets stuck in backup/locked mode, even though the backup seems to have finished properly (visible and accessible on the backup server).
Since the first backup issue, Ceph has been trying to rebuild itself, but hasn't managed to do so. It is in a degraded state, indicating that it lacks an MDS daemon. However, I double check and there are working MDS daemons on storage node 2 & 3. It was working on rebuilding itself until it got stuck in this state.

Here's the status:

root@storage-node-2:~# ceph -s
  cluster:
    id:     97637047-5283-4ae7-96f2-7009a4cfbcb1
    health: HEALTH_WARN
            insufficient standby MDS daemons available
            Slow OSD heartbeats on back (longest 10055.902ms)
            Slow OSD heartbeats on front (longest 10360.184ms)
            Degraded data redundancy: 141397/1524759 objects degraded (9.273%), 156 pgs degraded, 288 pgs undersized
 
  services:
    mon: 3 daemons, quorum asrv-pxdn-402,asrv-pxdn-401,asrv-pxdn-403 (age 4m)
    mgr: asrv-pxdn-401(active, since 16m)
    mds: 1/1 daemons up
    osd: 6 osds: 4 up (since 22h), 4 in (since 21h)
 
  data:
    volumes: 1/1 healthy
    pools:   5 pools, 480 pgs
    objects: 691.68k objects, 2.6 TiB
    usage:   5.2 TiB used, 24 TiB / 29 TiB avail
    pgs:     141397/1524759 objects degraded (9.273%)
             192 active+clean
             156 active+undersized+degraded
             132 active+undersized

[edit 2]

root@storage-node-2:~# ceph osd tree
ID  CLASS  WEIGHT    TYPE NAME               STATUS  REWEIGHT  PRI-AFF
-1         43.65834  root default                                     
-3         14.55278      host asrv-pxdn-401                           
 0    hdd  10.91409          osd.0               up   1.00000  1.00000
 3    ssd   3.63869          osd.3               up   1.00000  1.00000
-5         14.55278      host asrv-pxdn-402                           
 1    hdd  10.91409          osd.1               up   1.00000  1.00000
 4    ssd   3.63869          osd.4               up   1.00000  1.00000
-7         14.55278      host asrv-pxdn-403                           
 2    hdd  10.91409          osd.2             down         0  1.00000
 5    ssd   3.63869          osd.5             down         0  1.00000

Danyright

Asked: 2015-03-04 03:44:53 +0800 CST

Qemu Proxmox VM - Network access to guests for BackupPC backups

Environment

I rent a dedicated server from OVH and installed Proxmox 3.3 (which is based on Debian 6 Wheezy) in order to create and manage multiple Virtual Machines on it.

Each VM is configured to use a public IP (given by OVH), configured as IP aliases on the host - one virtual network device for each IP : eth0:0, eth0:1 and so on - as recommended by OVH's guide :http://help.ovh.co.uk/IpAlias#link7

And they are bridged to the different guests, through the vmbr0 bridge and with use of the virtual MAC addresses created for each IP - as recommended by these OVH guides : http://help.ovh.com/Proxmox and /BridgeClient [sorry cannot post more than 2 links]

The VMs are CentOS 7 installations, and the configuration works perfectly for access to the VMs and communication between the Virtual Machines, using the public IPs.

I also have SSH access to the host machine and the guests.

Use case

I am trying to setup backups of the different VMs using BackupPC installed on the host, for easier access to the physical machine's storage and the "backup storage" offered by OVH.

BackupPC has been installed on the host and configured to use rsync, and the goal is to connect to the VMs through ssh.

Issue

I cannot connect to the VMs from the host using SSH. This is because the public IPs I am trying to access the VMs through are in fact (from a host's perspective) assigned to the host itself as IP aliases.

This means, that when I connect to a VM's IP through SSH, I actually connect in loop to the host itself.

Question

How can I resolve this issue to get my backups working ?

1. Of course I could create one more VM and handle the backups from there, but this isn't an optimal solution since I will have to dedicate ressources for backups, and I want to avoid storing the backups in a VM.

2. I thought of adding a second network device to the VMs to handle 'local' communications but can't figure out how to do so. Since my "public-IP-configuration-bridge" already uses my host's eth0 device, I suppose I must create another bridge for this use case. But shall I bridge it to my host's eth0 without causing trouble with the "public IP configuration" ?

NB: If my reasoning behind my backup concept is wrong, please let me know. I welcome any other solution offering incremental and full backups of my different VMs.

Thanks a lot for you help !

Proxmox on Ceph performance & stability issues / Configuration doubts

Hardware configuration

Storage server nodes:

Compute server nodes:

Software: (on all 6 nodes)

Ceph configuration

Questions:

Noticed issues

Qemu Proxmox VM - Network access to guests for BackupPC backups

Environment

Use case

Issue

Question

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Danyright's questions

Hardware configuration

Storage server nodes:

Compute server nodes:

Software: (on all 6 nodes)

Ceph configuration

Questions:

Noticed issues

Environment

Use case

Issue

Question