Klun's questions -server

Klun

Asked: 2024-06-11 16:40:42 +0800 CST

vSphere: linux dd command on vm local SSD triggers vm disk latency critical alert

5

I have a Dell server running VMware ESX, with 12TB local SSD drives, 1TB memory, Xeon Gold processor, and a single debian VM.

On that VM, when I do simultaneous writes on disk, or even just execute the following command:

dd if=/dev/urandom of=/local/ssd/drive/path/largefile bs=1M count=1024

I have a critical disk latency alert on VSphere for that VM.

The dd command finished successfully after 10 minutes.

Why does vSphere trigger a critical alert that is not critical?

How is it possible to overwhelm high-end SSD drives, with a single dd command?

EDIT:

The critical alert is triggered if latency exceeds 75ms latency in a period of five minutes.
In practice, the disk latency seems to be around 200-250 ms for that vm:

EDIT 2:

Provisioning: thick lazy zeroed (no eager unfortunately)

EDIT 3:

I tried to define an IOPS limit on that disk, at the VM level (as you can see on the graph bellow).

I tried, 1000 IOPS, then 800, 600, 400, 200, 100. The critical disk latency alert is triggered even with 100 IOPS.

What is strange (as you can see on the graph), is that decreasing the limit (1000 IOPS to 100 IOPS) tend to increase the disk latency reported by vSphere. With 100 IOPS limit, the latency is 16,000 ms.

EDIT 4:

On the sofware side, I try to reduce the max simultaneous files writes from 24 to 4. The latency go from 200ms to 100ms, but the write bandwith go from 100MB/sec to 50MB/sec.

EDIT 5:

The switch from thick lazy zeroes provisionning, to thick eager zeroes, has not changed anything regarding the latency, always at 200ms

Klun

Asked: 2022-08-22 13:53:58 +0800 CST

linux mint xrdp xfce : X server already running on display :10.0

1

I'm on Linux Mint 21, xfce version.

I setup xrdp with basic command :

apt install xrdp

I have an /etc/xrdp/startvm.sh configuration like that :

#/bin/sh
startxfce4 >> ~/xrdp.log

When I connect with an RDP client, I got the xrdp Just Connecting login screen.

I choose Xorg, then entre my username and password. But when I submit the login form, the connection terminate unexpectedly.

There is nothing particular in the /var/log/xrdp.log or /var/log/xrdp-ses.log

The ~/xrdp.log indicates :

/usr/bin/startxfce4: X server already running on display :10.0
/usr/bin/startxfce4: X server already running on display :10.0
/usr/bin/startxfce4: X server already running on display :10.0
/usr/bin/startxfce4: X server already running on display :10.0
...

The /var/log/xrdp-sesman.log indicates :

20220821-23:43:13] [INFO ] ++ created session (access granted): username myuser, ip x.x.x.x:31947 - socket: 12
[20220821-23:43:13] [INFO ] starting Xorg session...
[20220821-23:43:13] [INFO ] Starting session: session_pid 23543, display :10.0, width 1920, height 1080, bpp 24, client ip x.x.x.x:31947 - socket: 12, user name myuser
[20220821-23:43:13] [INFO ] [session start] (display 10): calling auth_start_session from pid 23543
[20220821-23:43:13] [ERROR] sesman_data_in: scp_process_msg failed
[20220821-23:43:13] [ERROR] sesman_main_loop: trans_check_wait_objs failed, removing trans
[20220821-23:43:13] [INFO ] Starting X server on display 10: /usr/lib/xorg/Xorg :10 -auth .Xauthority -config xrdp/xorg.conf -noreset -nolisten tcp -logfile .xorgxrdp.%s.log  
[20220821-23:43:13] [INFO ] Found X server running at /tmp/.X11-unix/X10
[20220821-23:43:13] [INFO ] Found X server running at /tmp/.X11-unix/X10
[20220821-23:43:13] [INFO ] Session started successfully for user myuser on display 10
[20220821-23:43:13] [INFO ] Starting the xrdp channel server for display 10
[20220821-23:43:13] [INFO ] Found X server running at /tmp/.X11-unix/X10
[20220821-23:43:13] [INFO ] Starting the default window manager on display 10: /etc/xrdp/startwm.sh
[20220821-23:43:13] [INFO ] Session in progress on display 10, waiting until the window manager (pid 23544) exits to end the session
[20220821-23:43:14] [WARN ] Window manager (pid 23544, display 10) exited with non-zero exit code 139 and signal 0. This could indicate a window manager config problem
[20220821-23:43:14] [WARN ] Window manager (pid 23544, display 10) exited quickly (1 secs). This could indicate a window manager config problem
[20220821-23:43:14] [INFO ] Calling auth_stop_session and auth_end from pid 23543
[20220821-23:43:14] [INFO ] Terminating X server (pid 23545) on display 10
[20220821-23:43:14] [INFO ] Terminating the xrdp channel server (pid 23553) on display 10
[20220821-23:43:14] [INFO ] X server on display 10 (pid 23545) returned exit code 0 and signal number 0
[20220821-23:43:14] [INFO ] xrdp channel server for display 10 (pid 23553) exit code 0 and signal number 0
[20220821-23:43:14] [INFO ] cleanup_sockets:
[20220821-23:43:14] [INFO ] ++ terminated session:  username myuser, display :10.0, session_pid 23543, ip x.x.x.x:31947 - socket: 12

How to further debug ?

Klun

Asked: 2020-10-05 06:02:33 +0800 CST

IBM GPFS : very slow to remove files recursively

8

To delete files recursively in our IBM GPFS cluster, we use simple unix command like :

rm /my/directories -fr

However deletions are very long to be done.

Problem is that our distributed apps (Spark-based) took like one hour to be done. But then, it also took about an other hour to drop temporary files generated by distributed apps like Spark.

So global workloads are very inefficient. May be it's because the rm command has to list every sub-directories..

Anyway, do you known ways to efficiently drop an entire directory (and subdirectories) with GPFS ?

May be IBM give a special command to do that ?

vSphere: linux dd command on vm local SSD triggers vm disk latency critical alert

linux mint xrdp xfce : X server already running on display :10.0

IBM GPFS : very slow to remove files recursively

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?