Aleksandr Levchuk's questions -server

Aleksandr Levchuk

Asked: 2017-03-25 06:42:03 +0800 CST

atop: how to compress or hide the processors at top on large machines?

1

Header lists all CPUs/cores and keeps re-sizing as I go back in time with t and T. I read though the help and tried searching.

How to hide that header?

Aleksandr Levchuk

Asked: 2013-08-02 09:57:00 +0800 CST

How to capture retransmitted packet info with tcpdump?

4

TCP retransmission rate on a host are often a good indicator of network problems. How do I find out the source and destination IPs for the packets that are being retransmitted?

For context, on hosts that have sar installed, one can see the re-transmission rates like so:

sar -n ETCP

10:11:02 AM  atmptf/s  estres/s retrans/s isegerr/s   orsts/s
10:12:01 AM      0.07      1.95      0.08      0.00      1.18
10:13:01 AM      0.07      1.30      0.02      0.00      0.83
10:14:01 AM      0.07      1.40      0.02      0.00      0.85

Aleksandr Levchuk

Asked: 2011-12-21 18:44:53 +0800 CST

What version of HDFS is compatible with HBase stable?

1

HBase stable is currently hbase-0.90.4, what version(s) of HDFS is it compatible with?

Aleksandr Levchuk

Asked: 2011-08-25 14:03:09 +0800 CST

Solaris NFS which client is all that traffic comming from?

2

On Solaris / OpendIndiana NFS server, is the a way to get per-client stats?

Aleksandr Levchuk

Asked: 2011-05-21 16:53:21 +0800 CST

OOM killer goes insane

4

On our cluster we would sometimes have nodes go down when a new process would request too much memory. I was puzzled why the OOM killer does not just kill the guilty process.

The reason turned out to be that some processes get -17 oom_adj. That makes them off-limits for OOM killer (unkillabe!).

I can clearly see that with the following script:

#!/bin/bash
for i in `grep -v 0 /proc/*/oom_adj | awk -F/ '{print $3}' | grep -v self`; do
  ps -p $i | grep -v CMD
done

OK, it makes sense for sshd, udevd, and dhclient, but then I see regular user processes get -17 as well. Once that user process causes an OOM event it will never get killed. This causes OOM kiler to go insane. NFS rpc.statd, cron, everything that happened to to be not -17 will be wiped out. As a result the node is down.

I have Debian 6.0 (Linux 2.6.32-3-amd64).

Does anyone know where to contorl the -17 oom_adj assignment behaviour?

Could launching sshd and Torque mom from /etc/rc.local be causing the overprotective behaviour?

Aleksandr Levchuk

Asked: 2011-05-06 15:34:08 +0800 CST

In Linux, how to temporarly freeze a user?

2

On a mis-configured or buggy network filer (NFS NAS) writing a large file can cause the filer to freeze.

For diagnostics I need to be able to:

Suspend (or in other words temporary freeze) all processes of a particular user
Resume the user

Basically, like a kill -s SIGSTOP and kill -s SIGCONT but for the entire user.

To do that, is there a way to temporary take away all CPU-time from a user in Linux?

Aleksandr Levchuk

Asked: 2011-05-04 18:56:38 +0800 CST

FreeNAS vs OpenIndiana in terms of speed, driver availability, robustness?

6

This question related to NexentaStor vs FreeNAS and Is FreeNAS reliable?

I have been using OpenIndiana / Illumos as the OS for my self build NAS.

There is nothing much to it:

Created ZFS on the FiberChannel devices
Setup the local network
Enabled NFS

I aslo wrote a few Bash scripts that are cronned minutly to write-down zfs get all to the shared Filesystem so that I can monitor things like disk usage, compression ratio, and dedup ratio on the client side.

I don't need any other features.

How will FreeNAS compare to OpenSolaris in terms of speed, driver availability, and robustness?

Aleksandr Levchuk

Asked: 2011-03-27 17:31:58 +0800 CST

F_WRLCK calls take long time on NFS

2

I have NFS shared among 30 cluster nodes. The nodes are Debian 5 and 6. The NFS server is OpenSolaris 2009. We have good hardware and a 20Gbit Infiniband network.

On the cluster nodes, fs operations are snappy but not when it comes to:

Mutt
Sqlite3
An R lib. E.g. Rscript <(echo "library(GOstats)")

They all get stuck for a few minutes after the following system calls:

fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=1073741824, len=1} or
fcntl(3, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}

What could be the cause? How to diagnose and fix?

Would switching the NFS server to OpenIndiana oi_148 fix?

Aleksandr Levchuk

Asked: 2011-03-26 10:08:13 +0800 CST

ZFS nfsshare to export RW and RO host?

3

I have been exporting NFS from OpenSloarins like this (successfully):

zfs set sharenfs=root=rw=host1:host2:host3 pool1

I'm acting according the man pages sharefs, share_nfs but the following does not work:

zfs set sharenfs=root=rw=host1:host2:host3,ro=host4 pool1

All hosts loose access permission.

How can I share to some hosts as read/write and to some as read only?

Aleksandr Levchuk

Asked: 2011-03-03 20:31:03 +0800 CST

DHCPDISCOVER requests from an off-by-one MAC address

4

In a Linux DHCP server I'm getting a bunch of these log lines:

dhcpd: DHCPDISCOVER from 00:30:48:fe:5c:9c via eth1: network 192.168.2.0/24: no free leases

I don't have any machines with 00:30:48:fe:5c:9c and I don't intend to give out an IP to 00:30:48:fe:5c:9c (whatever that could be).

I tracked down the server that this is coming from and killed all the DHCP clients that were running but the DHCPDISCOVER requests do not stop.

I can prove that this is the sending server by pulling the Ethernet cable - the requests stop.

The strange thing is that the sending server only has 2 interfaces which are:

00:30:48:fe:5c:9a
00:30:48:fe:5c:9b

What can be the cause of the off-by-one address? Who could be sending the requests?

Details

My DHCP client is the default in Debian 6.0 (Squeeze) http://packages.debian.org/squeeze/isc-dhcp-client

On the DHCP client host:

root@n34:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 100
    link/ether 00:30:48:fe:5c:9a brd ff:ff:ff:ff:ff:ff
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
    link/ether 00:30:48:fe:5c:9b brd ff:ff:ff:ff:ff:ff
4: ib0: <BROADCAST,MULTICAST> mtu 2044 qdisc noop state DOWN qlen 256
    link/infiniband 80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:08:81:9f brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
5: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP qlen 256
    link/infiniband 80:00:00:49:fe:80:00:00:00:00:00:00:00:02:c9:03:00:08:81:a0 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

On the DHCP client host (same info as above):

root@n34:~# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:30:48:fe:5c:9a  
          inet addr:192.168.2.234  Bcast:192.168.2.255  Mask:255.255.255.0
          inet6 addr: fe80::230:48ff:fefe:5c9a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:72544 errors:0 dropped:0 overruns:0 frame:0
          TX packets:152773 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:4908592 (4.6 MiB)  TX bytes:89815782 (85.6 MiB)
          Memory:dfd60000-dfd80000 

eth1      Link encap:Ethernet  HWaddr 00:30:48:fe:5c:9b  
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
          Memory:dfde0000-dfe00000 

ib0       Link encap:UNSPEC  HWaddr 80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00  
          BROADCAST MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

ib1       Link encap:UNSPEC  HWaddr 80-00-00-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          inet addr:192.168.3.234  Bcast:192.168.3.255  Mask:255.255.255.0
          inet6 addr: fe80::202:c903:8:81a0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
          RX packets:1330 errors:0 dropped:0 overruns:0 frame:0
          TX packets:255 errors:0 dropped:5 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:716415 (699.6 KiB)  TX bytes:17584 (17.1 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:560 (560.0 B)  TX bytes:560 (560.0 B)

The nodes were imaged with Perseus which uses kexec instead of rebooting.

Aleksandr Levchuk

Asked: 2011-01-31 18:42:02 +0800 CST

How to QoS NFS?

6

I have the following NFS-based storage setup: enter image description here

Computes nodes are Linux. The NFS servers are Solaris.

A not-so-important user runs a bunch of read intensive jobs on a subset of the compute nodes. As a result, the whole group of compute nodes becomes very slow (ls blocks for 30 seconds). I was able to track down that the dedicated NFS server is hitting the limit of the san's read throughput.

How to implement quality of service (QoS) limiting the NFS bandwidth to nodes, processes, or users?

Aleksandr Levchuk

Asked: 2011-01-17 13:45:40 +0800 CST

Does the ``-'' sign have meaning in rsyslog.conf

10

Rsyslog is backwards-compatible with Syslog configuration files.

The syslog.conf man page has:

You may prefix each entry with the minus ``-'' sign to omit syncing the file after every logging. Note that you might lose information if the system crashes right behind a write attempt. Nevertheless this might give you back some performance, especially if you run programs that use logging in a very verbose manner.

but I could not find aything about the - sign in man rsyslog.conf.

What does rsyslog do when if reads - in the config file?

Aleksandr Levchuk

Asked: 2011-01-04 12:36:55 +0800 CST

Pull network or power? (for contianing a rooted server)

11

When a server gets rooted (~~e.g. a situation like this~~), one of the first things that you may decide to do is containment. Some security specialists advise not to enter remediation immediately and to keep the server online until forensics are completed. Those advises are usually for APT. It's different if you have occasional Script kiddie breaches, so you may decide to remediate (fix things) early. One of the steps in remediation is containment of the server. Quoting from Robert Moir's Answer - "disconnect the victim from its muggers".

A server can be contained by pulling the network cable or the power cable.

Which method is better?

Taking into consideration the need for:

Protecting victims from further damage
Executing successful forensics
(Possibly) Protecting valuable data on the server

Edit: 5 assumptions

Assuming:

You detected early: 24 hours.
You want to recover early: 3 days of 1 systems admin on the job (forensics and recovery).
The server is not a Virtual Machine or a Container able to take a snapshot capturing the contents of the servers memory.
You decide not to attempt prosecuting.
You suspect that the attacker may be using some form of software (possibly sophisticated) and this software is still running on the server.

Aleksandr Levchuk

Asked: 2010-12-23 07:19:04 +0800 CST

OpenVZ vs Xen, how much difference in performance?

3

There is a Xen vs. KVM in performance question on ServerFault.

What will be the speed difference if the choice is between Xen and OpenVZ?

Searching for such benchmarks does not show any results newer than 2008.

What would be some important performance measurements to compare OpenVZ against Xen?

Some may say "you're comparing oranges and pineapples" but I have to choose 1 of the 2 and it needs to be a wise choice. Performance is most important to us. We may switch away from OpenVZ because Xen is more ubiquitous but only if performance overhead is not significant. Next month (January 2011) I'm thinking of doing my own performance comparison - here is the project planning blog.

Aleksandr Levchuk

Asked: 2010-12-18 09:23:13 +0800 CST

zpool import on a 50TB volume is taking forever: What is it doing?

6

We have a fiber channel san managed by two OpenSolaris 2009.06 NFS servers.

Server 1 is managing 3 small volumes (300GB 15K RPM drives). It's working like a charm.
Server 2 is managing 1 large volume of 32 drives (2TB 7200 RPM drives) RAID6. Total size is 50TB.
Both servers have Zpool version 14 and ZFS version 3.

The slow 50TB server was installed a few month ago and was working fine. Users filled-up 2TB. I did a small experiment (created 1000 filesystems and had 24 snapshots on each). Everything when well as far as creating, accessing the filesystems with snapshots, and NFS mounting a few of them.

When I tried destroying the 1000 filesystems, the first fs took several minutes and then failed reporting the fs was in use. I issued a system shutdown but took more than 10 minutes. I did not wait longer and shut the power off.

Now when booting, OpenSolaris hangs. The lights on the 32 drives are blinking rapidly. I left it for 24 hours - still blinking but no progress.

I booted into an system snapshot before the zpool was created and tryied importing the zpool.

pfexec zpool import bigdata

Same situation: LEDs blinking and the import hangs forever.

Dtracing the "zpool import" process shows only the ioctl system call:

dtrace -n syscall:::entry'/pid == 31337/{ @syscalls[probefunc] = count(); }'

ioctl                          2499

Is there a way to fix this? Edit: Yes. Upgrading OpenSolaris to svn_134b did the trick:

pkg publisher # shows opensolaris.org
beadm create opensolaris-updated-on-2010-12-17
beadm mount opensolaris-updated-on-2010-12-17 /mnt
pkg -R /mnt image-update
beadm unmount opensolaris-updated-on-2010-12-17
beadm activate opensolaris-updated-on-2010-12-17
init 6

Now I have zfs version 3. Bigdata zpool stays at version 14. And it's back in production!

But what was it doing with the heavy I/O access for more then 24 hours (before the software upgraded)?

Aleksandr Levchuk

Asked: 2010-12-10 13:00:59 +0800 CST

Will Debian (GNU/kFreeBSD) + official ZFS support be a stable configuration?

4

Looks like Debian 6.0 (Squeeze) will be supporting ZFS via the official GNU/kFreeBSD kernel.

This opens a possibility of converting our Debian GNU/Linux cluster's dedicated NAS server from OpenSolaris 2009.06 to Debain. The server connects to the SAN via FiberChannel HBA and to the LAN vi InfiniBand HBA. Probably, it would be pretty hard to get the drivers to work on kFreeBSD.

Supposing all the drivers actually work, would this be a stable setup?

Aleksandr Levchuk

Asked: 2010-11-20 20:42:25 +0800 CST

15K RPM drives fail

2

For 3 years we had an LSI SAN with 48, 300GB Segate Cheetah 15K.5 (Model ST3300655FC) 3.5-inch drives. There are about 7 drives failed total. There bulk failed recently. Six drives since May 2010.

That's at a rate of 0.02 (drives failed)/(month)/(drives in array) for the last 6 month period.

There is an older SAN from HP running in the same room, I the drives are probably 15K 36 GB. Those never failed.

Is it common that 300GB 15K RPM drives start failing at this rate after 3 years?

Aleksandr Levchuk

Asked: 2010-11-19 08:45:21 +0800 CST

ZFS over NFS v3 - "Invalid Argument" for empty files

2

I have OpenSolaris 2009.06 server providing ZFS over NFS v3 to a Linux 2.6.26 server. Does not happen when accessing the files via NFSv4.

Very happy. It catches silent data corruption in our LSI san. Great performance. Has snapshots. Has compression. Transaction log replay backups. Most importantly we no longer have FS caching issues and freezes that occurred on a Linux server.

There is one strange thing: Empty files are inaccessible from the Linux NFS client. When I try to ls, cat, or stat them I get:

stat: cannot stat `/srv/zpools/a/write.lock': Invalid argument

Rsync backups report:

rsync: readlink "/srv/zpools/a/write.lock" failed: Invalid argument (22)
rsync: readlink "/srv/zpools/userX/.netbeans/6.9/var/cache/mavenindex/netbeans/write.lock" failed: Invalid argument (22)
rsync: readlink "/srv/zpools/userX/.netbeans/6.9/var/cache/mavenindex/local/write.lock" failed: Invalid argument (22)
rsync: readlink "/srv/zpools/userX/javaPrograms/mavenProjects/thesis/libbn/target/test-classes/.netbeans_automatic_build" failed: Invalid argument (22)
rsync: readlink "/srv/zpools/userX/javaPrograms/mavenProjects/scalaCommon/target/test-classes/.netbeans_automatic_build" failed: Invalid argument (22)

I cannot reproduce it by creating new empty files only for some old files.

Can anyone tell what could the reason?

Edit: In the ZFS server, when stat'ing the strange files I found that the Modification time was in back 1927. :) Touching the file on the server, fixed the problem on the NFS client.

Aleksandr Levchuk

Asked: 2010-11-16 19:39:52 +0800 CST

Infiniband UIO vs Infiniband HCA

2

What is the fundamental difference between:

Infiniband Universal I/O Card (e.g. Supermicro AOC-UINF-M2)

and
Infiniband Host Channel Adapter (e.g. Qlogic QLE7240-CK)

Can't both of those do do IP-over-IB?

Aleksandr Levchuk

Asked: 2010-11-16 17:32:03 +0800 CST

What are SAS external storage options (Promise, Infortrend, SuperMircro, ...)?

6

We are looking for a ~32T external storage solution for two 48-core AMD servers. These will be used for a small Linux OpenVZ cloud for CPU intensive web-servers and data warehousing. Dual path with automatic fail-over is pretty much a must. Hopefully the enclosure and the SAS controller would cost around $9k and 16 drives around $4k.

Promise and Infortrend

We initially looked at Promise's VTrak E610sD: http://www.promise.com/media_bank/Download%20Bank/Manual/VTrak_E-Class_PM_v3.2.pdf (page 35 shows the topology that we would want)
A college suggested Infortrend's EonStor DS S16S-R2240: http://www.infortrend.com/products/models/ESDS%20S16S-R2240

Has anyone had experience with these systems?

What are some alternatives to the above Promise and Infortrend SAS product for a two server web+db cloud application?

RAID Inc.

This could be a good option: http://www.raidinc.com/xanadu_230.php

SuperMircro

Would something like this also work? https://www.thinkmate.com/System/STX_JE16-0300/14991

atop: how to compress or hide the processors at top on large machines?

How to capture retransmitted packet info with tcpdump?

What version of HDFS is compatible with HBase stable?

Solaris NFS which client is all that traffic comming from?

OOM killer goes insane

In Linux, how to temporarly freeze a user?

FreeNAS vs OpenIndiana in terms of speed, driver availability, robustness?

F_WRLCK calls take long time on NFS

ZFS nfsshare to export RW and RO host?

DHCPDISCOVER requests from an off-by-one MAC address

Details

How to QoS NFS?

Does the ``-'' sign have meaning in rsyslog.conf

Pull network or power? (for contianing a rooted server)

OpenVZ vs Xen, how much difference in performance?

zpool import on a 50TB volume is taking forever: What is it doing?

Will Debian (GNU/kFreeBSD) + official ZFS support be a stable configuration?

15K RPM drives fail

ZFS over NFS v3 - "Invalid Argument" for empty files

Infiniband UIO vs Infiniband HCA

What are SAS external storage options (Promise, Infortrend, SuperMircro, ...)?

Promise and Infortrend

RAID Inc.

SuperMircro

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?