Ping a Specific Port

Question

dw.emplod

Asked: 2012-09-18 14:42:25 +0800 CST2012-09-18 14:42:25 +0800 CST 2012-09-18 14:42:25 +0800 CST

Kernel panic when bringing up DRBD resource

772

I'm trying to set up two machines synchonizing with DRBD. The storage is setup as follows: PV -> LVM -> DRBD -> CLVM -> GFS2.

DRBD is set up in dual primary mode. The first server is set up and running fine in primary mode. The drives on the first server have data on them. I've set up the second server and I'm trying to bring up the DRBD resources. I created all the base LVM's to match the first server. After initializing the resources with ``

drbdadm create-md storage

I'm bringing up the resources by issuing

drbdadm up storage

After issuing that command, I get a kernel panic and the server reboots in 30 seconds. Here's a screen capture.

enter image description here

My configuration is as follows: OS: CentOS 6

uname -a
Linux host.structuralcomponents.net 2.6.32-279.5.2.el6.x86_64 #1 SMP Fri Aug 24 01:07:11 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

rpm -qa | grep drbd

kmod-drbd84-8.4.1-2.el6.elrepo.x86_64
drbd84-utils-8.4.1-2.el6.elrepo.x86_64

cat /etc/drbd.d/global_common.conf

global {
        usage-count yes;
        # minor-count dialog-refresh disable-ip-verification
}

common {
    handlers {
            pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
            pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
            local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
            # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
            # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
            # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
            # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
            # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
    }

    startup {
            # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
            become-primary-on       both;
            wfc-timeout             30;
            degr-wfc-timeout        10;
            outdated-wfc-timeout    10;
    }

    options {
            # cpu-mask on-no-data-accessible
    }

    disk {
            # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
            # disk-drain md-flushes resync-rate resync-after al-extents
            # c-plan-ahead c-delay-target c-fill-target c-max-rate
            # c-min-rate disk-timeout

    }

    net {
            # protocol timeout max-epoch-size max-buffers unplug-watermark
            # connect-int ping-int sndbuf-size rcvbuf-size ko-count
            # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
            # after-sb-1pri after-sb-2pri always-asbp rr-conflict
            # ping-timeout data-integrity-alg tcp-cork on-congestion
            # congestion-fill congestion-extents csums-alg verify-alg
            # use-rle
            protocol C;
            allow-two-primaries yes;
            after-sb-0pri   discard-zero-changes;
            after-sb-1pri   discard-secondary;
            after-sb-2pri   disconnect;
    }
}

cat /etc/drbd.d/storage.res

resource storage {
        device          /dev/drbd0;
        meta-disk       internal;

        on host.structuralcomponents.net {
                address         10.10.1.120:7788;
                disk            /dev/vg_storage/lv_storage;
        }
        on host2.structuralcomponents.net {
                address         10.10.1.121:7788;
                disk            /dev/vg_storage/lv_storage;
        }

/var/log/messages is not logging anything about the crash.

I've been trying to find a cause of this but I've come up with nothing. Can anyone help me out? Thanks.

2 Answers

Voted

Michael Hampton · Answer 1 · 2012-09-18T15:10:46+08:00

Michael Hampton

2012-09-18T15:10:46+08:002012-09-18T15:10:46+08:00

Machine check exception is a hardware issue. You can use mcelog to interpret it, if you can boot the system.

The resolution is to replace the failing hardware. Since it looks like you're most likely leasing a server, contact the provider.

3

dw.emplod · Answer 2 · 2012-09-19T07:40:18+08:00

dw.emplod

2012-09-19T07:40:18+08:002012-09-19T07:40:18+08:00

Looks like the kernel panic was caused by the network adapter. The server was set up with a dedicated NIC for the DRBD traffic. When I switched the DRBD traffic onto the onboard NIC's the crashes stopped. I'll report back if I find a better explanation for why this was happening (other traffic over that interface seems to be working fine).

2

Kernel panic when bringing up DRBD resource

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?