mysql-cluster questions - Page 1

Manu

Asked: 2016-10-08 13:45:18 +0800 CST

I/O high on DRBD disk drbd10 on stacked site

4

We have 4 Redhat Boxes Dell PowerEdge R630 (say a,b,c,d) having the following OS/packages.

RedHat EL 6.5 MySql Enterprise 5.6 DRBD 8.4 Corosync 1.4.7

We have setup 4-way stacked drbd resources as below:

Cluster Cluster-1: servers a and b connected to each other local lan Cluster Cluster-2: servers c and d

Cluster Cluster-1 and Cluster-2 are attached via stacked drbd via virtual IPs and are part of different data centres.

The drbd0 disks have been created locally on each servers 1GB, and are further attached to drbd10.

Overall setup comprises of 4 layers: Tomcat frontend application -> rabbitmq -> memcache -> mysql/drbd

We are experiencing very high Disk IO, even the activity is not must as of now. But the traffic/activity will be increased in couple of weeks, so we are worried that it will pose a very bad impact on performance. The I/o Useage is going high only on the stacked sites(90% and above sometimes). The secondary sites are not having this issue.At times the usage is going high when the application is ideal.

So kindly share some advice/tuning guidelines that will help us in resolving the issue;

resource clusterdb {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergencyshutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
}
startup {
degr-wfc-timeout 120; # 2 minutes.
outdated-wfc-timeout 2; # 2 seconds.
}
disk {
on-io-error detach;
no-disk-barrier;
no-md-flushes;
}

net {
cram-hmac-alg "sha1";
shared-secret "clusterdb";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}

syncer {
rate 10M;
al-extents 257;
 on-no-data-accessible io-error;
 }

 on sever-1 {
 device /dev/drbd0;
 disk /dev/sda2;
 address 10.170.26.28:7788;
 meta-disk internal;
 }
 on ever-2 {
 device /dev/drbd0;
 disk /dev/sda2;
 address 10.170.26.27:7788;
 meta-disk internal;
 }
}

The stacked Configuration:-

    resource clusterdb_stacked {
  protocol A;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notifyemergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergencyshutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
}
startup {
degr-wfc-timeout 120; # 2 minutes.
outdated-wfc-timeout 2; # 2 seconds.
}
disk {
on-io-error detach;
no-disk-barrier;
no-md-flushes;
}

net {
cram-hmac-alg "sha1";
shared-secret "clusterdb";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}

syncer {
rate 10M;
al-extents 257;
 on-no-data-accessible io-error;
 }

  stacked-on-top-of clusterdb {
    device     /dev/drbd10;
    address   10.170.26.28:7788;
  }
 stacked-on-top-of clusterdb_DR {
    device     /dev/drbd10;
    address    10.170.26.60:7788; 
  }
}

The Requested data:-

Date || svctm(w_wait)|| %util
10:32:01 3.07 55.23 94.11
10:33:01 3.29 50.75 96.27
10:34:01 2.82 41.44 96.15
10:35:01 3.01 72.30 96.86
10:36:01 4.52 40.41 94.24
10:37:01 3.80 50.42 83.86
10:38:01 3.03 72.54 97.17
10:39:01 4.96 37.08 89.45
10:41:01 3.55 66.48 70.19
10:45:01 2.91 53.70 89.57
10:46:01 2.98 49.49 94.73
10:55:01 3.01 48.38 93.70
10:56:01 2.98 43.47 97.26
11:05:01 2.80 61.84 86.93
11:06:01 2.67 43.35 96.89
11:07:01 2.68 37.67 95.41

Update to question as per the comments:-

It is high actually comparing local with stacked.

Between local servers

[root@pri-site-valsql-a]#ping pri-site-valsql-b
PING pri-site-valsql-b.csn.infra.sm (10.170.24.23) 56(84) bytes of data.
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=1 ttl=64 time=0.143 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=2 ttl=64 time=0.145 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=3 ttl=64 time=0.132 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=4 ttl=64 time=0.145 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=5 ttl=64 time=0.150 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=6 ttl=64 time=0.145 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=7 ttl=64 time=0.132 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=8 ttl=64 time=0.127 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=9 ttl=64 time=0.134 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=10 ttl=64 time=0.149 ms
64 bytes from pri-site-valsql-b.csn.infra.sm (10.170.24.23): icmp_seq=11 ttl=64 time=0.147 ms
^C
--- pri-site-valsql-b.csn.infra.sm ping statistics ---
11 packets transmitted, 11 received, 0% packet loss, time 10323ms
rtt min/avg/max/mdev = 0.127/0.140/0.150/0.016 ms

Between two stacked servers

[root@pri-site-valsql-a]#ping dr-site-valsql-b
PING dr-site-valsql-b.csn.infra.sm (10.170.24.48) 56(84) bytes of data.
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=1 ttl=64 time=9.68 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=2 ttl=64 time=4.51 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=3 ttl=64 time=4.53 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=4 ttl=64 time=4.51 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=5 ttl=64 time=4.51 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=6 ttl=64 time=4.52 ms
64 bytes from dr-site-valsql-b.csn.infra.sm (10.170.24.48): icmp_seq=7 ttl=64 time=4.52 ms
^C
--- dr-site-valsql-b.csn.infra.sm ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 6654ms
rtt min/avg/max/mdev = 4.510/5.258/9.686/1.808 ms
[root@pri-site-valsql-a]#

The output showing high I/O:-

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.06    0.00    0.00   99.94

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0             0.00     0.00    0.00    2.00     0.00    16.00     8.00     0.90    1.50 452.25  90.45

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.13    0.50    0.00   99.12

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0             0.00     0.00    1.00   44.00     8.00   352.00     8.00     1.07    2.90  18.48  83.15

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.13    0.00    0.06    0.25    0.00   99.56

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0             0.00     0.00    0.00   31.00     0.00   248.00     8.00     1.01    2.42  27.00  83.70

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.19    0.00    0.06    0.00    0.00   99.75

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
drbd0             0.00     0.00    0.00    2.00     0.00    16.00     8.00     0.32    1.50 162.25  32.45

Edited properties file.But still no luck

disk {
on-io-error detach;
no-disk-barrier;
no-disk-flushes;
no-md-flushes;
c-plan-ahead 0;
c-fill-target 24M;
c-min-rate 80M;
c-max-rate 300M;
al-extents 3833;
}

net {
cram-hmac-alg "sha1";
shared-secret "clusterdb";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
max-epoch-size 20000;
max-buffers 20000;
unplug-watermark 16;
}

syncer {
rate 100M;
 on-no-data-accessible io-error;
 }

Terence Johnson

Asked: 2012-01-07 07:06:37 +0800 CST

MySQL on VMware for High Availability

3

Given a choice between the current database-level HA options for MySQL, like Master-Master, Cluster, or Galera, and vSphere HA for the machine running the MySQL master, which would you use or combine and why?

Luke

Asked: 2011-09-26 18:09:06 +0800 CST

Database scalability with write-heavy application

4

I have a write-heavy application. The application is best compared to surveys - the customer creates custom questionares and this is saved to the database. Most of the requests are from their users submitting these forms. Later on our customers do complex reports and graphs on these submissions.

Making sure our application server (PHP) and the web server (Nginx) scales is quite easy, the trouble is scaling the database server onto multiple servers.

A lot of applications are more read heavy, so typically you'll have a master-slave replication setup where all writes go to a single master, but reads are distributed to the slaves. For us this doesn't work because we're doing writes most of the time.

I've seen mention of a master-master setup, but this typically hits a snag with auto incremented primary keys. The solution is typically to have one server do odd numbers, and the other do evens. I want to avoid that.

On some similar questions I've seen mention of the Tungsten Replicator and how it gives you a lot more flexibility with replication. Would this help me at all? What kind of benefits would this give me that MySQL's built in replication can not provide?

There is also MySQL Cluster, but this typically hits a snag with very large databases and complex queries (joins). I need to be able to run complex reports, so this probably won't work for me.

I'm looking for redundancy, automatic fail over, distributing requests, and data integrity.

Are there other RDMS that provide better solutions that are suitable for the web?

John

Asked: 2011-08-11 01:49:07 +0800 CST

Why mysql cluster does not use multiple cores of the CPU?

4

I have a problem with the ndbmtd process. When I use following configuration I expect both cores on our server with Intel(R) Pentium(R) CPU G6950 @ 2.80GHz will be fully utilized. Unfortunately this is not happening. Only core with id=0 is used. Second one has no load.

My configuration:

[ndbd default]
MaxNoOfExecutionThreads=2
[ndbd]
HostName=192.168.1.4
NodeId=3
LockExecuteThreadToCPU=0,1
LockMaintThreadsToCPU=0

mpstat -P ALL

08:47:09 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
08:47:11 AM     all     44.64      0.00      1.75      1.25      0.00     52.37
08:47:11 AM       0     89.45      0.00      1.01      2.01      0.00      7.54
08:47:11 AM       1      0.99      0.00      1.98      0.00      0.00     97.03

However, "top" shows 90% usage for ndbmtd process (why?)

My topology - 2 data nodes, ndb_mgmt in VM, mysqld in VM.

Is my CPU not capable of such thing, I have something misconfigured or mysql-cluster is not able to fully load multi-core processors?

dlo

Asked: 2011-06-09 13:53:57 +0800 CST

Is MySQL Cluster appropriate for a small, low-volume multi-homed system?

4

I'd like to multi-home a proprietary web-based application for a relatively small user base with modest dataset size. Scalability is not as much of a concern as extremely high availability. The system has extremely basic architecture: PHP running on a single Linux VPS tapping into a standard MySQL server.

I'm thinking to use multiple VPS's in geographically distinct datacenters such that if a user has problems connecting to one server location, they can try another (I have separate ideas on how to semi-automate the failover, and potentially georoute, but that's another issue). PHP application code and other files would be rsynced; however, I haven't yet resolved how to sync the databases... is it reasonable to use MySQL Cluster for this?

If one node goes down, the others must continue working. When a node comes back up, it should automatically get back in sync. Given that there would be WAN-type latency between datacenters, write speed cannot be dependent on intra-cluster connectivity. Data consistency is desirable but not the overriding priority. System simplicity is important--I can't afford to spend the next month figuring out the nuances of the database.

If MySQL Cluster is not appropriate for this, then what are some other alternatives to consider?

I/O high on DRBD disk drbd10 on stacked site

MySQL on VMware for High Availability

Database scalability with write-heavy application

Why mysql cluster does not use multiple cores of the CPU?

Is MySQL Cluster appropriate for a small, low-volume multi-homed system?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?

Questions[mysql-cluster](server)