Kamiel Wanrooij's questions -server

Kamiel Wanrooij

Asked: 2012-11-09 10:11:09 +0800 CST

How to track things that SHOULD happen, but might not have

-1

I am running into a couple of issues with some applications we've deployed and maintain. I have the feeling we have approached this with some anti-patterns up to now, but I would like to see how to make this more flexible and stable.

In one situation, we have a server at a client which pushes data to us to parse every night (yes, Windows Task Scheduler). This is highly unstable however, so once every month this doesn't happen because of reasons out of our control. This heavily impacts our business since we run with stale data in that situation.

In another scenario we have a lot of background job processes that should be running. We already keep them up using bluepill ( http://www.github.com/arya/bluepill ) but obviously restarts happen, both automatically and manually, and people forget things or systems mess up.

What I would like to track is events that should occur or should be available. Like the existence of a process, the execution of a program, or the creation/age of a file, and track it when they don't happen or exist.

We develop most things in Ruby on Rails, use NewRelic, Bluepill and Munin, and run on Ubuntu. I've been toying around with counting ps aux | grep processname | wc -l in Munin scripts, or capturing the age of a file and raising alerts over 24-26 hours, stuff like that.

Is there better tooling to track things that should happen, and raise alerts if they don't?

P.S. I know some things are suboptimal, like manually having to define bluepill for applications and then forgetting to do so. The same goes for the push based approach of the first application, a dedicated daemon that manages that on the client side that we control and can track its connection to us might be a much better solution.

Kamiel Wanrooij

Asked: 2012-02-16 08:16:49 +0800 CST

Varnish before HAProxy seems to reuse (wrong) backend connection

2

We have the following setup: Nginx -> Varnish -> HAProxy -> App Server A / App Server B.

Most request we process are proxied by HAProxy to App Server A. This is done based on host header values. A few hosts should be redirected to App Server B. Traffic to App Server B is very low.

Most request work fine. Every once in a while requests that should be proxied to App Server B give a 404 Status Code. These requests show up in logs of Nginx, Varnish, and App Server A, but not HAProxy.

Requests to App Server A and B that work fine are properly logged by HAProxy.

It seems that Varnish reuses connections to App Server A established by HAProxy, preventing those requests from being reevaluated and sent to the proper backend. Is this a plausible cause of my problems? Is there a way to force Varnish to reconnect to the backend, or HAProxy to remain in between these servers? What are advantages / disadvantages of each solution?

Thanks!

Edit:

This is the part from my Varnish log file:

   12 SessionClose c Connection: close
   12 StatSess     c 127.0.0.1 54331 0 1 1 0 0 1 652 0
    0 CLI          - Rd ping
    0 CLI          - Wr 200 PONG 1329319173 1.0
   12 SessionOpen  c 127.0.0.1 54334 :6081
   12 ReqStart     c 127.0.0.1 54334 1884874126
   12 RxRequest    c GET
   12 RxURL        c /path/to/page
   12 RxProtocol   c HTTP/1.0
   12 RxHeader     c Host: www.example.com
   12 RxHeader     c Connection: close
   12 RxHeader     c User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
   12 RxHeader     c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
   12 RxHeader     c Referer: http://www.example.com/path/to/previous/
   12 RxHeader     c Accept-Encoding: gzip,deflate,sdch
   12 RxHeader     c Accept-Language: en-US,en;q=0.8
   12 RxHeader     c Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
   12 RxHeader     c Cookie: 71666-bl0098f964_r=0; 71666-bl0098f964_s=145818062; _advocaten-cms_session=BAh7BjoPc2Vzc2lvbl9pZCIlMmRhNGFhMWY3MDRlMzljNGRkZTQ0MTI3MjJhN2E2NzY%3D--5d089ff2c72e4ad91d415cb14020834387c2077e; 71666-bl0098f964_i=272; 71666-bl0098f964_vt=1329319215438
   12 VCL_call     c recv
   12 VCL_return   c pass
   12 VCL_call     c hash
   12 VCL_return   c hash
   12 VCL_call     c pass
   12 VCL_return   c pass
   12 Backend      c 14 default default
   14 TxRequest    b GET
   14 TxURL        b /path/to/page/
   14 TxProtocol   b HTTP/1.0
   14 TxHeader     b Host: www.example.com
   14 TxHeader     b User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7
   14 TxHeader     b Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
   14 TxHeader     b Referer: http://www.example.com/path/to/previous/
   14 TxHeader     b Accept-Encoding: gzip,deflate,sdch
   14 TxHeader     b Accept-Language: en-US,en;q=0.8
   14 TxHeader     b Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
   14 TxHeader     b Cookie: 71666-bl0098f964_r=0; 71666-bl0098f964_s=145818062; _advocaten-cms_session=BAh7BjoPc2Vzc2lvbl9pZCIlMmRhNGFhMWY3MDRlMzljNGRkZTQ0MTI3MjJhN2E2NzY%3D--5d089ff2c72e4ad91d415cb14020834387c2077e; 71666-bl0098f964_i=272; 71666-bl0098f964_vt=1329319215438
   14 TxHeader     b X-Forwarded-For: 127.0.0.1
   14 TxHeader     b X-Varnish: 1884874126
   14 RxProtocol   b HTTP/1.1
   14 RxStatus     b 404
   14 RxResponse   b Not Found
   14 RxHeader     b Date: Wed, 15 Feb 2012 15:20:06 GMT
   14 RxHeader     b Server: Apache
   14 RxHeader     b Vary: Accept-Encoding
   14 RxHeader     b Content-Encoding: gzip
   14 RxHeader     b Content-Length: 243
   14 RxHeader     b Connection: close
   14 RxHeader     b Content-Type: text/html; charset=iso-8859-1
   12 TTL          c 1884874126 RFC 120 1329319174 0 0 0 0
   12 VCL_call     c fetch
   12 VCL_return   c pass
   12 ObjProtocol  c HTTP/1.1
   12 ObjStatus    c 404
   12 ObjResponse  c Not Found
   12 ObjHeader    c Date: Wed, 15 Feb 2012 15:20:06 GMT
   12 ObjHeader    c Server: Apache
   12 ObjHeader    c Vary: Accept-Encoding
   12 ObjHeader    c Content-Encoding: gzip
   12 ObjHeader    c Content-Type: text/html; charset=iso-8859-1
   14 Length       b 243
   14 BackendClose b default
   12 VCL_call     c deliver
   12 VCL_return   c deliver
   12 TxProtocol   c HTTP/1.1
   12 TxStatus     c 404
   12 TxResponse   c Not Found
   12 TxHeader     c Server: Apache
   12 TxHeader     c Vary: Accept-Encoding
   12 TxHeader     c Content-Encoding: gzip
   12 TxHeader     c Content-Type: text/html; charset=iso-8859-1
   12 TxHeader     c Content-Length: 243
   12 TxHeader     c Date: Wed, 15 Feb 2012 15:19:34 GMT
   12 TxHeader     c X-Varnish: 1884874126
   12 TxHeader     c Age: 0
   12 TxHeader     c Via: 1.1 varnish
   12 TxHeader     c Connection: close
   12 Length       c 243
   12 ReqEnd       c 1884874126 1329319174.638826609 1329319174.639825106 0.000061750 0.000943899 0.000054598
   12 SessionClose c Connection: close
   12 StatSess     c 127.0.0.1 54334 0 1 1 0 1 1 260 243
    0 CLI          - Rd ping
    0 CLI          - Wr 200 PONG 1329319176 1.0

HAProxy is the default backend.

Could it also have something to do with the fact that HAProxy does not log subsequent requests for the same session? I assumed since HAProxy did not log them, they didn't pass through it, but according to the docs this is intended behavior. Since Varnish is the client as far as HAProxy is concerned, multiple requests (from multiple visitors) can belong to the same session?

Edit 2: It appears that if I add

option forceclose
option http-pretend-keepalive

to my HAProxy configuration the problem disappears or becomes much less frequent. This feels like a workaround to a deeper problem regarding Varnish / HAProxy interaction. Varnish receives a Connection: Close header, but does not specify that header in the request to the backend. Force closing the connection makes sure HAProxy evaluates every single request from Varnish.

Kamiel Wanrooij

Asked: 2011-07-15 04:16:34 +0800 CST

Chef clients are unable to connect after Ubuntu upgrade

1

After I upgraded Ubuntu on my chef server from 10.04 to 10.10, all my knife and chef clients stopped working. I received 401: Unauthorized exceptions for every query and operation.

I tried reregistering my clients (knife client reregister CLIENT) which didn't work.

I tried regenerating my chef authentication data (removed /etc/chef/validation.pem, restarted chef-server, and ran knife configure --initial on the server with chef-validation as the admin user and the newly generated /etc/chef/validation.pem as the certificate) which enabled my to connect to chef again with my new credentials, but now my configuration data is empty! Running knife node list for instance returns nothing.

This indicated that the CouchDB database is empty. And indeed, there is a /var/lib/couchdb/0.10.0/chef.couch file of 1.1GB, and an almost empty /var/lib/couchdb/1.0.1/chef.couch file.

I am still figuring out how to recover my data, but has anyone have similar experience? How did you manage to migrate your chef database to the new CouchDB version?

Kamiel Wanrooij

Asked: 2011-05-21 02:20:00 +0800 CST

Why does my memory used (accounting for cache) does not correspond to my processes memory usage?

1

I am using Ubuntu 10.04 where a number of background jobs run every day. free and top both report that 3.9GB of the 4GB of memory is in use (of which ~90MB is cache/buffers).

When I do a top of ps aux and count the memory usage of my apps, I get to about 50%.

I am only running mysql, apache+passenger, and redis on the machine. It also hosts an NFS share.

Is there another way to check the remaining 49% of in use memory? And to free it without rebooting the server?

This is the output of free:

             total       used       free     shared    buffers     cached
Mem:       4010060    3820592     189468          0      84208     194168
-/+ buffers/cache:    3542216     467844
Swap:       741372     741292         80

ps aux:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  23864   472 ?        Ss   Feb02   0:13 /sbin/init
root         2  0.0  0.0      0     0 ?        S    Feb02   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Feb02   2:32 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    Feb02   0:14 [migration/0]
root         5  0.0  0.0      0     0 ?        S    Feb02   0:00 [watchdog/0]
root         6  0.0  0.0      0     0 ?        S    Feb02   0:25 [migration/1]
root         7  0.0  0.0      0     0 ?        S    Feb02   1:30 [ksoftirqd/1]
root         8  0.0  0.0      0     0 ?        S    Feb02   0:00 [watchdog/1]
root         9  0.0  0.0      0     0 ?        S    Feb02   5:17 [events/0]
root        10  0.0  0.0      0     0 ?        S    Feb02   2:06 [events/1]
root        11  0.0  0.0      0     0 ?        S    Feb02   0:00 [cpuset]
root        12  0.0  0.0      0     0 ?        S    Feb02   0:00 [khelper]
root        13  0.0  0.0      0     0 ?        S    Feb02   0:00 [netns]
root        14  0.0  0.0      0     0 ?        S    Feb02   0:00 [async/mgr]
root        15  0.0  0.0      0     0 ?        S    Feb02   0:00 [pm]
root        17  0.0  0.0      0     0 ?        S    Feb02   0:45 [sync_supers]
root        18  0.0  0.0      0     0 ?        S    Feb02   0:47 [bdi-default]
root        19  0.0  0.0      0     0 ?        S    Feb02   0:00 [kintegrityd/0]
root        20  0.0  0.0      0     0 ?        S    Feb02   0:00 [kintegrityd/1]
root        21  0.0  0.0      0     0 ?        S    Feb02   0:47 [kblockd/0]
root        22  0.0  0.0      0     0 ?        S    Feb02   0:27 [kblockd/1]
root        23  0.0  0.0      0     0 ?        S    Feb02   0:00 [kacpid]
root        24  0.0  0.0      0     0 ?        S    Feb02   0:00 [kacpi_notify]
root        25  0.0  0.0      0     0 ?        S    Feb02   0:00 [kacpi_hotplug]
root        26  0.0  0.0      0     0 ?        S    Feb02   0:00 [ata_aux]
root        27  0.0  0.0      0     0 ?        S    Feb02   0:00 [ata_sff/0]
root        28  0.0  0.0      0     0 ?        S    Feb02   0:00 [ata_sff/1]
root        29  0.0  0.0      0     0 ?        S    Feb02   0:00 [khubd]
root        30  0.0  0.0      0     0 ?        S    Feb02   0:00 [kseriod]
root        31  0.0  0.0      0     0 ?        S    Feb02   0:00 [kmmcd]
root        32  0.0  0.0      0     0 ?        S    Feb02   0:07 [khungtaskd]
root        33  0.0  0.0      0     0 ?        S    Feb02  53:01 [kswapd0]
root        34  0.0  0.0      0     0 ?        SN   Feb02   0:00 [ksmd]
root        35  0.0  0.0      0     0 ?        S    Feb02   0:00 [aio/0]
root        36  0.0  0.0      0     0 ?        S    Feb02   0:00 [aio/1]
root        37  0.0  0.0      0     0 ?        S    Feb02   0:00 [ecryptfs-kthrea]
root        38  0.0  0.0      0     0 ?        S    Feb02   0:00 [crypto/0]
root        39  0.0  0.0      0     0 ?        S    Feb02   0:00 [crypto/1]
root        44  0.0  0.0      0     0 ?        S    Feb02   0:00 [pciehpd]
root        45  0.0  0.0      0     0 ?        S    Feb02   0:00 [scsi_eh_0]
root        46  0.0  0.0      0     0 ?        S    Feb02   0:00 [scsi_eh_1]
root        47  0.0  0.0      0     0 ?        S    Feb02   0:00 [kstriped]
root        49  0.0  0.0      0     0 ?        S    Feb02   0:00 [kmpathd/0]
root        50  0.0  0.0      0     0 ?        S    Feb02   0:00 [kmpathd/1]
root        51  0.0  0.0      0     0 ?        S    Feb02   0:00 [kmpath_handlerd]
root        52  0.0  0.0      0     0 ?        S    Feb02   0:00 [ksnapd]
root        53  0.0  0.0      0     0 ?        S    Feb02   0:00 [kondemand/0]
root        54  0.0  0.0      0     0 ?        S    Feb02   0:00 [kondemand/1]
root        55  0.0  0.0      0     0 ?        S    Feb02   0:00 [kconservative/0]
root        56  0.0  0.0      0     0 ?        S    Feb02   0:00 [kconservative/1]
root       246  0.0  0.0      0     0 ?        S    Feb02   2:58 [mpt_poll_0]
root       256  0.0  0.0      0     0 ?        S    Feb02   0:00 [mpt/0]
root       259  0.0  0.0      0     0 ?        S    Feb02   0:00 [scsi_eh_2]
root       278  0.0  0.0      0     0 ?        S    Feb02   0:08 [kdmflush]
root       282  0.0  0.0      0     0 ?        S    Feb02   0:00 [kdmflush]
root       297  0.0  0.0      0     0 ?        S    Feb02   0:36 [jbd2/dm-0-8]
root       298  0.0  0.0      0     0 ?        S    Feb02   0:00 [ext4-dio-unwrit]
root       299  0.0  0.0      0     0 ?        S    Feb02   0:00 [ext4-dio-unwrit]
root       338  0.0  0.0  17096     0 ?        S    Feb02   0:00 upstart-udev-bridge --daemon
root       340  0.0  0.0 147836    80 ?        Ss   Mar02   7:54 /usr/sbin/apache2 -k start
root       349  0.0  0.0  16880     0 ?        S<s  Feb02   0:00 udevd --daemon
root       554  0.0  0.0      0     0 ?        S    Feb02   0:00 [kpsmoused]
root       577  0.0  0.0      0     0 ?        S    Feb02 127:26 [vmmemctl]
root       657  0.0  0.0      0     0 ?        S    Feb02 126:36 [jbd2/sdb1-8]
root       658  0.0  0.0      0     0 ?        S    Feb02   0:00 [ext4-dio-unwrit]
root       659  0.0  0.0      0     0 ?        S    Feb02   0:00 [ext4-dio-unwrit]
root       788  0.0  0.0   6128     4 tty4     Ss+  Feb02   0:00 /sbin/getty -8 38400 tty4
root       791  0.0  0.0   6128     4 tty5     Ss+  Feb02   0:00 /sbin/getty -8 38400 tty5
root       793  0.0  0.0  49312   104 ?        Ss   Feb02   0:04 /usr/sbin/sshd
root       796  0.0  0.0   6128     4 tty2     Ss+  Feb02   0:00 /sbin/getty -8 38400 tty2
root       797  0.0  0.0   6128     4 tty3     Ss+  Feb02   0:00 /sbin/getty -8 38400 tty3
root       801  0.0  0.0   6128     4 tty6     Ss+  Feb02   0:00 /sbin/getty -8 38400 tty6
daemon     804  0.0  0.0  18932     0 ?        Ss   Feb02   0:02 atd
root       805  0.0  0.0  21128   156 ?        Ss   Feb02   0:47 cron
root       851  0.0  0.0  11360   304 ?        Ss   Feb02  15:18 /usr/sbin/irqbalance
root      1374  0.0  0.0      0     0 ?        S    Mar16   0:01 [jbd2/dm-2-8]
root      1375  0.0  0.0      0     0 ?        S    Mar16   0:00 [ext4-dio-unwrit]
root      1376  0.0  0.0      0     0 ?        S    Mar16   0:00 [ext4-dio-unwrit]
root      4088  0.0  0.0      0     0 ?        S    11:58   0:00 [flush-251:0]
root      4329  0.0  0.0  70660   720 ?        Ss   11:59   0:00 sshd: myuser [priv]
myuser    4347  0.0  0.0  70660   744 ?        S    12:00   0:00 sshd: myuser@pts/1
myuser    4348  0.0  0.1  26028  6432 pts/1    Ss   12:00   0:00 -bash
munin     5150  0.0  0.0  39916  2040 ?        Ss   Apr08   8:04 /usr/sbin/munin-node
redis     8110  3.0 38.7 2244244 1553116 ?     Ss   May19  27:58 /usr/bin/redis-server /etc/redis/re
www-data  8433  0.0  0.0 148344  1928 ?        S    May17   0:01 /usr/sbin/apache2 -k start
www-data  8435  0.0  0.0 148336  2132 ?        S    May17   0:01 /usr/sbin/apache2 -k start
myuser   10115  0.0  0.0  26624   592 ?        Ss   12:50   0:00 SCREEN -dR
myuser   10116  0.0  0.0  22808  2548 pts/0    Ss   12:50   0:00 /bin/bash
root     11410  0.0  0.0  27248   396 ?        Ss   Mar16   0:03 rpc.idmapd
root     11439  0.0  0.0      0     0 ?        S    Mar16   0:00 [lockd]
root     11440  0.0  0.0      0     0 ?        S    Mar16   0:03 [nfsd4]
root     11441  0.0  0.0      0     0 ?        S    Mar16   0:00 [nfsd4_callbacks]
root     11442  0.0  0.0      0     0 ?        S    Mar16   0:19 [nfsd]
root     11443  0.0  0.0      0     0 ?        S    Mar16   0:21 [nfsd]
root     11444  0.0  0.0      0     0 ?        S    Mar16   0:20 [nfsd]
root     11445  0.0  0.0      0     0 ?        S    Mar16   0:22 [nfsd]
root     11446  0.0  0.0      0     0 ?        S    Mar16   0:21 [nfsd]
root     11447  0.0  0.0      0     0 ?        S    Mar16   0:20 [nfsd]
root     11448  0.0  0.0      0     0 ?        S    Mar16   0:21 [nfsd]
root     11449  0.0  0.0      0     0 ?        S    Mar16   0:20 [nfsd]
root     11453  0.0  0.0  19032   316 ?        Ss   Mar16   0:09 /usr/sbin/rpc.mountd --manage-gids
myapp     11886  0.0  0.0  11984   196 pts/0    S+   13:03   0:00 /bin/bash shared/worker.sh
myapp     11887  0.1  2.0 180724 81252 pts/0    S+   13:04   0:03 resque-1.10.0: Forked 14745 at 1305
myapp     13085  0.1  2.0 195248 83180 ?        S    13:17   0:03 Rails: /var/www/myapp-prd/current
myapp     14745 80.9  2.0 185908 80900 pts/0    R+   13:29  19:09 resque-1.10.0: Processing myapp_import
www-data 15516  0.0  0.0 148224  1608 ?        S    May17   0:01 /usr/sbin/apache2 -k start
myuser   17579  0.0  0.0  17760  1220 pts/1    R+   13:53   0:00 ps aux
mysql    18736  2.5  1.1 345568 46396 ?        Ssl  Mar16 2349:04 /usr/sbin/mysqld
daemon   21340  0.0  0.0   8304     4 ?        Ss   Mar16   0:00 portmap
statd    21564  0.0  0.0  14600     4 ?        Ss   Mar16   0:00 rpc.statd -L
root     21823  0.0  0.0      0     0 ?        S    Mar16   0:00 [rpciod/0]
root     21824  0.0  0.0      0     0 ?        S    Mar16   0:00 [rpciod/1]
www-data 22353  0.0  0.0 148484  1352 ?        S    May16   0:02 /usr/sbin/apache2 -k start
root     23471  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kslowd000]
root     23472  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kslowd001]
root     23473  0.0  0.0      0     0 ?        S    Mar16   0:00 [nfsiod]
www-data 27001  0.0  0.0 148496  1588 ?        S    May17   0:01 /usr/sbin/apache2 -k start
root     27365  0.0  0.0  23432    20 ?        Ssl  May15   0:00 PassengerWatchdog
root     27368  0.0  0.0 164064  1064 ?        Sl   May15   6:51 PassengerHelperAgent
root     27370  0.0  0.2  61904  8408 ?        S    May15   0:02 Passenger spawn server
nobody   27373  0.0  0.0  72172    28 ?        Sl   May15   0:01 PassengerLoggingAgent
www-data 27383  0.0  0.0 148356  1740 ?        S    May15   0:02 /usr/sbin/apache2 -k start
www-data 27384  0.0  0.0 148492  1444 ?        S    May15   0:02 /usr/sbin/apache2 -k start
www-data 27385  0.0  0.0 148224  1568 ?        S    May15   0:02 /usr/sbin/apache2 -k start
www-data 27386  0.0  0.0 148496  1860 ?        S    May15   0:02 /usr/sbin/apache2 -k start
root     29402  0.0  0.0  16876     4 ?        S<   Mar16   0:00 udevd --daemon
root     29800  0.0  0.0      0     0 ?        S    Mar16   0:00 [kdmflush]
root     29804  0.0  0.0  16876     4 ?        S<   Mar16   0:00 udevd --daemon
www-data 30231  0.0  0.0 148352  1904 ?        S    May16   0:02 /usr/sbin/apache2 -k start
root     31072  0.0  0.0      0     0 ?        S    May12   0:37 [flush-8:16]
syslog   31116  0.0  0.0 126028   388 ?        Sl   May12   0:04 rsyslogd -c4
root     32217  0.0  0.0   6128     4 tty1     Ss+  Mar17   0:00 /sbin/getty -8 38400 tty1

cat /proc/meminfo:

MemTotal:        4010060 kB
MemFree:          155604 kB
Buffers:           48232 kB
Cached:           149308 kB
SwapCached:        42036 kB
Active:          1572044 kB
Inactive:         639616 kB
Active(anon):    1496792 kB
Inactive(anon):   517464 kB
Active(file):      75252 kB
Inactive(file):   122152 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        741372 kB
SwapFree:             64 kB
Dirty:              1532 kB
Writeback:             0 kB
AnonPages:       1972572 kB
Mapped:             7920 kB
Shmem:                96 kB
Slab:              24384 kB
SReclaimable:      12224 kB
SUnreclaim:        12160 kB
KernelStack:        1368 kB
PageTables:        11632 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     2746400 kB
Committed_AS:    3150752 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      281448 kB
VmallocChunk:   34359449320 kB
HardwareCorrupted:     0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       12288 kB
DirectMap2M:     4132864 kB

One of the things I notice is the high value of Active(anon): 1496792 kB, which according to the docs is:

Memory that has been used more recently and usually not reclaimed unless absolutely necessary.

That's almost 1.5GB right there, is that normal? I do have very long (permanent) running processes (Resque) that fork each time a job comes in.

Kamiel Wanrooij

Asked: 2011-04-14 10:40:15 +0800 CST

How does `sudo` search the path for executables?

8

I am using rubygems (1.3.7) with gems that require root privileges on Ubuntu 10.10. When I compare my setup to an ubuntu 9.10 with rubygems 1.3.6 installation, I see the following difference in gem environment:

1.3.7 / 10.10 - EXECUTABLE DIRECTORY: /var/lib/gems/1.8/bin

1.3.6 / 09.10 - EXECUTABLE DIRECTORY: /usr/bin

The output is the same whether I use sudo or not. To fix this (I don't know why it is different in the first place), I tried to modify my path variable.

My question is, where does sudo look for executables? If I install a gem (using sudo) the executable is placed in the /var path obviously. I added this path to my ~/.profile and /etc/environment files, but I cannot get sudo to execute the executables.

If I run:

$ gemname it runs my tool correctly.
$ sudo gemname it merely tells me command not found.
$ sudo echo $PATH it does show the correct path.
$ sudo -i gemname it runs correctly.
$ sudo sudo -V shows that the PATH is preserved.

Does sudo honour ~/.profile and/or /etc/environment? If so, they why can't it find my executable while the directory is shown in the $PATH environment variable?

I have read the documentation of sudo, I also search and looked through a ton of topics on stackoverflow and serverfault (for instance How to override a PATH environment variable in sudo?, but my example shows that $PATH contains the correct path), but they never actually show how to run a gem via sudo.

Kamiel Wanrooij

Asked: 2010-10-19 07:54:51 +0800 CST

How can I diagnose an Ubuntu system freeze after reboot

0

One of our servers froze yesterday, apparently refusing to serve any HTTP requests. The tech guy on site could not connect remotely to the machine, so he rebooted the (virtual) machine from the VMware Infrastructure Client, and everything was up and running again.

Now I want to figure out what went wrong. I looked at a couple of log files, and all just stop logging anything at 5:00am, and start logging again with a boot sequence. I could not find anything suspicious, other then the fact a number of cron jobs ran at 5:00am. These were all fairly simple jobs, not interacting with anything critical, and there was at least some activity after they completed.

The freeze lasted a couple of hours. We did not have any other issues on other virtual machines on the same box, which all have a very similar configuration.

Is there any place I should start looking for clues? What can I tell people to do should this happen again before just resetting the machine? Magic SysRq maybe?

Kamiel Wanrooij

Asked: 2010-07-28 03:50:15 +0800 CST

How to move multiple Apache Rails applications with minimal downtime

0

I have a server hosting some 80 small rails applications. We have recently upgraded the disk space, so now we have to move all sites to this new disk. The host is a VMware ESX server, so all disks are virtual.

We have an virtual host and apache config file for each file. In this file we define a development, testing, acceptance and production environment, each on its own domain. The production environment can have multiple domains. The websites are currently in a directory on the '/' partition (bad idea, I know), and need to be moved to a fresh partition. The websites run a Sqlite database, so that has to be copied as well.

We want to move these websites to a different disk. Easiest thing to do is shut down apache, copy the files, and remount the disk at the old location. This would cause significant downtime, since its around 100GB of data that needs to be copied.

Is there a way to synchronize the new disk with the old files, and then swapping them 'instantly'? Or maybe automatically moving the websites one by one, to minimize downtime for each? My greatest fear is corrupting the Sqlite databases if they are written to while the operation is in progress.

How to track things that SHOULD happen, but might not have

Varnish before HAProxy seems to reuse (wrong) backend connection

Chef clients are unable to connect after Ubuntu upgrade

Why does my memory used (accounting for cache) does not correspond to my processes memory usage?

How does `sudo` search the path for executables?

How can I diagnose an Ubuntu system freeze after reboot

How to move multiple Apache Rails applications with minimal downtime

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?