Ping a Specific Port

Question

eckza

Asked: 2012-12-09 08:41:40 +0800 CST2012-12-09 08:41:40 +0800 CST 2012-12-09 08:41:40 +0800 CST

Mid-2011 Mac Mini Server keeps dropping a disk from RAID, not sure what to do about it

772

The small business that I do sysadmin work for on the side uses a mid-2011 Mac Mini Server (running 10.7 Lion) as a fileserver and FileMaker database host. Its 2 750gb HDDs are RAID 1'd together, and it Time Machine backups over USB to a RAID 1 array of 2 1 TB disks.

I set it up about a year and a half ago and had no problems with it until a few months ago. I opened Disk Utility to find that the RAID had degraded and that it was only running on one disk. I went out and bought another 750gb HDD, installed it, and rebuilt the array.

Everything was fine for a week - then, the array degraded again. I rebuilt the array and it was fine again until last week - when again, the array degraded. It keeps degrading on the same device - disk1 has always been fine, but disk2 keeps degrading, regardless of what physical hard drive is in there. I don't think it's a hardware issue.

What should I do? I would reinstall OSX, but I've never restored a backup from Time Machine before and I'm not sure what to expect - if things go sideways, I woud have to reconfigure a lot of stuff, including about 10 user accounts and network shares and stuff (not to mention the FileMaker configuration stuff). This is just a side thing for me, and I really don't want to burn up a Friday-night-to-Monday-morning-nonstop weekend scenario on this because something went wrong and I lost everything.

2 Answers

Voted

Sven · Answer 1 · 2012-12-09T12:09:08+08:00

Sven

2012-12-09T12:09:08+08:002012-12-09T12:09:08+08:00

Have you read any log files that might give you a hint what the issue is? I would definitely not rule out a hardware issue - it's not only the disks that might be damaged but cables and even connections on the main board can be the culprit if they are no up to spec for whatever reason. These can be problematic to get repaired though, especially if errors are only sporadic - many companies, including Apple (in my experience) will disregard errors they can't see after a few seconds of testing.

0

bmike · Answer 2 · 2012-12-13T07:37:31+08:00

You will want to be very systematic about isolating the failure by saving the system logs, watching them for filesystem errors and challenging your assumptions.

Why rule out disk 1 - if there is an error writing data to two drives - the system has to pick one and perhaps there isn't a good reason to pick drive 2 to survive or the algorithm is based on something silly like whether the day/week/second when the error is detected is even or odd and you have too few documented failures to notice that pattern.

From the phrasing of the question - you are mixing two problems - lack of a tested rebuild strategy and how to isolate a RAID issue. Try to be frank with yourself and your employer about the risks and let them make a business decision which problem to attack with which budgetary estimate.

As to the main question here - you could also just script a simple check like diskutil list and have it send an alert / pager / capture the logs when you detect the next RAID problem. I would also disable RAID software AutoRebuild if you have that enabled just in case the problem is physical with someone jiggling the server and the system picks the wrong spindle to re-mirror when the cables reconnect.

Mid-2011 Mac Mini Server keeps dropping a disk from RAID, not sure what to do about it

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?