I've just put a second SiI 3114 SATARaid card in my home server so that I could add another pair of SATA drives and increase my storage space. Annoyingly, it doesn't seem to work:
[ 32.816030] ata5: lost interrupt (Status 0x0) [ 32.816072] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 32.816091] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in [ 32.816094] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 32.816101] ata5.00: status: { DRDY } [ 32.816117] ata5: hard resetting link [ 33.136082] ata5: SATA link down (SStatus 0 SControl 0) [ 36.060940] irq 18: nobody cared (try booting with the "irqpoll" option) [ 36.060949] Pid: 0, comm: swapper Not tainted 2.6.31-20-generic #58-Ubuntu [ 36.060954] Call Trace: [ 36.060977] [] ? printk+0x18/0x1c [ 36.060997] [] __report_bad_irq+0x27/0x90 [ 36.061005] [] note_interrupt+0x150/0x190 [ 36.061011] [] handle_fasteoi_irq+0xac/0xd0 [ 36.061023] [] handle_irq+0x18/0x30 [ 36.061029] [] do_IRQ+0x47/0xc0 [ 36.061042] [] ? irq_exit+0x50/0x70 [ 36.061058] [] ? smp_apic_timer_interrupt+0x57/0x90 [ 36.061065] [] common_interrupt+0x30/0x40 [ 36.061075] [] ? native_safe_halt+0x5/0x10 [ 36.061082] [] default_idle+0x46/0xd0 [ 36.061088] [] cpu_idle+0x8c/0xd0 [ 36.061103] [] rest_init+0x55/0x60 [ 36.061111] [] start_kernel+0x2e6/0x2ec [ 36.061117] [] ? unknown_bootoption+0x0/0x19e [ 36.061133] [] i386_start_kernel+0x7c/0x83 [ 36.061137] handlers: [ 36.061139] [] (sil_interrupt+0x0/0xb0) [ 36.061151] Disabling IRQ #18 [ 38.136014] ata5: hard resetting link [ 38.456022] ata5: SATA link down (SStatus 0 SControl 0) [ 43.456013] ata5: hard resetting link [ 43.776022] ata5: SATA link down (SStatus 0 SControl 0) [ 43.776035] ata5.00: disabled [ 43.776055] ata5.00: device reported invalid CHS sector 0 [ 43.776074] sd 4:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 43.776082] sd 4:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor] [ 43.776092] Descriptor sense data with sense descriptors (in hex): [ 43.776097] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 43.776112] 00 00 00 00 [ 43.776118] sd 4:0:0:0: [sde] Add. Sense: No additional sense information [ 43.776127] end_request: I/O error, dev sde, sector 0 [ 43.776136] Buffer I/O error on device sde, logical block 0 [ 43.776170] ata5: EH complete [ 43.776187] ata5.00: detaching (SCSI 4:0:0:0)
root@core:~# cat /proc/interrupts CPU0 0: 47 IO-APIC-edge timer 1: 8 IO-APIC-edge i8042 6: 3 IO-APIC-edge floppy 7: 0 IO-APIC-edge parport0 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 14: 53069 IO-APIC-edge pata_sis 15: 53004 IO-APIC-edge pata_sis 17: 112265 IO-APIC-fasteoi sata_sil 18: 200002 IO-APIC-fasteoi sata_sil, SiS SI7012 19: 111140 IO-APIC-fasteoi eth0 20: 0 IO-APIC-fasteoi ohci_hcd:usb2 21: 0 IO-APIC-fasteoi ohci_hcd:usb3 23: 0 IO-APIC-fasteoi ehci_hcd:usb1 NMI: 0 Non-maskable interrupts LOC: 6650492 Local timer interrupts SPU: 0 Spurious interrupts CNT: 0 Performance counter interrupts PND: 0 Performance pending work RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 160 Machine check polls ERR: 0 MIS: 0 root@core:~# lspci | grep Raid 00:09.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) 00:0a.0 RAID bus controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) root@core:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 9.10 Release: 9.10 Codename: karmic root@core:~# uname -a Linux core.topper.me.uk 2.6.31-20-generic #58-Ubuntu SMP Fri Mar 12 05:23:09 UTC 2010 i686 GNU/Linux
I've tried a combination of different kernel options (irqpoll, noapic, noacpi, pci=noapic) all to no avail. Does anyone have any bright ideas about how I can go about making this work?
Swapping PCI cards around isn't an option as there are only two slots in this motherboard (an ASRock K7S41GX). The BIOS doesn't look to have too much in the way of configuration options regarding IRQ usage.
Plan B is to ditch this server completely and buy a new QNAP for these drives to go in, but I was hoping to avoid doing this right now.
On someone else's suggestion I tried swapping the two PCI cards around in the motherboard. This has (a little annoyingly) seemed to work - can anyone explain why?
Perhaps there was just a touch of corrosion on the socket contacts on the motherboard. I've had this happen to me many times, and a common cure is to simply unplug/replug the offending device. (Powered off, of course.) This is especially common in equipment which is exposed to humidity and dust. If you have continued problems, try using a product such as Deoxit on the contacts. I've used this when working on 30 year old radar sets to good effect.
It's rather speculation on my part, but anyway: the BIOS could be assigning the cards different (random) IRQs if it sees a new card in slot. As you swapped the cards, it is possible that it changed the assigned IRQs and resolved IRQ conflict this way.