Is there a way on Linux to get statistics about the various reasons packets were dropped?
On all network interfaces (openSUSE 12.3) on several servers, ifconfig
and netstat -i
are reporting dropped packets at the reception. When I do a tcpdump
, the number of dropped packets stop increasing, meaning that the interfaces queues are not full and dropping the data. So there must be other reasons why this is happening (e.g. multicast pkts received whereas the interface is not part of this multicast group).
Where can I find such information? (/proc? /sys? some logs?)
Example of statistics (merge of the /sys/class/net/<dev>/statistics and ethtool output):
alloc_rx_buff_failed: 0
collisions: 0
dropped_smbus: 0
multicast: 1644
rx_align_errors: 0
rx_broadcast: 23626
rx_bytes: 1897203
rx_compressed: 0
rx_crc_errors: 0
rx_csum_offload_errors: 0
rx_csum_offload_good: 0
rx_dropped: 4738
rx_errors: 0
rx_fifo_errors: 0
rx_flow_control_xoff: 0
rx_flow_control_xon: 0
rx_frame_errors: 0
rx_length_errors: 0
rx_long_byte_count: 1998731
rx_long_length_errors: 0
rx_missed_errors: 0
rx_multicast: 1644
rx_no_buffer_count: 0
rx_over_errors: 0
rx_packets: 25382
rx_short_length_errors: 0
rx_smbus: 0
tx_aborted_errors: 0
tx_abort_late_coll: 0
tx_broadcast: 7
tx_bytes: 11300
tx_carrier_errors: 0
tx_compressed: 0
tx_deferred_ok: 0
tx_dropped: 0
tx_errors: 0
tx_fifo_errors: 0
tx_flow_control_xoff: 0
tx_flow_control_xon: 0
tx_heartbeat_errors: 0
tx_multicast: 43
tx_multi_coll_ok: 0
tx_packets: 63
tx_restart_queue: 0
tx_single_coll_ok: 0
tx_smbus: 0
tx_tcp_seg_failed: 0
tx_tcp_seg_good: 0
tx_timeout_count: 0
tx_window_errors: 0
Try
/sys/class/net/eth0/statistics/
(i.e. foreth0
), it's not perfect but it breaks down errors by transmit/receive and by carrier, window, fifo, crc, frame, length (and a few more) types of errors.Drops are not the same as "ignored",
netstat
show interface level statistics, a multicast packet ignored by a higher level (layer 3, the IP stack) won't show as a drop (though it might show up as "filtered" on some NIC stats). Statistics may be complicated somewhat by various offload features.You can get more stats if you have
ethtool
:Some statistics depend on the NIC driver, as will the exact meaning. The above is from an Intel
e1000
. Having looked at handful of drivers, some collect many more statistics than others (the stats available to ethtool tend to be kept in separate source file, e.g.drivers/net/ethernet/intel/e1000/e1000_ethtool.c
, if you need to rummage).ethtool -i eth0
will show the driver details, the output oflspci -v
should be more detailed, though with a bit of clutter too.Update In
tg3.c
functiontg3_rx()
there's only one place that looks likely with atp->rx_dropped++
, but the code is littered withgoto
s, so there are several other causes than the obvious, i.e. anything withgoto drop_it
orgoto drop_it_no_recycle
. (Note that the drop counter is one of the few maintained by the driver, the rest are maintained by the device itself.)The driver source I have to hand is 3.123. My best guess is this code:
Check the MTU, possible causes are jumbo frames, or slightly oversized ethernet frames to allow for encapsulation. I cannot explain why
tcpdump
might change the behaviour, it's not known to change the interface MTU. Note also that you may "see" packets larger then the MTU withtcpdump
if TSO/LRO is enabled (explanation).errors indicate Poorly or incorrectly negotiated mode and speed, or damaged network cable
dropped indicate Possibly due to iptables or other filtering rules, more likely due to lack of network buffer memory
overrun indicate the Number of times the network interface ran out of buffer space. carrier Damaged or poorly connected network cable, or switch problems
collsns indicate the Number of collisions, which should always be zero on a switched LAN. Non-zero indicates problems negotiating appropriate duplex mode. A small number that never grows means it happened when the interface came up but hasn't happened since