[H-GEN] DMA query

Tim Browne dugb at netspace.net.au
Thu May 6 01:07:58 EDT 2004


Jason,
A while ago I was running a freebsd box with a realtek 10/100 card that had both 
rj45 a bnc connector on  it. Even though the card was plugged into a 100T switch 
and in theory should auto detect the link speed it would always, after a period 
of time, lockup. Disconnecting the cable and reconnecting would temporarily fix 
the problem. I discovered that I had to specify 100UTP to the card during boot 
to solve the problem. If my memory servers me correctly there were additional 
commands passed to card with ifconfig to specify 100UTP. At the moment I cant 
recall what they were but if you really get stuck I may be able to dig them up.

Tim Browne.


Jason Parker-Burlingham wrote:
> [ Humbug *General* list - semi-serious discussions about Humbug and     ]
> [ Unix-related topics. Posts from non-subscribed addresses will vanish. ]
> 
> I'm trying to come up with an explanation for strange behavior
> observed on a SAMBA-based fileserver.  The computer in question
> doesn't really have a shortage of problems, but lately it has started
> to cease answering requests from the network---users can't get
> profiles, they call us, usually after unplugging anything that looks
> handy and rebooting like mad.
> 
> I managed to get there quickly enough today to discover that although
> the relevant processes were running, the host didn't appear to be on
> the network (as I recall, it would respond to its own broadcast pings,
> but nothing else would; neither could it be pinged from other hosts).
> 
> So I decided to start working the problem from the physical layer on
> up, but as soon as I unplugged and reinserted the host's network cable
> on the switch, everything started to work again (I had a continuous
> ping running on a machine across the room and noticed almost
> immediately).  So I think I have a badly behaved network card.
> 
> But I think that may not be the true cause:
> 
> I'd been noticing DMA timeouts and IDE resets, too.  The night before
> the events described above, I removed a suspect disk, hoping that'd
> put an end to them, started a backup to tape, and left.  Here's what
> happened, starting with when the kernel detected my using the tape
> unit, and ending with the last messages from the kernel before I
> arrived back at the client's a few hours later:
> 
>    hdd: attached ide-tape driver.
>    ide-tape: hdd <-> ht0: Seagate STT20000A rev 8A51
>    ide-tape: hdd <-> ht0: 1000KBps, 6*54kB buffer, 9720kB pipeline, 110ms tDSC, DMA
>    hdd: error waiting for DMA
>    hdd: dma timeout retry: status=0xd0 { Busy }
>    hdd: DMA disabled
>    hdd: ATAPI reset complete
>    ide-tape: ht0: I/O error, pc =  a, key =  2, asc =  4, ascq =  1
>    [lots of ht0 errors just like above]
>    ide-tape: Couldn't write a filemark
>    ide-tape: ht0: I/O error, pc = 10, key =  2, asc =  4, ascq =  1
>    ide-tape: ht0: I/O error, pc = 10, key =  2, asc =  4, ascq =  1
>    hdc: attached ide-cdrom driver.
>    hdc: ATAPI CD-ROM drive, 128kB Cache, UDMA(33)
>    Uniform CD-ROM driver Revision: 3.12
>    cdrom: This disc doesn't have any tracks I recognize!
> 
> Now, what's curious is that I had disabled and reenabled networking
> (basically, re-running ifconfig) before deciding to work more
> methodically; but the messages from the kernel about eth0 make me
> suspicious:
> 
>    # dmesg|grep eth0:
>    eth0: RealTek RTL8139 at 0xc8868f00, 00:50:bf:3a:3c:9e, IRQ 9
>    eth0:  Identified 8139 chip type 'RTL-8139B'
>    eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
> 
> [this is when I started a packet-sniff after things broke]
>    eth0: Promiscuous mode enabled.
>    eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
>    eth0: link down
>    eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
>    eth0: link down
>    eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
> 
> Notice the unpaired "link up".
> 
> My best theory is that the IDE bus reset tickles the network card into
> an unresponsive state, and reinserting the network cable (or
> rebooting) is sufficient to fix it.  The only problem with this theory
> is that I don't know enough to rule it in or out.
> 
> This may all be moot since the client is getting badly-needed
> replacement hardware.  But I still want to know if I'm just making
> shit up, or if I could conceivably be on the right track.
> 
> (Oh, and backups to tape when you've disabled DMA on all devices are
> *awful*.)
> 
> jason






More information about the General mailing list