From: Bernard Pidoux F6BVP (mvrhu@dejarnette.com)
Date: Sun May 04 2003 - 12:16:46 EEST
Lars E. Pettersson wrote:
>On Tuesday 29 April 2003 08:12, Kjell Jarl wrote:
>...
>
>
>>- I issue a netrom connect to a distand node.
>>- When connected, I send sveral "b" commands.
>>- When (unsure of the number) one or two "b" has been sent, there comes
>>a netrom disconnect from the remote station.
>>- In my vnc window, the last thing I saw was another b being sent to my
>>neighboring node after the disconnect arrived.
>>
>>
>
>My kernel panics (it get hung in the interrupt handler) seem to come when I
>have a netrom connection with outstanding frames (if I remember correctly)
>and we get a timeout from the ax25 connection. When the ax25 connection,
>initiated by the netrom connect, times out, we get the hang.
>
>Anyone with kernel knowledge that gets any wiser by this?
>
>73 de Lars, sm6rpz
>
>
I have already reported to the list my findings about kernel 2.4.x panics.
For me there is no question about the origin of the problem : it is not
specifically related with netrom but with ax25.
Here I am using serial mkiss interface and I think that the problem is
related to the serial management part of the code with intensive use of
clear interrupt (cli) instructions. In 2.4 kernels the interrupts are
handled differently than in 2.2 kernels by a new procedure called
softirq, that apparently is unable to recover from all the interrupts
generated by ax25 code.
I have traced the oops five times following the recommandations in
Documentation/oops-tracing.txt ( see REPORTING-BUGS and README files in
/usr/src/linux/ )
All reports gave the same message :
<0> Kernel panic : Aiee, killing interrupt handler !
interrupt handler not sycing
To make it short (I have 5 complete listing of traces with the last
subroutines addresses processed by the CPU ) the 29 sequence of routines
given by trace before kernel panics are not always exactly the same but
it always start at
sock_def_write_space (sock.c)
and the last 13 are always the same, beginning with do_sysctl_strategy
in sysctl.c and finishing by ksoftirq in softirq.c.
I guess that the important point is the way the code sequence leading to
a kernel panic is started and what routines are involved.
In my case, subroutine ax25_rcv, that make a lot of cli() instructions,
and ax25_kiss_rcv, were often involved in the fatal sequence.
I am aware that a lot of code cleaning is being performed in 2.5.x
kernels following the decision to remove cli() / sti() mechanism. This
should prevent system hanging, but until now I was not able to run such
an experimental kernel (no display at boot !).
I certainly would be interested in testing ax25 after these intensive
modifications around interrupt mechanism.
73 de Bernard F6BVP
-
To unsubscribe from this list: send the line "unsubscribe linux-hams" in
the body of a message to dgwn.sjzejnjye@usda.gov
More majordomo info at http://vger.kernel.org/majordomo-info.html
This archive was generated by hypermail 2b30 : Sun May 04 2003 - 12:21:37 EEST