General Category > LisaList2

LisaEm Bugs - LisaTest crash on CPU

<< < (2/2)

rayarachelian:
Nothing useful yet. The last bits of code before this happen turn on VIA1 IRQ for CA1, then there's a time delay, then when the vertical retrace happens, the ISR for autovector 1 is either not loaded or is in a bad page, so a bus error occurs, but the bus error vector is zero, so it dies.

The only odd thing is this MMU mismatch warning, but when I check on the RAM dump and MMU dump, I see they all have valid values, and the MMU cache matches the MMU registers, so not sure what's wrong. (look for *MISMATCH*)

I don't think LisaTest sets up a vector for bus errors and so this is an unexpected event. The real question then is why this IRQ happened and wasn't handled. Though I do see some code before the mismatches that seems to copy low RAM elsewhere, so perhaps it was getting ready to test low memory before this happened.

I do see some of the code winds up executing with the high address byte set (on the Lisa and pre-32 bit clean Macs the high address byte is ignored, and is used for flags. It does look like LOS uses this too and some of that is addressed in 1.2.7, but I don't see any obvious thing that I can point at and say where things went wrong to cause the bus error to be triggered, so whatever the bug is, it's still hidden.
Here's a bit of the cleaned up tracelog, as well as a larger uncleaned one twoards the end, but I don't know what else is useful there.

rayarachelian:
So I've stopped looking to fix this on 1.2.6 and instead am trying to fix this (or other related bugs) on 1.2.7. There's some code in reg68k.c that allocates IPCTs (Instruction Pointer Cache Tables) which libGenerator uses as an instruction cache.

I think either something is clobbering the data stored in that cache or in the MMU pointers, or something else is causing corruption there.

I've eliminated the free ipc table - the source of the bug isn't there by adding code to walk through the free ipcts and check that they are initialized to zero and none are "dirty" - the bug doesn't happen there.

The symptom in 1.2.7 is that in this code there's a segfault caused following ipc->opcode.
As per gdb ipc is not NULL, but following the pointer causes a crash, which means it's pointing to the wrong place, which means something has corrupted, or invalid pointers somehow were written to it - which should never happen. It's possible some other code somewhere is either clobbering the IPCTs or the mmu translation cache.

This code is a little bit unoptimized vs 1.2.6 - because I'm trying to track down exactly where the segfault is happening, using -O3 optimizes some of the variables out so it's hard to tell and the segfault happens on if (flag)


--- Code: ---               abort_opcode=2;

               #ifndef EVALUATE_EACH_IPC
               static int flag;
               flag=(ipc==NULL);
               if (!flag) flag=(ipc->function==NULL);

               if (!flag)
                  {  uint16 myword=fetchword(pc24 & 0x00ffffff);
                     if (ipc->opcode!=myword) flag=1;
                  }
               if (flag) //==13256== Conditional jump or move depends on uninitialised value(s)
               #endif
                {
                    if (abort_opcode==1) break;
                    if (!mt->table) mt->table=get_ipct();  //we can skip free_ipct
                    cpu68k_makeipclist(pc24 & 0x00ffffff); if (abort_opcode==1) break; //==24726== Conditional jump or move depends on uninitialised value(s)
                    ipc=&(mt->table->ipc[(pc24 & 0x1ff)>>1]);
                }
                abort_opcode=0;

--- End code ---

rayarachelian:
Found a minor initialization bug that was causing the 1.2.7 boot up crashes on LisaTest, so basically when I use calloc to allocate memory dynamically, that guarantees that the space allocated is initialized to zero, however, any arrays declared in C code itself that get allocated to the heap are apparently garbage filled. I did clear most of the fields in mmu_trans, except the table pointer, which had random junk in it, this caused the segfault in previous reply.

This might or might not be the issue with LisaTest failing, as mentioned in another post tonight, had some mouse movement issues that prevented running LisaTest properly, but it did expose a possible easter egg in Monitor OS (or perhaps it's not well known).

Navigation

[0] Message Index

[*] Previous page

Go to full version