LisaList2

Advanced search  

News:

2022.06.03 added links to LisaList1 and LisaFAQ to the General Category

Pages: [1]   Go Down

Author Topic: Troubleshooting Bus Errors  (Read 10939 times)

sigma7

  • Administrator
  • Sr. Member
  • *****
  • Karma: +148/-1
  • Offline Offline
  • Posts: 394
  • Warning: Memory errors found. Verify comments.
Troubleshooting Bus Errors
« on: May 10, 2022, 05:52:36 pm »

Here is a first go at troubleshooting bus errors... if it turns into something useful hopefully it will end up in the FAQ. -- corrections & suggestions requested!

Debugging Bus Errors caused by hardware issues v.2022-05-14-A

A bus error is generated if a timeout occurs when the CPU attempts to read or write to an address (memory or I/O).

There are circumstances when bus errors are expected, such as when checking to see if an expansion card is present in a slot. In these cases the software intercepts the bus error and carries on.

In the case where a bus error is not expected, it indicates a hardware or software fault, and so typically an error message is shown.

The bus timeout signal is generated by the 556 timer beside the CPU. It is re-started at the beginning of each bus cycle, so the timeout only occurs when a bus cycle does not terminate properly.

Bus cycles are terminated by an acknowledgement signal, which signifies to the CPU that the requested data is now available (for a read), or has been accepted by the hardware (for a write).

There are two acknowledgement signals, VPA and DTACK. VPA is used to provide compatibility with older 6800 peripheral timing and to do so, the CPU slows down the process of completing the bus cycle when it receives the VPA signal. DTACK is the 68000 signal that indicates the bus cycle can terminate more or less immediately.

In the Lisa hardware, VPA and DTACK are generated by different circuits depending on what hardware is accessed. ie. the I/O board generates DTACK or VPA for accesses to hardware on the I/O board, the CPU board generates DTACK or VMA for accesses to the various hardware control registers, MMU, and ROM on the CPU board as well as DTACK for slot RAM, and individual expansion slot cards generate their own DTACK or VPA.

So when a bus error occurs, the fault may be on any of these boards and so you may be in for some interesting troubleshooting.

Swapping boards with a working machine is helpful to isolate the problem board, but even having isolated a particular board, there is usually more than one source of VPA/DTACK to check.

If one knows the address that generated the bus error, that usually will narrow down the circuit substantially. While the Lisa is still operating from the CPU ROM, it will try to help provide the information needed...

Reviewing p25 of the Lisa Boot ROM Manual v1.3 Feb84, it indicates that when an exception (such as a bus error) occurs, the address is stored in the long word at $282. In the event this is not populated with the problem address that caused the bus error, one will need to use another technique to isolate the problem (see reply below).

The default memory map setup programmed into the MMU by the ROM looks like this:

    000000 - 1FFFFF* Slot Memory (a combination of both slots up to 2MB), with the video page using the highest 32K
                               *If there is less than 2MB, then the memory range will be smaller, eg. 1MB 000000 - 0FFFFF
                               Thanks to the magic of the MMU, the slot memory always starts at address 0 regardless of
                               which slot or how much memory is populated
                               A Lisa modified to support 4MB is special as the additional address line bypasses the MMU, but
                               roughly speaking the slot memory is mapped up to 3FFFFF

    200000*- FBFFFF This is unmapped space, ie. "nothing" is mapped in this address range, and so a bus error will occur if access to it is attempted. When an operating environment is loaded, this is likely to change.

    FC0000 - FC3FFF Slot 1  Note that it is up to each individual expansion card to
    FC4000 - FC7FFF Slot 2  determine which addresses to respond to in its space
    FC8000 - FCBFFF Slot 3

    FCC000 - FCCFFF Floppy Disk Controller shared memory
    FCD000 - FCDFFF I/O Board Hardware
    FCE000 - FCEFFF CPU Board Hardware

    FCF000 - FDFFFF Also unmapped (?)

    FE0000 - FFFFFF CPU ROM
Lisa addresses have 3 bytes as the 68000 is limited to 16MB of address space (68020 etc. have more). The high byte of a long word address is ignored by hardware and almost always ignored by software -- various components of Mac software use the high byte for flags eg. to indicate a locked handle. When entering an address into service mode it will pad the address with leading zeros, but if you are putting addresses into memory, likely you will need to provide the leading byte, eg. 00xx yyzz.

Once the problem board is isolated, the missing DTACK/VPA signal path can be investigated.

For example, by inspecting "Schematic System I/O Lisa" "050-4008-" page 2 of 5, we see that VPA can be generated by U10D-3 (when one of the VIAs is addressed) or U5E-8 (when the SCC is addressed).

In addition, page 3 shows that DTACK is generated by U5E-6 when the 9512 is addressed, and controlled by the 9512 "Pause" signal that will delay the termination of the bus cycle until it has finished its operation.

Once the problem device is determined, one can examine activity on the signal path at the time of the bus error to isolate where the problem lies at the chip level.

Additional addressing details that may be useful to know (but probably unnecessary) ...

Later versions of the Lisa Hardware Reference Manual have an errata page that indicates the I/O mapping in the manual proper is incorrect. However that errata assumes the I/O mapping will persist as initially set up by the ROM, which is not necessarily true. The MMU can be used to map I/O to many different places/pages in the address range of the 68000, and even to more than one page. The manual proper is written such that it doesn't assume where the I/O will be mapped and so provides offsets from the beginning of that page rather than absolute addresses. In most cases one can use the addresses provided in the errata, but there may be an operating environment where they are not correct (likely suspects are limited to LOS and the Workshop).

One simple example is the SCSI expansion card. On the Mac Plus, the SCSI port base address is $580000. For compatibility with MacPlus software, MacWorks +/II will map an alias of the expansion card slot that contains the SCSI card such that the 5380 SCSI chip also appears at $580000 (as well as in the $FCxxxx space).

The CPU and I/O board hardware addresses are not fully decoded. This means that the devices will also appear at other locations in the address space.

For example, by inspecting "Schematic System I/O Lisa" "050-4008-" page 2 of 5 we see that U7C, the Keyboard & Mouse 6522 VIA has chip select pin 23 driven by a circuit that decodes A10, A11, and A12 (as well as INTIO generate by the CPU board and asserted for eg. FCC000 - FCDFFF). Also note that A1-A4 go to U7C to select 1 of 16 addresses in the chip. As A5-A9 are not decoded, the VIA addresses will be aliased at any variation of A5-A9. eg. as well as the typical base address of FCDD81, U7C will also be found at FCDE81, FCDDC1, etc.

edit 1: 2022-05-10-A corrected VMA to VPA
edit 2: 2022-05-10-B fixed unmapped area below ROM, reference to 4MB modification
edit 3: 2022-05-11-A added SCSI port example
edit 4: 2022-05-11-B changed designation of I/O schematic as variant available on bitsavers differs http://www.bitsavers.org/pdf/apple/lisa/hardware/050-4008-H_IO.pdf
edit 5: 2022-05-14-A Corrected that DTACK for RAM is generated by the CPU board, not on the memory board
« Last Edit: May 14, 2022, 05:20:41 pm by sigma7 »
Logged
Warning: Memory errors found. ECC non-functional. Verify comments if accuracy is important to you.

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +105/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Debugging Bus Errors
« Reply #1 on: May 10, 2022, 08:19:29 pm »

Many thanks for this excellent write up!
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

sigma7

  • Administrator
  • Sr. Member
  • *****
  • Karma: +148/-1
  • Offline Offline
  • Posts: 394
  • Warning: Memory errors found. Verify comments.
Re: Troubleshooting Bus Errors
« Reply #2 on: May 10, 2022, 11:14:51 pm »

"What if I don't know the address that is causing the bus error?"

More Difficult Bus Errors v.2022-05-15-A

There may be circumstances when you don't know the address causing the bus error. For example, you may not be able to get into service mode because the hardware isn't working well enough to get there.

In this case one has to troubleshoot solely by examining what the hardware is doing.

On startup, the ROM does not expect to encounter any Bus Errors in a working system until the point where expansion card slots are tested and found empty. Since this is the last step of the self-test, a bus error generated during the self test due to a problem with the Lisa hardware is likely to be the first bus error that occurs.

Troubleshooting

This is straightforward to troubleshoot with a logic analyzer, so assuming that one is not available...

If an oscilloscope is available, one can trigger off the Bus Timeout signal on the CPU board (eg. U15B pin 9). Then use the other channel to probe signals of interest just at/before the time the fatal bus error occurs after reset.

If an oscilloscope or logic analyzer is not available, you are likely in advanced difficulty mode and will need to be creative. Even with an oscilloscope the following may be helpful for efficient troubleshooting...

Disable the bus timeout delay
  • If the problem bus error is the first encountered after reset, then one can disable the bus timeout by grounding pin 8 of U15B on the CPU board. In this configuration the CPU will wait forever for the problematic bus cycle to complete, and one can investigate the state of signals with even a simple voltmeter or logic probe.

Extend the bus timeout delay
  • If the bus error is not the first after reset (eg. occurs while loading an operating environment), then the bus timeout may be extended instead of disabled.

    On the CPU board, U15B pin 8 is the signal that resets the bus timeout timer. As long as pin 8 is mostly low (perhaps toggling rapidly between low and not quite low), the CPU is executing bus cycles normally. When a bus error is in progress, pin 8 will rise in voltage to approximately 3.3V which then triggers the bus timeout (the voltage at pin 8 will be changing slowly as the timeout capacitor charges). If the timeout delay is long enough, one has time to investigate signals around the Lisa.

    During the long timeout delay of the problematic bus cycle, one can disable the bus timeout as described above.

    The 556 timer's formula is generally 1.1RC
    The stock values of C9 = 0.01uF and R3 = 5.1K give a time constant of 56 us
    To increase the time to approximately 1 second, one could change R3 to 100k and C9 to 10 uF

    While experimenting with a long delay (~60 seconds), the Lisa would reset at the end of the self test, as if in power cycle mode. It is not clear whether this is due to ROM programming (ie. this signifies power cycle mode) or some unexpected consequence of using a long bus timeout (more testing needed).
     
Eliminating the bus error caused by an empty expansion slot
  • If the problem bus error is the first to occur after checking expansion slots, it may be desirable to disable the bus timeout and have the slots appear populated (so they do not generate bus errors).

    The straightforward solution is to install 3 working expansion cards, but if that is not possible, one can make an empty expansion slot appear populated with a ghost card by adding a diode to assert DTACK when the empty slot is accessed.

    To do this, one diode is required for each slot to be affected; the cathode(s) (banded end) connected to /SLn on the CPU board:
    • U13F-15 for slot 1
    • U13F-13 for slot 2
    • U13F-11 for slot 3

    All of the anode(s) is/are connected to DTACK:
    • U3A-1 on the CPU board

    This modification prevents a bus error due to an empty expansion slot from occurring during the self-test. At the end of the tests there may be an error message showing "bad expansion slot card", which is due to the ghost expansion card not having a valid identification ROM; regardless of this error, one can then proceed to the startup-from menu and continue to load an operating environment.

Double Bus Faults

There is a more unusual hardware error called a "Double Bus Fault"; this is what happens when another bus error occurs while trying to handle a bus error exception (ie. what would be an endless loop of bus error exceptions). A Double Bus Fault causes the CPU to halt and stop attempting to process instructions. A Double Bus Fault caused by hardware is probably due to some problem with the core bus cycle logic on the CPU board, in which case it is likely the Lisa will not run at all. The procedure above to troubleshoot Bus Errors is probably useful in this case too.

A Double Bus Fault can also be caused by software. eg. if the bus error vector (@ $000008) points to unmapped space, or address $8 is unmapped, presumably due to an error in MMU programming, then a bus error will result in a Double Bus Fault.

edit 1: 2022-05-14-A Substantially rewritten due to erroneous thought that bus errors may occur before empty slots are encountered
edit 2: 2022-05-15-A Updated after testing ghost expansion card modification, formatting adjusted in an attempt to improve readability
« Last Edit: May 15, 2022, 04:45:15 pm by sigma7 »
Logged
Warning: Memory errors found. ECC non-functional. Verify comments if accuracy is important to you.

sigma7

  • Administrator
  • Sr. Member
  • *****
  • Karma: +148/-1
  • Offline Offline
  • Posts: 394
  • Warning: Memory errors found. Verify comments.
Re: Troubleshooting Bus Errors
« Reply #3 on: May 14, 2022, 05:21:38 pm »

Major rewrite

Note posting above: "Substantially rewritten due to erroneous thought that bus errors may occur before empty slots are encountered"

The technique described to hold a problematic bus cycle indefinitely for hardware inspection may simplify troubleshooting substantially!

It seems that the CPU board generates DTACK for all accesses that the MMU has been programmed to be RAM.

To determine the amount of RAM installed, the MMU is initially programmed to address 2MB of memory (4 MB in 3A 'square pixel' or correspondingly modified ROMs), then the ROM code tests areas of that RAM to see if it is indeed unique memory or just empty space. Once the size and physical address of RAM blocks is determined, the MMU is re-programmed to make the RAM contiguous starting at logical address $0.

« Last Edit: May 15, 2022, 04:41:47 pm by sigma7 »
Logged
Warning: Memory errors found. ECC non-functional. Verify comments if accuracy is important to you.
Pages: [1]   Go Up