LisaList2

Advanced search  

News:

2022.06.03 added links to LisaList1 and LisaFAQ to the General Category

Pages: [1]   Go Down

Author Topic: Re: Xenix adb reverse engineering (informix brand/serial)  (Read 5541 times)

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Xenix adb reverse engineering (informix brand/serial)
« on: October 19, 2020, 03:10:30 pm »

So about informix, as is usual, these are tar floppies, that is they were made using the tar command to archive the binaries to a single tar file that spans multiple floppy images. To make life easier, I wrote a small program to extract merge the 3 disk copy disk images into a single tar file, so I can play with them in Linux, and not just on the Lisa. This will let me disassemble them. So dc42-to-tar will be included in the release (RC4). It can be used for other Xenix distributed software, and likely will work for uniplus as well.

So was able to exact these guys:

Code: [Select]
./informix$ tar xvf ../floppy-tar.tar
usr/bin/informix
usr/bin/dbbuild
usr/bin/dbstatus
usr/bin/enter1
usr/bin/perform
usr/bin/formbuild
./informix$ ls -l usr/bin/
total 251
-rwxr-xr-x 1 ray ray 80004 Jun 28  1984 dbbuild
-rwxr-xr-x 1 ray ray 73446 Jun 28  1984 dbstatus
-rwxr-xr-x 1 ray ray 65586 Jun 28  1984 enter1
-rwxr-xr-x 1 ray ray 48128 Jun 28  1984 formbuild
-rwxr-xr-x 1 ray ray 42192 Jun 28  1984 informix
-rwxr-xr-x 1 ray ray 94420 Jun 28  1984 perform

Now the bad/good news is that all of them do their own serial number check, so I'll have to crack each of these binaries, but most likely they'll use the same kind of code, so should be somewhat portable.

I dug up my notes on how to use adb for this.

On the informix binary I found (with adb) the offset to the string "Invalid Serial Number" and so now I'll need to step through a run to figure out what function prints this, and what it did right before that and find some branch that I can change.

Unfortunately these binaries have been stripped, but that shouldn't matter too much, but it would have been easier if it had the symbols, like (I forget which) either brand or mutiplan had left. I expect/hope that most likely once I find the serial check function, it should be the same across all the binaries.

Incase anyone cares, some adb commands, (mostly as a quickref for my future self when I'll need to do this again for some other Xenix stuff again):
Code: [Select]
0 - set dot address to address 0 (or if it had symbols you could type in _main or __start, etc.)
?i disassemble a single instruction at the current address.
?16i disassemble 16 instructions
$x  hex dump one word
$32x hex dump 16-bit words
$32X hex dump 32 32-bit words
$s print strings as ascii
?w 0x1234 - write word 0x1234 at the current address - must start adb with -w but note this will immediately overwrite the binary as well so keep a backup before running adb -w
?W 0x12345678 - write 0x12345678 32 bit word at the current address - must start adb with -w but note this will overwrite the binary as well

0x8000050:b set breakpoint at 0x8000050
:b set a breakpoint at the current address:d delete the breakpoint at the current address
:r run the program - can interrupt with control-c if you time it right.:e single step (:s does not work, it's equivalent to continue):c continue run

 $b - display breakpoints
 $e - environment
 $R - registers
 $m - map
 $c - stack trace - shows who called what and what args were passed, so very useful.


You can find more adb here: https://docs.oracle.com/cd/E19455-01/806-0624/6j9vek509/index.html - but this is from a much newer version of adb.
there is also this guy: https://wolfram.schneider.org/bsd/7thEdManVol2/adb/adb.pdf
« Last Edit: October 22, 2020, 10:34:35 am by rayarachelian »
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #1 on: October 22, 2020, 09:10:47 am »

meh, been setting breakpoints and trying to break this since Monday night, but not getting very far. Got several stack traces, and found a function call that has a parameter with the "INFORMIX Master Menu" logo string, with it but it's taking too long to do this on an actual Lisa.

Instead, I'm going back to trying to get Xenix working on LisaEm, that way I can use the tracelog to look at the entire run instead. I have some notes in a notebook I took of what subroutines when called will cause the program to quit and what they in turn call and where they succeed vs eventually quit.

The most annoying thing is that :s doesn't step in Xenix adb, rather it's equivalent to :c and continues the run of the program, so at first I was disassembling basic blocks and whenever it was about to do a JSR or BSR or a branch, I'd set two breakpoints and see where it stopped and noted it down on paper. Then repeated again and again and again.

Eventually I found that :e does step properly, and have been using that, only it doesn't output anything at all, so I also have to do ?i and then $r to see the instruction and the register changes between instructions, so it's cumbersome and worse than setting breakpoints. (This is why running it under LisaEm with it's tracelog mechanism in Xenix's single user mode would be easier, and would also be a much better use of my time to get working.)

I also tried to play with simple binaries such as true, false, yes, echo, and cat, and find that some binaries seem to have a base (that is starting address) of 0x100000 and others 0x800000, but it's confusing to find out from adb ahead of time before running what the entry point address is so it's hard to know where to set breakpoints before doing :r (and you can't do :e to step before :r is started). $m doesn't seem to be consistent there but I think the b1 is the base where the code is loaded.
Weirdly, I would have thought that true/false would be less than a dozen bytes, but this isn't the case. Rather they seem to be a few hundred bytes in size where a lot of the startup code is linked in that's not used for anything productive. (You'd only need true/false to return a zero or a one and quit the program.) So these were compiled from C and linked to stdlib. (Xenix uses statically linked libraries, but the linker is a bit smart in that it only links the library functions used by the program. LisaOS on the other hand seems to have all the shared code in memory already and uses various A-line and other traps as well as A5 for iopaslib/graf/os calls.)

While the text segment is sometimes at 0x800000 (for informix it is), strings and other static data are at offset to address 0:
Code: [Select]
0x888 - serial number string in binary SCOSNHXL000000SBP0
0x921 - "Invalid Serial Number" <<<-- this is the thing we want to set a watchpoint for via :b
0x1b3 - "INFORMIX Master Menu"... <<< - found this in a stack trace. called like so: 0x801cd4(0x1b3)

Instead this excess startup code deals with the argument count, arguments, environ, setup, etc. (I was looking for what traps/syscalls quit a xenix program vs what trap print a string to stdout, etc.) I suspect trap #0 with d0=0 outputs a character, but haven't quite found what terminates a program yet, but likely some other value passed to trap #0. I think what happens is that d0 or a0 are passed as argc, *argv[], and either something else passes *environ right after that or it calculates it and pushes them all on the stack as args to main(). then the __start function eventually calls _main at 0x8000050. It might do some other stuff like setup the stack/heap memory areas before it calls __main... but __main shows up in the stack trace, and __start does not (shows up as ? ? ? ?)

(The linker takes functions like main(int argc, char *argv[], char *env[]) and renames them to _main, while the start function I think is named __start, or maybe it remains just start, but start is part of stdlib and is written in assembler.)

__start basically is a wrapper around main. So after main is called, it takes the return value off the stack and then calls _exit with it, and then calls trap 0 with d0=1 which is the syscall for exit. Since the binary is statically linked stdlib is at the end of the code. So with stripped binaries you could do some pattern analysis and recognize which stdlib functions are linked in by their code.

I haven't done this, but should be theoretically possible to go through the documentation, create a program that calls all the stdlib functions, not strip that binary and get the opcodes for all of them, then use those to find what stdlib functions are linked into a stripped binary, so you can recognize what a stripped binary calls that way, and thus build an strace/truss equivalent to help with reverse engineering.

(This might be project for a future date ofc.) A better way would be to figure out the ar .a and .o formats (possibly using nm) and analyze all of the libraries that way to build code signatures.

A side effect of static linking is that since each program will be of different size, stdlib (and other libs) will be relocated so in one binary printf will have one address, in another it'll be totally different - which is why such an analyzer would be useful and which is why a stripped informix binary is much more annoying. I don't expect the authors to have left helpful function names such as "serial_number_check" - rather they would have obfuscated the names to something else, but knowing what printf is would be very helpful.

As a heuristic you can also assume any functions at the beginning of the binary are program code and towards the end are linked in libraries.

However, having looked at _start, I know that _exit() is at 0x80784a (as that's called immediately after JSR 0x800050 - which I know is _main() ), and can use _exit()'s address to set a breakpoint when it's called. I'm assuming that normal startup in informix will print the banner, and then do a serial number check, and then when that fails immediately call exit rather than return from main, and so I can use $C to see the stack right when exit is called, so I can then disassemble the caller functions and figure out from there what decisions were made to reach exit() and thus be able to find the serial number check. So I'll give this one more shot before switching back to trying to get Xenix to boot on LisaEm.

It went a lot faster with multiplan because that test was very early on in the start up and if I remember correctly, and all the shifting/rotating and multiplying was pretty obvious; and also brand, I think had the symbols linked in so it was easy to tell one function from another and what it does.

So tl;dr going to first try tracing the stack from a breakpoint set on _exit as a last ditch attempt before pivotting back to trying to get Xenix to boot on LisaEm, and then will do a single tracelog of this under LisaEm and see the whole run that way vs the notes I took.
« Last Edit: October 22, 2020, 10:49:30 am by rayarachelian »
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #2 on: October 22, 2020, 08:34:52 pm »

Ok, so looks like that did the trick. I found a routine that pointed to "Invalid Serial Number" but the pointer was off by two bytes. This isn't a pascal string however, those two bytes aren't a length. I don't know if informix will actually work or not but now it displays a menu instead of quitting, and ofc, I'll have to see if I can fix the other binaries and whether the method it uses are similar enough for me to autodetect on the other binaries.

To patch /usr/bin/informix to work, you'd do this:

Code: [Select]
cp /usr/bin/informix /usr/bin/informix.bak
adb informix
0x801ce4?w 0x1de8
$q

Here are my notes if you want to play along.

Basically I started backwards by setting a breakpoint for the exit routine 0x80784a, whose pointer I got by looking at the _start routine that starts at 0x800000 - this does a jsr to _main which is 0x800050, and then it picks up the return value, then calls _exit (0x80784a) and then sets D0=1, and does trap #0 to exit. Working backwards from there by disassembling the stack and setting a bunch of other breakpoints, at some point I set a breakpoint at 0x801e12 and looked at what called it, and saw a lot of math being done there which is a clue.

I saw a weird bit of code that sets A0 to 0x1526 and then did JSR (a0). Looking at the memory at 0x1526, I found it contained 0x801de8. Looking at the instructions there, I found a reference to memory address 0x920, looking at that I realized that at 0x922 it contained the string "Invalid Serial Number" - so therefore this routine is likely what eventually prints "Invalid Serial Number" and then quits.

I then was looking for references to 1526, when I found a routine that pushed three routine pointers, one of which was 0x801de8, but then it pushed two other ones right above it, so I realized that the other two were likely the success functions. There's a bsr to some internal routine that does something with these, but didn't bother looking as changing all 3 pointers to 0x801dd8 caused the informix menu to come up.

So there's a lot more to patch as there's a handful of binaries. I also found that my dc42-to-tar code doesn't ignore the end of a disk so what happens is that tar quits early and doesn't allow the full file list, will have to fix that.

Another issue is that I did this on a machine where Xenix lives on a 5MB profile (since it's a 2/5 and xenix has a dumbass bug that limits it to 5mb). So I didn't have enough space to extract all of informix, it's sample db's and libs and other binaries. So not going to be able to extract all the binaries except one by one. Will revisit the rest after I get Xenix booting up on LisaEm.

I've already turned the Lisa off for the night, but I recall the /once/init.informix script might also be branding some libraries, I'm not sure, been doing this too long and I'm too tired. :) but if that's the case, I'll have to figure out what it does with the libraries.

Note that init.informix will remove brand as well as other files if it fails to serialize so you should edit it using vi to remove the "rm" commands in it if you want to pick up from where I started and play. I didn't use brand at all, but I did confirm this version of brand is different than the one that came with multiplan and that it's stripped.

There's some weird ass code in here, such as  movem.l #<>,-(a7) that's got me wondering what it's for, off the top of my head, this should do nothing.
Likely this is compiler broiler plate on function entry/exit where it flags registers to be saved/restored and it happens to have none, but still emits bullshit opcodes slowing down the app for no good reason.

Code: [Select]
from stack trace:
0x8019a8: ??? () -> 0x801e12 this is ??? since it was called by JSR (a0) so $c couldn't figure out how it got here.
  (0x8019a8 has a lot of math in it. it does move.l 0x1526,a0 (1526 contains 0x801de8), then JSR (A0) )
0x801d2a: 0x8018f4()
0x8006b6: 0x801cd4(0x1b3) 1b3 is the "INFORMIX Master Menu" string
0x80008a: 0x8006a2()
0x80003a: 0x800050(1,0x7ffe2e) <- this is __start calling _main()

0x801cd4: link a6,0
0x801cd8: tst.b -144,(a7)
0x801cdc: movem.l #<>,-(a7) #wtf? 0x48e7 0x0000
0x801ce0: push 0x801dE8 <- failure  <- change this guy!: 0x801ce4?w 0x1de8
0x801ce6: push 0x801dd8 <- likely success pointer
0x801cec: push 0x801dd8 <- likely success pointer.
0x801cf2: bsr  0x8018b4
...
 v- so this does nothing useful, but it's used as a sink vs the routine that bails out below.
 0x801dd8: link a6,0
 0x801ddc: tst.b -132,(a7)
 0x801de0: movem #<>,-(a7)  #wtf? 0x48e7 0x0000
 0x801de4: unlk a6
 0x801de6: rts

 v- so this is called by pointer and if we got here, it's invalid serial #
 0x801de8: link a6,#0
 0x801dec  tst.b -136,(a7) - this probably ensures MMU has allocated memory on stack.
 0x801df0  move.l #0x920,(a7)  push pointer to: 0x0a,0x49+"Invalid Serial Number"!!! so this routine is the exit   
 0x801df6  jsr 0x8031fc
 0x801dfc  addq 0x4,a7
 0x801dfe  move.l 0xcfe,-(a7)
 0x801e04  jsr 0x8032c0 <- haven't looked at what this does.
 0x801e0a  addq #4,a7
 0x801e0c  jsr 0x800d4a <- haven't looked at what this does.
 0x801e12: moveq 1,d0 <- I suspect this is flagging "Invalid Serial Number" to the routine it calls.
 0x801e14: move d0,-(a7)
 0x801e16: jsr 80784a <- haven't looked at what this does.
 0x801e1c: addq 4,a7
 0x801e1e: unlk a6
 0x801e20: rts

Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #3 on: October 23, 2020, 04:05:50 pm »

I tried another binary today, it looks like it will be all broiler plate (hopefully for the most part). This time I tried with dbbuild:

Code: [Select]

I did the usual 0x800000?i until I saw the jsr to main, this time it's not a 0x800050, but it doesn't matter, a couple of opcodes immediately after that and right before the move d0,1 and trap #0 is a 2nd jsr to 0x80e4aa which is the exit routine. Setting a breakpoint to the 2nd jsr with 0x80e4aa:b and :r for a run returned the following stack trace: (you get the stack trace by using $c)

(That 2nd JSR can usually be found at 0x800044, don't set the breakpoint to 0x800044, set it to the address it goes to.)

$c
0x804316: ? ? ? ()
(there might be another line here but it doesn't matter, you're after the line below)
0x804698: 0x804262(0x21a2) * <- dbbuild data description language compiler banner - this is the droid we're looking for.
0x80003a: 0x801a82(1,0x7ffe32)

Disassembling (2nd address on the line that has a single parameter - in this case the one with 0x21a2 in it, so we now run 0x804262?i to disassemble one instruction at a time until we see 3 movel's, and a bsr, this time it's like this:

(we ignore this) link
(we ignore this) testb
(we ignore this) movem
0x80464e move.l #804756,-(a7) ; so this is the bad serial routine
0x804654 move.l #804746,-(a7) ; these are the good serial routines
0x80465a move.l #804766,-(a7) ; these are the good serial routines
0x804660 bsr 0x804222
... and we don't care about the rest.

So then the first one is the bad serial routine address, so we do 80464e?x gives us:
 
0x80464e: 0x2f3c ; this is the move.l opcode
0x804650: 0x0080 ; this is high word of 0x80464e
0x804652: 0x4756 ; this is low word that we want to change to 0x4746 so it matches the other two, we do this with this command:
0x804652?w 0x4746

;and now let's test it by running it:
:r

and we're in (it now quits with incorrect number of command line arguments, see error 2014)

I suspect this recipe will work for all the binaries.

Edit: and sure enough I tested on the other binaries and it did work.

One problem I ran into is that if there's not enough free space, when adb tries to rewrite the file, it doesn't do it in place so the change will silently fail. In my case I was making backups for each of the binaries before editing, but this isn't needed since you can always restore them by untarring the specific binary from the floppies.
« Last Edit: October 23, 2020, 04:51:59 pm by rayarachelian »
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

blusnowkitty

  • Sr. Member
  • ****
  • Karma: +69/-0
  • Offline Offline
  • Posts: 244
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #4 on: October 23, 2020, 05:24:03 pm »

Another issue is that I did this on a machine where Xenix lives on a 5MB profile (since it's a 2/5 and xenix has a dumbass bug that limits it to 5mb). So I didn't have enough space to extract all of informix, it's sample db's and libs and other binaries. So not going to be able to extract all the binaries except one by one. Will revisit the rest after I get Xenix booting up on LisaEm.

So, Xenix itself is actually broken? I just thought something was wrong with my X/Profile when I tried to get it going on a 10MB image.
Logged
You haven't lived until you've heard the sound of a Sony 400k drive.

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #5 on: October 23, 2020, 06:35:23 pm »

So, Xenix itself is actually broken? I just thought something was wrong with my X/Profile when I tried to get it going on a 10MB image.

Yes, Xenix uses the I/O ROM to *assume* the size of the hard drive attached to it. If you have a 2/5 it should have ROM H/A8 rather than H/88 for example. If it sees I/O ROM version A8 it assumes the hard drive is 5MB, if it sees 88, it assumes it's 10MB.

You can install a 2nd (and 3rd) hard drive if you have the dual port parallel card and then it will have enough space. When I first had Xenix going back in the late 80s, I setup one drive for the /usr partition, etc. There's no such limit on hard drives attached over the expansion slots.

see: https://lisafaq.sunder.net/single.html#lisafaq-sw-xen_about
« Last Edit: October 23, 2020, 06:38:09 pm by rayarachelian »
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

rayarachelian

  • Administrator
  • Hero Member
  • *****
  • Karma: +101/-0
  • Offline Offline
  • Posts: 772
  • writing the code,writing the code,writing the code
    • LisaEm
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #6 on: October 24, 2020, 01:10:40 pm »

I've posted the deserialized informix along with a small C program to deserialize the originals here: https://lisalist2.com/index.php/topic,123.msg896.html#msg896
Logged
You don't know what it's like, you don't have a clue, if you did you'd find yourselves doing the same thing, too, Writing the code, Writing the code

Lisa2

  • Administrator
  • Sr. Member
  • *****
  • Karma: +64/-0
  • Offline Offline
  • Posts: 153
  • See why 1983 was more like Y2K...
    • Lisa2.com
Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #7 on: October 24, 2020, 05:42:53 pm »

Great job Ray!
Thanks.
Logged

Todd

  • Jr. Member
  • **
  • Karma: +5/-0
  • Offline Offline
  • Posts: 16
Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #8 on: November 25, 2020, 04:01:07 pm »

Incredible!
Logged

compu_85

  • Sr. Member
  • ****
  • Karma: +66/-0
  • Offline Offline
  • Posts: 233
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #9 on: December 03, 2020, 11:56:36 am »

So, Xenix itself is actually broken? I just thought something was wrong with my X/Profile when I tried to get it going on a 10MB image.
...If it sees I/O ROM version A8 it assumes the hard drive is 5MB, if it sees 88, it assumes it's 10MB.

...There's no such limit on hard drives attached over the expansion slots.


When I tried hanging extra 10m disks off of my Xenix system, it would let me format them as 10m, but any data written past 5m would be lost  :o

-J
Logged

D.Finni

  • Sr. Member
  • ****
  • Karma: +37/-0
  • Offline Offline
  • Posts: 135
Re: Re: Xenix adb reverse engineering (informix brand/serial)
« Reply #10 on: December 07, 2020, 02:30:05 pm »

When I tried hanging extra 10m disks off of my Xenix system, it would let me format them as 10m, but any data written past 5m would be lost  :o

Weren't computer systems from the 1980s so wonderful!  :P

Check this out: it's a long read, but definitely an eye-opener on "business as usual" in the 1980s with Lisa. experience at Rutgers with large intro courses
Logged
Pages: [1]   Go Up