x86 real mode function calls not executing

327 Views Asked by At

I have some x86 realmode assembly code which isn't behaving exactly as expected. I believe the issue relates to an incorrectly calculated jmp/call offset, but I might be mistaken.

Here is the assembler language code:

[org 0x7c00]

mov ah, 0x0e

mov al, 'h'
int 0x10

mov al, 'e'
int 0x10

mov al, 'l'
int 0x10

mov al, 'l'
int 0x10

mov al, 'o'
int 0x10

mov al, '!'
;int 0x10
call print_char

;loop:
;    jmp loop

mov si, mystring
call print_string

jmp $


; fill to 512 bytes
times 510 - ($ - $$) db 0
dw 0xAA55

; the address is stored in si
print_string:
    pusha
    ; load character from si
    mov al, [si]
    cmp al, 0x00
    jz print_string_end
    call print_char ; print the char using the print_char function
    inc si ; increment the string printing index si
print_string_end:
    popa
    ret

; print function: print a single character
; the character is stored in al
print_char:
    pusha
    mov ah, 0x0e
    int 0x16
    popa            ; don't know what registers int 0x16 modifies
    ret

mystring:
db "loading operating system",0x00

And here is the dissassembly: objdump -D -b binary -m i8086 -M intel bootsector.bin

bootsector.bin:     file format binary


Disassembly of section .data:

00000000 <.data>:
   0:   b4 0e                   mov    ah,0xe
   2:   b0 68                   mov    al,0x68
   4:   cd 10                   int    0x10
   6:   b0 65                   mov    al,0x65
   8:   cd 10                   int    0x10
   a:   b0 6c                   mov    al,0x6c
   c:   cd 10                   int    0x10
   e:   b0 6c                   mov    al,0x6c
  10:   cd 10                   int    0x10
  12:   b0 6f                   mov    al,0x6f
  14:   cd 10                   int    0x10
  16:   b0 21                   mov    al,0x21
  18:   e8 f2 01                call   0x20d
  1b:   be 14 7e                mov    si,0x7e14
  1e:   e8 df 01                call   0x200
  21:   eb fe                   jmp    0x21
    ...
 1fb:   00 00                   add    BYTE PTR [bx+si],al
 1fd:   00 55 aa                add    BYTE PTR [di-0x56],dl
 200:   60                      pusha  
 201:   8a 04                   mov    al,BYTE PTR [si]
 203:   3c 00                   cmp    al,0x0
 205:   74 04                   je     0x20b
 207:   e8 03 00                call   0x20d
 20a:   46                      inc    si
 20b:   61                      popa   
 20c:   c3                      ret    
 20d:   60                      pusha  
 20e:   b4 0e                   mov    ah,0xe
 210:   cd 16                   int    0x16
 212:   61                      popa   
 213:   c3                      ret    
 214:   6c                      ins    BYTE PTR es:[di],dx
 215:   6f                      outs   dx,WORD PTR ds:[si]
 216:   61                      popa   
 217:   64 69 6e 67 20 6f       imul   bp,WORD PTR fs:[bp+0x67],0x6f20
 21d:   70 65                   jo     0x284
 21f:   72 61                   jb     0x282
 221:   74 69                   je     0x28c
 223:   6e                      outs   dx,BYTE PTR ds:[si]
 224:   67 20 73 79             and    BYTE PTR [ebx+0x79],dh
 228:   73 74                   jae    0x29e
 22a:   65 6d                   gs ins WORD PTR es:[di],dx
    ...

The file was assembled with nasm bootsector.asm -f bin -o bootsector.bin

On line 1e there is the instruction call 0x200. Unless I misunderstand, this pushes the current (instruction pointer + 1) onto the stack, and jumps to execute code at offset 0x200. This is somewhere in memory below where the origin is, which is 0x7c00, so it appears to be an address different to that of where the function print_char resides.

At least I think that is what is happening, but I might be completely wrong as I'm new to this.

Also - maybe I'm not alowed to have a file which exceeds 512 bytes as a boot sector?

2

There are 2 best solutions below

4
rupertreynolds On

(It may help you to remember that Intel assembly code can use more than one opcode for one assembly mnemonic. So there are several different versions of "call" to watch out for)

The "call 0x200" disassembled at offset +1e is coded as e8 df 01, which the CPU will execute as a relative call to the next instruction +01df.

Because disassembly defaulted to starting at offset +0 that disassembles as 21+1df (=0200), or 512 in decimal. Remember that print_string was assembled immediately after your 512-byte boot sector, so that adds up.

If your code is loaded at 0000:7c00 then the relative call would go to 0000:7e00, which is calculated correctly, but as others have said that code will not be there, because it wouldn't be loaded by BIOS, which only loads the first sector.

I've done a boot sector a while ago and my advice is a) it's easy to run out of space, so use compact code b) don't assume anything other than CS:IP points to your code. If you rely on DS, ES, SS you may find they are set differently by different BIOS and emulators, so try "mov ax,cs" and "mov ds,ax" etc near the top, to be safe.

Your code uses "mov al,[si]" to load string data. si is paired with ds by default, so it loads from ds:[si]. So perhaps your unexpected output is because ds:[si] points to the wrong data. If you get Bochs set up, you'll be able to find out.

0
usernameisunavaible On

Your code exceeds the 512 bytes, that part isn't loaded into RAM so it jumps in reality to an uninitialized memory address. You have to either load the next sector (before the jump/call) or you make it like this:

; maybe you should setup the stack some where here at the start
; ...

; ...
call func
; ...

; your hang instruction
jmp $


; the code below won't be reached except when you call it
; also you use ret so it returns. It will only be executed if you
; explicitly call it or jump to it (for jump returns don't work)
; also this part is before 0xaa55 so it is loaded in your memory.
func:
   ; ... stuff
   ret

; the padding
times 510 - ($ - $$) db 0
dw 0xaa55