I'm having some problems trying to solve this expression in assembler.
`$`z=(5*a-b/7)/(3/b+a*a)
I would like to know how do you convert a word to a double word ( unsigned solution ) ,
do i have to use the cwb command or do i use AX:BX , if i do have to use those last registers ,
how do i properly write the command ?
I will be testing the code in Turbo Debugger under DosBox .
My full code
assume cs:code,ds:data
data segment
;
a db ?
b db ?
rez dw ?
;
data ends
;
code segment
;
mov ax,data
mov ds,ax
;
;
;#####prima paranteza
;
mov al,a ;ah=a
mul 5 ;ax=a*5
mov cx,ax ;cx=ax
mov ah,b ; il mut pe ah in b ( pregatire pt conversie fara semn )
mov al,0 ; l-am convertit pe b in word ( pe 2 octeti )
div 7 ; am impartit double word-ul b la 7 , catul a ramas in ah , restul a ramas in al
sub cx,ax ; (am tinut cont de faptul ca in ax a ramas rezultatul dupa impartire ) , cx=cx-ax, a*5 - b/7
;
; ####a 2 a paranteza
;
mov ah,3
mov al,0 ; conversie de la b la w ( fara semn )
div b ; ax=3/b
mov bx,ax ; bx = ax
mov al,a
mul a ; ax = a * a
add ax,bx ; ax = ax + bx
;
;
; #### calcul final
mov bx,ax ; bx = ax ( rezultatul celei de a 2 a paranteze )
mov ax,cx ; ax = cx ( rezultatul primei paranteze )
word to double-word?
Let's see if I got you:
word -> 8bit
double-word -> 16bit
AX, BX, CX and DX are 16 bit registers, and they are formed by two other 8-bit registers [ABCD]H and [ABCD]L, so, AX would be:
AH AL
|0|0|0|0|0|0|0|0| - |0|0|0|0|0|0|0|0|
When you use AX, you're using those two at the same time. So, if you want to convert a word to a double word, you just clear the whole [ABCD]X register, and then move your word to the [ABCD]L register, leaving [ABCD]X with the word value.
Cheers
Well, the worst-case scenario would be to zero-out the double word and then copy the word to the lower part (The first 0-x bits / my assembler is rusty) of the double word. There might be a more efficient way to do this, I'll admit
First, what are the range/s of the variables? Are they all unsigned (or are some of them signed)?
Second, what sort of accuracy do you need; and how does this influence things like accuracy loss due to rounding in integer divisions?
Third, how important is performance? How important is memory usage?
Fourth, what sorts of "CPU type" constraints are there? Is it "8086 only" (where you can't use 32-bit registers), or is it "80386 or later" (where you can use 32-bit registers in 16-bit code)? Can you use floating point and the FPU?
Depending on all of the above, the resulting code might look like this:
; Assumes a ranges from 0 to 7
; Assumes b ranges from 1 to 1234
; Assumes 80386 or later
; Note: requires a pre-computed 19728-byte lookup table
movzx eax,word a
movzx ebx,word b
lea eax,[0xFFFFFFF8 + ebx*8+eax]
mov ax,[myTable + eax*2]
Of course depending on all of the above the resulting code might be radically different too...
Normally it would be done with DX:AX in 16-bit mode. You can use CWD to sign-extend AX into DX, or you could just clear DX.
Related
I'm trying to create a line-drawing algorithm in assembly (more specifically Bresenham's line algorithm). After trying an implementation of this algorithm, the algorithm fails to work properly even though I almost exactly replicated the plotLineLow() function from this Wikipedia page.
It should be drawing a line between 2 points, but when I test it, it draws points in random places in the window. I really don't know what could be going wrong because debugging in assembly is difficult.
I'm using NASM to convert the program to binary, and I run the binary in QEMU.
[bits 16] ; 16-bit mode
[org 0x7c00] ; memory origin
section .text ; code segmant
global _start ; tells the kernal where to begin the program
_start: ; where to start the program
; main
call cls ; clears the screen
update: ; main loop
mov cx, 0x0101 ; line pos 1
mov bx, 0x0115 ; line pos 2
call line ; draws the line
jmp update ; jumps to the start of the loop
; functions
cls: ; function to clear the screen
mov ah, 0x00 ; set video mode
mov al, 0x03 ; text mode (80x25 16 colours)
int 0x10 ; BIOS interrupt
ret ; returns to where it was called
point: ; function to draw a dot at a certain point (dx)
mov bx, 0x00ff ; clears the bx register and sets color
mov cx, 0x0001 ; clears the cx register and sets print times
mov ah, 0x02 ; set cursor position
int 0x10 ; BIOS interrupt
mov ah, 0x09 ; write character
mov al, ' ' ; character to write
int 0x10 ; BIOS interrupt
ret ; returns to where it was called
line: ; function to draw a line at two points (bx, cx)
push cx ; saves cx for later
push bx ; saves bx for later
sub bh, ch ; gets the value of dx
mov [dx_L], bh ; puts it into dx
sub bl, cl ; gets the value of dy
mov [dy_L], bl ; puts it into dy
mov byte [yi_L], 1 ; puts 1 into yi (positive slope)
cmp byte [dy_L], 0 ; checks if the slope is negative
jl .negative_y ; jumps to the corresponding sub-label
jmp .after_negative_y ; if not, jump to after the if
.negative_y: ; if statement destination
mov byte [yi_L], -1 ; sets yi to -1 (negative slope)
neg byte [dy_L] ; makes dy negative as well
.after_negative_y: ; else statement destination
mov ah, [dy_L] ; moves dy_L into a temporary register
add ah, ah ; multiplies it by 2
sub ah, [dx_L] ; subtracts dx from that
mov [D_L], ah ; moves the value into D
pop bx ; pops bx to take a value off
mov [y_L], bh ; moves the variable into the output
pop cx ; pops the stack back into cx
mov ah, bh ; moves x0 into ah
mov al, ch ; moves x1 into al
.loop_x: ; loop to go through every x iteration
mov dh, ah ; moves the iteration count into dh
mov dl, [y_L] ; moves the y value into dl to be plotted
call point ; calls the point function
cmp byte [D_L], 0 ; compares d to 0
jg .greater_y ; if greater, jumps to the if statement
jmp .else_greater_y ; if less, jumps to the else statement
mov bh, [dy_L] ; moves dy into a temporary register
.greater_y: ; if label
mov bl, [yi_L] ; moves yi into a temporary register
add [y_L], bl ; increments y by the slope
sub bh, [dx_L] ; dy and dx
add bh, bh ; multiplies bh by 2
add [D_L], bh ; adds bh to D
jmp .after_greater_y ; jumps to after the if statement
.else_greater_y: ; else label
add bh, bh ; multiplies bh by 2
add [D_L], bh ; adds bh to D
.after_greater_y: ; after teh if statement
inc ah ; increments the loop variable
cmp ah, al ; checks to see if the loop should end
je .end_loop_x ; if it ended jump to the end of teh loop
jmp .loop_x ; if not, jump back to the start of the loop
.end_loop_x: ; place to send the program when the loop ends
ret ; returns to where it was called
section .data ; data segmant
dx_L: db 0 ; used for drawing lines
dy_L: db 0 ; ^
yi_L: db 0 ; ^
xi_L: db 0 ; ^
D_L: db 0 ; ^
y_L: db 0 ; ^
x_L: db 0 ; ^
section .text ; code segmant
; boot the OS
times 510-($-$$) db 0 ; fills up bootloader space with empty bytess
db 0x55, 0xaa ; defines the bootloader bytes
I see no video mode
just 80x25 text VGA mode (mode = 3) you set at start with cls so how can you render points? You should set the video mode you want assuming VGA or VESA/VBE see the link above.
why to heck use VGA BIOS for point rendering?
that will be slooooooooow and I have no idea what it does when no gfx mode is present. You can render points by direct access to VRAM (at segment A000h) Ideal use 8/16/24/32bit video modes as they have pixels aligned to BYTEs ... my favorite is 320x200x256c (mode = 19) as it fits into 64K segment so no paging is needed and pixels are Bytes.
In case you are using characters instead of pixels then you still can use access to VRAM in the same way just use segment B800h and chars are 16 bit (color and ASCII).
integer DDA is faster then Bresenham on x86 CPUs since 80386
I do not code in NASM for around 2 decades and closest thing to line I found in my archive is this:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
line: pusha ;ax=x0,bx=x1,dl=y0,dh=y1,cl=col
push ax ;expecting ES = A000h
mov si,bx
sub si,ax
sub ah,ah
mov al,dl
mov bx,ax
mov al,dh
sub ax,bx
mov di,ax
mov ax,320
sub dh,dh
mul dx
pop bx
add ax,bx
mov bp,ax
mov ax,1
mov bx,320
cmp si,32768
jb .r0
neg si
neg ax
.r0: cmp di,32768
jb .r1
neg di
neg bx
.r1: cmp si,di
ja .r2
xchg ax,bx
xchg si,di
.r2: mov [.ct],si
.l0: mov [es:bp],cl
add bp,ax
sub dx,di
jnc .r3
add dx,si
add bp,bx
.r3: dec word [.ct]
jnz .l0
popa
ret
.ct: dw 0
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
So you have something to cross check (took me a while to find it in my archives as I coded whole 3D polygonal engines with textures at that time so I do not have much 2D code in NASM...)
The example expects 320x200x256c VGA video mode
If you were writing a DOS .com file, things would be a bit simpler: segment registers would all be set the same, with your code/data at offset 100 from them. And you could end with ret.
[BITS 16]
[ORG 100h]
[SEGMENT .text]
ret
As #bitRAKE and #PeterCoders pointed out in case You run this in BOOT SECTOR the org is ok. However in such case there is no OS present so if you were going to do more with the stack or any other block of memory outside your 512 bytes, you'd want to point the stack to somewhere known. (It does start out valid, though, because interrupts are enabled.)
More importantly, you need to initialize DS to match your ORG setting, so ds:org reaches a linear address of 7C00. With org 0x7c00, that means you want DS=0. Otherwise instructions like mov [dx_L], bh would be using memory at some unknown location.
[BITS 16]
[ORG 7C00h]
[SEGMENT .text]
mov ax,0000h
mov ds,ax ; DS=0 to match ORG
mov ss,ax ; if you set SS:SP at all, do it back-to-back
mov sp,7C00h ; so an interrupt can't fire half way through.
; here do your stuff
l0:
hlt ; save power
jmp l0
Hope you are using VC or NC configured as IDE for NASM and not compiling/linking manually
This one is usable in MS-DOS so if you are running BOT SECTOR you out of luck. Still You can create a *.com executable debug and once its working in dos change to BOOT SECTOR...
see Is there a way to link object files for DOS from Linux? on how to setup MS-DOS Volkov commander to automatically compile and link your asm source code just by hitting enter on it ... You can also run it just by adding line to the vc.ext line ... but I prefer not to so you can inspect error log first
Convenient debugging
You can try to use MS-DOS (DOSBox) with ancient Borland Turbo C/C++ or Pascal and use their inline asm { .... } code which can be traced and stepped directly in the IDE. However it uses TASM (different syntax to NASM) and have some restrictions ...
Sadly I never saw any decent IDE for asm on x86 platform. The best IDE for asm I worked with was Herkules on ZX Spectrum ... was possible to done things even modern C++ IDEs doesnt have.
I have to prepare program for 8086 processor which converts quaternary to octal number.
My idea:
Multiplying every digit by exponents of 4 and add to register. Later check the highest exponent of 8 not higher than sum from first step.
Divide by exponents of 8 until remainder equals 0. Every result of dividing is one digit in octal.
But for 16-digits number last exponent of 4 is 4^15. I suppose It isn't optimal algorithm.
Is there any other way? Maybe to binary and group by 3 digits.
Turns out you can indeed process values 3 digits at a time. Done this way, you can process strings of arbitrary length, without being limited by the size of a register. Not sure why you might need to, unless aliens try to communicate with us using ascii strings of quaternary digits with huge lengths. Might happen.
It's possible to do the translation either way (Right to Left or Left to Right). However, there's a bit of a challenge to either:
If you are processing RtL, you need to know the length of the output string before you start (so that you know where to write the digits as you compute them). This is do-able, but a bit tricky. In simplest terms, the length is ((strlen(Q) + 2) / 3) * 2. That almost gets it. However, you can end up with a blank space at the beginning for a number of cases. "1" as well as "10" will give the blank space. "20" won't. The correct value can be computed, but it's annoying.
Likewise, processing LtR has a similar problem. You don't have the problem of figuring out where to write digits, but consider: If the string to convert it "123", then the conversion is simple (33 octal). But what if you start processing, and the complete string is "1231" (155 octal)? In that case what you need to process it like "001231" (01 55). IOW, digits can be processed in groups of 3, but you need to handle the initial case where the number of digits doesn't evenly divide by 3.
Posting solutions to homework is usually something I avoid. However I doubt you are going to turn this in as your 'solution,' and it's (barely) possible that google might send someone here who needs something similar.
A few things to note:
This code is intended to be called from C using Microsoft's fastcall (it made testing easier) and compiled with masm.
While it is written in 32bit (my environment), there's nothing that particularly requires 32bit in it. Since you said you were targeting 8086, I've tried to avoid any 'advanced' instructions. Converting to 16bit or even 64bit should not present much of a challenge.
It processes from left to right.
As with any well-written routine, it validates its parameters. It outputs a zero length string on error, such as invalid digits in the input string.
It will crash if the output buffer is NULL. I suppose I could return a bool on error (currently returns void), but, well, I didn't.
I'm sure the code could be tighter (couldn't it always?), but for "homework project quality," it seems reasonable.
Other than that, that comments should explain the code.
.386
.model flat
.code
; Call from C via:
; extern "C" void __fastcall PrintOct(const char *pQuat, char *pOct);
; On Entry:
; ecx: pQuat
; edx: pOct
; On Exit:
; eax, ecx, edx clobbered
; all others preserved
; If pOct is zero bytes long, an error occurred (probably invalid digits)
#PrintOct#8 PROC
; -----------------------
; If pOct is NULL, there's nothing we can do
test edx, edx
jz Failed
; -----------------------
; Save the registers we modify (except for
; eax, edx and ecx which we treat as scratch).
push esi
push ebx
push edi
mov esi, ecx
mov edi, edx
xor ebx, ebx
; -----------------------
; esi: pQuat
; edi: pOct
; ebx: zero (because we use lea)
; ecx: temp pointer to pQuat
; Reject NULL pQuat
test esi, esi
jz WriteNull
; -----------------------
; Reject 0 length pQuat
mov bl, BYTE PTR [esi]
test bl, bl
jz WriteNull
; -----------------------
; How many chars in pQuat?
mov dl, bl ; bl is first digit as ascii. Preserve it.
CountLoop:
inc ecx ; One more valid char
; While we're counting, check for invalid digits
cmp dl, '0'
jl WriteNull
cmp dl, '3'
jg WriteNull
mov dl, BYTE PTR [ecx] ; Read the next char
test dl, dl ; End of string?
jnz CountLoop
sub ecx, esi
; -----------------------
; At this point, there is at least 1 valid digit, and
; ecx contains # digits
; bl still contains first digit as ascii
; Normally we process 3 digits at a time. But the number of
; digits to process might not be an even multiple of 3.
; This code finds the 'remainder' when dividing ecx by 3.
; It might seem like you could just use 'div' (and you can),
; but 'div' is so insanely expensive, that doing all these
; lines is *still* cheaper than a single div.
mov eax, ecx
mov edx, 0AAAAAAABh
mul edx
shr edx, 1
lea edx, [edx+edx*2]
sub ecx, edx ; This gives us the remainder (0-2).
; If the remainder is zero, use the normal 3 digit load
jz LoadTriplet
; -----------------------
; Build a triplet from however many leading 'odd' digits
; there are (1 or 2). Result is in al.
lea eax, DWORD PTR [ebx-48] ; This get us the first digit
; If there was only 1 digit, don't try to load 2
cmp cl, 1
je OneDigit
; Load the other digit
shl al, 2
mov bl, BYTE PTR [esi+1]
sub bl, 48
or al, bl
OneDigit:
add esi, ecx ; Update our pQuat pointer
jmp ProcessDigits
; -----------------------
; Build a triplet from the next 3 digits.
; Result is in al.
; bl contains the first digit as ascii
LoadTriplet:
lea eax, DWORD PTR [ebx-48]
shl al, 4 ; Make room for the other 2 digits.
; Second digit
mov cl, BYTE PTR [esi+1]
sub cl, '0'
shl cl, 2
or al, cl
; Third digit
mov bl, BYTE PTR [esi+2]
sub bl, '0'
or al, bl
add esi, 3 ; Update our pQuat pointer
; -----------------------
; At this point
; al: Triplet
; ch: DigitWritten (initially zeroed when computing remainder)
ProcessDigits:
mov dl, al
shr al, 3 ; left digit
and dl, 7 ; right digit
; If we haven't written any digits, and we are
; about to write a zero, skip it. This deals
; with both "000123" and "2" (due to OneDigit,
; the 'left digit' might be zero).
; If we haven't written any digits yet (ch == 0), and the
; value we are are about to write is zero (al == 0), skip
; the write.
or ch, al
jz Skip1
add al, '0' ; Convert to ascii
mov BYTE PTR [edi], al ; Write a digit
inc edi ; Update pointer to output buffer
jmp Skip1a ; No need to check again
Skip1:
or ch, dl ; Both check and update DigitWritten
jz Skip2
Skip1a:
add dl, '0' ; Convert to ascii
mov BYTE PTR [edi], dl ; Write a digit
inc edi ; Update pointer to output buffer
Skip2:
; Load the next digit.
mov bl, BYTE PTR [esi]
test bl, bl
jnz LoadTriplet
; -----------------------
; All digits processed. We know there is at least 1 valid digit
; (checked on entry), so if we never wrote anything, the value
; must have been zero. Since we skipped it to avoid
; unnecessary preceding zeros, deal with it now.
test ch, ch
jne WriteNull
mov BYTE PTR [edi], '0'
inc edi
; -----------------------
; Write the trailing NULL. Note that if the returned string is
; 0 bytes long, an error occurred (probably invalid digits).
WriteNull:
mov BYTE PTR [edi], 0
; -----------------------
; Cleanup
pop edi
pop ebx
pop esi
Failed:
ret
#PrintOct#8 ENDP
end
I've run a string with 1,000,000,000 quaternary digit thru it as well as all the values from 0-4,294,967,295. Seems to work.
I for one welcome our new 4-digited alien overlords.
I am relatively new to assembler, but when creating code what works with arrays and calculates the average of each row, I encountered a problem that suggests I don't know how division really works. This is my code:
.model tiny
.code
.startup
Org 100h
Jmp Short Start
N Equ 2 ;columns
M Equ 3 ;rows
Matrix DW 2, 2, 3 ; elements
DW 4, 6, 6 ; elements]
Vector DW M Dup (?)
S Equ Type Matrix
Start:
Mov Cx, M;20
Lea Di, Vector
Xor Si, Si
Cols: Push Cx
Mov Cx, N
Xor Bx, Bx
Xor Ax, Ax
Rows:
Add Ax, Matrix[Bx][Si]
Next:
Add Bx, S*M
Loop Rows
Add Si, S
Mov [Di], Ax
Add Di, S
Pop Cx
Loop Cols
Xor Bx, Bx
Mov Cx, M
Mov DX, 2
Print: Mov Ax, Vector[Bx]
IDiv Dx; div/idiv error here
Add Bx, S
Loop Print
.exit 0
There are no errors when compiling. Elements are counted correctly, but when division happens the debugger shows the program jumping to apparently random code. Why is this happening and how can I resolve it?
If you use x86 architecture, IDiv with 16-bit operand will also take Dx as a part of the integer to be divided and throw an exception (interrupt) if the quotient is too large to fit in 16bits.
Try something like this:
Mov Di, 2
Print: Mov Ax, Vector[Bx]
Cwd ; sign extend Ax to Dx:Ax
IDiv Di
I've tried reading about this all over the internet but here it is my problem. I am given a string of doublewords.
I have to order in decreasing order the string of the low words (least significant) from these doublewords. The high words remain unchanged.
For ex: strin DD 12345678h 1256ABCDh, 12AB4344h
the result would be 1234ABCDh, 12565678h, 12AB4344h.
Now I tried my best writting some code but it's not working properly, my insertion procedure. If you could take a look and tell me what I'm doing wrong, I'd be greatful.
I tried running it in td mode but I just can't figure out.
assume cs:code, ds:data
data segment
s dd 12345678h, 1256ABCDh, 12AB4344h
ls equ ($-s)/4 ;this is supposed to be the length of my source string
d dd ls dup (?) ;this is my destination string
aux dw ?
aux2 dw ?
data ends
code segment
insert proc
push di ;here I use the stack to get more free registers
push cx
cmp di, offset d ;if di=offset d it means that I didn't store any number yet
je addPrim
std ;we plan on working form right to left on the string for the next part
mov cx, di
sub cx, offset d ;here I find out with how many words I have to compare the word from AX
dec di
dec di ;since I work with doublewords, for some reason I thought I should decrease di
dec di ;3 times but here my procedure gets fuzzy and doesn't work properly anymore
repeta1: ;this repeat is supposed to compare the word from AX with the rest of the least
scasw ;significant words from es:di
jge DIplus2 ;if my number from AX is bigger or equal than what's in es:di, I increment
;di twice and store it
mov bx, word ptr es:[di+1] ;this part is supposed to interchange words but it's not
;working how I planned so I don't know how to change it
mov word ptr es:[di+2], bx
loop repeta1
jmp DIplus1
DIplus2:
inc di
DIplus1:
inc di
addPrim: ;this label just adds the first word in the destination string
stosw
pop cx
pop di
inc di
inc di
cld
ret
insert endp
start:
mov ax, data
mov ds, ax
mov es, ax
mov si, offset s
mov di, offset d
mov cx, ls ; store in cx the length of the strings
jcxz exit
repeta:
lodsw ;because of little endian, my first word will be my least significant word in the
;in the doubleword so right after it is moved in ax, i apply the procedure insert
call insert
lodsw ;here it moves in ax my most significan word in the dd, so i auto store it
stosw ;in my destination string
loop repeta
exit:
mov ax, 4c00h
int 21h
code ends
end start
ls equ ($-s)/4 ;this is supposed to be the length of my source string
This actually calculates the number of elements.
mov cx, di
sub cx, offset d ;here I find out with how many words ...
At the second invocation of your insert proc this will set CX=4 which is too big given a list of only 3 values. I suggest you divide CX by 4.
dec di
dec di ;since I work with doublewords...
dec di ;3 times but here my procedure gets fuzzy
This is certainly wrong. SCASW indicates you either decrement by 4 or not decrement at all!
mov bx, word ptr es:[di+1] ;this part is supposed to interchange words...
mov word ptr es:[di+2], bx
This cannot work since the offsets are only 1 byte apart!
jmp DIplus1
This yields an single increment of DI and thus an error because you want to store a word at that spot.
I'm one day into learning ASM and I've done a few tutorials, and even successfully modified the tutorial content to use jmp and cmp, etc instead of the MASM .if and .while macros.
I've decided to try and write something very, very simple to begin with before I continue with more advanced tutorials. I'm writing a Fibonacci number generator. Here is the source I have so far:
.386
.model flat, stdcall
option casemap :none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\masm32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\masm32.lib
.code
start:
mov eax, 1
mov ecx, 1
_a:
push eax
add eax, ecx
pop ecx
; Jump to _b if there is an overflow on eax
; Print Values Here
jmp _a
_b:
push 0
call ExitProcess
end start
I intend to check for overflows on eax/ecx but right now I'm just interested in displaying the values of eax/ecx on the screen.
I know how to push the address of a constant string from .data and call StdOut which was the first example in the hello world tutorial, but this appears to be quite different (?).
There is this code provided by Microsoft itself
http://support.microsoft.com/kb/85068
Note that this code outputs AX register on 16 bit systems. But you can get the idea, you just need to convert AX value into ASCII characters by looping through each character. Skip the interrupts part and use your StdOut function.
mov dx, 4 ; Loop will print out 4 hex characters.
nexthex:
push dx ; Save the loop counter.
mov cl, 4 ; Rotate register 4 bits.
rol ax, cl
push ax ; Save current value in AX.
and al, 0Fh ; Mask off all but 4 lowest bits.
cmp al, 10 ; Check to see if digit is 0-9.
jl decimal ; Digit is 0-9.
add al, 7 ; Add 7 for Digits A-F.
decimal:
add al, 30h ; Add 30h to get ASCII character.
mov dl, al
;Use StdOut to print value of dl
;mov ah, 02h ; Prepare for interrupt.
;int 21h ; Do MS-DOS call to print out value.
pop ax ; Restore value to AX.
pop dx ; Restore the loop counter.
dec dx ; Decrement loop counter.
jnz nexthex ; Loop back if there is another character
; to print.
See here as well:
http://www.masm32.com/board/index.php?PHPSESSID=fa4590ba57dbaad4bc44088172af0b49&action=printpage;topic=14410.0