x86 Asm Insertion sort - sorting

I've tried reading about this all over the internet but here it is my problem. I am given a string of doublewords.
I have to order in decreasing order the string of the low words (least significant) from these doublewords. The high words remain unchanged.
For ex: strin DD 12345678h 1256ABCDh, 12AB4344h
the result would be 1234ABCDh, 12565678h, 12AB4344h.
Now I tried my best writting some code but it's not working properly, my insertion procedure. If you could take a look and tell me what I'm doing wrong, I'd be greatful.
I tried running it in td mode but I just can't figure out.
assume cs:code, ds:data
data segment
s dd 12345678h, 1256ABCDh, 12AB4344h
ls equ ($-s)/4 ;this is supposed to be the length of my source string
d dd ls dup (?) ;this is my destination string
aux dw ?
aux2 dw ?
data ends
code segment
insert proc
push di ;here I use the stack to get more free registers
push cx
cmp di, offset d ;if di=offset d it means that I didn't store any number yet
je addPrim
std ;we plan on working form right to left on the string for the next part
mov cx, di
sub cx, offset d ;here I find out with how many words I have to compare the word from AX
dec di
dec di ;since I work with doublewords, for some reason I thought I should decrease di
dec di ;3 times but here my procedure gets fuzzy and doesn't work properly anymore
repeta1: ;this repeat is supposed to compare the word from AX with the rest of the least
scasw ;significant words from es:di
jge DIplus2 ;if my number from AX is bigger or equal than what's in es:di, I increment
;di twice and store it
mov bx, word ptr es:[di+1] ;this part is supposed to interchange words but it's not
;working how I planned so I don't know how to change it
mov word ptr es:[di+2], bx
loop repeta1
jmp DIplus1
DIplus2:
inc di
DIplus1:
inc di
addPrim: ;this label just adds the first word in the destination string
stosw
pop cx
pop di
inc di
inc di
cld
ret
insert endp
start:
mov ax, data
mov ds, ax
mov es, ax
mov si, offset s
mov di, offset d
mov cx, ls ; store in cx the length of the strings
jcxz exit
repeta:
lodsw ;because of little endian, my first word will be my least significant word in the
;in the doubleword so right after it is moved in ax, i apply the procedure insert
call insert
lodsw ;here it moves in ax my most significan word in the dd, so i auto store it
stosw ;in my destination string
loop repeta
exit:
mov ax, 4c00h
int 21h
code ends
end start

ls equ ($-s)/4 ;this is supposed to be the length of my source string
This actually calculates the number of elements.
mov cx, di
sub cx, offset d ;here I find out with how many words ...
At the second invocation of your insert proc this will set CX=4 which is too big given a list of only 3 values. I suggest you divide CX by 4.
dec di
dec di ;since I work with doublewords...
dec di ;3 times but here my procedure gets fuzzy
This is certainly wrong. SCASW indicates you either decrement by 4 or not decrement at all!
mov bx, word ptr es:[di+1] ;this part is supposed to interchange words...
mov word ptr es:[di+2], bx
This cannot work since the offsets are only 1 byte apart!
jmp DIplus1
This yields an single increment of DI and thus an error because you want to store a word at that spot.

Related

Converting quaternary to octal. ASM 8086

I have to prepare program for 8086 processor which converts quaternary to octal number.
My idea:
Multiplying every digit by exponents of 4 and add to register. Later check the highest exponent of 8 not higher than sum from first step.
Divide by exponents of 8 until remainder equals 0. Every result of dividing is one digit in octal.
But for 16-digits number last exponent of 4 is 4^15. I suppose It isn't optimal algorithm.
Is there any other way? Maybe to binary and group by 3 digits.
Turns out you can indeed process values 3 digits at a time. Done this way, you can process strings of arbitrary length, without being limited by the size of a register. Not sure why you might need to, unless aliens try to communicate with us using ascii strings of quaternary digits with huge lengths. Might happen.
It's possible to do the translation either way (Right to Left or Left to Right). However, there's a bit of a challenge to either:
If you are processing RtL, you need to know the length of the output string before you start (so that you know where to write the digits as you compute them). This is do-able, but a bit tricky. In simplest terms, the length is ((strlen(Q) + 2) / 3) * 2. That almost gets it. However, you can end up with a blank space at the beginning for a number of cases. "1" as well as "10" will give the blank space. "20" won't. The correct value can be computed, but it's annoying.
Likewise, processing LtR has a similar problem. You don't have the problem of figuring out where to write digits, but consider: If the string to convert it "123", then the conversion is simple (33 octal). But what if you start processing, and the complete string is "1231" (155 octal)? In that case what you need to process it like "001231" (01 55). IOW, digits can be processed in groups of 3, but you need to handle the initial case where the number of digits doesn't evenly divide by 3.
Posting solutions to homework is usually something I avoid. However I doubt you are going to turn this in as your 'solution,' and it's (barely) possible that google might send someone here who needs something similar.
A few things to note:
This code is intended to be called from C using Microsoft's fastcall (it made testing easier) and compiled with masm.
While it is written in 32bit (my environment), there's nothing that particularly requires 32bit in it. Since you said you were targeting 8086, I've tried to avoid any 'advanced' instructions. Converting to 16bit or even 64bit should not present much of a challenge.
It processes from left to right.
As with any well-written routine, it validates its parameters. It outputs a zero length string on error, such as invalid digits in the input string.
It will crash if the output buffer is NULL. I suppose I could return a bool on error (currently returns void), but, well, I didn't.
I'm sure the code could be tighter (couldn't it always?), but for "homework project quality," it seems reasonable.
Other than that, that comments should explain the code.
.386
.model flat
.code
; Call from C via:
; extern "C" void __fastcall PrintOct(const char *pQuat, char *pOct);
; On Entry:
; ecx: pQuat
; edx: pOct
; On Exit:
; eax, ecx, edx clobbered
; all others preserved
; If pOct is zero bytes long, an error occurred (probably invalid digits)
#PrintOct#8 PROC
; -----------------------
; If pOct is NULL, there's nothing we can do
test edx, edx
jz Failed
; -----------------------
; Save the registers we modify (except for
; eax, edx and ecx which we treat as scratch).
push esi
push ebx
push edi
mov esi, ecx
mov edi, edx
xor ebx, ebx
; -----------------------
; esi: pQuat
; edi: pOct
; ebx: zero (because we use lea)
; ecx: temp pointer to pQuat
; Reject NULL pQuat
test esi, esi
jz WriteNull
; -----------------------
; Reject 0 length pQuat
mov bl, BYTE PTR [esi]
test bl, bl
jz WriteNull
; -----------------------
; How many chars in pQuat?
mov dl, bl ; bl is first digit as ascii. Preserve it.
CountLoop:
inc ecx ; One more valid char
; While we're counting, check for invalid digits
cmp dl, '0'
jl WriteNull
cmp dl, '3'
jg WriteNull
mov dl, BYTE PTR [ecx] ; Read the next char
test dl, dl ; End of string?
jnz CountLoop
sub ecx, esi
; -----------------------
; At this point, there is at least 1 valid digit, and
; ecx contains # digits
; bl still contains first digit as ascii
; Normally we process 3 digits at a time. But the number of
; digits to process might not be an even multiple of 3.
; This code finds the 'remainder' when dividing ecx by 3.
; It might seem like you could just use 'div' (and you can),
; but 'div' is so insanely expensive, that doing all these
; lines is *still* cheaper than a single div.
mov eax, ecx
mov edx, 0AAAAAAABh
mul edx
shr edx, 1
lea edx, [edx+edx*2]
sub ecx, edx ; This gives us the remainder (0-2).
; If the remainder is zero, use the normal 3 digit load
jz LoadTriplet
; -----------------------
; Build a triplet from however many leading 'odd' digits
; there are (1 or 2). Result is in al.
lea eax, DWORD PTR [ebx-48] ; This get us the first digit
; If there was only 1 digit, don't try to load 2
cmp cl, 1
je OneDigit
; Load the other digit
shl al, 2
mov bl, BYTE PTR [esi+1]
sub bl, 48
or al, bl
OneDigit:
add esi, ecx ; Update our pQuat pointer
jmp ProcessDigits
; -----------------------
; Build a triplet from the next 3 digits.
; Result is in al.
; bl contains the first digit as ascii
LoadTriplet:
lea eax, DWORD PTR [ebx-48]
shl al, 4 ; Make room for the other 2 digits.
; Second digit
mov cl, BYTE PTR [esi+1]
sub cl, '0'
shl cl, 2
or al, cl
; Third digit
mov bl, BYTE PTR [esi+2]
sub bl, '0'
or al, bl
add esi, 3 ; Update our pQuat pointer
; -----------------------
; At this point
; al: Triplet
; ch: DigitWritten (initially zeroed when computing remainder)
ProcessDigits:
mov dl, al
shr al, 3 ; left digit
and dl, 7 ; right digit
; If we haven't written any digits, and we are
; about to write a zero, skip it. This deals
; with both "000123" and "2" (due to OneDigit,
; the 'left digit' might be zero).
; If we haven't written any digits yet (ch == 0), and the
; value we are are about to write is zero (al == 0), skip
; the write.
or ch, al
jz Skip1
add al, '0' ; Convert to ascii
mov BYTE PTR [edi], al ; Write a digit
inc edi ; Update pointer to output buffer
jmp Skip1a ; No need to check again
Skip1:
or ch, dl ; Both check and update DigitWritten
jz Skip2
Skip1a:
add dl, '0' ; Convert to ascii
mov BYTE PTR [edi], dl ; Write a digit
inc edi ; Update pointer to output buffer
Skip2:
; Load the next digit.
mov bl, BYTE PTR [esi]
test bl, bl
jnz LoadTriplet
; -----------------------
; All digits processed. We know there is at least 1 valid digit
; (checked on entry), so if we never wrote anything, the value
; must have been zero. Since we skipped it to avoid
; unnecessary preceding zeros, deal with it now.
test ch, ch
jne WriteNull
mov BYTE PTR [edi], '0'
inc edi
; -----------------------
; Write the trailing NULL. Note that if the returned string is
; 0 bytes long, an error occurred (probably invalid digits).
WriteNull:
mov BYTE PTR [edi], 0
; -----------------------
; Cleanup
pop edi
pop ebx
pop esi
Failed:
ret
#PrintOct#8 ENDP
end
I've run a string with 1,000,000,000 quaternary digit thru it as well as all the values from 0-4,294,967,295. Seems to work.
I for one welcome our new 4-digited alien overlords.

assembly: sorting numbers using only conditional statments

I am new to assembly and I am trying to write a program that gets five user inputed numbers, stores them in variables num1-num5, sorts them(without using arrays) with num5 having the greatest value and num1 having the lowest value, and then displays the sorted numbers. I am having trouble figuring out how to approach this. I got the 5 numbers and stored them in the variables but I am confused as to how to start with sorting. I have tried a few things but I keep getting errors. This is my code that I can actually get running but it isn't working the way I want it to.
TITLE MASM Template (main.asm)
INCLUDE Irvine32.inc
.data
getnumber byte "Please enter a number between 0 and 20",0ah,0dh,0
num1 byte 0
num2 byte 0
num3 byte 0
num4 byte 0
num5 byte 0
.code
main PROC
call Clrscr
;************* get the information from the user*******************
mov edx, offset getnumber ;ask to input number
call writestring
call readint
mov bl, al
mov num1, bl ;get the number and move to num1 variable
mov edx, offset getnumber ;ask to input number
call writestring
call readint
mov bl, al
mov num2, bl ;get the number and move to num2 variable
mov edx, offset getnumber ;ask to input number
call writestring
call readint
mov bl, al
mov num3, bl ;get the number and move to num3 variable
mov edx, offset getnumber ;ask to input number
call writestring
call readint
mov bl,al
mov num4, bl ;get the number and move to num4 variable
mov edx, offset getnumber ;ask to input number
call writestring
call readint
mov bl, al
mov num5, bl ;get the number and move to num5 variable
;***show the user inputed numbers****
mov al, num1
call writeint
mov al, num2
call writeint
mov al,num3
call writeint
mov al, num4
call writeint
mov al,num5
call writeint
;*****start comparing***
cmp bl,num5
jl jumptoisless
jg jumptoisgreater
jumptoisless:
call writeint
jumptoisgreater:
mov bl, num5
mov dl, num4
mov num5, dl
mov num4, bl
call writeint
jmp imdone
imdone:
call dumpregs
exit
main ENDP
END main
Some notes to your code:
call readint
mov bl, al
mov num2, bl
Why don't you simply store al directly to memory, as: mov [num2],al? You don't use the bl anyway.
Except here:
;*****start comparing***
cmp bl,num5
jl jumptoisless
jg jumptoisgreater
Where I would be afraid what call writeint does to ebx (or you did your homework, and you know from head that call writeint preserves ebx content?).
And if the ebx is preserved, then bl contains still num5 from the input, so it will be equal.
Funnily enough, when equal, you will continue with jumptoisless: part of code, which will output some leftover in al, and then it will continue to jumptoisgreater: part of code, so effectively executing all of them.
Can you watch the CPU for a while in debugger, while single stepping over the instructions, to understand a bit better how it works? It's a state machine, ie. based on the current values in registers, and content of memory, it will change the state of registers and memory in the deterministic way.
So unless you jump away, next instructions is executed after the current one, and jl + jg doesn't cover "equal" state (at least you do cmp only once, so hopefully you understand the jcc instructions don't change flags and both jl/jg operate on the same result of cmp in flags). The Assembler doesn't care about name of your labels, and it will not warn you the "isgreater" code is executed even when "isless" was executed first.
About how to solve your task:
Can't think of anything reasonably short, unless you start to work with num1-num5 memory as array, so you can address it in generic pointer way with index. So I will gladly let you try on your own, just a reminder you need at least n*log_n compares to sort n values, so if you would write very effective sort code, you would need at least 5*3 = 15 cmp instructions (log2(5) = 3, as 23 = 8).
On the other hand an ineffective (but simple to write and understand) bubble sort over array can be done with single cmp inside two loops.
rcgldr made me curious, so I have been trying few things...
With insertion sort it's possible to use only 8x (at most) cmp (I hope the pseudo-syntax is understandable for him):
Insert(0, num1)
// ^ no cmp
Insert((num2 <= [0] ? 0 : 1), num2)
// ^ 1x cmp executed
Insert((num3 <= [0] ? 0 : (num3 <= [1] ? 1 : 2)), num3)
// ^ at most 2 cmp executed
Insert((num4 <= [1] ? (num4 <= [0] ? 0 : 1) : (num4 <= [2] ? 2 : 3)), num4)
// ^ always 2 of 3 cmp executed
Insert((num5 <= [1] ? (num5 <= [0] ? 0 : 1) : (num5 <= [2] ? 2 : (num5 <= [3] ? 3 : 4))), num5)
// ^ at most 3 of 4 cmp executed
=> total at most 8 cmp executed.
Of course doing the "insert" with "position" over fixed variables would be total PITA... ;) So this is half-joke proposal just to see if 8x cmp is enough.
("6 compares" turned out to be my brain-fart, not possible AFAIK)

Jumping to random code when using IDIV

I am relatively new to assembler, but when creating code what works with arrays and calculates the average of each row, I encountered a problem that suggests I don't know how division really works. This is my code:
.model tiny
.code
.startup
Org 100h
Jmp Short Start
N Equ 2 ;columns
M Equ 3 ;rows
Matrix DW 2, 2, 3 ; elements
DW 4, 6, 6 ; elements]
Vector DW M Dup (?)
S Equ Type Matrix
Start:
Mov Cx, M;20
Lea Di, Vector
Xor Si, Si
Cols: Push Cx
Mov Cx, N
Xor Bx, Bx
Xor Ax, Ax
Rows:
Add Ax, Matrix[Bx][Si]
Next:
Add Bx, S*M
Loop Rows
Add Si, S
Mov [Di], Ax
Add Di, S
Pop Cx
Loop Cols
Xor Bx, Bx
Mov Cx, M
Mov DX, 2
Print: Mov Ax, Vector[Bx]
IDiv Dx; div/idiv error here
Add Bx, S
Loop Print
.exit 0
There are no errors when compiling. Elements are counted correctly, but when division happens the debugger shows the program jumping to apparently random code. Why is this happening and how can I resolve it?
If you use x86 architecture, IDiv with 16-bit operand will also take Dx as a part of the integer to be divided and throw an exception (interrupt) if the quotient is too large to fit in 16bits.
Try something like this:
Mov Di, 2
Print: Mov Ax, Vector[Bx]
Cwd ; sign extend Ax to Dx:Ax
IDiv Di

Assembler (TASM x64) arrays and elements

I have an array of nine names:
.model tiny
.data
vardas1 db "Rokas",0ah,'$'
vardas2 db "Tomas",0ah,'$'
vardas3 db "Matas",0ah,'$'
vardas4 db "Domas",0ah,'$'
vardas5 db "Augis",0ah,'$'
vardas6 db "Vofka",0ah,'$'
vardas7 db "Marka",0ah,'$'
vardas8 db "Auris",0ah,'$'
vardas9 db "Edvis",0ah,'$'
vardai dw offset vardas1, offset vardas2, offset vardas3, offset vardas4, offset vardas5, offset vardas6, offset vardas7, offset vardas8, offset vardas9
.code
org 100h
I need to read a digit from keyboard, and then I need to print that name. For example I will push 5, and console should write "Augis". BTW, second code block aren't all code, just loop that doesn't work
paieska:
mov dx, offset _comment1 ; Just string name asking user to input digit
mov ah, 9
int 21h
mov j, 00h ; Trying to input the digit from keyboard
mov ah, 01h
mov dl, 0ah
int 21h
mov bx, offset vardai ; Add array "names" to bx register
add bx, cx ; Add cx for indexing
mov dx, [bx] ; Add first array element to dx register
add cx, 2 ; Increasing cx by 2, because I'm using data word not data byte
mov ah, 9 ; Try to print it
int 21h
cmp cx, j ; Try to compare cx (index of array) to mine inputed digit "j"
jne paieska
je end
mov ah, 01h
mov dl, 0ah ;NO NEED FOR THIS - INT21/01 DOES NOT USE DL
int 21h
MOV AH, '1' ; MIN INPUT CHAR
mov bx, offset vardai ; Add array "names" to bx register WELL, ASSIGN ACTUALLY
MOV CX,2 ;NUMBER OF BYTES TO ADD (WORDS, NOT BYTES)
LOOPN:
mov dx, [bx] ; name-pointer array element to dx register
CMP AH,AL ; MATCHING char?
JE PNAME ; YES, PRINT NAME
add bx, cx ; Add cx=2 for next name
inc AH ; next possible character input
CMP AH,'9'+1 ; allowed is '1'..'9'
jne loopn ; in allowed range
; input not 1..9
mov dx, offset errormessage
PNAME:
mov ah, 9 ; Try to print it
int 21h
jmp end
Well, I tried to edit your approach with CAPS, but it became too complicated.
Essentially, you are reading a character from the keyboard using function 01. This character arrives in AL. If all goes well, it should be '1'..'9'. Notice these are the ASCII characters '1'..'9', that is hex 31..39
Next step is to set BX to the start of the table, AH to the minimum character you anticipate and CX to 2 because the table contains words, not bytes.
Now we have a loop. Load X from the table, and check whether AL is equal to AH. If the user input 1, these will be equal, so go print the string.
Otherwise, add 2 to BX to point to the next entry in the table (this could have been done by ADD BX,2 or INC BX INC BX which would mean the MOV CX,2 would be unnecessary - just the way I wrote it...) and increment the '1' in AH to '2'.
The end-condition for the loop is when AH gets incremented from '9' to - well, ':' or '9'+1. If it hasn't reached that end-condition, then run around the loop until all of the values '1'..'9' have been tested. If you haven't got to PNAME yet, then there's an error because the character input wasn't allowed, so point to an error message and then print it.
Now jumping to the end - probably you'd want to terminate the program, so you'd execute
MOV AH,4CH
INT 21H

ASCII code interpretation (assembly)

First of all, thanks for all the help thus far.
Complete code can be found here
I have trouble understanding these lines. I wrote some comments...
The line, for example, mov es:[si+6], al means move data in al to the memory address marked by si+6 (I think this would be an offset calculation).
Then what is add si,40 in the loop?
Any helps mean everything to me! Thank you.
L0_95: ; this segment prints ASCII code 0 - 95
mov si,6 ; refers to the string we declared at the beginning
mov cx,4 ; I think this is the height?
C1A:
; this loop adds the name of the column
mov al,string[0]
mov es:[si],al
mov al,string[2]
mov es:[si+6],al
mov al,string[4]
mov es:[si+24],al
mov al,string[6]
mov es:[si+28],al
add si,40 ;;;; what is this line?
loop C1A
mov si,122 ;;;; and these three lines?
mov bx,0
mov cx,4
C1B:push cx
mov cx,24
add si,40
C1C:push cx
call DEC_CONVERT
add si,2
call HEX_CONVERT
add si,2
call BIN_CONVERT
add si,2
call CHAR_CONVERT
inc bx
add si,126
pop cx
loop C1C
pop cx
sub si,3840
loop C1B
ret
L96_191:
add si advances the si register by 40.
mov si,122 sets the si register to 122, probably the address of some data. The remaining two instructions should now be self-explanatory.

Resources