Access specific bit in embedded X86 assembly - visual-studio-2010

I am trying to acces a specific bit and modify it.
I have moved 0x01ABCDEF (hex value) into ecx and want to be able to check bit values at specific position.
For example I must take byte 0 of 0x01ABCDEF (0xEF)
check if bit at position 7 is 1
set the middle 4 bits to 1 and the rest to 0.

Under x86 the most simple solution is using bit manipulation instructions like BT (bit test), BTC (bit test and complement), BTR (bit test and reset) and BTS (bit test and set).
Bit test example:
mov dl, 7 //test 7th bit
bt ecx, edx //test 7th bit in register ecx
Remember: only last 5 bits in register edx is used.
or
bt ecx, 7
In both cases the result is stored in carry flag.

It's been years since I've done asm, but you want to and your value with 0x80 and if the result is zero your bit is not set so you jump out, otherwise continue along and set your eax to the value you want (I assume the four bits you mean are the 00111100 in the fourth byte.
For example (treat this as pseudo code as it's been far too long):
and eax, 0x80
jz exit
mov eax, 0x3C
exit:

Most CPUs do not support bit-wise access, so you have to use OR to set and AND to clear bits.
As I'm not really familiar with assembly I will just give you C-ish pseudocode, but you should easily be able to transform that to assembly instructions.
value = 0x01ABCDEF;
last_byte = value & 0xFF; // = 0xEF
if (last_byte & 0x40) { // is the 7th bit set? (0x01 = 1st, 0x02 = 2nd, 0x04 = 3rd, 0x08 = 4th, 0x10 = 5th, 0x20 = 6th, 0x40 = 7th, 0x80 = 8th)
value = value & 0xFFFFFF00; // clear last byte
value = value | 0x3C; // set the byte with 00111100 bits (0x3C is the hex representation of these bits)
}
Not that you can remove the last_byte assignment and directly check value & 0x40. However, if you want to check something which is not the least significant part, you have to do shifting first. For example, to extract ABCD you would use the following:
middle_bytes = (value & 0xFFFF00) >> 8;
value & 0cFFFF00 gets rif og the more significant byte(s) (0x01) and >> 8 shifts the result left by one byte and thus gets rid of the last byte (0xEF).

Related

Outputting Az-Za in MS-Debug

I'm currently struggling in outputting AzByCx in ms debug since I don't really know too much about it.
Included here are the Basic commands that my teacher sent to us.
I can output A-Z easily, but not AzBxCy, there's not much of a detailed tutorial online so I came here to ask.
-e 200 "Az"
-a
1BD8:0100 mov bx, word [200]
1BD8:0104 mov cx, 3
1BD8:0107 mov ah, 2
1BD8:0109 mov dl, bl
1BD8:010B int 21
1BD8:010D mov dl, bh
1BD8:010F int 21
1BD8:0111 add bx, FF01
1BD8:0115 loop 109
1BD8:0117 mov dl, 0D
1BD8:0119 int 21
1BD8:011B mov dl, 0A
1BD8:011D int 21
1BD8:011F mov ax, 4C00
1BD8:0122 int 21
1BD8:0124
-g
AzByCx
What does this do?
initialises word at ds:200h to the string "Az" (little endian low byte is "A", high byte is "z")
use A command to assemble the following program:
load from memory into register bx (to initialise the register without having to look up the numeric ASCII codepoints)
set up cx as loop counter for three iterations
set up register ah for the interrupt 21h service 02h (display character in dl)
display character read from bl (low byte of bx)
display character from bh (high byte of bx)
add 1 to bl and -1 (in two's complement resulting in 0FFh) to bh which increments the codepoint in bl and decrements the one in bh
loop back to display subsequent letters
display a linebreak (codepoints 13 and 10, or 0Dh 0Ah) to make the output easier to read
terminate program
use G command to run the program
Microsoft DEBUG.EXE can indeed be used to write a simple program in Intel-syntax assembly, see this example. Replace the string definition db "Hello world$" with db "AzByCx$" and you're done.
However, this is definitely not a good way how to learn assembly language.
Find a better teacher.

I still don't get how IMUL works in Assembly

I am a begginer with assembly i just started learning it and i don't get how the instruction IMUL really works
For example i'm working on this piece of code on visual studio:
Mat = 0A2A(hexadecimal)
__asm {
MOV AX, Mat
AND AL,7Ch
OR AL,83h
XOR BL,BL
SUB BL,2
IMUL BL
MOV Ris5,AX
}
the result in Ris5 should be 00AA (in hexadecimal), for the first couple lines i'm all good, from the first line to 'SUB BL,2'
the results are AL = AB (AX =0AAB)
but then starting from IMUL i'm stuck.
I know that IMUL executes a signed multiply of AL by a register or a byte or a word .. and stores the result in AX (here) but i can't find the same result (00AA)
MOV AX, Mat AX = 0x0A2A (...00101010)
AND AL,7Ch AX = 0x0A28 (...00101000)
OR AL,83h AX = 0x0AAB (...10101011)
XOR BL,BL BL = 0x00
SUB BL,2 BL = 0xFE
IMUL BL AX = 0xFFAB * 0xFFFE = 0x00AA
MOV Ris5,AX Ris5 = 0x00AA
When you multiply two N bit numbers the lower N bits don't care about signed vs unsigned, but as you pad the numbers then you get into signed vs unsigned multiply instructions as you will see in some instruction sets. To not lose precision you desire a 2*N number of bits result, grade school math:
00000000aaaaaaaa
* 00000000bbbbbbbb
=====================
AAAAAAAAAAaaaaaa
* BBBBBBBBBBbbbbbb
====================
Signed vs unsigned with the Capital letter representing the sign extension
0xAB = 171 unsigned = -85 signed
0xFE = 254 unsigned = -2 signed
unsigned multiply 171 * 254 = 43434 = 0xA9AA
signed multiply -85 * -2 = 170 = 0x00AA
The lower byte is the same as they are 8 bit operands and the sign extension doesn't come into play:
bbbbbbbb *a[0]
bbbbbbbb *a[1]
bbbbbbbb *a[2]
bbbbbbbb *a[3]
bbbbbbbb *a[4]
bbbbbbbb *a[5]
bbbbbbbb *a[6]
+ bbbbbbbb *a[7]
==================
cyyyyyyyxxxxxxxx
If you look up the columns the x bits are not affected by the sign extension so are the same for unsigned and signed. y bits are affected as well as the carry out of the msbit c which makes up the 16th bit of the result.
Now the tool is not complaining about this syntax, is it?
Mat = 0A2A(hexadecimal)
Without an h at the end or 0x or $ up-front that looks like octal, but the A's would cause an error if octal (or if decimal). Assuming you start with 0x0A2A, I think your understanding is solid.

MS-DOS Debugger, move to next memory location

I am using DosBox and I have a string read in from the buffer using an interrupt. I know where the first character is stored in memory, how do I increment to the next character?
0100 mov ah, 0a
0102 mov dx, 111
0105 int 21
0107 mov dl, [113] ;first character here
010b mov ah, 02
010d int 21
010f int 20
0111 db 0f
The question is how do I increment to the next character in the string? If I input the string "Hello" and then use inc dl it simply gives me the letter "I" instead of "e".
You need to pick a register to point at the current character. Once you have this there are instructions to advance your pointer as well as instructions to read the byte the pointer is pointing to.
Since you don't know how many characters will be read by the user input function, and you want this function to work with any input value, you'll need some sort of loop so you only print as many characters as have been read, although if you wanted to make it simple, you could make it only print the first 3 characters regardless of what was typed in. I'm going to assume you want to print out exactly what was typed by the user.
A good way to tackle this is to use the LOOP instruction which repeats (jumps into) the loop body by the count that has been placed into the CX register. Since the user input function gives us the character count, all we have to do is read that into the lower portion of the CX register (CL) for loop initialization. We'll also need a pointer to point to the current character. Once in the loop body, we'll just read whatever character is at our "current character" pointer and then call the DOS character-print function to output it to the console. Then we can advance the pointer and LOOP until all characters have been done. Note that each time the LOOP instruction is encountered, CX is decremented until it reaches zero. Once it is zero, LOOP no longer jumps back into the body and simply advances to the instruction following the LOOP.
NOTE: If you don't output a CRLF after reading the user input, there will be no line feed and the new output will overwrite where the input was read on the console. Effectively it will look like nothing happened.
Here's your modified sample:
0100 MOV AH,0A
0102 MOV DX,0123
0105 INT 21 ;input string using buffer at 0123
0107 MOV DX,0120
010A MOV AH,09
010C INT 21 ;output CRLF sequence first
010E MOV SI,0124 ;point SI at byte containing chars read
0111 XOR CX,CX ;CX = 0
0113 MOV CL,[SI] ;CX = chars_read
0115 INC SI ;mov SI to next char to display
0116 MOV DL,[SI] ;DL = character SI is pointing at
0118 MOV AH,02
011A INT 21 ;display character
011C LOOP 0115 ;loop back to INC instruction until no more chars left
011E INT 20 ;exit
0120 DB 0D ;CR
0121 DB 0A ;LF
0122 DB 24 ;"$" DOS string terminator
0123 DB 20 ;buffer start; max characters = 32
0124 DB 00 ; chars read goes here
0125 DB 00 ; input chars are read here

Assembler instruction: rdtsc

Could someone help me understand the assembler given in https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
It goes like this:
uint64_t msr;
asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr)
:
: "rdx");
How is it different from:
uint64_t msr;
asm volatile ( "rdtsc\n\t"
: "=a" (msr));
Why do we need shift and or operations and what does rdx at the end do?
EDIT: added what is still unclear to the original question.
What does "\n\t" do?
What do ":" do?
delimiters output/input/clobbers...
Is rdx at the end equal to 0?
Just to recap. First line loads the timestamp in registers eax and edx. Second line shifts the value in eax and stores in rdx. Third line ors the value in edx with the value in rdx and saves it in rdx. Fourth line assigns the value in rdx to my variable. The last line sets rdx to 0.
Why are the first three lines without ":"?
They are a template. First line with ":" is output, second is optional input and third one is optional list of clobbers (changed registers).
Is a actually eax and d - edx? Is this hard-coded?
Thanks again! :)
EDIT2: Answered some of my questions...
uint64_t msr;
asm volatile ( "rdtsc\n\t" // Returns the time in EDX:EAX.
"shl $32, %%rdx\n\t" // Shift the upper bits left.
"or %%rdx, %0" // 'Or' in the lower bits.
: "=a" (msr)
:
: "rdx");
Because the rdtsc instruction returns it's results in edx and eax, instead of a straight 64-bit register on a 64-bit machine (See the intel system's programming manual for more information; it's an x86 instruction), the 2nd
instruction shifts the rdx register to the left 32 bits so that edx will be on the upper 32 bits instead of the lower 32 bits.
"=a" (msr) will move the contents of eax into msr (the %0), i.e. into the lower 32 bits of it, so in total you have edx (higher 32 bits) and eax (lower 32 bits) into rdx which is msr.
rdx is a clobber which will represent the msr C variable.
It's similar to doing the following in C:
static inline uint64_t rdtsc(void)
{
uint32_t eax, edx;
asm volatile("rdtsc\n\t", "=a" (eax), "=d" (edx));
return (uint64_t)eax | (uint64_t)edx << 32;
}
And:
uint64_t msr;
asm volatile ( "rdtsc\n\t"
: "=a" (msr));
This one, will just give you the contents of eax into msr.
EDIT:
1) "\n\t" is for the generated assembly to look clearer and error-free, so that you don't end up with things like movl $1, %eaxmovl $2, %ebx
2) Is rdx at the end equal to 0? The left shift does this, it removes the bits that are already in rdx.
3) Is a actually eax and d - edx? Is this hard-coded? Yes, there is a table that describes what characters represents what register, e.g. "D" would be rdi, "c" would be ecx, ...
rdtsc returns timestamp in a pair of 32-bit registers (EDX and EAX). First snippet combines them into single 64-bit register (RDX) which is mapped to msr variable.
Second snippet is the wrong one. I'm not sure about what will happen: either it won't be compiled at all, or only part of msr variable will be updated.

How do you store data in the memory without using a variable within Assembly (NASM)?

I have search every webpage for an answer but I cant seem to find it. I have been learning the net-wide assembly syntax for around 2 months, and I'm trying to find a way to store data in the memory.
I know that sys_break:
mov,eax 45
reserves memory, and I have a functional macro which reserves 16kb of memory:
%macro reserve_mem
mov eax,45
xor ebx,ebx
int 80h
add eax,16384
mov ebx,eax
mov eax,45
int 80h
%endmacro
I also know that when you reserve bytes (resb), words ect. in the .bss section, small parts of memory are allocated to that uninitialised data.
In addition, there is virtual memory which can be accessed with an address like 0x0000, and this is then mapped into its actual memory location.
However, my problem is that I am trying to store data in the memory, but everything I try ends in a segmentation fault(core dumped) which is the programm trying to access memory it doesn't have access to. I have tried code such as below.
mov [0x0000],eax
Thankyou for the help.
You seem to misunderstand the concept of virtual memory. It's similar to telephone number in that you cannot call someone at every combination of some 10 digits. You are supopsed to call at only those numbers listed in the telephone book. Otherwise you'll hear "Sorry, this number is currently out of service". Likewise, only those virtual addresses listed in the page table of each process (always automatically and transparently maintained by OS) are valid for the process to access. SEGV is OS's way of saying "Sorry, this virtual address is currently out of service."
In your code, you dereferenced 0x0000 but it is one of the least possible values for a vaild virtual address. You ended up in doing so because you had thrown away the valid virtual address returned by the brk(2) syscall (read man 2 brk carefully because the raw syscall behaves diffrently from both of glibc brk and sbrk.) Your code would translate to C in this way (though nowadays glibc malloc(3) often relies on mmap(2) rather than brk(2)):
void *p = malloc(16384);
int eax = ...;
(void *)0 = eax;
This is obviously wrong and you must do something like this:
void *p = malloc(16384);
int *p0 = (int *)p + 0;
int *p1 = (int *)p + 1;
int eax = ...;
int ebx = ...; /* it's all up to you which register to use */
*p0 = eax;
*p1 = ebx;
which should translates to NASM like this:
reserve_mem ; IIRC eax now points to the last
mov ecx, eax ; byte of the newly allocated chunk
sub ecx, 16383 ; set p0 (== p)
mov edx, ecx
add edx, 4 ; set p1; 4 is for sizeof(int)
; ... set whatever value to eax ...
; ... set whatever value to ebx ...
mov [ecx], eax ; *p0 = eax;
mov [edx], ebx ; *p1 = ebx;
My knowledge on assembly programming is rusting and the above codes may contain many errors... but the concept part should not be so wrong.

Resources