I have written a code for 8086 microprocessor for taking string from keyboard and displaying it as follows
Title Get the string from keyboard and display it
.model small
.stack 100h
.data
str1 db 'Enter String ','$'
str2 db 50 dup('$')
str3 db 0dh, 0ah, '$'
.code
main proc
mov ax,#data
mov ds,ax
mov ah,09h ; for displaying Enter String
lea dx,str1
int 21h
mov ah,0ah ; for taking i/p from keyboard
lea dx,str2
int 21h
mov ah,09h ; for displaying in new line
lea dx,str3
int 21h
mov ah,09h ; for displaying what you have entered
lea dx,str2+2
int 21h
int 21h
mov ah,4ch
int 21h
main endp
end main
I don't understand why we have to give effective address of the string as str+2 to print the inputted string back ? If simply lea dx, str2 is used no string is displayed.
Thanks in advance.
The first byte at STR2 should contain the maximum number of characters to read. The second byte should contain the number of characters already present. Because you initialized STR2 with 50 '$' characters and the ASCII value of '$' is 36 you effectively asked DOS to allow an input of 36 characters that are already there!
Better code str2 db 50,0,50 dup (0)
At the conclusion of this 'Buffered STDIN Input' the second byte will contain the number of characters that were read. I hope you see now why you need lea dx,str2+2. That's the address where the characters are.
Please note that with a first byte of 50 input is limited to 49 characters as DOS appends a terminating '$' character (Not included in the second byte).
Related
I'm trying to get a simple Hello world program in NASM to run.
I want to print to the console without using C-Libraries, interfacing directly with WinAPI.
I am using the Visual Studio provided LINK.EXE for linking.
Here's my code so far:
section .data
message: db 'Hello world!',10 ; 'Hello world!' plus a linefeed character
messageLen: db $-message ; Length of the 'Hello world!' string
global _start
extern GetStdHandle
extern WriteConsoleW
extern ExitProcess
section .text
_start:
; DWORD bytes;
mov rbp, rsp
sub rsp, byte 8
; hStdOut = GetStdHandle(STD_OUTPUT_HANDLE)
mov ecx, -11
call GetStdHandle
; WriteFile(hstdOut, message, length(message), &bytes, 0);
mov rcx, rax
mov rdx, message
mov r8, messageLen
lea r9, [rsp-4]
push 0
call WriteConsoleW
; ExitProcess(0)
mov rcx, 0
call ExitProcess
ret
Which I assemble and link like this:
nasm -f win64 .\ASM.ASM
link /entry:_start /nodefaultlib /subsystem:console .\ASM.obj "C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x64\kernel32.lib" "C:\Program Files (x86)\Windows Kits\10\Lib\10.0.18362.0\um\x64\user32.lib"
However when I run the resulting .exe file, I get nothing.
Some things I tried so far are
Using the decorated names (like _GetStdHandle#4), which resulted in the linker complaining about unresolved references
Not trying to print anything and calling Sleep, which resulted in the process sleeping indefinitely
Exiting with a different return code, which once again did nothing
What am I doing wrong?
EDIT: Fixed calling convention
There are three problems with your revised code. The first is:
message: db 'Hello world!',10 ; 'Hello world!' plus a linefeed character
messageLen: db $-message ; Length of the 'Hello world!' string
You defined messageLen to be a byte containing the length of the message and storing that value at the address of messageLen. You then do this:
mov r8, messageLen
That would move the address of label messageLen to r8. What you really should have done is define messageLen as an assembly time constant like this:
messageLen equ $-message ; Length of the 'Hello world!' string
The second problem is that you define the the string as a sequence of single byte characters:
message: db 'Hello world!',10 ; 'Hello world!' plus a linefeed character
There is nothing wrong with this, but to print them out you need to use the Ansi version of the function WriteConsole which is WriteConsoleA. Using WriteConsoleW printed the string as Unicode (UTF-16 on Windows 2000 and later, UTS-2 on NT4 and earlier versions of Windows).
The third problem is with regards to a mandatory 32 bytes of shadow space before the stack based parameter(s) are placed on the stack before making a function call. You also need to make sure the stack (RSP) is a 16-byte aligned value at the point of making a function call. These requirement can be found in the Microsoft 64-bit calling convention.
Code that would take this into account would look like this:
section .data
message: db 'Hello world!',10 ; 'Hello world!' plus a linefeed character
messageLen equ $-message ; Length of the 'Hello world!' string
global _start
extern GetStdHandle
extern WriteConsoleA
extern ExitProcess
section .text
_start:
; At _start the stack is 8 bytes misaligned because there is a return
; address to the MSVCRT runtime library on the stack.
; 8 bytes of temporary storage for `bytes`.
; allocate 32 bytes of stack for shadow space.
; 8 bytes for the 5th parameter of WriteConsole.
; An additional 8 bytes for padding to make RSP 16 byte aligned.
sub rsp, 8+8+8+32
; At this point RSP is aligned on a 16 byte boundary and all necessary
; space has been allocated.
; hStdOut = GetStdHandle(STD_OUTPUT_HANDLE)
mov ecx, -11
call GetStdHandle
; WriteFile(hstdOut, message, length(message), &bytes, 0);
mov rcx, rax
mov rdx, message
mov r8, messageLen
lea r9, [rsp-16] ; Address for `bytes`
; RSP-17 through RSP-48 are the 32 bytes of shadow space
mov qword [rsp-56], 0 ; First stack parameter of WriteConsoleA function
call WriteConsoleA
; ExitProcess(0)
; mov rcx, 0
; call ExitProcess
; alternatively you can exit by setting RAX to 0
; and doing a ret
add rsp, 8+8+32+8 ; Restore the stack pointer.
xor eax, eax ; RAX = return value = 0
ret
Supposing we have got a text file sample.txt:
one
two
...
Now we want to remove the first line:
two
...
A quick way to do that is to use input redirection, set /P and findstr1 (I know there are other ways using more or for /F, but let us forget about them for now):
#echo off
< "sample.txt" (
set /P =""
findstr "^"
)
The output is going to be as expected.
However, why is the output empty when I replace the input redirection < by type and a pipe | :
#echo off
type "sample.txt" | (
set /P =""
findstr "^"
)
When I replace set /P ="" by pause > nul, the output is what I expect -- the input file is output but with the first character of the first line missing (as it is consumed by pause). But why does set /P seem to consume everything instead of only the first line like it does with the redirection < approach? Is that a bug?
To me it looks like set /P fails to adequately initialise the reading pointer to the piped data.
I watched that strange behaviour on Windows 7 and on Windows 10.
It becomes even more weird: when calling the script containing the pipe multiple times, for instance by a loop like for /L %I in (1,1,1000) do #pipe.bat, and the input file contains about fifteen lines or more, sometimes (a few times out of thousand) a fragment of the input file is returned; that fragment is exactly the same each time; it seems that there are always 80 bytes missing at the beginning.
1) findstr hangs in case the last line is not terminated by a line-break, so let us assume such is there.
When retrieving data, the set /p tries to fill a 1023 character buffer (if they are available) with data from stdin. Once this read operation has ended, the first end of line is searched and once it has been found (or the end of the buffer has been reached), the SetFilePointer API is called to reposition the input stream pointer after the end of the read line. This way the next read operation will start to retreive data after the read line.
This works flawlessly when a disk file is associated with the input stream, but as Microsoft states in the SetFilePointer documentation
The hFile parameter must refer to a file stored on a seeking device;
for example, a disk volume. Calling the SetFilePointer function with a
handle to a non-seeking device such as a pipe or a communications
device is not supported, even though the SetFilePointer function may
not return an error. The behavior of the SetFilePointer function in
this case is undefined.
What is happening is that, while not generating any error, the call to reposition the read pointer fails when stdin is associated with a pipe, the pointer is not moved back and the 1023 bytes (or the number of available read bytes) keep read.
edited in response to Aacini request
The set command is processed by the eSet function, who calls SetWork to determine which type of set command will be executed.
As it is a set /p the SetPromptUser function is called and from this function the ReadBufFromInput function is called
add esp, 0Ch
lea eax, [ebp+var_80C]
push eax ; int
push 3FFh ; int
lea eax, [ebp+Value]
push eax ; int
xor esi, esi
push 0FFFFFFF6h ; nStdHandle
mov word ptr [ebp+Value], si
call edi ; GetStdHandle(x) ; GetStdHandle(x)
push eax ; hFile
call _ReadBufFromInput#16 ; ReadBufFromInput(x,x,x,x)
it requests 3FFh (1023) characters from standard input handle (0FFFFFFF6h = -10 = STD_INPUT_HANDLE)
ReadBufFromInput uses the GetFileType API to determine if it should read from the console or from a file
; Attributes: bp-based frame
; int __stdcall ReadBufFromInput(HANDLE hFile, int, int, int)
_ReadBufFromInput#16 proc near
hFile= dword ptr 8
; FUNCTION CHUNK AT .text:4AD10D3D SIZE 00000006 BYTES
mov edi, edi
push ebp
mov ebp, esp
push [ebp+hFile] ; hFile
call ds:__imp__GetFileType#4 ; GetFileType(x)
and eax, 0FFFF7FFFh
cmp eax, 2
jz loc_4AD10D3D
and, as in this case it is a pipe (GetFileType returns 3) the code jumps to the ReadBufFromFile function
; Attributes: bp-based frame
; int __stdcall ReadBufFromFile(HANDLE hFile, LPWSTR lpWideCharStr, DWORD cchWideChar, LPDWORD lpNumberOfBytesRead)
_ReadBufFromFile#16 proc near
var_C= dword ptr -0Ch
cchMultiByte= dword ptr -8
NumberOfBytesRead= dword ptr -4
hFile= dword ptr 8
lpWideCharStr= dword ptr 0Ch
cchWideChar= dword ptr 10h
lpNumberOfBytesRead= dword ptr 14h
This function will call the ReadFile API function to retrive the indicated number of characters
push ebx ; lpOverlapped
push [ebp+lpNumberOfBytesRead] ; lpNumberOfBytesRead
mov [ebp+var_C], eax
push [ebp+cchWideChar] ; nNumberOfBytesToRead
push edi ; lpBuffer
push [ebp+hFile] ; hFile
call ds:__imp__ReadFile#20 ; ReadFile(x,x,x,x,x)
The returned buffer is iterated in search of an end of line, and once it is found, the pointer in the input stream is moved after the found poisition
.text:4AD06A15 loc_4AD06A15:
.text:4AD06A15 cmp [ebp+NumberOfBytesRead], 3
.text:4AD06A19 jl short loc_4AD06A2D
.text:4AD06A1B mov al, [esi]
.text:4AD06A1D cmp al, 0Ah
.text:4AD06A1F jz loc_4AD06BCF
.text:4AD06A25
.text:4AD06A25 loc_4AD06A25:
.text:4AD06A25 cmp al, 0Dh
.text:4AD06A27 jz loc_4AD06D14
.text:4AD06A2D
.text:4AD06A2D loc_4AD06A2D:
.text:4AD06A2D movzx eax, byte ptr [esi]
.text:4AD06A30 cmp byte ptr _DbcsLeadCharTable[eax], bl
.text:4AD06A36 jnz loc_4AD12018
.text:4AD06A3C dec [ebp+NumberOfBytesRead]
.text:4AD06A3F inc esi
.text:4AD06A40
.text:4AD06A40 loc_4AD06A40:
.text:4AD06A40 cmp [ebp+NumberOfBytesRead], ebx
.text:4AD06A43 jg short loc_4AD06A15
.text:4AD06BCF loc_4AD06BCF:
.text:4AD06BCF cmp byte ptr [esi+1], 0Dh
.text:4AD06BD3 jnz loc_4AD06A25
.text:4AD06BD9 jmp loc_4AD06D1E
.text:4AD06D14 loc_4AD06D14:
.text:4AD06D14 cmp byte ptr [esi+1], 0Ah
.text:4AD06D18 jnz loc_4AD06A2D
.text:4AD06D1E
.text:4AD06D1E loc_4AD06D1E:
.text:4AD06D1E mov eax, [ebp+var_C]
.text:4AD06D21 mov [esi+2], bl
.text:4AD06D24 sub esi, edi
.text:4AD06D26 inc esi
.text:4AD06D27 inc esi
.text:4AD06D28 push ebx ; dwMoveMethod
.text:4AD06D29 push ebx ; lpDistanceToMoveHigh
.text:4AD06D2A mov [ebp+cchMultiByte], esi
.text:4AD06D2D add esi, eax
.text:4AD06D2F push esi ; lDistanceToMove
.text:4AD06D30 push [ebp+hFile] ; hFile
.text:4AD06D33 call ds:__imp__SetFilePointer#16 ; SetFilePointer(x,x,x,x)
The short summarize of a long discussion at dostips (mentioned already by Aacini
set /p problems with pipes).
Reading with set /p from a redirect, reads always to the end of the line and removes the \r\n characters.
Reading with set /p from a pipe, reads up to 1023 bytes from the pipe buffer.
It doesn't stop at any \r or \n characters but it drops all content after a \n.
After closing the pipe on the left side, a set /p on the right side will read empty lines.
See full code here.
I have filled a buffer (malloc'd) with an fread call and it is a success. I am now trying to iterate over the buffer and commence parsing the input. I'm trying to start really simple by walking the buffer and output each char to the screen. But my loop is just outputting the entire input. Here is the loop portion of the code:
mov ecx, 0
mov ebx, buffer
.readByte:
push DWORD [ebx + 1 * ecx]
push DWORD ecx
push DWORD char
call _printf
add esp, 12
incr ecx
cmp ecx, [fsz]
jge .endRead
jmp .readByte
The contents of the source file that is read in (s1.txt) is:
1 + 2;
My goal is to simply output:
1
+
2
;
Since you used %s format, which indicates a string, and that without a length specifier, why did you expect it to print just a single character? You should try %c format and something like movzx eax, byte [ebx + ecx]; push eax to pass the argument. A %.1s format specifier could also work and then you can keep your argument passing. Don't forget to add a newline too, if you want that. You could also just use putchar of course.
Oh, and ecx is a caller-saved register, as such any function you call may destroy its value. So if you want to keep using that, you need to save and restore it yourself.
I want to create macro in FASM, which could directly print string (int DOS) like this:
prints 'hey there!!!!'
I have written such code:
format MZ
use16
stack 0x100
entry _TEXT#16:_start
;
macro prints str
{
call #f
db str, 0x24
##:
pop dx
mov ah, 9
int 0x21
}
segment _DATA#16 use16
msg db 'hi!', 0xd, 0xa, 0x24
segment _TEXT#16 use16
_start:
push _DATA#16
pop ds
prints 'hi there))) !!!!'
prints 'me'
mov ax, 0x4c00
int 0x21
ret
The problem is: when I leave my _DATA#16 segment empty (without any variables) all is fine.
But when I define new variable in that segment some raw extra symbols begin to appear like this: http://board.flatassembler.net/files/err_758.png
So can you help me? where is my mistake?
Maybe I have chosen the wrong way to achieve that thing I want?
Help please....
As far as I understood it is because int 21h expects offset in _DATA#16 segment but not _CODE#16 segment. So, the easiest way - to use only one segment in program or just using .com files. Here is sample:
use16
org 0x100
macro prints [str*]
{
pusha
if str in <0xd, 0xa, 9>\
| str eqtype ''
call #f
db str, 0x24
##:
pop dx
else
mov dx, str
end if
mov ah, 9
int 0x21
popa
}
_start:
prints 0xd, 0xa, 9
prints 'hi!', 0xd, 0xa
mov ax, msg
prints ax, 0xd, 0xa
prints msg
int 0x20
ret
msg db 'hey there!', 0x24
It can accept strings directly, addresses of strings in registers and variables.
It can also handle 3 special characters - 0xd (CR), 0xa (LF) and 9 (TAB).
If I find the way to display string directly in multi-segment programs, I will post the answer.
I am currently trying to append a null terminator to an(a?) user inputted string:
.386
.model flat, stdcall
WriteFile PROTO STDCALL:DWORD, :PTR, :DWORD, :PTR DWORD, :PTR OVERLAPPED
ReadFile PROTO STDCALL:DWORD, :PTR, :DWORD, :PTR DWORD, :PTR OVERLAPPED
GetStdHandle PROTO STDCALL:DWORD
.data
buff DB 100h DUP(?)
stdInHandle DWORD 0
bytesRead DWORD ?
.code
start:
;read string from stdin
INVOKE GetStdHandle, -10
MOV stdInHandle, eax
INVOKE ReadFile, stdInHandle, BYTE PTR[buff], 100, ADDR bytesRead, 0
;append null terminator on CR,LF
MOV eax, bytesRead
MOV edx, BYTE PTR[buff]
SUB eax, 2
AND BYTE PTR [eax+edx], 0
RET
END start
It refuses to assemble at MOV edx, BYTE PTR[buff] and gives me an error:
error: Invalid combination of opcode and operands (or wrong CPU setting).
So I'm assuming I cannot MOV the value of BYTE PTR[buff] into register edx. So I can't even begin to test if this method of trying to apply a NULL terminator to a string will even work.
My question is, what is wrong with the above code (should I use a different register instead of edx?)
What is the best way to apply a NULL terminator to the string?
You can't move a byte value into a dword sized register. You either need to use a byte sized register such as dl, or zero-extend it with movzx. As you are working with bytes, I suggest you go with the first option.
When I had to create methods for strings without using anything from good ole Irvine, I got the length of the string, incremented what the length returned as (you need to include an extra +1 for the null-terminator) by 1, and then added 0h to the end of the string where the pointer was where the counter is.
MOV EAX, SIZEOF lpSourceString + 1 ; Get the string length of string, add 1 to include null-terminator
INVOKE allocMem, EAX ; Allocate memory for a target to copy to
LEA ESI, [lpSourceString] ; put source address in ESI
MOV EDI, EAX ; copy the dest address to another register we can increment
MOV ECX, SIZEOF lpSourceString ; Set up loop counter
We have the size of the string. Now we can add the null-terminate to it. To do that, we need to make sure that we have a pointer looking at the end of the string. So if we have a method that returns a string in EAX, EAX needs to point to the start of the string (so we leave the allocMem unmodified, instead incrementing a copy in EDI). Let's say that we are putting characters in a string:
nextByte: ; Jump label, get the next byte in the string until ECX is 0
MOV DL, [ESI] ; Get the next character in the string
MOV [EDI], DL ; Store the byte at the position of ESI
INC ESI ; Move to next char in source
INC EDI ; INCrement EDI by 1
loop nextByte ; Re-loop to get next byte
MOV byte ptr[EDI], 0h ; Add null-terminator to end of string
; EAX holds a pointer to the start of the dynamically-allocated
; 0-terminated copy of lpSourceString
MOV requires the byte ptr size specifier because neither the [EDI] memory operand nor the 0 immediate operand would imply a size for the operation. The assembler wouldn't know if you meant a byte, word, or dword store.
I have this in my MASM, but I use a String_length stdcall method I had written due to a class requirement.
This is so common that the MASM32 runtime supplies this functionality as part of its runtime. All you need to do is include the relevant code:
include \masm32\include\masm32rt.inc
Then use the StripLF function as so:
invoke StripLF, addr buff
To fix your current problem (if you want to do it manually) , you need to move the address of buff to edx instead.
mov edx, offset buff