Why is fscanf read garbage? - gcc

#include <stdio.h>
#include <windows.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#define PATH "F:\\c\\projects\\Banking Management System\\data\\"
#define F_ACCT "accounts.txt"
#define FILENAME(file) PATH file
#define F_ACCT_FPRINTF "%05d%-8s%-30s%d%d%d%-20s%-20s%-20s%c%-15.2lf\n"
#define F_ACCT_FSCANF "%05d%8s%30[^\n]%d%d%d%20[^\n]%20[^\n]%20[^\n]%c%lf\n"
typedef struct Date
{
int dd;
int mm;
int ccyy;
} Date;
typedef struct Account
{
int id;
char acct_no[8];
char name[30];
Date birthday;
char telephone_no[20];
char mobile_no[20];
char tfn[20];
char acct_type; // 'S' - Saving | 'C' - Current | Fixed - 'F' | Recurring - 'R'
double acct_bal;
} Account;
int main(int argc, char *argv[])
{
Account *ac_t=malloc(sizeof(Account));
if (ac_t==NULL)
{
free(ac_t);
perror("Fatal error: ");
exit(EXIT_FAILURE);
}
FILE *fp=fopen(FILENAME(F_ACCT),"a+"); // Save option selected by the user
if (!fp) // NULL=0=true
{
free(ac_t);
perror("ERROR:");
exit(EXIT_FAILURE);
}
(fscanf(fp,F_ACCT_FSCANF,\
&ac_t->id,\
ac_t->acct_no,\
ac_t->name,\
&ac_t->birthday.dd,\
&ac_t->birthday.mm,\
&ac_t->birthday.ccyy,\
ac_t->telephone_no,\
ac_t->mobile_no,\
ac_t->tfn,\
&ac_t->acct_type,\
&ac_t->acct_bal));
printf("\ntmp=%d", tmp);
printf("\n[%d]",ac_t->id);
printf("\n[%s]",ac_t->acct_no);
printf("\n[%s]",ac_t->name);
printf("\n[%d]",ac_t->birthday.dd);
printf("\n[%d]",ac_t->birthday.mm);
printf("\n[%d]",ac_t->birthday.ccyy);
printf("\n[%s]",ac_t->telephone_no);
printf("\n[%s]",ac_t->mobile_no);
printf("\n[%s]",ac_t->tfn);
printf("\n[%c]",ac_t->acct_type);
printf("\n[%lf]",ac_t->acct_bal);
system("pause");
free(pw_t);
return 0;
}
=========================================================================================
Input file (accounts.txt)
=========================
000011000 Anil Dhar 27111960(02) 8883 2827 0408 942 407 111222333 S 100.21
Note: The record was created successfully using frpintf() as per F_ACCT_FPRINTF.
**Problem**
=======
fscanf is reading garbage values like this:
ac_t->id 1
t_acct_no
name Anil Dhar
birthday.dd 27
birthday.mm 11
birthday.ccyy 1960
telephone_no (02) 8883 2827 0408 942 407 111222333 Sogram Files\Intel\ip¬tαK4
mobile_no 0408 942 407 111222333 Sogram Files\Intel\ip¬tαK4
tfn 111222333 Sogram Files\Intel\ip¬tαK4
t_acct_type
acct_bal 74895632819821970000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000
All my string variables like name, telephone_no, mobile_no, tfn could contain spaces.
The record is not delimited with anything. My fscanf() is not populating the fields properly wherever I am reading the string variables.
What could have gone wrong????

$20[^\n] reads everything up until end of line. Read up until a space, for example. Note that all specifiers except %[] and %c automatically eat and ignore leading whitespaces (tabs, newlines and spces). Note that all whitespaces automatically eat and ignore zero or more whitespaces. See scanf documentation. I would also sprinkle spaces anyway to make scanf format more readable. Do not add trailing \n. Check return value of scanf, do not leave it unchecked - scanning may fail. You need to at max read one character less than your buffers, or you have to increase your buffers by one - place for terminating zero byte.
"%d %7[^ ] %29[^ ] %d %d %d %19[^ ] %19[^ ] %19[^ ] %c %lf"

%s reads whitespace delimited strings, while %[^\n] reads up to the next newline. It appears that you want a specific number of characters, so you want %c. So you should have something more like:
#define F_ACCT_FSCANF "%5d%8c%30c%2d%2d%4d%20c%20c%20c%c%lf"
I'm not sure if this is exactly correct, because your printing format is somewhat ambiguous (using %d rather than %02d means you don't know how many digits you'll get). It also will go off the rails badly if your input file had its spacing modified in any way, so you might want to use fgets+sscanf rather than fscanf, as that will at least allow you to resynchronize after any corrupted line.
One thing to be careful of -- %8c will read exactly 8 characters into the buffer provided as an argument with NO terminating NUL -- if you want NUL terminated strings, you'll need to arrange for that manually.

Related

Non-ASCII character casted on int

I would like to cast a non-ASCII character (for example 'ą') on int to get it number in UTF-8. When I do something like this:
#include <iostream>
using namespace std;
int main()
{
cout << static_cast<int>('ą')<<endl;
return 0;
}
I get -71 what is not its proper number in UTF-8. I heard that it might be because 'ą' is stored in 2 bytes and one of them is cut away when initialization of variable. Any solution for this?

Unexpected value appears on stack when attempting buffer overflow

I am trying to learn more about cyber security, in this case about buffer overflows. I have a simple code that I want to change flow of:
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
void win()
{
printf("code flow successfully changed\n");
}
int main(int argc, char **argv)
{
volatile int (*fp)();
char buffer[64];
fp = 0;
gets(buffer);
if(fp) {
printf("calling function pointer, jumping to 0x%08x\n", fp);
fp();
}
}
By using some tools I have determined that function pointer (fp) gets it value updated after 72 characters have entered the buffer. The function win() is located at value 0xe5894855 so after 72 characters I need to provide that value to buffer for it to jump to the desired function.
However I am facing this issue:
By putting Python3's print("A"*18*4 + "UH" + "\x89" + "\xe5") into input of given C code, I should be getting desired value 0xe5894855 in section marked with red. But instead, I am getting highlighted malformed hex from somewhere. (89 is getting extra C2 and incorrect e5 value is overflowing to next part of stack) (value in those parts of stack are zero initially, but changed into that once overflow is attempted).
Why is this happening? Am I putting hex values into C program incorrectly?
Edit: Still have not figured out why passing hex through python did not work, but I found a different method, by using Perl: perl -e 'print "A"x4x18 . "\x55\x48\x89\xe5"', which did work, and address I needed to jump to was also incorrect (which I also fixed)

Read a utf8 file to a std::string without BOM

I am trying to read a utf8 content to char*, my file does not have any DOM, so the code is straight, (the file is unicode punctuation)
char* fileData = "\u2010\u2020";
I cannot see how a single unsigned char 0 > 255 can contain a character of value 0 > 65535 so I must be missing something.
...
std::ifstream fs8("../test_utf8.txt");
if (fs8.is_open())
{
unsigned line_count = 1;
std::string line;
while ( getline(fs8, line))
{
std::cout << ++line_count << '\t' << line << L'\n';
}
}
...
So how can I read a utf8 file into a char*, (or even a std::string)
well, you ARE reading the file correctly into std::string and std::string do support UTF8, it's probably that your console * which cannot show non-ASCII character.
basically, when a character code page is bigger than CHAR_MAX/2, you simply represent this character with many character.
how and how many characters? this is what encoding is all about.
UTF32 for example, will show each character, ASCII and non ASCII as 4 characters. hence the "32" (each byte is 8 bit, 4*8 = 32).
without providing any auditional information on what OS you are using, we can't give a an advice on how your program can show the file's line.
*or more exactly, the standard output which will probably be implemented as console text.

Listing all files in a folder, only first character of file name gets printed [duplicate]

This question already has answers here:
Filenames truncate to only show first character
(3 answers)
Closed 9 years ago.
I'm trying to access all images in a designated folder, get their names, and then pass them for further processing (getting their pixel values, to be precise, but this isn't relevant now). The following test code should list the name of every image found, however, for some reason it only lists the first letter for each image.
#include <windows.h>
int main(int argc, char* argv[])
{
WIN32_FIND_DATA search_data;
memset(&search_data, 0, sizeof(WIN32_FIND_DATA));
HANDLE handle = FindFirstFile(L"images\\*.jpg", &search_data);
while(handle != INVALID_HANDLE_VALUE)
{
printf("Found file: %s\r\n", search_data.cFileName);
if(FindNextFile(handle, &search_data) == FALSE)
break;
}
return 0;
}
Your program is compiled for Unicode, but your printf format string is expecting an ASCII string. Change the %s to %S.

using MultiByteToWideChar

The following code prints the desired output but it prints garbage at the end of the string. There is something wrong with the last call to MultiByteToWideChar but I can't figure out what. Please help??
#include "stdafx.h"
#include<Windows.h>
#include <iostream>
using namespace std;
#include<tchar.h>
int main( int, char *[] )
{
TCHAR szPath[MAX_PATH];
if(!GetModuleFileName(NULL,szPath,MAX_PATH))
{cout<<"Unable to get module path"; exit(0);}
char ansiStr[MAX_PATH];
if(!WideCharToMultiByte(CP_ACP,WC_COMPOSITECHECK,szPath,-1,
ansiStr,MAX_PATH,NULL,NULL))
{cout<<"Unicode to ANSI failed\n";
cout<<GetLastError();exit(1);}
string s(ansiStr);
size_t pos = 0;
while(1)
{
pos = s.find('\\',pos);
if(pos == string::npos)
break;
s.insert(pos,1,'\\');
pos+=2;
}
if(!MultiByteToWideChar(CP_ACP,MB_PRECOMPOSED,s.c_str(),s.size(),szPath,MAX_PATH))
{cout<<"ANSI to Unicode failed"; exit(2);}
wprintf(L"%s",szPath);
}
MSDN has this to say about the cbMultiByte parameter:
If this parameter is -1, the function processes the entire input
string, including the terminating null character. Therefore, the
resulting Unicode string has a terminating null character, and the
length returned by the function includes this character.
If this parameter is set to a positive integer, the function processes
exactly the specified number of bytes. If the provided size does not
include a terminating null character, the resulting Unicode string is
not null-terminated, and the returned length does not include this
character.
..so if you want the output string to be 0 terminated you should include the 0 terminator in the length you pass in OR 0 terminate yourself based on the return value...

Resources