Same .txt files, different sizes? - utf-8

I have a program that reads from a .txt file
I use the cmd prompt to execute the program with the name of the text file to read from.
ex: program.exe myfile.txt
The problem is that sometimes it works, sometimes it doesn't.
The original file is 130KB and doesn't work.
If I copy/paste the contents, the file is 65KB and works.
If I copy/paste the file and rename it, it's 130KB and doesn't work.
Any ideas?
After more testing it shows that this is what makes it not work:
int main(int argc, char *argv[])
{
char *infile1
char tmp[1024] = { 0x0 };
FILE *in;
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
if (strcmp(argv[i], "-sec") == 0) /* Process optional arguments. */
{
opt = 1; /* This is used as a boolean value. */
/*
* The last argument is argv[argc-1]. Make sure there are
* enough arguments.
*/
if (i + 1 <= argc - 1) /* There are enough arguments in argv. */
{
/*
* Increment 'i' twice so that you don't check these
* arguments the next time through the loop.
*/
i++;
optarg1 = atoi(argv[i]); /* Convert string to int. */
}
}
else /* not -sec */
{
if (infile1 == NULL) {
infile1 = argv[i];
}
else {
if (outfile == NULL) {
outfile = argv[i];
}
}
}
}
in = fopen(infile1, "r");
if (in == NULL)
{
fprintf(stderr, "Unable to open file %s: %s\n", infile1, strerror(errno));
exit(1);
}
while (fgets(tmp, sizeof(tmp), in) != 0)
{
fprintf(stderr, "string is %s.", tmp);
//Rest of code
}
}
Whether it works or not, the code inside the while loop gets executed.
When it works tmp actually has a value.
When it doesn't work tmp has no value.
EDIT:
Thanks to sneftel, we know what the problem is,
For me to use fgetws() instead of fgets(), I need tmp to be a wchar_t* instead of a char*.
Type casting seems to not work.
I tried changing the declaration of tmp to
wchar_t tmp[1024] = { 0x0 };
but I realized that tmp is a parameter in strtok() used elsewhere in my code.
I here is what I tried in that function:
//tmp is passed as the first parameter in parse()
void parse(wchar_t *record, char *delim, char arr[][MAXFLDSIZE], int *fldcnt)
{
if (*record != NULL)
{
char*p = strtok((char*)record, delim);
int fld = 0;
while (p) {
strcpy(arr[fld], p);
fld++;
p = strtok('\0', delim);
}
*fldcnt = fld;
}
else
{
fprintf(stderr, "string is null");
}
}
But typecasting to char* in strtok doesn't work either.
Now I'm looking for a way to just convert the file from UTF-16 to UTF-8 so tmp can be of type char*
I found this which looks like it can be useful but in the example it uses input from the user as UTF-16, how can that input be taken from the file instead?
http://www.cplusplus.com/reference/locale/codecvt/out/

It sounds an awful lot like the original file is UTF-16 encoded. When you copy/paste it in your text editor, you then save the result out as a new (default encoding) (ASCII or UTF-8) text file. Since a single character takes 2 bytes in a UTF-16-encode file but only 1 byte in a UTF-8-encoded file, that results in the file size being roughly halved when you save it out.
UTF-16 is fine, but you'll need to use Unicode-aware functions (that is, not fgets) to work with it. If you don't want to deal with all that Unicode jazz right now, and you don't actually have any non-ASCII characters to deal with in the file, just do the manual conversion (either with your copy/paste or with a command-line utility) before running your program.

Related

How to read hex and mhx extension in CAPL Script

I HAVE
ReadBinFile () //Reads 2048 byte from bin file
{
transferlength = fileGetBinaryBlock(buffer, 2048 , fileHandle);
}
Now i want to read .hex and .mhx extension files i could not find the inbuilt function what is the option to do this in capl script.
*.hex or *.mhx format files can be read with help of OpenFileRead function.
the below piece of code will help to read the .hex file.
void readhexfile(void)
{
dword i;
char LineBuffer[0xFF];
int ByteCount;
char CountAscii[5];
int Data;
char Ascii[5];
dword readaccess = 0;
dword bufferpointer = 0;
byte buffer[1*1024*1024]; //1MB hex file size
if ((-1) != strstr_regex(FILENAME, ".hex")) //FILENAME-> Sysvariable
{
readaccess = OpenFileRead (FILENAME,0);
/* --> identified as IntelHEX-Input */
if (readaccess != 0)
{
/* read line until cr+lf */
while (fileGetString(LineBuffer, elcount(LineBuffer), readaccess) != 0)
{
// check for Record Type 00 (Data Record)
if (LineBuffer[0] == ':'
&& LineBuffer[7] == '0'
&& LineBuffer[8] == '0'
)
{
// extract ByteCount parameter
strncpy(CountAscii, "0x", elcount(CountAscii));
substr_cpy_off(CountAscii, 2, LineBuffer, 1, 2, elcount(CountAscii));
ByteCount = atol(CountAscii);
// extract Data parameter
for (i = 0; i < ByteCount; i++)
{
strncpy(Ascii, "0x", elcount(Ascii));
substr_cpy_off(Ascii, 2, LineBuffer, 9+i*2, 2, elcount(Ascii));
Data = atol(Ascii);
buffer[bufferpointer++] = Data;
};
}
}
fileClose(readaccess);
}
}
}
the Similar approach can be used for reading .mhx file format.
You can easily convert a *.hex/*.mhx file into a binary using srecord:
srec_cat file.hex −o file.bin −binary
Then you can read the resulting binary file with fileGetBinaryBlock().
If you need to process *.hex files dynamically, you can call srec_cat directly from your CAPL code using sysExec().
dword openFileRead(char filename[], dword mode);
This function opens the file named filename for the read access.
If mode=0 the file is opened in ASCII mode;
if mode=1 the file is opened in binary mode.
To open hexfile
dword HexFileHandle;
HexFileHandle = openFileRead(HEX_File_Path, 0);
The return value is the file handle that must be used for read operations.
If an error occurs, the return value is 0.

not able to read a text file in vs2010 using c .. i am new to vs please help me

i kept my text file at exactly same place where .exe is existing , then also its not working ..
hi this is my code , i kept my text file at exactly same place where .exe is existing , then also its not working ..
hi this is my code , i kept my text file at exactly same place where .exe is existing , then also its not working ..
int main(int argc, _TCHAR* argv[])
{
int result = 0;
char ca, file_name[25];
FILE *fp;
//printf("Enter the name of file you wish to see\n");
gets(file_name);
fp = fopen("sample.txt","r"); // read mode
if( fp == NULL )
{
perror("Error while opening the file.\n");
//exit(EXIT_FAILURE);
}
if( fgets (str, 60, fp)!=NULL )
{
/* writing content to stdout */
puts(str);
}
fclose(fp);
}
Try this , i basically work in C & C++ , i use this code to perform file operation
int main()
{
char filename[10];char extension[5]=".txt";
printf("Enter the name of file you wish to see\n");
gets(filename);
fflush(stdin);
filename[10]='\0';
strcat(filename,extension);
puts(filename);
FILE *p; char acline[80];
p=fopen(filename,"r");
if(p==NULL)
{
printf("%s file is missing\n",filename);system("pause");
}
fseek(p,0,SEEK_SET); // Setting file pointer to beginning of the file
while (!feof(p)) // Detecting end of file
{
fgets(acline,80,p);
puts(acline);
}
printf("\n File end\n");
system("pause");
}
*but while(!feof()) has certain issues see this

WinAPI C++ client detect write on anonymous pipe before reading

I am writing a C++ (Windows) client console application which reads from an anonymous pipe on STDIN. I would like to be able to use my program as follows:
echo input text here | my_app.exe
and do something in the app with the text that is piped in
OR
my_app.exe
and then use some default text inside of the app instead of the input from the pipe.
I currently have code that successfully reads from the pipe on STDIN given the first situation:
#include <Windows.h>
#include <iostream>
#include <string>
#define BUFSIZE 4096
int main(int argc, const char *argv[]) {
char char_buffer[BUFSIZE];
DWORD bytes_read;
HANDLE stdin_handle;
BOOL continue_reading;
unsigned int required_size;
bool read_successful = true;
stdin_handle = GetStdHandle(STD_INPUT_HANDLE);
if (stdin_handle == INVALID_HANDLE_VALUE) {
std::cout << "Error: invalid handle value!\n\n";
} else {
continue_reading = true;
while (continue_reading) {
continue_reading = ReadFile(stdin_handle, char_buffer, BUFSIZE,
&bytes_read, NULL);
if (continue_reading) {
if (bytes_read != 0) {
// Output what we have read so far
for (unsigned int i = 0; i < bytes_read; i++) {
std::cout << char_buffer[i];
}
} else {
continue_reading = false;
}
}
}
}
return 0;
}
I know that my only option with anonymous pipes is to do a blocking read with ReadFile. If I understand correctly, in regard to how I am invoking it, ReadFile will continue to read from the buffer on STDIN until it detects an end of write operation on the other end of the pipe (perhapse reads some sort of "end of write" token??). I would like to know if there is some sort of "beginning write" token that will be in the buffer if something is being piped in which I can check on STDIN BEFORE I call ReadFile. If this were the case I could just skip calling ReadFile and use some default text.
If there is not a way to do this, I can always pass in a command line argument that denotes that I should not check the pipe and just use the default text (or the other way around), but I would much prefer to do it the way that I specified.
Look at PeekNamedPipe(). Despite its name, it works for both named and anonymous pipes.
int main(int argc, const char *argv[])
{
char char_buffer[BUFSIZE];
DWORD bytes_read;
DWORD bytes_avail;
DWORD dw;
HANDLE stdin_handle;
bool is_pipe;
stdin_handle = GetStdHandle(STD_INPUT_HANDLE);
is_pipe = !GetConsoleMode(stdin_handle, &dw);
if (stdin_handle == INVALID_HANDLE_VALUE) {
std::cout << "Error: invalid handle value!\n\n";
} else {
while (1) {
if (is_pipe) {
if (PeekNamedPipe(stdin_handle, NULL, 0, NULL, &bytes_avail, NULL)) {
if (bytes_avail == 0) {
Sleep(100);
continue;
}
}
}
if (!ReadFile(stdin_handle, char_buffer, min(bytes_avail, BUFSIZE), &bytes_read, NULL)) {
break;
}
if (bytes_read == 0) {
break;
}
// Output what we have read so far
for (unsigned int i = 0; i < bytes_read; i++) {
std::cout << char_buffer[i];
}
}
}
return 0;
}
It looks like what you're really trying to do here is to determine whether you've got console input (where you use default value) vs pipe input (where you use input from the pipe).
Suggest testing that directly instead of trying to check if there's input ready: the catch with trying to sniff whether there's data in the pipe is that if the source app is slow in generating output, your app might make an incorrect assumption just because there isn't input yet available. (It might also be possible that, due to typeahead, there's a user could have typed in characters that area ready to be read from console STDIN before your app gets around to checking if input is available.)
Also, keep in mind that it might be useful to allow your app to be used with file redirection, not just pipes - eg:
myapp.exe < some_input_file
The classic way to do this "interactive mode, vs used with redirected input" test on unix is using isatty(); and luckily there's an equivalent in the Windows CRT - see function _isatty(); or use GetFileType() checking for FILE_TYPE_CHAR on GetStdHandle(STD_INPUT_HANDLE) - or use say GetConsoleMode as Remy does, which will only succeed on a real console handle.
This also works without overlapped I/O while using a second thread, that does the synchronous ReadFile-call. Then the main thread waits an arbitrary amount of time and acts like above...
Hope this helps...

Converting Unicodestring to Char[]

I've got a form with a Listbox which contains lines of four words.
When I click on one line, these words should be seen in four different textboxes.
So far, I've got everything working, yet I have a problem with chars converting.
The string from the listbox is a UnicodeString but the strtok uses a char[].
The compiler tells me it "Cannot Convert UnicodeString to Char[]". This is the code I am using for this:
{
int a;
UnicodeString b;
char * pch;
int c;
a=DatabaseList->ItemIndex; //databaselist is the listbox
b=DatabaseList->Items->Strings[a];
char str[] = b; //This is the part that fails, telling its unicode and not char[].
pch = strtok (str," ");
c=1;
while (pch!=NULL)
{
if (c==1)
{
ServerAddress->Text=pch;
} else if (c==2)
{
DatabaseName->Text=pch;
} else if (c==3)
{
Username->Text=pch;
} else if (c==4)
{
Password->Text=pch;
}
pch = strtok (NULL, " ");
c=c+1;
}
}
I know my code doesn't look nice, pretty bad actually. I'm just learning some programming in C++.
How can I convert this?
strtok actually modifies your char array, so you will need to construct an array of characters you are allowed to modify. Referencing directly into the UnicodeString string will not work.
// first convert to AnsiString instead of Unicode.
AnsiString ansiB(b);
// allocate enough memory for your char array (and the null terminator)
char* str = new char[ansiB.Length()+1];
// copy the contents of the AnsiString into your char array
strcpy(str, ansiB.c_str());
// the rest of your code goes here
// remember to delete your char array when done
delete[] str;
This works for me and saves me converting to AnsiString
// Using a static buffer
#define MAX_SIZE 256
UnicodeString ustring = "Convert me";
char mbstring[MAX_SIZE];
wcstombs(mbstring,ustring.c_str(),MAX_SIZE);
// Using dynamic buffer
char *dmbstring;
dmbstring = new char[ustring.Length() + 1];
wcstombs(dmbstring,ustring.c_str(),ustring.Length() + 1);
// use dmbstring
delete dmbstring;

Alternative to fgets()?

Description:
Obtain output from an executable
Note:
Will not compile, due to fgets() declaration
Question:
What is the best alternative to fgets, as fgets requires char *?
Is there a better alternative?
Illustration:
void Q_analysis (const char *data)
{
string buffer;
size_t found;
found = buffer.find_first_of (*data);
FILE *condorData = _popen ("condor_q", "r");
while (fgets (buffer.c_str(), buffer.max_size(), condorData) != NULL)
{
if (found == string::npos)
{
Sleep(2000);
} else {
break;
}
}
return;
}
You should be using the string.getline function for strings
cppreference
however in your case, you should be using a char[] to read into.
eg
string s;
char buffer[ 4096 ];
fgets(buffer, sizeof( buffer ), condorData);
s.assign( buffer, strlen( buffer ));
or your code:
void Q_analysis( const char *data )
{
char buffer[ 4096 ];
FILE *condorData = _popen ("condor_q", "r");
while( fgets( buffer, sizeof( buffer ), condorData ) != NULL )
{
if( strstr( buffer, data ) == NULL )
{
Sleep(2000);
}
else
{
break;
}
}
}
Instead of declaring you buffer as a string declare it as something like:
char buffer[MY_MAX_SIZE]
call fgets with that, and then build the string from the buffer if you need in that form instead of going the other way.
The reason what you're doing doesn't work is that you're getting a copy of the buffer contents as a c-style string, not a pointer into the gut of the buffer. It is, by design, read only.
-- MarkusQ
You're right that you can't read directly into a std::string because its c_str and data methods both return const pointers. You could read into a std::vector<char> instead.
You could also use the getline function. But it requires an iostream object, not a C FILE pointer. You can get from one to the other, though, in a vendor-specific way. See "A Handy Guide To Handling Handles" for a diagram and some suggestions on how to get from one file type to another. Call fileno on your FILE* to get a numeric file descriptor, and then use fstream::attach to associate it with an fstream object. Then you can use getline.
Try the boost library - I believe it has a function to create an fstream from a FILE*
or you could use fileno() to get a standard C file handle from the FILE, then use fstream::attach to attach a stream to that file. From there you can use getline(), etc. Something like this:
FILE *condorData = _popen ("condor_q", "r");
std::ifstream &stream = new std::ifstream();
stream.attach(_fileno(condorData));
I haven't tested it all too well, but the below appears to do the job:
//! read a line of text from a FILE* to a std::string, returns false on 'no data'
bool stringfgets(FILE* fp, std::string& line)
{
char buffer[1024];
line.clear();
do {
if(!fgets(buffer, sizeof(buffer), fp))
return !line.empty();
line.append(buffer);
} while(!strchr(buffer, '\n'));
return true;
}
Be aware however that this will happily read a 100G line of text, so care must be taken that this is not a DoS-vector from untrusted source files or sockets.

Resources