Problem with writing out large temp files

Problem with writing out large temp files - visual-studio-2010

I have a set of large image files that I'm using as temporary swap files on Windows in Visual Studio 2010. I'm writing and reading the files out as necessary.
Problem is, even though each of the files are the same size, I'm getting different file sizes.
So, I can do:
template <typename T>
std::string PlaceFileOnDisk(T* inImage, const int& inSize)
TCHAR lpTempPathBuffer[MAX_PATH];
TCHAR szTempFileName[MAX_PATH];
DWORD dwRetVal = GetTempPath(MAX_PATH, lpTempPathBuffer);
UINT uRetVal = GetTempFileName(lpTempPathBuffer, TEXT("IMAGE"), 0, szTempFileName);
FILE* fp;
fopen_s(&fp, szTempFileName, "w+");
fwrite(inImage, sizeof(T), inSize, fp);
fclose(fp);
std::string theRealTempFileName(szTempFileName);
return theRealTempFileName;
}
but that results in files between 53 and 65 mb in size (the image is 4713 * 5908 * sizeof (unsigned short).
I figured that maybe that 'fwrite' might not be stable for large files, so I broke things up into:
template <typename T>
std::string PlaceFileOnDisk(T* inImage, const int& inYSize, const int& inXSize)
TCHAR lpTempPathBuffer[MAX_PATH];
TCHAR szTempFileName[MAX_PATH];
DWORD dwRetVal = GetTempPath(MAX_PATH, lpTempPathBuffer);
UINT uRetVal = GetTempFileName(lpTempPathBuffer, TEXT("IMAGE"), 0, szTempFileName);
int y;
FILE* fp;
for (y = 0; y < inYSize; y++){
fopen_s(&fp, szTempFileName, "a");
fwrite(&(inImage[y*inXSize]), sizeof(T), inXSize, fp);
fclose(fp);
}
std::string theRealTempFileName(szTempFileName);
return theRealTempFileName;
}
Same thing: the files that are saved to disk are variable sized, not the expected size.
What's going on? Why are they not the same?
The read in function:
template <typename T>
T* RecoverFileFromDisk(const std::string& inFileName, const int& inSize){
T* theBuffer = NULL;
FILE* fp;
try {
theBuffer = new T[inYSize*inXSize];
fopen_s(&fp, inFileName.c_str(), "r");
fread(theBuffer, sizeof(T), inSize, fp);
fclose(fp);
}
catch(...){
if (theBuffer != NULL){
delete [] theBuffer;
theBuffer = NULL;
}
}
return theBuffer;
}
This function may be suffering from similar problems, but I'm not getting that far, because I can't get past the writing function.
I did try to use the read/write information on this page:
http://msdn.microsoft.com/en-us/library/aa363875%28v=vs.85%29.aspx
But the suggestions there just didn't work at all, so I went with the file functions with which I'm more familiar. That's where I got the temp file naming conventions, though.

Are you able to open the image after it's written? It sounds like you're having trouble writing it, too?
You're just asking about why the file sizes are different for the same size picture? What about how the sizes of the initial files compare to each other? It may have something to do with the how the initial image files are compressed.
I'm not sure what you're doing with the files, but have you considered a more basic "copy" function?

Related

How to write memory of a dll?

I'm trying to replace a specific string in memory belongs to a dll. Here's the code.
I can read it, and it gives me correct result, but when writing VC++ shows 'Access violation writing location'.
HMODULE HMODULE1 = LoadLibrary(L"my.dll");
std::string x1(8, '\0');
std::string x2 = "CIFCDMEY";
auto startPos = (void*)((char*)(HMODULE1)+0x1158A0 + 9);
// Correct, I can read the memory
memcpy_s((void*)x1.data(), x1.size(), startPos, x1.size());
// Access violation writing location
memcpy_s(startPos, x2.size(), x2.data(), x2.size());
auto handle = GetCurrentProcess();
SIZE_T num;
auto ret = WriteProcessMemory(handle, startPos, x2.data(), x2.size(), &num);
auto lastError1 = GetLastError();
LPVOID lpMessageBuffer1 = NULL;
size_t size1 = ::FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
NULL,
lastError1,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR)&lpMessageBuffer1,
0,
NULL);
std::wstring errorMessage1;
if (size1 > 0) {
// errorMessage1: Invalid access to memory location.
errorMessage1 = std::wstring((LPCTSTR)lpMessageBuffer1, size1);
}
In 'Watch' window, the value of variable startPos is my.dll!0x0f2a58a9 (load symbols for additional information).
I know people use 'WriteProcessMemory' to write memory of a process, how about a dll?

If the target memory page does not have write permissions you need to take them using VirtualProtect()
I use a simple wrapper function for all my patching, it uses the PAGE_EXECUTE_READWRITE memory protection constant because if you are modifying a code page this will avoid crashes when the instruction pointer lands in the same memory page and you only used PAGE_READWRITE
void Patch(char* dst, char* src, const intptr_t size)
{
DWORD oldprotect;
VirtualProtect(dst, size, PAGE_EXECUTE_READWRITE, &oldprotect);
memcpy(dst, src, size);
VirtualProtect(dst, size, oldprotect, &oldprotect);
}

As one MPI process executes MPI_Barrier(), other processes hang

I have an MPI program for having multiple processes read from a file that contains list of file names and based on the file names read - it reads the corresponding file and counts the frequency of words.
If one of the processes completes this and returns - to block executing MPI_Barrier(), the other processes also hang. On debugging, it could be seen that the readFile() function is not entered by the processes currently in process_files() Unable to figure out why this happens. Please find the code below:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"
void process_files(char*, int* , int, hashtable_t* );
void initialize_word(char *c,int size)
{
int i;
for(i=0;i<size;i++)
c[i]=0;
return;
}
char* readFilesList(MPI_File fh, char* file,int rank, int nprocs, char* block, const int overlap, int* length)
{
char *text;
int blockstart,blockend;
MPI_Offset size;
MPI_Offset blocksize;
MPI_Offset begin;
MPI_Offset end;
MPI_Status status;
MPI_File_open(MPI_COMM_WORLD,file,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
MPI_File_get_size(fh,&size);
/*Block size calculation*/
blocksize = size/nprocs;
begin = rank*blocksize;
end = begin+blocksize-1;
end+=overlap;
if(rank==nprocs-1)
end = size;
blocksize = end-begin+1;
text = (char*)malloc((blocksize+1)*sizeof(char));
MPI_File_read_at_all(fh,begin,text,blocksize,MPI_CHAR, &status);
text[blocksize+1]=0;
blockstart = 0;
blockend = blocksize;
if(rank!=0)
{
while(text[blockstart]!='\n' && blockstart!=blockend) blockstart++;
blockstart++;
}
if(rank!=nprocs-1)
{
blockend-=overlap;
while(text[blockend]!='\n'&& blockend!=blocksize) blockend++;
}
blocksize = blockend-blockstart;
block = (char*)malloc((blocksize+1)*sizeof(char));
block = memcpy(block, text + blockstart, blocksize);
block[blocksize]=0;
*length = strlen(block);
MPI_File_close(&fh);
return block;
}
void calculate_term_frequencies(char* file, char* text, hashtable_t *hashtable,int rank)
{
printf("Start File %s, rank %d \n\n ",file,rank);
fflush(stdout);
if(strlen(text)!=0||strlen(file)!=0)
{
int i,j;
char w[100];
i=0,j=0;
while(text[i]!=0)
{
if((text[i]>=65&&text[i]<=90)||(text[i]>=97&&text[i]<=122))
{
w[j]=text[i];
j++; i++;
}
else
{
w[j] = 0;
if(j!=0)
{
//ht_set( hashtable, strcat(strcat(w,"#"),file),1);
}
j=0;
i++;
initialize_word(w,100);
}
}
}
return;
}
void readFile(char* filename, hashtable_t *hashtable,int rank)
{
MPI_Status stat;
MPI_Offset size;
MPI_File fx;
char* textFromFile=0;
printf("Start File %d, rank %d \n\n ",strlen(filename),rank);
fflush(stdout);
if(strlen(filename)!=0)
{
MPI_File_open(MPI_COMM_WORLD,filename,MPI_MODE_RDONLY,MPI_INFO_NULL,&fx);
MPI_File_get_size(fx,&size);
printf("Start File %s, rank %d \n\n ",filename,rank);
fflush(stdout);
textFromFile = (char*)malloc((size+1)*sizeof(char));
MPI_File_read_at_all(fx,0,textFromFile,size,MPI_CHAR, &stat);
textFromFile[size]=0;
calculate_term_frequencies(filename, textFromFile, hashtable,rank);
MPI_File_close(&fx);
}
printf("Done File %s, rank %d \n\n ",filename,rank);
fflush(stdout);
return;
}
void process_files(char* block, int* length, int rank,hashtable_t *hashtable)
{
char s[2];
s[0] = '\n';
s[1] = 0;
char *file;
if(*length!=0)
{
/* get the first file */
file = strtok(block, s);
/* walk through other tokens */
while( file != NULL )
{
readFile(file,hashtable,rank);
file = strtok(NULL, s);
}
}
return;
}
void execute_process(MPI_File fh, char* file, int rank, int nprocs, char* block, const int overlap, int * length, hashtable_t *hashtable)
{
block = readFilesList(fh,file,rank,nprocs,block,overlap,length);
process_files(block,length,rank,hashtable);
}
int main(int argc, char *argv[]){
/*Initialization*/
MPI_Init(&argc, &argv);
MPI_File fh=0;
int rank,nprocs,namelen;
char *block=0;
const int overlap = 70;
char* file = "filepaths.txt";
int *length = (int*)malloc(sizeof(int));
hashtable_t *hashtable = ht_create( 65536 );
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Get_processor_name(processor_name, &namelen);
printf("Rank %d is on processor %s\n",rank,processor_name);
fflush(stdout);
execute_process(fh,file,rank,nprocs,block,overlap,length,hashtable);
printf("Rank %d returned after processing\n",rank);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}
The filepaths.txt is a file that contain the absolute file names of normal text files:
eg:
/home/mpiuser/mpi/MPI_Codes/code/test1.txt
/home/mpiuser/mpi/MPI_Codes/code/test2.txt
/home/mpiuser/mpi/MPI_Codes/code/test3.txt

Your readFilesList function is pretty confusing, and I believe it doesn't do what you want it to do, but maybe I just do not understand it correctly. I believe it is supposed to collect a bunch of filenames out of the list file for each process. A different set for each process. It does not do that, but this is not the problem, even if this would do what you want it to, the subsequent MPI IO would not work.
When reading files, you use MPI_File_read_all with MPI_COMM_WORLD as communicator. This requires all processes to participate in reading this file. Now, if each process should read a different file, this obviously is not going to work.
So there are several issues with your implementation, though I can not really explain your described behavior, I would rather first start off and try to fix them, before debugging in detail, what might go wrong.
I am under the impression, you want to have an algorithm along these lines:
Read a list of file names
Distribute that list of files equally to all processes
Have each process work on its own set of files
Do something with the data from this processing
And I would suggest to try this with the following approach:
Read the list on a single process (no MPI IO)
Scatter the list of files to all processes, such that all get around the same amount of work
Have each process work on its list of files independently and in serial (serial file access and processing)
Some data reduction with MPI, as needed
I believe, this would be the best (easiest and fastest) strategy in your scenario. Note, that no MPI IO is involved here at all. I don't think doing some complicated distributed reading of the file list in the first step would result in any advantage here, and in the actual processing it would actually be harmful. The more independent your processes are, the better your scalability usually.

Some warnings being treated as errors while making a modified ver of ext2 kernel module under ubuntu

I have succeeded in making a modified version of ext2 (so called myext2.ko) and tested it for mount and umount, and something else; the problem occurs when I add the following code into my fs/myext2/file.c and tried to implement a simple "encryption" func, that is, negating the last bit of the read-in string :
ssize_t my_new_sync_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos)
{
struct iovec iov; //changed
struct kiocb kiocb;
struct iov_iter iter;
ssize_t ret;
//inserted by adward - begin
size_t i;
char buff[len];
for (i=0;i<len;i++){
buff[i] = buf[i] ^ 1;
}
iov.iov_base = (void __user *)buff;
iov.iov_len = len;
printk("Inside my_new_sync_write");
//inserted by adward - end
init_sync_kiocb(&ki_nbytesocb, filp);
kiocb.ki_pos = *ppos;
kiocb.ki_nbytes = len;
iov_iter_init(&iter, WRITE, &iov, 1, len);
ret = filp->f_op->write_iter(&kiocb, &iter);
if (-EIOCBQUEUED == ret)
ret = wait_on_sync_kiocb(&kiocb);
*ppos = kiocb.ki_pos;
return ret;
}
ssize_t my_new_sync_read(struct file *filp, char __user *buf, size_t len, loff_t *ppos)
{
struct iovec iov = { .iov_base = buf, .iov_len = len };
struct kiocb kiocb;
struct iov_iter iter;
ssize_t ret;
//inserted by adward - begin
size_t i;
//inserted by adward - end
init_sync_kiocb(&kiocb, filp);
kiocb.ki_pos = *ppos;
kiocb.ki_nbytes = len;
iov_iter_init(&iter, READ, &iov, 1, len);
ret = filp->f_op->read_iter(&kiocb, &iter);
if (-EIOCBQUEUED == ret)
ret = wait_on_sync_kiocb(&kiocb);
*ppos = kiocb.ki_pos;
//inserted by adward - begin
for (i=0;i<len;i++){
buf[i] ^= 1;
}
printk("inside my_new_sync_read");
//inserted by adward - end
return ret;
}
The prototype of the above two functions are actually in fs/read_write.c , using by almost all file system types in the kernel code ver 3.17.6; I just copied them into fs/myext2/file.c and make some minor change as commented, so that I can do some test without having to change any Makefile.
But the moment I paste them into my file.c, "sudo make" gives the error message as following:
/home/adward/linux-3.17.6/fs/myext2/file.c:64:15: error: storage size of ‘kiocb’ isn’t known
struct kiocb kiocb;
^
/home/adward/linux-3.17.6/fs/myext2/file.c:65:18: error: storage size of ‘iter’ isn’t known
struct iov_iter iter;
^
and cc1: some warnings being treated as errors
even if I haven't refered to them by changing the func pointers in file_operations in the same source code file, or say, I haven't used them!
P.S.
My file_operation struct now looks like:
const struct file_operations myext2_file_operations = {
.llseek = generic_file_llseek,
.read = new_sync_read, //want to replace with my_new_sync_read
.write = new_sync_write, //want to replace with my_new_sync_write
...
}
Has anyone who have done something similar and crashed into some problems like this one? Please notify me if I have done something remarkable wrong, thanks.

Met the same error before. U should add <linux/aio.h> as ext2 uses asynchronous IO for reading/writing files.
Hope that helps :)

UTF8ToUTF16 failing

I have the following code which is just three sets of functions for converting UTF8 to UTF16 and vice-versa. It converts using 3 different techniques..
However, all of them fail:
std::ostream& operator << (std::ostream& os, const std::string &data)
{
SetConsoleOutputCP(CP_UTF8);
DWORD slen = data.size();
WriteConsoleA(GetStdHandle(STD_OUTPUT_HANDLE), data.c_str(), data.size(), &slen, nullptr);
return os;
}
std::wostream& operator <<(std::wostream& os, const std::wstring &data)
{
DWORD slen = data.size();
WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), data.c_str(), slen, &slen, nullptr);
return os;
}
std::wstring AUTF8ToUTF16(const std::string &data)
{
return std::wstring_convert<std::codecvt_utf8<wchar_t>>().from_bytes(data);
}
std::string AUTF16ToUTF8(const std::wstring &data)
{
return std::wstring_convert<std::codecvt_utf8<wchar_t>>().to_bytes(data);
}
std::wstring BUTF8ToUTF16(const std::string& utf8)
{
std::wstring utf16;
int len = MultiByteToWideChar(CP_UTF8, 0, utf8.c_str(), -1, NULL, 0);
if (len > 1)
{
utf16.resize(len - 1);
wchar_t* ptr = &utf16[0];
MultiByteToWideChar(CP_UTF8, 0, utf8.c_str(), -1, ptr, len);
}
return utf16;
}
std::string BUTF16ToUTF8(const std::wstring& utf16)
{
std::string utf8;
int len = WideCharToMultiByte(CP_UTF8, 0, utf16.c_str(), -1, NULL, 0, 0, 0);
if (len > 1)
{
utf8.resize(len - 1);
char* ptr = &utf8[0];
WideCharToMultiByte(CP_UTF8, 0, utf16.c_str(), -1, ptr, len, 0, 0);
}
return utf8;
}
std::string CUTF16ToUTF8(const std::wstring &data)
{
std::string result;
result.resize(std::wcstombs(nullptr, &data[0], data.size()));
std::wcstombs(&result[0], &data[0], data.size());
return result;
}
std::wstring CUTF8ToUTF16(const std::string &data)
{
std::wstring result;
result.resize(std::mbstowcs(nullptr, &data[0], data.size()));
std::mbstowcs(&result[0], &data[0], data.size());
return result;
}
int main()
{
std::string str = "консоли";
MessageBoxA(nullptr, str.c_str(), str.c_str(), 0); //Works Fine!
std::wstring wstr = AUTF8ToUTF16(str); //Crash!
MessageBoxW(nullptr, wstr.c_str(), wstr.c_str(), 0); //Fail - Crash + Display nothing..
wstr = BUTF8ToUTF16(str);
MessageBoxW(nullptr, wstr.c_str(), wstr.c_str(), 0); //Fail - Random chars..
wstr = CUTF8ToUTF16(str);
MessageBoxW(nullptr, wstr.c_str(), wstr.c_str(), 0); //Fail - Question marks..
std::cin.get();
}
The only thing that works above is the MessageBoxA. I don't understand why because I'm told that Windows converts everything to UTF16 anyway so why can't I convert it myself?
Why does none of my conversions work?
Is there a reason my code does not work?

The root problem why all of your approaches fail is that they require the std::string to be UTF-8 encoded but std::string str = "консоли" is not UTF-8 encoded unless you save the .cpp file as UTF-8 and configure your compiler's default codepage to UTF-8. In most C++11 compilers, you can use the u8 prefix to force the string to use UTF-8:
std::string str = u8"консоли";
However, VS 2013 does not support that feature yet:
Support For C++11 Features
Unicode string literals
2010 No
2012 No
2013 No
Windows itself does not support UTF-8 in most API functions that take a char* as input (an exception is MultiByteToWideChar() when using CP_UTF8). When you call an A function, it calls the corresponding W function internally, converting any char* data to/from UTF-16 using Windows' default codepage (CP_ACP). So you get random results when you use non CP_ACP data with functions that are expecting it. As such, MessageBoxA() will work correctly only if your .cpp file and compiler are using the same codepage as CP_ACP so the unprefixed char* data matches what MessageBoxA() is expecting.
I don't know why AUTF8ToUTF16() is crashing, probably a bug in your compiler's STL implementation when processing bad data.
BUTF8ToUTF16() is not handling this case in the documentation: "If the input byte/char sequences are invalid, returns U+FFFD for UTF encodings." Also, your implementation is not optimal. Use length() instead of -1 on inputs to avoid dealing with null terminator issues.
CUTF8ToUTF16() is not doing any error handling or validations. However converting non-valid input to question marks or U+FFFD is very common in most libraries.

Problems With 64bit Posix Write In Mac OS X? (2gb+ Dataset in HDF5)

I'm having some issues with HDF5 on Mac os x (10.7). After some testing, I've confirmed that POSIX write seems to have issues with buffer sizes exceeding 2gb. I've written a test program to demonstrate the issue:
#define _FILE_OFFSET_BITS 64
#include <iostream>
#include <unistd.h>
#include <fcntl.h>
void writePosix(const int64_t arraySize, const char* name) {
int fd = open(name, O_WRONLY | O_CREAT);
if (fd != -1) {
double *array = new double [arraySize];
double start = 0.0;
for (int64_t i=0;i<arraySize;++i) {
array[i] = start;
start += 0.001;
}
ssize_t result = write(fd, array, (int64_t)(sizeof(double))*arraySize);
printf("results for array size %lld = %ld\n", arraySize, result);
close(fd);
} else {
printf("file error");
}
}
int main(int argc, char *argv[]) {
writePosix(268435455, "/Users/tpav/testfolder/lessthan2gb");
writePosix(268435456, "/Users/tpav/testfolder/equal2gb");
}
Output:
results for array size 268435455 = 2147483640
results for array size 268435456 = -1
As you can see, I've even tried defining the file offsets. Is there anything I can do about this or should I start looking for a workaround in the way I write 2gb+ chunks?

In the HDF5 virtual file drivers, we break I/O operations that are too large for the call into multiple smaller I/O calls. The Mac implementation of POSIX I/O takes a size_t argument so our code assumed that the max I/O size would be the max value that can fit in a variable of type ssize_t (the return type of read/write). Sadly, this is not the case.
Note that this only applies to single I/O operations. You can create files that go above the 2GB/4GB barrier, you just can't write >2GB in a single call.
This should be fixed in HDF5 1.8.10 patch 1, due out in late January 2013.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio