Are MapViewOfFile memory mappings reused? - winapi

If I create 2 separate mappings of the same file in the same process will the pointers be shared?
in other words:
LPCTSTR filename = //...
HANDLE file1 = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0);
HANDLE fileMapping1 = CreateFileMapping(file1, NULL, PAGE_READONLY, 0, 0, 0);
void* pointer1 = MapViewOfFile(fileMapping1, FILE_MAP_READ, 0, 0, 0);
CloseHandle(fileMapping1);
CloseHandle(file1);
HANDLE file2 = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0);
HANDLE fileMapping2 = CreateFileMapping(file2, NULL, PAGE_READONLY, 0, 0, 0);
void* pointer2 = MapViewOfFile(fileMapping2, FILE_MAP_READ, 0, 0, 0);
CloseHandle(fileMapping2);
CloseHandle(file2);
Will pointer1 ever be equal to pointer2?
The reason I am asking is that I have several threads that need to search in a large (300+MB) file and I want to use memory mapping for that. However the process needs to be able to run on an old 32bit xp machine, so if each thread allocated their own copy in virtual memory then I could run out of memory.

msdn has documented it between the lines:
As mentioned above, you can have multiple views of the same
memory-mapped file, and they can overlap. But what about mapping two
identical views of the same memory-mapped file? After learning how to
unmap a view of a file, you could come to the conclusion that it would
not be possible to have two identical views in a single process
because their base address would be the same, and you wouldn't be able
to distinguish between them. This is not true. Remember that the base
address returned by either the MapViewOfFile or the MapViewOfFileEx
function is not the base address of the file view. Rather, it is the
base address in your process where the view begins. So mapping two
identical views of the same memory-mapped file will produce two views
having different base addresses, but nonetheless identical views of
the same portion of the memory-mapped file.
Also:
The point of this little exercise is to emphasize that every view of a
single memory-mapped file object is always mapped to a unique range of
addresses in the process. The base address will be different for each
view. For that reason the base address of a mapped view is all that is
required to unmap the view.

Will pointer1 ever be equal to pointer2?
The pointers might be the equal in case MapViewOfFile chooses the same address for the mapping. You don't control this with MapViewOfFile, and you have some control over this with MapViewOfFileEx (last argument lpBaseAddress there).
Each separate MapViewOfFile can create a new mapping over the same physical data, so OS does not need to map the file mapping into the same addresses even if you open two mappings simultaneously, preserving the coherence of data. It is easy to see this by modifying your code slightly:
HANDLE file1 = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
HANDLE fileMapping1 = CreateFileMapping(file1, NULL, PAGE_READWRITE, 0, 0, 0);
void* pointer1 = MapViewOfFile(fileMapping1, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, 0);
//CloseHandle(fileMapping1);
//CloseHandle(file1);
HANDLE file2 = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL);
HANDLE fileMapping2 = CreateFileMapping(file2, NULL, PAGE_READWRITE, 0, 0, 0);
void* pointer2 = MapViewOfFile(fileMapping2, FILE_MAP_READ | FILE_MAP_WRITE, 0, 0, 0);
INT& n1 = *((INT*) pointer1);
INT& n2 = *((INT*) pointer2);
ATLASSERT(&n1 != &n2); // The pointers are not equal even though they point
// the same data!
INT n3 = 0;
n1 = 2;
n3 += n2;
n1 = 3;
n3 += n2;
ATLASSERT(n3 == 5); // That's 2+3 we wrote through n1 and read through n2
//CloseHandle(fileMapping2);
//CloseHandle(file2);
That is, pointer equivalence is not something you should expect or rely on. Especially if your mapping is large, and reopening does not take place immediately.

MapViewOfFile finds a hole in your process's address space that is big enough for the entire file. I would not expect it to return the same pointer even if you passed the same file mapping object twice. For different mapping objects and different file handles, I would definitely expect the pointers to be different.
Behind the scenes, Windows should be using the same 'section' object, so both ranges of virtual address space should be mapped to the same physical memory. This is the same as two processes mapping the same file.
To use the same memory range from both threads, one thread will have to map the file and store the pointer in a shared location. The other thread will have to retrieve that pointer from the shared location. You're likely to need reference counting to decide when to unmap the file (which you do by calling UnmapViewOfFile - closing the file mapping handle will not release that address space).

The same physical memory will be used, but the two pointers will likely not be the same. In any case, you have no guarantee that they will be the same, even if they incidentially are when you test. Read as: you cannot ever rely on the assumption that this will be the case.
You are creating two mappings on two different file handles. Incidentially they refer to the same file (which is why the same physical memory will be used), but they are still two different mappings that do not logically relate to each other in any way.
Yes, it may sound illogical and unreasonable (maybe even impossible) to have the same physical memory at two different addresses. However, this is a perfectly legitimate thing.

Related

Create a font resource from byte array on Win32

I have a byte array that contains the contents of a read in font file. I'd like WinAPI (No Gdi+) to create a font resource from it, so I could use it for rendering text.
I only know about AddFontResourceExW, that loads in a font resource from file, and AddFontMemResourceEx, which sounded like what I'd need, but it seems to me that it's still some resource-system thing and the data would have to be pre-associated with the program.
Can I somehow convert my loaded in byte-array into a font resource? (Possibly without writing it to a file and then calling AddFontResourceExW)
When you load a font from a resource script into memory, you use code like the following (you didn't add a language tag, so I'm using C/C++ code - let me know if that's a problem):
HANDLE H_myfont = INVALID_HANDLE_VALUE;
HINSTANCE hResInstance = ::GetModuleHandle(nullptr);
HRSRC ares = FindResource(hResInstance, MAKEINTRESOURCE(IDF_MYID), L"BINARY");
if (ares) {
HGLOBAL amem = LoadResource(hResInstance, ares);
if (amem != nullptr) {
void *adata = LockResource(amem);
DWORD nFonts = 0, len = SizeofResource(hResInstance, ares);
H_myfont = AddFontMemResourceEx(adata, len, nullptr, &nFonts);
}
}
The key line here is void *adata = LockResource(amem); - this converts the font resource loaded as an HGLOBAL into 'accessible memory' (documentation). Now, assuming your byte array is in the correct format (see below), you could probably just pass a pointer to it (as void*) in the call to AddFontMemResourceEx. (You can use your known array size in place of calling SizeofResource.)
I would suggest code something like this:
void *my_font_data = (void*)(font_byte_array); // Your byte array data
DWORD nFonts = 0, len = sizeof(font_byte_array);
H_myfont = AddFontMemResourceEx(my_font_data, len, nullptr, &nFonts);
which (hopefully) will give you a loaded and useable font resource.
When you're done with the font (which, once loaded, can be used just like any system-installed font), you can release it with:
RemoveFontMemResourceEx(H_myfont);
As I don't have your byte array, I can't (obviously) test this idea. However, if you do try it, please let us know if it works. (If it doesn't, there may be some other, relatively straightforward, steps that need to be added.)
NOTE: Although I can't say 100% what the exact format expected of a "font resource" is, the fact that code given above works (for me) with a resource defined in the .rc script as a BINARY with a normal, ".ttf" file, suggests that, if your byte array follows the format of a Windows Font File, then it should work. This is how I have included a font as an embedded resource:
IDF_MYFONT BINARY L"..\\Resource\\MyFont.ttf"

Is it possible to poll a kqueue's file descriptor with `select()`?

When you create a kqueue with kqueue() you get back a file descriptor. But it appears that this file descriptor cannot be meaningfully polled with select(). I understand that the standard way to poll/read from a kqueue() is with kevent(...) but I'm trying to integrate with some legacy code that polls file descriptors using select().
The goal here was to be able to fire a "user event" that can be detected by this select-based polling mechanism (even if the event eventually needs to be "consumed" using kevent() later). This looked like the kind of thing EVFILT_USER was born to do, but a quick experiment indicates that select() doesn't report the kqueue's fd as being ready to read when an event is added (and triggered) in the kqueue, it just times out (or blocks forever). (But an equivalent kevent() call does see/return the event.)
Am I doing something wrong? Or is it just not possible to poll a kqueue's fd with select()?
The paper, describing kqueue/kevent says (sect 6.5):
Since an ordinary file descriptor references the kqueue, it can take
part in any operations that normally can per - formed on a descriptor.
The application may select(), poll(), close(), or even create a kevent
referencing a kqueue;
This is indeed the case for FreeBSD, i've checked this with following code:
struct kevent e;
fd_set fdset;
int kq=kqueue();
EV_SET(&e, 1, EVFILT_USER, EV_ADD, 0, 0, NULL);
kevent(kq, &e, 1, 0, 0, 0); // register USER event filter
EV_SET(&e, 1, EVFILT_USER, EV_ADD, NOTE_TRIGGER, 0, NULL);
kevent(kq, &e, 1, 0, 0, 0); // trigger USER event
FD_ZERO(&fdset);
FD_SET(kq,&fdset);
select(FD_SETSIZE,&fdset, 0, 0, 0); // wait for activity on kq
int res = kevent(kq, 0, 0, &e, 1, 0); // get the event

glTexSubImage2D with GL_PIXEL_UNPACK_BUFFER gives GL_INVALID_OPERATION

Currently I am attempting to use PBOs to get video data to textures. I'm not sure if what I'm trying to do is even possible, or a good way to do it if it IS possible... I have 3 textures with the GL_RED format (one for each channel, not using Alpha currently). All three of these will be filled out in a single call to an external library.
Here's binding the buffer, etc:
void LockTexture(const TextureID& id, void ** ppbData)
{
Texture& tex = textures.getArray()[id];
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, tex.glBufID);
glBufferData(GL_PIXEL_UNPACK_BUFFER, tex.width * tex.height, NULL, GL_STREAM_DRAW);
*ppbData = glMapBuffer(GL_PIXEL_UNPACK_BUFFER, GL_WRITE_ONLY);
}
This is done for the 3 textures, the buffers are then filled by the external library. Then I attempt to push them to the texture, like so:
void UnlockTexture(const TextureID& id)
{
Texture& tex = textures.getArray()[id];
glUnmapBuffer(GL_PIXEL_UNPACK_BUFFER);
glBindTexture(tex.glTarget, tex.glTexID);
glCheckForErrors(); // <--- NO ERROR
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, tex.width, tex.height, GL_RED, GL_UNSIGNED_BYTE, 0);
glCheckForErrors(); // <--- ERROR
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
glBindTexture(tex.glTarget, 0);
}
Going through the list of reasons the error could be generated, this is what I know:
texture array has been defined
type is correct
data param (offset) is good at 0
not executed between glBegin/glEnd
This one I'm not sure about:
error is generated if a non-zero buffer object name is bound to the GL_PIXEL_UNPACK_BUFFER target and the data would be unpacked from the buffer object such that the memory reads required would exceed the data store size.
This one seems like it could be an issue, but I'd have no idea how else to handle this:
error is generated if a non-zero buffer object name is bound to the GL_PIXEL_UNPACK_BUFFER target and the buffer object's data store is currently mapped.
Am I correct in saying that this glUnmapBuffer is unmapping the last-mapped buffer, so the correct buffer is still mapped?
GL version is 3.2
I would greatly appreciate any help on this one, thanks!
glUnmapBuffer(target) will unmap the buffer which is currently bound to target. From the code you posted, it is unclear if there will still be the same binding as at the time you did the map call. Your wordings suggests that you do the mapping for all 3 right after each other, and when you try to unmap it, you only unmap the last one mapped because you forget to rebind the other ones, which would lead to this error for the first two of your textures.

How can I insert a single byte to be sent prior to an I2C data package?

I am developing an application in Atmel Studio 6 using the xMega32a4u. I'm using the TWI libraries provided by Atmel. Everything is going well for the most part.
Here is my issue: In order to update an OLED display I am using (SSD1306 controller, 128x32), the entire contents of the display RAM must be written immediately following the I2C START command, slave address, and control byte so the display knows to enter the data into the display RAM. If the control byte does not immediately precede the display RAM package, nothing works.
I am using a Saleae logic analyzer to verify that the bus is doing what it should.
Here is the function I am using to write the display:
void OLED_buffer(){ // Used to write contents of display buffer to OLED
uint8_t data_array[513];
data_array[0] = SSD1306_DATA_BYTE;
for (int i=0;i<512;++i){
data_array[i+1] = buffer[i];
}
OLED_command(SSD1306_SETLOWCOLUMN | 0x00);
OLED_command(SSD1306_SETHIGHCOLUMN | 0x00);
OLED_command(SSD1306_SETSTARTLINE | 0x00);
twi_package_t buffer_send = {
.chip = OLED_BUS_ADDRESS,
.buffer = data_array,
.length = 513
};
twi_master_write(&TWIC, &buffer_send);
}
Clearly, this is very inefficient as each call to this function recreates the entire array "buffer" into a new array "data_array," one element at a time. The point of this is to insert the control byte (SSD1306_DATA_BYTE = 0x40) into the array so that the entire "package" is sent at once, and the control byte is in the right place. I could make the original "buffer" array one element larger and add the control byte as the first element, to skip this process but that makes the size 513 rather than 512, and might mess with some of the text/graphical functions that manipulate this array and depend on it being the correct size.
Now, I thought I could write the code like this:
void OLED_buffer(){ // Used to write contents of display buffer to OLED
uint8_t data_byte = SSD1306_DATA_BYTE;
OLED_command(SSD1306_SETLOWCOLUMN | 0x00);
OLED_command(SSD1306_SETHIGHCOLUMN | 0x00);
OLED_command(SSD1306_SETSTARTLINE | 0x00);
twi_package_t data_control_byte = {
.chip = OLED_BUS_ADDRESS,
.buffer = data_byte,
.length = 1
};
twi_master_write(&TWIC, &data_control_byte);
twi_package_t buffer_send = {
.chip = OLED_BUS_ADDRESS,
.buffer = buffer,
.length = 512
};
twi_master_write(&TWIC, &buffer_send);
}
/*
That doesn't work. The first "twi_master_write" command sends a START, address, control, STOP. Then the next such command sends a START, address, data buffer, STOP. Because the control byte is missing from the latter transaction, this does not work. All I need is to insert a 0x40 byte between the address byte and the buffer array when it is sent over the I2C bus. twi_master_write is a function that is provided in the Atmel TWI libraries. I've tried to examine the libraries to figure out its inner workings, but I can't make sense of it.
Surely, instead of figuring out how to recreate a twi_write function to work the way I need, there is an easier way to add this preceding control byte? Ideally one that is not so wasteful of clock cycles as my first code example? Realistically the display still updates very fast, more than enough for my needs, but that does not change the fact this is inefficient code.
I appreciate any advice you all may have. Thanks in advance!
How about having buffer and data_array pointing to the same uint8_t[513] array, but with buffer starting at its second element. Then you can continue to use buffer as you do today, but also use data_array directly without first having to copy all the elements from buffer.
uint8_t data_array[513];
uint8_t *buffer = &data_array[1];

What is the difference between creating a buffer object with clCreateBuffer + CL_MEM_COPY_HOST_PTR vs. clCreateBuffer + clEnqueueWriteBuffer?

I have seen both versions in tutorials, but I could not find out, what their advantages and disadvantages are. Which one is the proper one?
cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY,sizeof(float) * DATA_SIZE, NULL, NULL);
clEnqueueWriteBuffer(command_queue, input, CL_TRUE, 0, sizeof(float) * DATA_SIZE, inputdata, 0, NULL, NULL);
vs.
cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, ,sizeof(float) * DATA_SIZE, inputdata, NULL);
Thanks.
[Update]
I added CL_MEM_COPY_HOST_PTR, to the second example to make it correct.
During my working with OpenCL I found a very important difference between
cl_mem CT = clCreateImage3DContext, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR , Volume_format, X, Y, Z, rowPitch, slicePitch, sourceData, &error);
and
cl_mem CT = clCreateImage3D(Context, CL_MEM_READ_ONLY , Volume_format, X, Y, Z, 0, 0, 0, &error);
error = clEnqueueWriteImage(CommandQue, CT, CL_TRUE, origin, region, rowPitch, slicePitch, sourceData, 0, 0, 0);
For the first approach OpenCL will copy the host pointer not direct to the GPU. First it will allocate a second temporary buffer on the host which can cause problems if you load big stuff like a CT to the GPU. For a short time the needed memory is twice the CT size. Also the data is not copied during this function. It is copied during the argument setting to the kernel function which uses the 3D image object.
The second approach direct copies the data to the GPU. There are no additional allocations done by OpenCL. I think this is probably the same for normal buffer objects.
I assume that inputdata is not NULL.
In that case the second approach should not work at all, since the specifications says, that clCreateBuffer returns NULL and an error, if:
CL_INVALID_HOST_PTR if host_ptr is NULL and CL_MEM_USE_HOST_PTR or CL_MEM_COPY_HOST_PTR are set in flags or if host_ptr is not NULL but CL_MEM_COPY_HOST_PTR or CL_MEM_USE_HOST_PTR are not set in flags.
so you mean either
clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);
or
clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);
The first one should be more or less the same as the first approach you showed, while the second one won't actually copy the data, but instead use the supplied memory location for buffer storage (caching portions or all of it in device memory). Which of those two is better depends on the usage scenario obviously.
Personaly I prefer using the two step approach of first allocating the buffer and afterwards filling it with a writeToBuffer, since I find it easier to see what happens (of course one step might be faster (or it might not, thats just a guess))
The nice aspect of the first approach, is that "clEnqueueWriteBuffer" allows you to assign an event to the copy of a buffer. So, let's say you want to measure the time it takes to copy data to the GPU using the GPU_Profiling options, you will be able to do so with the first approach, but not with the second one.
The second approach is more compact, easier to read, and requires less lines to code.
One major difference that I've run into:
cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY,sizeof(float) * DATA_SIZE, NULL, NULL);
clEnqueueWriteBuffer(command_queue, input, CL_TRUE, 0, sizeof(float) * DATA_SIZE, inputdata, 0, NULL, NULL);
This first set of commands will create an empty buffer and enqueue a command in your command queue to fill the buffer.
cl_mem input = clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, ,sizeof(float) * DATA_SIZE, inputdata, NULL)
This second command will create the buffer and fill it immediately. Note that there's no command queue in this argument list, so it uses the contents of input data as it is right now.
If you've already been running CL code and your source pointer is dependent upon a previous command in the command queue completing (e.g. an enqueued read of a prior output buffer), you definitely want to use the 1st method. If you try to create and fill the buffer in a single command, you'll end up with a race condition in which the buffer contents will not properly wait on the completion of your prior buffer read.
Well the main difference between these two is that the first one allocates memory on the device and then copies data to that memory. The second one only allocates.
Or did you mean clCreateBuffer(context,CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,sizeof(float) * DATA_SIZE, inputdata, NULL);?

Resources