In all the examples I've seen for GDCM on how to write image data, they always consider the image volume as a single whole, cohesive buffer. The basic structure is along the lines
#include "gdcmImage.h"
#include "gdcmImageWriter.h"
#include "gdcmFileDerivation.h"
#include "gdcmUIDGenerator.h"
int write_image(...)
{
size_t width = ..., height = ..., depth = ...;
auto im = new gdcm::Image;
std::vector<...> buffer;
auto p = buffer.data();
im->SetNumberOfDimensions(3);
im->SetDimension(0, width);
im->SetDimension(1, height);
im->SetDimension(1, depth);
im->GetPixelFormat().SetSamplesPerPixel(...);
im->SetPhotometricInterpretation( gdcm::PhotometricInterpretation::... );
unsigned long l = im->GetBufferLength();
if( l != width * height * depth * sizeof(...) ){ return SOME_ERROR; }
gdcm::DataElement pixeldata( gdcm::Tag(0x7fe0,0x0010) );
pixeldata.SetByteValue( buffer.data(), buffer.size()*sizeof(*buffer.data()) );
im->SetDataElement( pixeldata );
gdcm::UIDGenerator uid;
auto file = new gdcm::File;
gdcm::FileDerivation fd;
const char UID[] = ...;
fd.AddReference( ReferencedSOPClassUID, uid.Generate() );
fd.SetFile( *file );
// If all Code Value are ok the filter will execute properly
if( !fd.Derive() ){ return SOME_ERROR; }
gdcm::ImageWriter w;
w.SetImage( *im );
w.SetFile( fd.GetFile() );
// Set the filename:
w.SetFileName( "some_image.dcm" );
if( !w.Write() ){ return SOME_ERROR; }
return 0;
}
The problem I'm facing with this approach is, that the amount of image data I need to store easily exceeds the available system memory, if an additional copy is being made; specifically these are volumes of 4096×4096×2048 voxels of 12 bits each, so about 48GiB of data in memory.
However the approach of using gdcm::DataElement and gdcm::Image::SetDataElement will obviously create a full copy of the data in buffer, which is troublesome. For one, the data as produced by my imaging system does not reside in memory as a cohesive, singular block of values; it is split into slices. And the total amount of data fits into the memory of the systems being used only once.
It is trivial for me, to read in the data slice by slice, which would cut down the memory requirements significantly. However I'm at a loss, how that'd be done with GDCM.
Did you check gdcm::FileStreamer:
http://gdcm.sourceforge.net/3.0/html/classgdcm_1_1FileStreamer.xhtml
See typical setup at:
https://github.com/malaterre/GDCM/blob/master/Examples/Csharp/FileStreaming.cs
The example show how to create an out of memory private element, but you can do the same with public DataElement.
A more complex example to read where Pixel Data is written in chunks is at:
https://github.com/malaterre/GDCM/blob/master/Examples/Csharp/FileChangeTS.cs#L126-L154
Related
When performing shm-related development on MacOS, the searched processes are shown in the following code (verification is indeed correct).
However, there is a new problem that cannot be solved. It is found that when ftruncat adjusts the memory size for shm_fd, it is allocated according to the multiple of the page size.
But in this case, when the shared memory file is opened by other processes, the actual data size cannot be obtained correctly. The obtained file size is an integer multiple of the page, which will cause an error when appending data.
// write data_size = 12
char *data = "....";
long data_size = 12;
shmFD = shm_open(...);
ftruncate(shmFD, data_size); // Actually the size actually allocated is not 12, but 4096
shmAddr = (char *)mmap(NULL, data_size, ... , shmFD, 0);
memcpy(shmAddr, data, data_size);
// read
...
fstat(shmFD, &sb)
long context_len_in_shm = sb.st_size;
// get wrong shm size -> context_len_in_shm = 4096
Temporarily use the following structure to record data into shm. The first operation before writing or reading is to get the value of the data_len field, and then determine the length of the data to be read and written from the back. Hope for a more concise way, just like the use of lseek() under Linux.
shm mem map :
----shm mem----
struct {
long data_len;
data[1];
data[2];
...
data[data_len];
}
---------------
long *shm_mem = (long *)shmAddr;
long data_size = shm_mem[0]; // Before reading, you need to determine whether the shm file is empty and whether the pointer is valid. It is omitted here.
char *shm_data = (char *)&(shm_mem[1]);
char *buffer = (char *)malloc(data_size);
memcpy(buffer, shm_data, data_size);
For our student project I've been tinkering with an OBJ-loader in order to import models into our application.
It loads without issues, and drawing it kind of works without index (the model is obviously not represented correctly because I'm not using an index buffer)
However, drawing with DeviceContext->DrawIndexed shows nothing on screen.
Without indexed drawing
With indexed drawing
Buffer creation method:
void ObjectLoader::CreateBuffers()
{
//Index buffer
D3D11_BUFFER_DESC iBufferDesc;
memset(&iBufferDesc, 0, sizeof(iBufferDesc));
iBufferDesc.BindFlags = D3D11_BIND_INDEX_BUFFER;
iBufferDesc.Usage = D3D11_USAGE_DEFAULT;
iBufferDesc.ByteWidth = sizeof(DWORD);
D3D11_SUBRESOURCE_DATA indexData;
indexData.pSysMem = &ind;
pDevice->CreateBuffer(&iBufferDesc, &indexData, &pIndexBuffer);
//Vertex buffer
D3D11_BUFFER_DESC bufferDesc;
memset(&bufferDesc, 0, sizeof(bufferDesc));
bufferDesc.BindFlags = D3D11_BIND_VERTEX_BUFFER;
bufferDesc.Usage = D3D11_USAGE_DEFAULT;
bufferDesc.ByteWidth = sizeof(TriangleVertex) * this->NumberOfVerts();
D3D11_SUBRESOURCE_DATA data;
data.pSysMem = tva;
pDevice->CreateBuffer(&bufferDesc, &data, &pVertexBuffer);
}
Draw method:
void ObjectLoader::Draw()
{
if (pDevice == nullptr)
return;
UINT32 vertexSize = sizeof(float) * 5;
UINT32 offset = 0;
pDeviceContext->IASetVertexBuffers(0, 1, &pVertexBuffer, &vertexSize, &offset);
pDeviceContext->IASetIndexBuffer(this->pIndexBuffer, DXGI_FORMAT_R32_UINT, 0);
pDeviceContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
pDeviceContext->DrawIndexed(vIndex.size(),0 , 0);
//pDeviceContext->Draw(this->NumberOfVerts(), 0);
}
What the hell am I missing? I've looked at several books on indexed drawing and it seems pretty straight-forward. At first I thought the winding order was reversed but I checked this by simply reversing the index array; same result.
If you need more code let me know, but I feel this should suffice.
Thanks in advance!
Edit: OT: I never figured out how to get my code to be properly formatted so I apologize for that, feel free to share how that's done.
I want to allocate memory for some elements of a structure, which are pointers to other small structs.How do I allocate and de-allocate memory in best way?
Ex:
typedef struct _SOME_STRUCT {
PDATATYPE1 PDatatype1;
PDATATYPE2 PDatatype2;
PDATATYPE3 PDatatype3;
.......
PDATATYPE12 PDatatype12;
} SOME_STRUCT, *PSOME_STRUCT;
I want to allocate memory for PDatatype1,3,4,6,7,9,11.Can I allocate memory with single malloc? or what is the best way to allocate memory for only these elements and how to free the whole memory allocated?
There is a trick that allows a single malloc, but that also has to weighed against doing a more standard multiple malloc approach.
If [and only if], once the DatatypeN elements of SOME_STRUCT are allocated, they do not need to be reallocated in any way, nor does any other code do a free on any of them, you can do the following [the assumption that PDATATYPEn points to DATATYPEn]:
PSOME_STRUCT
alloc_some_struct(void)
{
size_t siz;
void *vptr;
PSOME_STRUCT sptr;
// NOTE: this optimizes down to a single assignment
siz = 0;
siz += sizeof(DATATYPE1);
siz += sizeof(DATATYPE2);
siz += sizeof(DATATYPE3);
...
siz += sizeof(DATATYPE12);
sptr = malloc(sizeof(SOME_STRUCT) + siz);
vptr = sptr;
vptr += sizeof(SOME_STRUCT);
sptr->Pdatatype1 = vptr;
// either initialize the struct pointed to by sptr->Pdatatype1 here or
// caller should do it -- likewise for the others ...
vptr += sizeof(DATATYPE1);
sptr->Pdatatype2 = vptr;
vptr += sizeof(DATATYPE2);
sptr->Pdatatype3 = vptr;
vptr += sizeof(DATATYPE3);
...
sptr->Pdatatype12 = vptr;
vptr += sizeof(DATATYPE12);
return sptr;
}
Then, the when you're done, just do free(sptr).
The sizeof above should be sufficient to provide proper alignment for the sub-structs. If not, you'll have to replace them with a macro (e.g. SIZEOF) that provides the necessary alignment. (e.g.) for 8 byte alignment, something like:
#define SIZEOF(_siz) (((_siz) + 7) & ~0x07)
Note: While it is possible to do all this, and it is more common for things like variable length string structs like:
struct mystring {
int my_strlen;
char my_strbuf[0];
};
struct mystring {
int my_strlen;
char *my_strbuf;
};
It is debatable whether it's worth the [potential] fragility (i.e. somebody forgets and does the realloc/free on the individual elements). The cleaner way would be to embed the actual structs rather than the pointers to them if the single malloc is a high priority for you.
Otherwise, just do the the [more] standard way and do the 12 individual malloc calls and, later, the 12 free calls.
Still, it is a viable technique, particularly on small memory constrained systems.
Here is the [more] usual way involving per-element allocations:
PSOME_STRUCT
alloc_some_struct(void)
{
void *vptr;
PSOME_STRUCT sptr;
sptr = malloc(sizeof(SOME_STRUCT));
// either initialize the struct pointed to by sptr->Pdatatype1 here or
// caller should do it -- likewise for the others ...
sptr->Pdatatype1 = malloc(sizeof(DATATYPE1));
sptr->Pdatatype2 = malloc(sizeof(DATATYPE2));
sptr->Pdatatype3 = malloc(sizeof(DATATYPE3));
...
sptr->Pdatatype12 = malloc(sizeof(DATATYPE12));
return sptr;
}
void
free_some_struct(PSOME_STRUCT sptr)
{
free(sptr->Pdatatype1);
free(sptr->Pdatatype2);
free(sptr->Pdatatype3);
...
free(sptr->Pdatatype12);
free(sptr);
}
If your structure contains the others structures as elements instead of pointers, you can allocate memory for the combined structure in one shot:
typedef struct _SOME_STRUCT {
DATATYPE1 Datatype1;
DATATYPE2 Datatype2;
DATATYPE3 Datatype3;
.......
DATATYPE12 Datatype12;
} SOME_STRUCT, *PSOME_STRUCT;
PSOME_STRUCT p = (PSOME_STRUCT)malloc(sizeof(SOME_STRUCT));
// Or without malloc:
PSOME_STRUCT p = new SOME_STRUCT();
I want to copy a double pointer object to the host and compute over it on the GPU Device. When doing cudaMemcpy of the object to device it throws SEGFAULT.
BMP Input;
Input.ReadFromFile( fileName );
WIDTH = Input.TellWidth();
HEIGHT = Input.TellHeight();
RGBApixel** imageData = new RGBApixel* [HEIGHT];
for (int i = 0; i < HEIGHT; i++)
imageData[i] = new RGBApixel [WIDTH];
for(int j=0;j<Input.TellHeight();j++){
for(int i=0;i<Input.TellWidth();i++){
imageData[j][i] = Input.GetPixel(i,j);
}
}
long long imageSize = WIDTH*HEIGHT*sizeof(RGBApixel *);
RGBApixel** dev_imgdata,dev_imgdata_out;
//Allocating cudaMemory
cudaMalloc( (void **) &dev_imgdata, imageSize );
cudaMalloc( (void **) &dev_imgdata_out, imageSize );
Now the below line throws segfault
cudaMemcpy(dev_imgdata,imageData,imageSize,cudaMemcpyHostToDevice);
When declaring RGBApixel** imageData = new RGBApixel* [HEIGHT]; you have absolutely no guarantee that imageData will occupy a contiguous block of memory.
cudaMemcpy copies contiguous blocks of memory into the device RAM. Your statement tries to copy the start addresses of each matrix row but not the actual data. Also when using cudaMalloc, you need to properly allocate for each line, exactly as you did for the host buffer.
What you need to do is to declare imageData as just a RGMAPixel* - basically put the matrix in a single vector and use proper indexing and it will work.
You can also copy each line at a time but that's not a very good practice since every memory access will require an extra indirection and you will mess the caching efficiency.
Also, make sure that when you compile your program, you use -arch sm_20 to enable extra options for your graphic card ( if it has Capability 2.0). Without it I believe you can't use double and the result is unpredictable (or the double is diminished to float)
I want to convert a UINT16 monochrome image to a 8 bits image, in C++.
I have that image in a
char *buffer;
I'd like to give the new converted buffer to a QImage (Qt).
I'm trying with freeImagePlus
fipImage fimage;
if (fimage.loadfromMemory(...) == false)
//error
loadfromMemory needs a fipMemoryIO adress:
loadfromMemory(fipMemoryIO &memIO, int flag = 0)
So I do
fipImage fimage;
BYTE *buf = (BYTE*)malloc(gimage.GetBufferLength() * sizeof(BYTE));
// 'buf' is empty, I have to fill it with 'buffer' content
// how can I do it?
fipMemoryIO memIO(buf, gimage.GetBufferLength());
fimage.loadFromMemory(memIO);
if (fimage.convertTo8Bits() == true)
cout << "Good";
Then I would do something like
fimage.saveToMemory(...
or
fimage.saveToHandle(...
I don't understand what is a FREE_IMAGE_FORMAT, which is the first argument to any of those two functions. I can't find information of those types in the freeImage documentation.
Then I'd finish with
imageQt = new QImage(destiny, dimX, dimY, QImage::Format_Indexed8);
How can I fill 'buf' with the content of the initial buffer?
And get the data from the fipImage to a uchar* data for a QImage?
Thanks.
The conversion is simple to do in plain old C++, no need for external libraries unless they are significantly faster and you care about such a speedup. Below is how I'd do the conversion, at least as a first cut. The data is converted inside of the input buffer, since the output is smaller than the input.
QImage from16Bit(void * buffer, int width, int height) {
int size = width*height*2; // length of data in buffer, in bytes
quint8 * output = reinterpret_cast<quint8*>(buffer);
const quint16 * input = reinterpret_cast<const quint16*>(buffer);
if (!size) return QImage;
do {
*output++ = *input++ >> 8;
} while (size -= 2);
return QImage(output, width, height, QImage::Format_Indexed8);
}