I have troubles detecting digits/numbers in an image with the Windows UWP OCR-Engine from C++/CX.
I need to detect the number in the following Image
I tried it by using the builtin method for Windows 10 UWP: OcrEngine with the following code:
...
cv::Mat croppedImage = imread("digit.png");
WriteableBitmap^ bit1 = ref new WriteableBitmap(croppedImage.cols, croppedImage.rows);
SoftwareBitmap^ bit2 = bit2->CreateCopyFromBuffer(bit1->PixelBuffer, BitmapPixelFormat::Bgra8, bit1->PixelWidth, bit1->PixelHeight);
Windows::Globalization::Language^ l = ref new Windows::Globalization::Language("de");
OcrEngine^ ocrEngine = OcrEngine::TryCreateFromLanguage(l);
IAsyncOperation<OcrResult^>^ ao = ocrEngine->RecognizeAsync(bit2);
task_completion_event<Platform::String^> purchaseCompleted;
auto deviceEnumTask = create_task(ao);
deviceEnumTask.then([this](OcrResult^ result)
{
App1::MainPage::findNumber(result->Text);
});
...
void App1::MainPage::findNumber(Platform::String^ text)
{
//Do something with String
}
My Problem is now, that the inserted string in findNumber is always null. I tried with different pictures as input but always the same result: NULL.
Is there an easier way to get the digits in this images in C++/CX?
What could be the problem? Converting the image?
The problem was the conversion of the WriteableBitmap to a SoftwareBitmap WriteableBitmap^ bit1 = ref new WriteableBitmap(croppedImage.cols, croppedImage.rows);
// Get access to the pixels
IBuffer^ buffer = bit1->PixelBuffer;
unsigned char* dstPixels;
// Obtain IBufferByteAccess
ComPtr<IBufferByteAccess> pBufferByteAccess;
ComPtr<IInspectable> pBuffer((IInspectable*)buffer);
pBuffer.As(&pBufferByteAccess);
// Get pointer to pixel bytes
pBufferByteAccess->Buffer(&dstPixels);
memcpy(dstPixels, croppedImage.data, croppedImage.step.buf[1] * croppedImage.cols*croppedImage.rows);
SoftwareBitmap^ bit2= ref new SoftwareBitmap(BitmapPixelFormat::Bgra8, croppedImage.cols, croppedImage.rows);
//SoftwareBitmap^ bit2 =
bit2->CopyFromBuffer(bit1->PixelBuffer);
Related
I'm trying to create an NV12 resource as source for a video encoder in DX12. While I intend to eventually populate a resource from GPU, what I'm trying to do now is take an ffmpeg AVFrame I already have (in AV_PIX_FMT_YUV420P format) and create a texture in DXGI_FORMAT_NV12 format using that data.
I understand the NV12 format (https://learn.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#nv12) has U and V interleaved while the AV_PIX_FMT_YUV420P doesn't.
My main question is what does the D3D12_RESOURCE_DESC look like for an NV12 texture - do I tell it I need more than one array/mip level to make it planar? Or do I just give it a single memory address with both planes layed out as per the NV12 format, and it figures out subresources for me based on the format?
I understand that to read the data I define two SRVs, one for Y mapped to the Red channel and a second for U and V, but it's how I initialise it that's confusing me.
Just create the resource as normal, and then when you query the layout description, it will be planar.
D3D12_RESOURCE_DESC desc = {};
desc.Dimension = D3D12_RESOURCE_DIMENSION_TEXTURE2D;
desc.Format = DXGI_FORMAT_NV12;
desc.MipLevels = 1;
desc.DepthOrArraySize = 1;
desc.Width = 1024;
desc.Height = 720;
desc.SampleDesc.Count = 1;
const CD3DX12_HEAP_PROPERTIES defaultHeapProperties(D3D12_HEAP_TYPE_DEFAULT);
ComPtr<ID3D12Resource> res;
HRESULT hr = device->CreateCommittedResource(
&defaultHeapProperties,
D3D12_HEAP_FLAG_NONE,
&desc,
D3D12_RESOURCE_STATE_COMMON,
nullptr,
IID_PPV_ARGS(res.GetAddressOf()));
if (FAILED(hr))
{
// error
}
D3D12_FEATURE_DATA_FORMAT_INFO formatInfo = { DXGI_FORMAT_NV12, 0 };
if (FAILED(device->CheckFeatureSupport(D3D12_FEATURE_FORMAT_INFO, &formatInfo, sizeof(formatInfo))))
{
formatInfo = {};
}
D3D12_PLACED_SUBRESOURCE_FOOTPRINT footprint[2] = {};
UINT numRows;
UINT64 rowBytes, totalBytes;
device->GetCopyableFootprints(&desc, 0, 2, 0, footprint, &numRows, &rowBytes, &totalBytes);
The formatInfo.PlaneCount is 2, which is why you have to ask for two subresources.
footprint[0].Format is DXGI_FORMAT_R8_TYPELESS with 1024x720 size. The footprint[0].Offset is likely 0.
footprint[1].Format is DXGI_FORMAT_R8G8_TYPELESS with 512x360 size. The footprint[1].Offset is something other than 0.
In Direct3D 12 Video the layouts are very simple to understand. In Direct3D 11 Video, it was all implicitly defined so it was a bit of a mess. That said, DDS files were defined as non-planar data, so you may want to examine how these are handled in DirectXTex.
Is there a way to create a scoreboard in processing that saves after the sketch is closed and reopened? And is there a way to make this work on android?
Here a little sample using saveStrings():
//random scores
int[] scores = {01,20,40,60,30,25};
void setup(){
//convert and save
String[] s = str(scores);
saveStrings("sco.txt",s);
//load in a different array...
// the path to saved data,
String path = "/Users/vk/Documents/Processing/_forum/saveScore/sco.txt";
int[] loadedScores = int(loadStrings(path));
// ensure they are there...
println(loadedScores);
}
Can somebody help me with the following code snippet to capture part of or the whole desktop on OSX ? I would like to specify the upper-left corner coordinates (x,y) and the width (w) and height (h) of the rectangle that defines the capture.
It's for a C# MonoMac application on OSX.
This is what I've done:
int windowNumber = 2;
System.Drawing.RectangleF bounds = new RectangleF(0,146,320,157);
CGImage screenImage = MonoMac.CoreGraphics.CGImage.ScreenImage(windowNumber,bounds);
MonoMac.Foundation.NSData bitmapData = screenImage.DataProvider.CopyData();
It looks like I have the bitmap data in 'bitmapData', but I'm not sure how I convert the NSData instance 'bitmapData' to an actual Bitmap; i.e. :
Bitmap screenCapture = ????
The documentation is really sparse and I've googled for examples without luck. So I'm hoping that there's a kind MonoMac expert out there who can point me in the right direction? - An example would be nice :o)
Thank you in advance!
This will give you the bytes of your capture in a .NET byte[], from where you can create a Bitmap or Image or whatever you want. Might not be exactly what you are looking for but should put you in the right direction.
int windowNumber = 2; System.Drawing.RectangleF bounds = new RectangleF(0,146,320,157);
CGImage screenImage = MonoMac.CoreGraphics.CGImage.ScreenImage(windowNumber,bounds);
using(NSBitmapImageRep imageRep = new NSBitmapImageRep(screenImage))
{
NSDictionary properties = NSDictionary.FromObjectAndKey(new NSNumber(1.0), new NSString("NSImageCompressionFactor"));
using(NSData tiffData = imageRep.RepresentationUsingTypeProperties(NSBitmapImageFileType.Png, properties))
{
byte[] imageBytes;
using(var ms = new MemoryStream())
{
tiffData.AsStream().CopyTo(ms);
imageBytes = ms.ToArray();
}
}
}
I've got a windows 8 program that uses an image picker and downloads the selected image on the server.
The server provides an API which needs the image to be converted in base64string. And an image must be less than 7Mb.
I'm using the code below:
FileOpenPicker openPicker = new FileOpenPicker();
openPicker.ViewMode = PickerViewMode.Thumbnail;
openPicker.SuggestedStartLocation = PickerLocationId.PicturesLibrary;
openPicker.FileTypeFilter.Add(".jpg");
openPicker.FileTypeFilter.Add(".jpeg");
openPicker.FileTypeFilter.Add(".png");
StorageFile file = await openPicker.PickSingleFileAsync();
if (file != null)
{
// Application now has read/write access to the picked file
bitmap = new BitmapImage();
byte[] buf;
using (var stream = await file.OpenStreamForReadAsync())
{
buf = ReadToEnd(stream);
}
using (var stream = await file.OpenAsync(FileAccessMode.Read))
{
base64String = Convert.ToBase64String(buf);
bitmap.SetSource(stream);
}
}
And the bitmap goes to the server.
But there is a problem: the bitmap size is much more bigger than jpg's size, for example. And none of small jpgs go to the server, because their bitmap version is larger than 7 Mb.
Can I convert an image to base64string without converting it to a bitmap?
In this code, you read the image (encoded in jpeg) and convert it to a base 64 string.
You can not reduce the size of the base 64 without reducing the size of the image.
To do so, you can use a BitmapEncoder/Decoder and resize the image to a smaller size.
Regards
How can I convert single channel IplImage (grayscale), depth=8, into a Bitmap?
The following code runs, but displays the image in 256 color, not grayscale. (Color very different from the original)
btmap = gcnew Bitmap(
cvImg->width ,
cvImg->height ,
cvImg->widthStep ,
System::Drawing::Imaging::PixelFormat::Format8bppIndexed,
(System::IntPtr)cvImg->imageData)
;
I believe my problem lies in the PixelFormat. Ive tried scaling the image to 16bit and setting the pixel format to 16bppGrayscale, but this crashes the form when uploading the image.
The destination is a PicturePox in a C# form.Thanks.
You need to create ColorPalette instance, fill it with grayscale palette and assign to btmap->Palette property.
Edit: Actually, creating ColorPalette class is a bit tricky, it is better to modify color entries directly in btmap->Palette. Set these entries to RGB(0,0,0), RGB(1,1,1) ... RGB(255,255,255). Something like this:
ColorPalette^ palette = btmap->Palette;
array<Color>^ entries = palette->Entries;
for ( int i = 0; i < 256; ++i )
{
entries[i] = Color::FromArgb(i, i, i);
}
int intStride = (AfterHist.width * AfterHist.nChannels + 3) & -4;
Bitmap BMP = new Bitmap(AfterHist.width,
AfterHist.height, intStride,
PixelFormat.Format24bppRgb, AfterHist.imageData);
this way is correct to create a bitmap of a IPLimage.