I am working on a project, which is Android app that uses camera to capture a photo of some ticket and does OCR recognition for only a part of it. I have no previous experience in image processing, but I know it must be some kind of tricky way, because Android applications have small RAM limits.
I have not enough reputation points to post images so I give URLs to it.
Below, I attach image before any processing:
My aim is to automatically detect these lines of (---) and crop it so that final image look like this one:
What's more - it's important to stay open-source and do it without sending photo to some external image processing service.
You can try using Hough Transform to find the lines. OpenCV has a implementation that is open source and works on Android.
HoughLineP is a very efficient Version of the HoughTransform to find Line Segments.
Olena is definitely the way to go!. It's a generic image processing library, but the interesting part is an module that's called Scribo.
Scribo will do document analysis on the picture to extract text and/or image regions, and optionally send text regions to tesseract for recognition.
Being feasible for Android or not is something that I couldn't tell. I've tried it on OSX and Linux systems and it shows great potential.
Related
I have 1500 pictures that need the address where they were taken to be shown in the corner of the picture. I have the pictures geo-tagged.
I need help extracting the GPS data and converting that to an address.
Then getting that address and saving it into the picture in the bottom right corner. Can anyone help or point me in the right direction please?
You're going to need two things. First you need an application that will extract the EXIF data that you are interested in. You should be able to write this yourself as it is fairly simple to do. You will need the JPEG standard and just need enough of it to identify the markers; specifically the APPn markers. You are also going to need the EXIF and (possibly the) TIFF standards to figure out how to extract the data you need form the EXIF APPn marker.
Writing the information to the corner of the image is the tough part. There are probably command line applications that will allow you to do that already. If worst comes to worst, there are various language API's that will allow you to read a JPEG stream into a buffer; draw text to the buffer; then write the buffer back to a JPEG stream.
You will most likely need to use a programming language for this; I think Python would be suitable as it's easy to get started and has libraries needed for your task.
For example, in order to extract the location (coordinates) from the JPEG files you can use pyexiv2.
To transform those coordinates to addresses you need to use a geocoding service such as Google's Geocoding API - you can use their Python library directly or code your own using something like requests.
Now that you have the address data you can overlay that data onto images using Python's pillow library.
If you're looking for some code to get started let me shamelessly plug my own project called photomap; you can find code to read GPS information from images here: https://github.com/iticus/photomap/blob/master/handlers.py#L170
The question seems to be weird, but I need to ask this, since I am witnessing a quite interesting output when I compare text as image and graphics as image.
Ideally I am in process of identifying an tool, or algorithm to compare two pdfs, generate output which will highlight the difference between them.
There are possibilities in pdfs, which will have text as image format (legacy text on papers, are converted to pdfs).
and we are doing migration of those legacy pdfs, and finally we are comparing with legacy and converted pdf output.
I am evaluating couple of tools like Adobe dc pro, i-net pdfc and power pdf etc, for comparing two pdfs.
While evaluating, I am able to see graphic images are getting compared(not accurate either) on either side of the pdfs. Where as text as images are completely ignored, unanimously same results in all the tools.
But I am more interested in text as image, since we deal more of legacy text pdfs.
Below, is attached graphic image comparison result, where it could able to capture the differences between the images.
But when I compare text image, differences are not highlighted in the tool.
What I understand from this, text is not compared as image graphics, and tool is completely ignoring the comparison. I would like have clarification whether my assumption is correct.
Secondly, I would like to know how to compare text image in pdfs to generate the differences?.
I'm working for the company that is author of i-net PDFC so I'll answer your first question as well:
Your assumption is correct. i-net PDFC is able to compare images and shapes, but it cannot detect if some content completely changed it's meaning, e.G. a line shape that is used to draw a letter or in your case an image that has to be recognized as text. Recognizing ASCII art as image won't work for the same reason either. Such cases will always be detected as differences even though their visual appearance is similar.
On your second question: Using an OCR conversion tool for one or both documents is a common solution to this problem. A simple image comparison of the compared pages in unlikely to work due to the different font styles and line wrappings in the converted file.
Please note that most OCR applications will use the rendered page images for the recognition. This may lead to incorrect recognition results even if there are no images in the PDF file.
i-net Software is aware of this general issue and an OCR module is currently in development. It'll provide an option to apply the recognition solely to the images in the PDF files.
I am learning to use OpenGL ES 2.0 by using MoSync to write cross platform C code. I have already managed to draw basic shapes such as a triangle, square and circle so the next stage is to draw some text to the screen. After reading various books, tutorials and forum posts I realise I have to create a texture atlas bitmap.
I have a file with the text I want to use, i.e 0-9 a-z image file. Before I can upload and bind it to a texture object I first need to upload the image to OpenGL. Various tutorials use UIImage or BitmapFactory to upload the image but I cannot use these as MoSync does not contain their header files. Could anyone suggest a way to load my image file to OPenGL?
To use MoSync on the Android platform you are probably going to have to make a native library for MoSync and your OpenGL ES code in C++. Most OpenGL ES projects on Android are done in native code for many reasons which are detailed in this article:
http://software.intel.com/en-us/articles/porting-opengl-games-to-android-on-intel-atom-processors-part-1/
I ended up using maOpenGLTexImage(MAHandle image), which works exactly as glTexImage2D() but it uses an image resource instead and figures out pixel formats etc.
I am developing an application for viewing images.
I used the example of PhotoScroller Apple to implement this application.
In my application I want to be able to draw on the image.
I had the idea to put a UIView on top with transparent background and draw the lines via touch events. This solution has become very slow because the generated images are very large, around 3700x2000 pixels.
I also tried a solution with the example of Apple GLPaint that uses OpenGL, but it has a size limitation of 2048x2048 pixels.
Anyone have any idea or example of how I implement this?
I think you should try and tile your image.
One option is using CATiledLayer. Have a look at this short tutorial.
Or you could try and use CGContextDrawTiledImage to get your stuff done. Possibly this post from S.O. could help you getting started.
I don't know anything about either library but I have to choose one of them.
Which one whould you recommend?
I'm using Perl. I need to generate images for weather site. The image is generated for a location and should contain temperature and a weather condition image inside. I guess this is a piece of cake for both libs. But I want to know which one is more powerful. I've read that libGD is not able to rotate text. Maybe there are some other drawbacks? Which one generates images faster? Whose API is easier to use?
according to this source, you should use GD:
GD and other modules built on top of that (like GD::Graph) are more aimed at producing "new" images like charts.
And you can read "Develop your own weather maps and alerts with Perl and GD", which is what you're looking for.
If you some some time. try them both, play a little, and decide.
I find both to be straightforward to use ImageMagick gives you a lot more power than gd. Here are two Magick examples from my posts:
How can I use IO::Scalar with Image::Magick::Read()
How can I resize an image to fit area with Image::Magick?
to give you examples of the API.
I have used GD to create a visualization.
See Script : giss-timeline-graphs.pl on that page.
imagemagick is more robust, however libGD should be able to cover most of the image generation tasks as well. you should see perl API/functions to both of these libraries to see what is more convenient for you.