Image optimization for OCR

Image optimization for OCR - image

I'm using tesseract v 3.02
I have the following image
Download Image (right click -- save link as)
I would like to get the text from it. I'm using tesseract for the purpose.
When writing this command:
tesseract cropped.png tess -psm 7
The result I get is "suackea I 30 10193020 NL 3 e 1 64 :23 23% 34% 120". While the end is ok, the beginning is incorrect. The expected result is:
"Strackea III €0.10/€0.20 NL 6 6 1 €4 €23 23% 34% 120"
I have tried to do some transformation using imageMagick before tesseract to get an image with text written in black on a white background:
convert cropped.png -fuzz 28000 -fill black -opaque white cropped.png
convert cropped.png -fuzz 25000 -fill white -opaque rgb(118,118,118) cropped.png
The resulting image is
Download Image (right click -- save link as)
tesseract cropped.png tess -psm 7
but the result is the same.
What transformation or other command line tool would you use to recognize the text correctly ?)
The font the text is written in is Microsof Sans Serif

Related

Graphicsmagick Unrecognized option (-annotate)

I am trying to add some text in image. I am using following command:
convert - -pointsize 37 -font fonts/source-sans-pro-regular.ttf -fill "#FFFFFF80" -gravity SouthEast -annotate +48+48 "some text"
but I am getting:
convert convert: Unrecognized option (-annotate).
Version of graphicsmagick is:
GraphicsMagick 1.4 snapshot-20210721 Q16
What can be a problem?

I see two issues here.
Firstly, GraphicsMagick commands start with gm, i.e.
gm convert ...
gm mogrify ...
gm identify ...
Secondly, the gm convert command does not have an option -annotate, see list of supported options here.
Maybe you meant to install ImageMagick which would:
work without gm prefix, as you expected, and
accept the -annotate option you hoped to use.

Filling the gaps made in chinese character due to line removal for ocr

Hello friends,
I have a hard time to ocr the above image due to the gaps that were made due to line removal.So could anyone kindly guide me on how to fill the gaps in chinese character using imagemagick

Cool question! There are many ways of approaching this but unfortunately I can't tell which ones work! So I'll give you some code and you can experiment by changing it around.
For the moment, I tried simply removing any lines that have white pixels in them, but you could look at the lines above and below, or do something else.
#!/bin/bash -xv
# Get lines containing white pixels
convert chinese.gif -colorspace gray -threshold 80% DEBUG-white-lines.png
# Develop that idea and get the line numbers in an array
wl=( $(convert chinese.gif -colorspace gray -threshold 80% -resize 1x\! -threshold 20% txt: | awk -F '[,:]' '/FFFFFF/{print $2}') )
# White lines are:
echo "${wl[#]}"
# Build a string of a whole load of "chop" commands to apply in one go, rather than applying one-at-a-time and saving/re-loading
# As we chop each line, the remaining lines move up, changing their offset by one line - UGHH. Apply a correction!
chop=""
correction=0
for line in "${wl[#]}" ; do
((y=line-correction))
chop="$chop -chop 0x1+0+$y "
((correction=correction+1))
done
echo $chop
convert chinese.gif $chop result.png
Here's the image DEBUG-white-lines.png:
The white lines are identified as:
44 74 134 164 194 254 284 314 374 404
The final command run is:
convert chinese.gif -chop 0x1+0+44 -chop 0x1+0+73 -chop 0x1+0+132 -chop 0x1+0+161 -chop 0x1+0+190 -chop 0x1+0+249 -chop 0x1+0+278 -chop 0x1+0+307 -chop 0x1+0+366 -chop 0x1+0+395 result.png

If I understand this correctly then you want to find a way of removing the white lines and then still get it to go through an OCR?
The best way would be by eye and connect the dots so to speak so the last pixel of the characters line up.
A programitcal way would be to remove the white line ad then duplicate the line above (or below) and shift it into place.
康 家 月 而 视 , 喝 道
" 你 想 做 什 么 !"
秦 微 微 一 笑 , 轻 声 道
不 知 道 看 着 些 亲 死 眼 前 ,
前 辈 会 不 会 有 痛 的 感 觉 。"
说 , 伸 手 一 指 , 一 位 少 妇
身 形 一 顿 , 小 出 现 了 一 个 血 洞
倒 地 身 广 。
康 家 相 又 惊 又 , 痛 声 道
I don't read Chinese but this is what it got machine translated as
Kang Jia month and watch, drink
"What do you want to do !"
Qin Weiwei smiled, softly
I don't know. look at some dead eyes. ,
Predecessors will not feel pain ."
And said, stretch out a finger , a young woman.
In The Shape of a meal, a small blood hole appeared
Down to the ground wide.
The Kang family was shocked and sore

add text to image with convert

I am being stuck at some image editing using bash commands.
I have 414 images labelled 1file.png 2file.png ... 414file.png. To each of these I would like to add the number, which is already present in the name. The goal is to create a .gif which will show the counts of images in one corner counting up. (Creating the .gif using convert * result.gif works)
The problem is it will not add my number in the image but literally print $i. Does somebody know how to read in the loop variable?
I have tried:
for i in {0..9}; do
convert -pointsize 80 -fill black -draw 'text 1650 400 "$i"' "$i"file.png "$i"file.png;
done
Many thanks in advance.

Problem solved:
for i in {0..1}; do
convert -pointsize 80 -fill black -draw 'text 1650 400 '\"$i\"'' "$i"file.png "$i"file.png;
done

Creating new eng.tessdata file for custom font in Tesseract giving error

Converted the PDF file into .tiff which is pretty straightforward
convert -depth 4 -density 300 -background white +matte eng.arial.pdf eng.arial.tiff
Then train tesseract for the .tiff file -
tesseract eng.arial.tiff eng.arial batch.nochop makebox
Then feed the .tiff file into tesseract -
tesseract eng.arial.tiff eng.arial.box nobatch box.train .stderr
Detect the Character set used -
unicharset_extractor *.box
But I am getting this error -
unicharset_extractor:./.libs/lt-unicharset_extractor.c:233: FATAL: couldn't find unicharset_extractor.
And it also happening for mftraining and combine_tessdata as well.
UPDATE
Ran unicharset_extractor on single box file and still doesn't work.
And it is not only with this command but also with mftraining, cntraining and combine_tessdata.

Determine bit depth of bmp file on os x

How can I determine the bit depth of a bmp file on Mac OS X? In particular, I want to check if a bmp file is a true 24 bit file, or if it is being saved as a greyscale (i.e. 8 bit) image. I have a black-and-white image which I think I have forced to be 24 bit (using convert -type TrueColor), but Imagemagick gives conflicting results:
> identify -verbose hiBW24.bmp
...
Type: Grayscale
Base type: Grayscale
Endianess: Undefined
Colorspace: Gray
> identify -debug coder hiBW24.bmp
...
Bits per pixel: 24
A number of other command-line utilities are no help, it seems:
> file hi.bmp
hi.bmp: data
> exiv2 hiBW24.bmp
File name : hiBW24.bmp
File size : 286338 Bytes
MIME type : image/x-ms-bmp
Image size : 200 x 477
hiBW24.bmp: No Exif data found in the file
> mediainfo -f hi.bmp
...[nothing useful]

If you want a commend-line utility try sips (do not forget to read the manpage with man sips). Example:
*terminal input*
sips -g all /Users/hg/Pictures/2012/03/14/QRCodeA.bmp
*output is:*
/Users/hg/Pictures/2012/03/14/QRCodeA.bmp
pixelWidth: 150
pixelHeight: 143
typeIdentifier: com.microsoft.bmp
format: bmp
formatOptions: default
dpiWidth: 96.000
dpiHeight: 96.000
samplesPerPixel: 3
bitsPerSample: 8
hasAlpha: no
space: RGB
I think the result contains the values you are after.
Another way is to open the image with the previewer preview.app and the open the info panel.
One of the most informative programs (but not easy to use) is exiftool by Phil Harvey http://www.sno.phy.queensu.ca/~phil/exiftool/ , which also works very well on MacOSX for a lot of file formats but maybe an overkill for your purpose.

I did this to investigate:
# create a black-to-white gradient and save as a BMP, then `identify` it to a file `unlim`
convert -size 256x256 gradient:black-white a.bmp
identify -verbose a.bmp > unlim
# create another black-to-white gradient but force 256 colours, then `identify` to a second file `256`
convert -size 256x256 gradient:black-white -colors 256 a.bmp
identify -verbose a.bmp > 256
# Now look at difference
opendiff unlim 256
And the difference is that the -colors 256 image has a palette in the header and has a Class:PseudoClass whereas the other has Class:Direct

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Image optimization for OCR - image

Related

Graphicsmagick Unrecognized option (-annotate)

Filling the gaps made in chinese character due to line removal for ocr

add text to image with convert

Creating new eng.tessdata file for custom font in Tesseract giving error

Determine bit depth of bmp file on os x

Categories

Resources