How to convert image to text in codeignter - codeigniter

Hii ijust want to ask that how can i convert a image to text using OCR ?
if(isset($_FILES['image'])){
$file_name = $_FILES['image']['name'];
$file_tmp =$_FILES['image']['tmp_name'];
move_uploaded_file($file_tmp,"image/".$file_name);
echo "<h3>Image Upload Success</h3>";
echo '<img src="'.$file_name.'" style="width:70%">';
shell_exec('"C:\\Program Files\\Tesseract-OCR\\tesseract" "D:\\xampp\\htdocs\\ci3\\image\\'.$file_name.'" out');
echo "<br><h3>OCR after reading</h3><br><pre>";
$myfile = fopen("out.txt", "r") or die("Unable to open file!");
echo fread($myfile,4045);
fclose($myfile);
echo "</pre>";
}
I just write this code but it is like not convert the image text properly so is their any solution so please let me know !!
I Expecting that its work i vertical image to read the text but in my output it read like.........
The Registration Directorate at the Ministry of Industry, Commerce and Tourism
certifies that the merchant's below details have been registered in accordance with
Decree law No. (27) for the year 2015 of the Commercial Registration.
22/04/2023 GliaiuY! ~,6 Registration 22/04/2007
Date
eroup HORIZON TELECOM SERVICES COMPANY WLL
Name
Commercial
Name
Registration
Type
CR
Status
HORIZON TELECOM SERVICES COMPANY WLL
With Limited Liability Company
ACTIVE
Area 4élaicl!
ABU SAYBA/ a2 5!
P.O.BOX #.ye Road & »b
7325
Block asx
473
Commercial
Address
Activities
Sale and installation of telecommunications equipment and parts
ere! Bylo!
Registration Directorate
QF. 409 Issue 0
* This CR does not permit its holder to practice investment activities on behalf of others.
igiwltzads
(alka,
KINGDOM OF BAHRAIN pues.
Ministry of Industry, ©
Commerce and Tourism R
Solenitl J) 15 bol gd
Commercial Registration Certificate
ell ad oi hath dala yb yleill s Ae licall 8 51} 52 apsaill 6 pla) agts
GSM Dasa GLE; 2015 Aid (27) aby cy silds p pes pall Cady alld g oLisi atltly Aba uell
Judll 6 Registration == 4908 - 1
aad CYL) Glesal 6 5 jl st 4S 8 Ac gore! pul
aed VLA) Las) Oy jolt ASS ole usd
Ba gdare Aud gious IS AS ph andl ¢ 93
dads aud Le
Flat/Shop No. J=«/4a4
11
Building +
608 Gola ol gual
Woke abby SYLSIYI Glace af jlo
wsdl Ul gal pletion! LU 49) jo: 4ualial jin Y all lhe *
Issued Date: 20/04/2022 Page 1 of 1
}
Z/
Please post this certificate at a visible place.
Tel: +973 80001700 - www. sijilat.bh - www.moic.gov.bh
boat! S12 Sol GIS Bolg S! che 5 oe
but i need a seprate column to read a proper text formate

Related

Removing categories with patsy and statsmodels

I am using statsmodels and patsy for building a logistic regression model. I'll use pseudocode here. Let's assume I have a dataframe containing a categorical variable, say Country, with 200 levels. I have reasons to believe some of them would be predictive, so I build a model as in
formula = 'outcome ~ C(Country)'
patsy splits Country into its levels and the model is build using all countries. I then see that the coefficient in GB is high so I want to remove only GB. Can I do something like this in patsy:
formula = 'outcome ~ C(country) - C(country)[GB]'
I tried and it did not change anything.
I don't know if there is a way to subset a Category with patsy formula, but you can do it in the DataFrame.
For example
import numpy as np
import pandas as pd
import statsmodels.api as sm
# sample data
size = 100
np.random.seed(1)
countries = ['IT', 'UK', 'US', 'FR', 'ES']
df = pd.DataFrame({
'outcome': np.random.random(size),
'Country': np.random.choice(countries, size)
})
df['Country'] = df.Country.astype('category')
print(df.Country)
0 ES
1 IT
2 UK
3 US
4 UK
..
95 FR
96 UK
97 ES
98 UK
99 US
Name: Country, Length: 100, dtype: category
Categories (5, object): ['ES', 'FR', 'IT', 'UK', 'US']
Let us suppose we want to remove Category "US"
# create a deep copy excluding 'US'
_df = df[df.Country!='US'].copy(deep=True)
print(_df.Country)
0 ES
1 IT
2 UK
4 UK
5 ES
..
94 UK
95 FR
96 UK
97 ES
98 UK
Name: Country, Length: 83, dtype: category
Categories (5, object): ['ES', 'FR', 'IT', 'UK', 'US']
Even if there are no more elements with category "US" in the DataFrame, the category is still there. If we use this DataFrame in a statsmodels model, we'd get a singular matrix error, so we need to remove unused categories
# remove unused category 'US'
_df['Country'] = _df.Country.cat.remove_unused_categories()
print(_df.Country)
0 ES
1 IT
2 UK
4 UK
5 ES
..
94 UK
95 FR
96 UK
97 ES
98 UK
Name: Country, Length: 83, dtype: category
Categories (4, object): ['ES', 'FR', 'IT', 'UK']
and now we can fit a model
mod = sm.Logit.from_formula('outcome ~ Country', data=_df)
fit = mod.fit()
print(fit.summary())
Optimization terminated successfully.
Current function value: 0.684054
Iterations 4
Logit Regression Results
==============================================================================
Dep. Variable: outcome No. Observations: 83
Model: Logit Df Residuals: 79
Method: MLE Df Model: 3
Date: Sun, 16 May 2021 Pseudo R-squ.: 0.01179
Time: 22:43:37 Log-Likelihood: -56.776
converged: True LL-Null: -57.454
Covariance Type: nonrobust LLR p-value: 0.7160
=================================================================================
coef std err z P>|z| [0.025 0.975]
---------------------------------------------------------------------------------
Intercept -0.1493 0.438 -0.341 0.733 -1.007 0.708
Country[T.FR] 0.4129 0.614 0.673 0.501 -0.790 1.616
Country[T.IT] -0.1223 0.607 -0.201 0.840 -1.312 1.068
Country[T.UK] 0.1027 0.653 0.157 0.875 -1.178 1.383
=================================================================================

Apache PIG - How to get the Flop 10 data records?

I have data records like this:
Name customerID revenue(Mio) premium
Michael James 078932832 2.7 y
Susan Miller 024383490 3.9 n
John Cooper 021023023 2.1 y
How do I get the records - divided into the premium flag - each with the lowest revenue (=Flop 10)?
The result should be given as:
Nr Name customerID revenue(Mio) premium
1 John Cooper 021023023 2.1 y
2 Michael James 078932832 2.7 y
3 Andrew Murs 044834399 3.0 y
. ... ..... ... .
10 th entry with flag y
1 Susan Miller 024383490 3.9 n
. ... ..... ... .
10 th entry with flag n
As you see the list is ordered ascending (beginning with the lowest revenue).
I guess you should use split
Considering A is your load statement
A = load 'data' as (Nr,Name,customerID,revenue,premium);
B = split A into PRE if premium =='y', NONPRE if premium == 'n';
C = order PRE by revenue asc;
D = order NONPRE by revenue asc;
Disclaimer: Be careful while using split as null records get dropped. I have not compiled this code.

bash awk get numbers in two digits

I want to correct wrong meta data or add missing meta data for the 75 cd's I have ripped from disc.
I got the track info from AllMusic en stripped it to almost usable "CSV" data.
Number";"1";"Piece";"Nocturne for piano No. 2 in E flat major, Op. 9/2, CT. 109";"Componist";"Frédéric Chopin
MainPiece";"";"Piece";"Symphony No. 9 in E minor ("From the New World"), B. 178 (Op. 95) (first published as No. 5)
Number";"2";"Piece";"Largo";"Componist";"Antonin Dvorák
Number";"3";"Piece";"La plus que lente, waltz for piano (or orchestra), L. 121";"Componist";"Claude Debussy
Number";"4";"Piece";"Waldesrauschen (Forest Murmurs), for piano (Zwei Konzertetuden No. 1), S. 145/1 (LW A218/1)";"Componist";"Franz Liszt
MainPiece";"";"Piece";"Oboe Concerto, for oboe, strings & continuo in D minor, Op. 8/9, RV 454
Number";"5";"Piece";"Allegro";"Componist";"Antonio Vivaldi
Number";"6";"Piece";"Largo";"Componist";"Antonio Vivaldi
Number";"7";"Piece";"Allegro";"Componist";"Antonio Vivaldi
MainPiece";"";"Piece";"Cello Concerto in A major, G. 475
Number";"8";"Piece";"1. Allegro";"Componist";"Luigi Boccherini
Number";"9";"Piece";"2. Adagio";"Componist";"Luigi Boccherini
Number";"10";"Piece";"3. Rondò - Allegro";"Componist";"Luigi Boccherini
MainPiece";"";"Piece";"Serenade No. 12 for winds in C minor ("Nacht Musique"), K. 388 (K. 384a)
Number";"11";"Piece";"Allegro";"Componist";"Wolfgang Amadeus Mozart
Number";"12";"Piece";"Liebesträume, notturno for piano No. 3 in A flat major ("O Lieb, so lang du lieben kannst"), S. 541/3 (LW A103/3)";"Componist";"Franz Liszt
MainPiece";"";"Piece";"Phantasiestücke (4) for violin, cello & piano in A minor, Op. 88
Number";"13";"Piece";"Romanze";"Componist";"Robert Schumann
MainPiece";"";"Piece";"Sinfonia Concertante for violin, cello, oboe, bassoon & orchestra, H. 1/105
Number";"14";"Piece";"Andante";"Componist";"Franz Joseph Haydn
I would like to rewrite this with awk to a script to set meta data
eyeD3 -n 01 -a composer -t mainpiece piece 01*.mp3
And with awk to rename the files
mv 01*.mp3 01 [composer] mainpiece piece.mp3
The mainpiece / piece is an manual part but I would like to rewrite 1 to 01.
I found something with printf ("%2d" ,$1,$2) but thins complaints about .mp3
Has anyone suggestions for me?

using variables in gsub

I have a variable address which for now is a long string containing some unneccessary info, eg: "Aboriginal Relations 11th Floor Commerce Place 10155 102 Street Edmonton AB T5J 4G8 Phone 780 427-9658 Fax 780 644-4939 Email gerry.kushlyk#gov.ab.ca"
Aboriginal Relations is in a variable called title, and I'm trying to call address.gsub!(title,''), but its returning the original string.
I've also tried address.gsub!(/#{title}/,'') and address.gsub!("#{title}",'') but those won't work either. Any ideas?
Sorry, the typo occurred when I typed it into stack overflow, heres the code and the output, copied and pasted:
(this is within a loop, so there will be multiple outputs)
p title
address.gsub!(title,'')
p address
output
"Aboriginal Relations "
"Aboriginal Relations 11th Floor Commerce Place 10155 102 Street Edmonton AB T5J 4G8 Phone 780 427-9658 Fax 780 644-4939 Email gerry.kushlyk#gov.ab.ca"
"Aboriginal Tourism Advisory Council "
"Aboriginal Tourism Advisory Council 5th Floor Terrace Building 9515 107 Street Edmonton AB T5K 2C3 Phone 780 427-9687 Fax 780 422-7235 Email foip.fintprccs#gov.ab.ca"
"Acadia Foundation "
"Acadia Foundation PO Box 96 Oyen AB T0J 2J0 Phone 403 664-3384 Fax 403 664-3316 Email acadiafoundation#telus.net"
"Access Advisory Council "
"Access Advisory Council 12th Floor Centre West Building 10035 108 Street Edmonton AB T5J 3E1 Phone 780 427-2805 Fax 780 422-3204 Email barb.joyner#gov.ab.ca"
"ACCM Benevolent Association "
"ACCM Benevolent Association Suite 100 9403 95 Avenue Edmonton AB T6C 4M7 Phone 780 468-4648 Fax 780 468-4648 Email accmmanor#shaw.ca"
"Acme Municipal Library "
"Acme Municipal Library PO Box 326 Acme AB T0M 0A0 Phone 403 546-3845 Fax 403 546-2248 Email aamlibrary#marigold.ab.ca"
likewise, if I try address.match(/#{title}/) I get nil.
I'm assuming you're using ruby 1.9 or higher.
It's possible that the trailing whitespace is a non-breaking space:
p "Relations\u00a0" # looks like a trailing space, but strip won't remove it
to get rid of it:
"Relations\u00a0".gsub!(/^\u00a0|\u00a0$/, '') # => "Relations"
A more generic solution for all unicode whitespace:
"Relations\u00a0".gsub!(/^[[:space:]]|[[:space:]]$/, '') # => "Relations"
To see what the character is in your case:
title[-1].ord # => 160 (example only)
'%x' % title[-1].ord # => "a0" (hex equivalent; example only)
title = title[0..-2] seemed to solve it. for some reason strip and chomp wouldn't work.

Preview label written in Eltron Programming Language EPL?

I have code produced by our proprietry system in the Eltron Programming Language:
This is sent to Eltron/Zebra label printers to be printed.
Is there some kind of software that would allow me to interpret this code to do some form of 'print preview'?
I am considering developing a way to convert this into an image or even postscript pdf, but I am struggling with how to do the barcodes (the lines starting in B is for barcodes).
N
Q296,24
R132,0
S2
D9
ZB
A3,2,0,3,1,1,N,"RB10SS5"
B3,22,0,2C,2,4,35,N,"391369840"
A3,60,0,3,1,1,N,"391369840"
A3,80,0,3,1,1,N,"Testing"
A3,100,0,4,1,1,N,"Serology"
A3,130,0,1,1,1,N,"SSTORE"
A185,16,0,1,1,1,N,"17 Mar"
A185,35,0,1,1,1,N,"SEROL"
A185,51,0,1,1,1,N,"0.50"
B400,208,0,2C,2,4,40,N,"391369840"
A400,254,0,2,1,1,N,"391369840"
P1
Zebra doesn't provide software to do this for EPL labels, but if you have a ZPL/EPL printer and convert the labels to ZPL, you can use the printer's web page to view the label once its on the printer
I8,A,001
Q200,024
q448
rN
S4
D7
ZT
JF
O
R24,0
f100N
I8,
A,001
Q200,
024
q448
rN
S4
D7
ZT
JF
O
R24,
0
f100
N
B380,179,2,1,2,6,40,N,"<parcel_name>"
A355,107,2,4,1,1,N,"<model> <color> <clarity> Cut:<cut>"
A355,80,2,4,1,1,N,"Dpt:<depth> Tbl:<table_size>"
A355,54,2,4,1,1,N,"Pol:<polish> Sym:<symmetry> Fl:<flouresence>"
A355,27,2,4,1,1,N,"<gemology_institute_name> <certificate_number>"
A355,134,2,4,1,1,N,"<parcel_name_formatted> <carat>"
P1

Resources