Detecting text with Google Vision API ignores some characters - google-api

I'm using Google Vision API to recognize text from an image. It works usually fine but if the text contains closing curly braces } it is almost always ignored. There is no problem with the opening curly braces { on the same image.
For example the following input (in image format):
{
line1
{
line2
}
line3
}
line4
Results the following:
{
line1
{
line2
line3
line4
I'm using DOCUMENT_TEXT_DETECTION
Any idea?

Related

How to replace a whole line (between 2 words) using sed?

Suppose I have text as:
This is a sample text.
I have 2 sentences.
text is present there.
I need to replace whole text between two 'text' words. The required solution should be
This is a sample text.
I have new sentences.
text is present there.
I tried using the below command but its not working:
sed -i 's/text.*?text/text\
\nI have new sentence/g' file.txt
With your shown samples please try following. sed doesn't support lazy matching in regex. With awk's RS you could do the substitution with your shown samples only. You need to create variable val which has new value in it. Then in awk performing simple substitution operation will so the rest to get your expected output.
awk -v val="your_new_line_Value" -v RS="" '
{
sub(/text\.\n*[^\n]*\n*text/,"text.\n"val"\ntext")
}
1
' Input_file
Above code will print output on terminal, once you are Happy with results of above and want to save output into Input_file itself then try following code.
awk -v val="your_new_line_Value" -v RS="" '
{
sub(/text\.\n*[^\n]*\n*text/,"text.\n"val"\ntext")
}
1
' Input_file > temp && mv temp Input_file
You have already solved your problem using awk, but in case anyone else will be looking for a sed solution in the future, here's a sed script that does what you needed. Granted, the script is using some advanced sed features, but that's the fun part of it :)
replace.sed
#!/usr/bin/env sed -nEf
# This pattern determines the start marker for the range of lines where we
# want to perform the substitution. In our case the pattern is any line that
# ends with "text." — the `$` symbol meaning end-of-line.
/text\.$/ {
# [p]rint the start-marker line.
p
# Next, we'll read lines (using `n`) in a loop, so mark this point in
# the script as the beginning of the loop using a label called `loop`.
:loop
# Read the next line.
n
# If the last read line doesn't match the pattern for the end marker,
# just continue looping by [b]ranching to the `:loop` label.
/^text/! {
b loop
}
# If the last read line matches the end marker pattern, then just insert
# the text we want and print the last read line. The net effect is that
# all the previous read lines will be replaced by the inserted text.
/^text/ {
# Insert the replacement text
i\
I have a new sentence.
# [print] the end-marker line
p
}
# Exit the script, so that we don't hit the [p]rint command below.
b
}
# Print all other lines.
p
Usage
$ cat lines.txt
foo
This is a sample text.
I have many sentences.
I have many sentences.
I have many sentences.
I have many sentences.
text is present there.
bar
$
$ ./replace.sed lines.txt
foo
This is a sample text.
I have a new sentence.
text is present there.
bar
Substitue
sed -i 's/I have 2 sentences./I have new sentences./g'
sed -i 's/[A-Z]\s[a-z].*/I have new sentences./g'
Insert
sed -i -e '2iI have new sentences.' -e '2d'
I need to replace whole text between two 'text' words.
If I understand, first text. (with a dot) is at the end of first line and second text at the beginning of third line. With awk you can get the required solution adding values to var s:
awk -v s='\nI have new sentences.\n' '/text.?$/ {s=$0 s;next} /^text/ {s=s $0;print s;s=""}' file
This is a sample text.
I have new sentences.
text is present there.

How to match and delete some if statements from a file based on pattern matching

I have following code
if (temp==1) {
some text
}
some more text
abcdef
if (temp==1) {
some text
}
if (temp2==1) {
some text
}
I need to use any script/command to delete all the if statements.
Required output:
some more text
abcdef
if (temp2==1) {
some text
}
What i can already achieve is the following
grep -zPo "if\ \(temp==1\) (\{([^{}]++)*\})" filename
and i get the following output
if (temp==1) {
some text
}
if (temp==1) {
some text
}
Same result from perl command too
perl -l -0777 -ne
"print $& while /if \(temp==1\) (\{([^{}]++|(?1))*\})/g" filename
Now i need to delete the matched lines from the file.
So all if(temp2==1) must be retained and if(temp==1) must be deleted.
How can i do this?
What you're asking to do is impossible in general without a parser for whatever language that code is written in but you can produce the output you want from that specific input using any awk in any OS on any UNIX box with:
awk '/if \(temp==1/{f=1} !f; /}/{f=0}' file
if that's all you want.
You probably can use sed to do this:
$ sed '/temp==1/,/}/d' inputfile
some more text
abcdef
if (temp2==1) {
some text
}
Above deletes (with d) all lines between and including the patterns, /temp==1 and }.
Note: It will not work with nested patterns as OP is suggesting in his comment. As per OP's comment, one could do the following:
$ sed '/temp==1/,/}/d;/}/,/}/d' 1.txt
This removes additional texts and patterns that are between two }s.

Matching parentheses over multiple lines (with awk?)

I want to filter out footnotes from a LaTeX document using a bash script. It may look like either of these examples:
Some text with a short footnote.\footnote{Some \textbf{explanation}.}
Some text with a longer footnote.%
\footnote{Lorem ipsum dolor
sit amet, etc. etc. etc. \emph{along \emph{multiple} lines}
but all lines increased indent from the start.}
The remains should be:
Some text with a short footnote.
Some text with a longer footnote.%
I don't care about extra whitespace.
Since matching parentheses cannot be done with regular expressions, I presume I cannot use sed for this. Is it possible with awk or some other tool?
With GNU awk for multi-char RS and null FS splitting the record into chars:
$ cat tst.awk
BEGIN { RS="[\\\\]footnote"; ORS=""; FS="" }
NR>1 {
braceCnt=0
for (charPos=1; charPos<=NF; charPos++) {
if ($charPos == "{") { ++braceCnt }
if ($charPos == "}") { --braceCnt }
if (braceCnt == 0) { break }
}
$0 = substr($0,charPos+1)
}
{ print }
$ awk -f tst.awk file
Some text with a short footnote.
Some text with a longer footnote.%
Using recursive regex in command line perl, you can match matching parentheses as this:
perl -00pe 's/%?\s*\\footnote({(?:[^{}]*|(?-1))*})//g' file
Some text with a short footnote.
Some text with a longer footnote.
For regex details here is regex demo

How to call a chef function that adds an indented line that ends with a backslash?

I have to following function
#------------------------------------------------------------------------------
# This function just executes one-liner sed command
# to add "line2" after the line that contains "line1"
# in file "file_name"
#------------------------------------------------------------------------------
def add_line2_after_line1_in_file(line1, line2, file_name, comment)
bash comment do
stripped_line2=line2.strip
user 'root'
code <<-EOC
sed -i '/#{line1}/a #{line2}' #{file_name}
EOC
only_if do File.exists? file_name and File.readlines(file_name).grep(/#{stripped_line2}/).size == 0 end
end
end
How can I call the function to produce this?
Before calling the function
This is a line
Indented line1 \
Indented line2 \
Line3
After calling the function
This is a line
Indented line1 \
Indented line2 \
Added line \
Line3
I tried this to no avail
add_line2_after_line1_in_file("line1", " line2 \\", "file", "comment")
Maybe a better function that does the same thing?
Chef comes with a Chef::Util::FileEdit Ruby class (code) that can do this type of replacement for you.
ruby_block "Add line to file [c:/path/to/your/file]" do
block do
file = Chef::Util::FileEdit.new("c:/path/to/your/file")
file.insert_line_after_match(/ line2 \\/, " line3 \\" )
file.write_file
end
action :run
end
It's generally easier to get along in Chef by doing things in Ruby. You get to leave behind issues like shell escaping and quoting which are no fun.
If you have the time, try setting this up as a light weight resource and provider for each of the methods you use, so you can reuse the resource easily in any cookbooks, something like:
file_insert_after_line "c:/path/to/your/file" do
search /sometext/
insert_text "new line of text"
end

Output each line of text (+ the first line) to its own .txt file VBS-Script

I need a VBS script to output the first line of text along with another line of text for file data.txt.
example of my text.txt file
line1 + line2 to 1.txt
line1 + line3 to 2.txt
line1 + line4 to 3.txt
Thanks in advance,
Best regards,
joe
Long time since i used VBS
Set fso = CreateObject("Scripting.FileSystemObject")
set src = fso.OpenTextFile("test.txt",1)
lines = split(src.readall,vbcrlf)
for i = 1 to ubound(lines)
set dst = fso.CreateTextFile( i & ".txt", true)
dst.writeline lines(0)
dst.writeline lines(i)
dst.close
next
You can write to files using the FileSystemObject. This page shows some sample code for opening and writing to a file: Working with Files

Resources