Using BASH, how can I retrieve text between parenthesis? - bash

I've searched for this very question on this site, but either they use REGEX ( I have no clue how to use that in BASH, or it's not quite similar enough of a problem so I can use the examples.
Basically, I have an html file that has the info I need in a set of parenthesis.
for example:
Merry Christmas (english)
Feliz Navidad (spanish)
I'm trying to take the data from the html and put it into either a string or echo it out to a filename for comparison.
Any suggestions?

You question is very vague so it's hard to tell what are exactly your requirements, but the following command will find and print all the parentheses from the file:
grep -oP '\([^)]+\)' input.html

Related

Format PART of a specific word in InDesign

There's a unique, made-up word in a book I am editing. I need to italicize the first three letters of this word every time it occurs.
So far I have determined that GREP styles are my best shot at automatically formatting this word, but I have not been able to create a GREP string that works. Any help would be welcome!
Edit:
I managed to get a working GREP query, but this only works for me in the Find/Change dialog. I believe that these GREP strings need to be written a little differently depending on where they are used in the program...
By the way, the specific word I am looking for is youniverse. I need you to always be italicized.
My current working Find/Change GREP query is:
you(?=niverse)
This is a basic way to get the result I am looking for. Ideally this would be a GREP Style in my main paragraph style so I could procedurally apply this style every time the word occurs
If you need to match the first "Dan" of DankFarrik use this in your GREP search and apply appropriate character style:
(Dan|DankFarrik)
You can also try this one
((Dan)(?=kFarrik))

How to read out value from website in bash?

I want to read out and later process a value from a website (Facebook Ads) from a bash script that runs daily. Unfortunately I need to be logged in to get this value:
So far I've figured out how to log into this website on Firefox and save the html file where the value could theoretically be read out:
The only unique identifier in this file is the first instance of "Gesamtausgaben". Is there any way with this information to cut out everything besides "100,10" ?
I'd also be happy for a different kind of way to get this value. And no, I don't have any API access.
I appreciate all ideas.
Thanks,
Patrick
How to Parse HTML (Badly) with PCRE
You can't reliably parse HTML with just regular expressions, so you'll need an XML/HTML or XPATH parser to do this properly. That said, if you have a PCRE-compatible grep then the following will likely work provided the HTML is minified and the class isn't re-used on your page.
$ pcregrep -o 'span class=".*_3df[ij].*>\K[^<]+' foo.html
100,10 €
If your target HTML spreads across multiple lines, or if you have multiple spans with the same classes assigned, then you'll have to do some work to refine the regular expression and differentiate between which matches are important to you. Context lines or subsequent matches may be helpful, but your mileage will definitely vary.

Filter out HTML code with grep

I am working on a project using a bash shell script. The idea is to grep a wget retrieved page, in order to pick up a certain paragraph on the web page. The area I would like to copy, usually starts with a
<p><b>
but the paragraph also contains other bits of HTML code, such as anchor tags, that I don't want to be in the output of the grep.
I have tried
cat page.html| grep "<p><b>" >grep.txt
and then I grep the output file, which now contains the paragraph I want
cat grep.txt|grep -v '<p>|<b>|<a>' >grep.txt
but then all it does is clear everything from the file and not read anything. How can I get it to exclude only the HTML code?
I am also trying to follow the links that are in the paragraph that I grep, in order to do the same thing with those pages. Only 2 levels deep, so the main page and then what ever sub page(s) stem from the first paragraph of the main page. I know this is a difficult idea, hopefully I explained well enough to get some help. If you have any ideas, any help is appreciated.
Do you have to do this in bash? It seems to me that Python would lend itself to this problem, in particular a library called Beautiful Soup.
I've used this for parsing HTML in the past and it's the easiest tool I could find. It has good documentation for dealing with html.
Perhaps you could make a standalone python code that extracts the HTML and then echos the string you're after. The python code could then be called from inside your bash script if you have some bash functions you want to perform on the string.
I know this is 7 years old but just posting solution I have with bash
https://api.jquery.com/jquery.grep/

How to read text between two particular text in unix shell scripting

I want to read text between two particular words from a text file in unix shell scripting.
For example in the following:
"My name is Sasuke Uchiha."
I want to get Sasuke.
This is one of the many ways it can be done:
To capture text between "is" and "Uchiha":
sed -n "s/^.*is \(.*\)Uchiha.*/\1/p" inFile
I'm tempted to add a "let me google that for you" link, but it seems like you're having a hard enough time as is.
What's the best way to find a string/regex match in files recursively? (UNIX)
Take a look at that. It's similar to what you're looking for. Regex is the go to tool for matching strings and such. And Grep is the easiest way to use it from shell in unix.
Take a look at this as well: http://www.robelle.com/smugbook/regexpr.html

Great tools to find and replace in files? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I'm switching from a Windows PHP-specific editor to VIM, on the philosophy of "use one editor for everything and learn it really well."
However, one feature I liked in my PHP editor was its "find and replace" capability. I could approach things two ways:
Just find. Search all files in a project for a string, see all the occurrences listed, and click to dive into that file at that line.
Blindly replace all occurrences of "foo" with "bar".
And of course I could use the GUI to say what types of files, whether to look in subfolders, whether it was case sensitive, etc.
I'm trying to approximate this ability now, and trying to piece it together with bash is pretty tedious. Doable, but tedious.
Does anybody know any great tools for things like this, for Linux and/or Windows? (I would really prefer a GUI if possible.) Or failing that, a bash script that does the job well? (If it would list file names and line numbers and show code snippets, that would be great.)
Try sed. For example:
sed -i -e 's/foo/bar/g' myfile.txt
Vim has multi-file search built in using the command :vimgrep (or :grep to use an external grep program - this is the only option prior to Vim 7).
:vimgrep will search through files for a regex and load a list of matches into a buffer - you can then either navigate the list of results visually in the buffer or with the :cnext and :cprev commands. It also supports searching through directory trees with the ** wildcard. e.g.
:vimgrep "^Foo.*Bar" **/*.txt
to search for lines starting with Foo and containing Bar in any .txt file under the current directory.
:vimgrep uses the 'quickfix' buffer to store its results. There is also :lvimgrep which uses a local buffer that is specific to the window you are using.
Vim does not support multi-file replace out of the box, but there are plugins that will do that too on vim.org.
I don't get why you can't do this with VIM.
Just Find
/Foo
Highlights all instances of Foo in the file and you can do what you want.
Blindly Replace
:% s/Foo/Bar/g
Obviously this is just the tip of the iceberg. You have lots of flexibility of the scope of your search and full regex support for your term. It might not work exactly like your former editor, but I think your original 'use one editor' idea is a valid one.
Notepad++ allows me to search and replace in an entire folder (and subfolders), with regex support.
You can use perl in command prompt to replace text in files.
perl -p -i".backup" -e "s/foo/bar/g" test.txt
Since you are looking for a GUI tool, I generally use the following 2 tools. Both of them have great functionality including wildcat matching, regex, filetype filter etc. Both of them displays good useful information about the hit in files like filename/lines.
Visual Studio: fast yet powerful. I uses it if the file number is huge (say, tens of thousands...)
pspad: lightweight. And a good feature about find/replace for pspad is that it will organize hits in different files in a tree hierarchy, which is very clear.
There are a number of tools that you can use to make things easier. Firstly, to search all the files in the project from vim you can use :grep like so:
:grep 'Function1' myproject/
This essentially runs a grep and lets you quickly jump from/to locations where it has been found.
Ctags is a tool that finds declarations in your code and then allows vim to jump to these declarations. To do this, run ctags and then place your cursor over a function call and then use Ctrl-]. Here is a link with some more ctags information:
http://www.davedevelopment.co.uk/2006/03/13/vim-ctags-and-php-5/
I don't know if it is an option for you, but if you load all your files into vim with
vim *.php
than you can
:set hidden
:argdo %s/foo/bar/g => will execute the substitue command in all opened buffers
:wall => will write all opened buffers
Or instead of loading all your files into vim try :help vimgrep and a cominbation of :help argdo and :help argadd
For Windows, I think that grepWin is hard to beat -- a GUI to a powerful and flexible grep tool for Windows. It searches, and replaces, knows about regular expressions, that sort of stuff.
look into sed ... powerful command line tool that should accomplish most of what you're looking for ... its supports regex, so your find/replace is quite easy.
(man sed)
Notepad++ has support for syntax highlighting in many languages and supports find and replace across all open files with regex and basic \n \r \t support.
The command grep -rn "search terms" * will search for the specified terms in all files (including those in sub-directories) and will return matching lines including file name and line number. Armed with this info, it is easy to jump to a particular file/line in VIM.
As was mentioned before, sed is extremely powerful for doing find-and-replace.
You can run both of these tools from inside VIM as well.
Some developers I currently work with swear by Textpad. It has a UI and also supports using regex's -- everything you're looking for and more.
A very useful search tool is ack. (Ubuntu refers to it as "ack-grep" in the repositories and man pages.)
The short version of what it does is a combination of find and grep that's more powerful and intelligent than that pair.

Resources