To find a word and copy the following word with shell(ubuntu)? - shell

is there a posibility to find a word in a file and than to copy the following word?
Example:
abc="def"
bla="no_need"
line_i_need="information_i_need"
still_no_use="blablabla"
so the third line, is exactly the line i need!
is it possible to find this word with shell orders?
thanks for your support

Using an awk with custom field separator it is much simpler:
awk -F '[="]+' '$1=="line_i_need"{print $2}' file
information_i_need
-F '[="]+' sets field separator as 1 or more of = or "

Use grep:
grep file_name line_i_need
It will print:
line_i_need="information_i_need"

This finds the line with grep an cuts the second column using " separator
grep file_name line_i_need | cut -d '"' -f2

Related

I need delete two " " with sed command

I need to delete "" in file
"CITFFUSKD-E0"
I have tried sed 's/\"//.
Result is:
CITFFUSKD-E0"
How I can delete both ?
Also I need to delete everything behind first word but input can be this one:
"CITFFUSKD-E0"
"CITFFUSKD_E0"
"CITFFUSKD E0"
Result I want it:
CITFFUSKD
You may use
sed 's/"//g' file | sed 's/[^[:alnum:]].*//' > newfile
Or, contract the two sed commands into one sed call as #Wiimm suggests:
sed 's/"//g;s/[^[:alnum:]].*//' file > newfile
If you want to replace inline, see sed edit file in place.
Explanation:
sed 's/"//g' file - removes all " chars from the file
sed 's/[^[:alnum:]].*//' > newfile - also removes all chars from a line starting from the first non-alphanumeric char and saves the result into a newfile.
Could you please try following.
awk 'match($0,/[a-zA-Z]+[^a-zA-Z]*/){val=substr($0,RSTART,RLENGTH);gsub(/[^a-zA-Z]+/,"",val);print val}' Input_file
delete everything behind first word
sed 's/^"\([[:alpha:]]*\)[^[:alpha:]]*.*/\1/'
Match the first ". Then match a sequence of alphabetic characters. Match until you find non-alphabetic character ^[:alpha:]. Then match the rest. Substitute it all for \1 - it is a backreference for the part inside \( ... \), ie. the first word.
I need delete two “ ” with sed command
Remove all possible ":
sed 's/"//g'
Extract the string between ":
sed 's/"\([^"]*\)"/\1/'
Remove everything except alphanumeric characters (numbers + a-z + a-Z, ie. [0-9a-zA-z]):
sed 's/[^[:alnum:]]//g'
This should do all in one go, remove the ", print the first part:
awk -F\" '{split($2,a,"-| |_");print a[1]}' file
CITFFUSKD
CITFFUSKD
CITFFUSKD
When you have 1 line, you can use
grep -Eo "(\w)*" file | head -1
For normal files (starting with a double quote on each line)
, try this
tr -c [^[:alnum:]] '"' < file | cut -d'"' -f2
Many legitimate ways to solve this.
I favor using what you know about your data to simplify solutions -- this is usually an option. If everything in your file follows the same pattern, you can simply extract the first set of capitalized letters encountered:
sed 's/"\([A-Z]\+\).*$/\1/' file
awk '{gsub(/^.|....$/,"")}NR==1' file
CITFFUSKD

Extract only a part of data from a file

My input is test.txt which contains data in this format:
'X'=>'ABCDEF',
'X'=>'XYZ',
'X'=>'GHIJKLMN',
I want to get something like:
'ABCDEF',
'XYZ',
'GHIJKLMN',
How do I go about this in bash?
Thanks!
If the input never contains the character > elsewhere than in the "fat arrow", you can use cut:
cut -f2 -d\> file
-d specifies the delimiter, here > (backslash needed to prevent the shell from interpreting it as the redirection operator)
-f specifies which field to extract
Here's a solution using sed:
curl -sL https://git.io/fjeX4 | sed 's/^.*>//'
Sed is passed a single command: s///. is a regex that matches any characters (.*) from the beginning of the line (^) to the last '>'. The is an empty string, so essentially sed is just deleting all the characters on the line up to the last >. As with the other solutions, this solution assumes that there is only one '>' on the line.
If the data is really uniform, then you could just run cut (on example input):
$ curl -sL https://git.io/fjeX4 | cut -d '>' -f 2
'ABCDEF',
'XYZ',
'GHIJKLMN',
You can see flag explanations on explainshell.
With awk, it would look similar:
$ curl -sL https://git.io/fjeX4 | awk -F '>' '{ print $2 }'
'ABCDEF',
'XYZ',
'GHIJKLMN',
Using awk
awk 'BEGIN{FS="=>"}{print $2}' file
'ABCDEF',
'XYZ',
'GHIJKLMN',
FS in awk stands for field separator. The code inside BEGIN is executed only at the beginning, ie, before processing the first record. $2 prints the second field.
A more idiomatic way of putting the above stuff would be
awk 'BEGIN{FS="=>"}$2' file
'ABCDEF',
'XYZ',
'GHIJKLMN',
The default action in awk is to print the record. Here we explicitly mention what to print. ie $2.

Strip only domain name out of input url string

Did a bit of searching already but cannot seem to find an elegant way of doing this. I'd like to be able to search through a list like below and only end up with a plain text output file containing on the domain name, no http:// or anything after the /
So a list like this:
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
I want to end up with plain text output file like this.
7wind.ru
aldersgatencsc.org
amunow.org
Given:
$ echo "$txt"
http://7wind.ru/file/Behind+the+dune/
http://aldersgatencsc.org/open.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=mz34ligqc4&utm_content=bgi71kl5oy
http://amunow.org/test.php?utm_source=5r2ke0ow6k&utm_medium=qqod2h9a88&utm_campaign=2d1hl1v8c5&utm_term=dhxg1r4l76&utm_content=tr71txtklp
You can use cut:
$ echo "$txt" | cut -d'/' -f3
7wind.ru
aldersgatencsc.org
amunow.org
Or, if your content is in a file:
$ cut -d'/' -f3 file
7wind.ru
aldersgatencsc.org
amunow.org
Then redirect that to the file you want:
$ cut -d'/' -f3 file >new_file
awk -F \/ '{ print $3 }' outputfile > newfile
Print the 3rd field delimited by /
$ sed -r 's#.*//([^/]*)/.*#\1#' Input_file
7wind.ru
aldersgatencsc.org
amunow.org
try following awks.
Solution 1st:
awk '{sub(/.*\/\//,"");sub(/\/.*/,"");print}' Input_file
Solution 2nd:
awk '{match($0,/\/.[^/]*/);print substr($0,RSTART+2,RLENGTH-2)}' Input_file
This works by stripping the protocol and :// first, then anything after and including the next slash.
sed "s|.*://||; s|/.*||" url-list.txt
Add -i to change the file directly.
try this regexp
((http|https):\/\/)?([a-zA-Z\.]+)(\/)?
first match, 3th group
but it may validate invalid url too! be careful

printing first word in every line of a txt file unix bash

So I'm trying to print the first word in each line of a txt file. The words are separated by one blank.
cut -c 1 txt file
Thats the code I have so far but it only prints the first character of each line.
Thanks
To print a whole word, you want -f 1, not -c 1. And since the default field delimiter is TAB rather than SPACE, you need to use the -d option.
cut -d' ' -f1 filename
To print the last two words not possible with cut, AFAIK, because it can only count from the beginning of the line. Use awk instead:
awk '{print $(NF-1), $NF;}' filename
you can try
awk '{print $1}' your_file
read word _ < file
echo "$word"
What's nice about this solution is it doesn't read beyond the first line of the file. Even awk, which has some very clean, terse syntax, has to be explicitly told to stop reading past the first line. read just reads one line at a time. Plus it's a bash builtin (and a builtin in many shells), so you don't need a new process to run.
If you want to print the first word in each line:
while read word _; do printf '%s\n' "$word"; done < file
But if the file is large then awk or cut will win out for reading every line.
You can use:
cut -d\ -f1 file
Where:
-d is the delimiter (here using \ for a space)
-f is the field selector
Notice that there is a space after the \.
-c is for characters, you want -f for fields, and -d to indicate your separator of space instead of the default tab:
cut -d " " -f 1 file

Display all fields except the last

I have a file as show below
1.2.3.4.ask
sanma.nam.sam
c.d.b.test
I want to remove the last field from each line, the delimiter is . and the number of fields are not constant.
Can anybody help me with an awk or sed to find out the solution. I can't use perl here.
Both these sed and awk solutions work independent of the number of fields.
Using sed:
$ sed -r 's/(.*)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
Note: -r is the flag for extended regexp, it could be -E so check with man sed. If your version of sed doesn't have a flag for this then just escape the brackets:
sed 's/\(.*\)\..*/\1/' file
1.2.3.4
sanma.nam
c.d.b
The sed solution is doing a greedy match up to the last . and capturing everything before it, it replaces the whole line with only the matched part (n-1 fields). Use the -i option if you want the changes to be stored back to the files.
Using awk:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file
1.2.3.4
sanma.nam
c.d.b
The awk solution just simply prints n-1 fields, to store the changes back to the file use redirection:
$ awk 'BEGIN{FS=OFS="."}{NF--; print}' file > tmp && mv tmp file
Reverse, cut, reverse back.
rev file | cut -d. -f2- | rev >newfile
Or, replace from last dot to end with nothing:
sed 's/\.[^.]*$//' file >newfile
The regex [^.] matches one character which is not dot (or newline). You need to exclude the dot because the repetition operator * is "greedy"; it will select the leftmost, longest possible match.
With cut on the reversed string
cat youFile | rev |cut -d "." -f 2- | rev
If you want to keep the "." use below:
awk '{gsub(/[^\.]*$/,"");print}' your_file

Resources