Filter out a word from a line - ksh

I would like to get out a word(number) from the line below but via a very simple one linear command
Needed string is 260.796, so I just would like to get 260.796 out of below line
xcxalpha=0.000 es enealpha=260.796 es emalpha=29.107 es

You must think how you know what part of the string you want.
When you want the substring after the second '=' until the space, use
echo "xcxalpha=0.000 es enealpha=260.796 es emalpha=29.107 es" | cut -d "=" -f3 | cut -d " " -f1
When you want the value after enealpha=, use
echo "xcxalpha=0.000 es enealpha=260.796 es emalpha=29.107 es" | sed 's/.*enealpha=//;s/ .*//'
or
echo "xcxalpha=0.000 es enealpha=260.796 es emalpha=29.107 es" | sed -r 's/.*enealpha=([^ ]*).*/\1/'
When the spaces and equal signs are like the example, you might consider them as field sep's like
echo "xcxalpha=0.000 es enealpha=260.796 es emalpha=29.107 es" | awk -F " |=" '{print $5}'
When you want something else, look for the right sed/awk/cut command/combination.

Related

I only want to display the matching users up until the comma

I want to print the color along wit the users who match, not all the users.
So I want to find bob and riley and only print the data associated with them and not the other users. If there is no matches I would like the whole line just to be skipped and not display anything
How can I achieve this?
awk '{ FS=":"; print $1 " " $4 }' /test|while read color person
do
if [[ `echo ${users}|egrep -i "bob|riley"` ]]
then
printf " ${color} - ${person}\n\n"
fi
done
used FS because in the file there are 4 field separated by the ":"
Input looks something like this:
red: :car:todd,riley,bob,greg
green: :jeep:todd,riley,bob,greg,thomas
black: :truck:jamie,jack,bob,travis,riley
Output currently:
red - todd,riley,bob,greg
green - todd,riley,bob,greg,thomas
black - jamie,jack,bob,travis,riley
Desired output
red - bob,riley
green - bob,riley
black - bob,riley
Doesn't have to be sorted like this but it would be helpful
When you have GNU grep available, you can use option -o (--only-matching)
echo ${person} | egrep -o -i -e "bob|riley"
will show for the first line
riley
bob
Now you can combine the results with tr
echo ${person} | egrep -o -i -e "bob|riley" | tr '\n' ,
gives
riley,bob,
And finally strip the trailing comma
echo ${person} | egrep -o -i -e "bob|riley" | tr '\n' , | sed -e 's/,$//'
which results in
riley,bob

Extract string between qoutes in a script

my text-
(
"en-US"
)
what i need -
en-US
currently im able to get it by piping it with
... | tr -d '[:space:]' | sed s/'("'// | sed s/'("'// | sed s/'")'//
I wonder if there is a simple way to extract the string between the qoutes rather than chopping off useless parts one by one
... | grep -oP '(?<=").*(?=")'
Explanation:
-o: Only output matching string
-P: Use Perl style RegEx
(?<="): Lookbehind, so only match text that is preceded by a double quote
.*: Match any characters
(?="): Lookahead, so only match text that is followed by a double quote
With sed
echo '(
"en-US"
)' | sed -rn 's/.*"(.*)".*/\1/p'
with 2 commands
echo '(
"en-US"
)' | tr -d "\n" | cut -d '"' -f2
Could you please try following. Where var is the bash variable haveing shown sample value stored in it.
echo "$var" | awk 'match($0,/".*"/){print substr($0,RSTART+1,RLENGTH-2)}'
Explanation: Following is only for explanation purposes.
echo "$var" | ##Using echo to print variable named var and using |(pipe) to send its output to awk command as an Input.
awk ' ##Starting awk program from here.
match($0,/".*"/){ ##using match function of awk to match a regex which is to match from till next occurrence of " by this match 2 default variables named RSTART and RLENGTH will be set as per values.
print substr($0,RSTART+1,RLENGTH-2) ##Where RSTART means starting point index of matched regex and RLENGTH means matched regex length, here printing sub-string whose starting point is RSTART and ending point of RLENGTH to get only values between " as per request.
}' ##Closing awk command here.
Consider using
... | grep -o '"[^"]\{1,\}"' | sed -e 's/^"//' -e 's/"$//'
grep will extract all substrings between quotes (excluding empty ones), the sed later will remove the quotes on both ends.
And this one ?
... | grep '"' | cut -d '"' -f 2
It works if you have just 1 quoted value by line.

Extract Data Between string with Double Double quotes in Shell scripting

I need to extract the data from a large file having double double quotes into a text file.
The number of columns are fixed but the column will be missing if the data isnt available in a row like acct_address & phne_nm missing in 1st row, phne_num missing in 2nd row, acct_address missing in 3rd row
Data in File
<acc_details acct_no=""00000"" acct_nm=""John""/>
<acc_details acct_no=""00001"" acct_address=""109 BIRHN WAY "" acct_nm=""BARNS WY""/>
<acc_details acct_no=""00002"" acct_nm=""BILL BAR"" phne_nm=""123456""/>
Expected Result
acct_no,acct_address,acct_nm,phne_nm
00000,,John,
00001,109 BIRHN WAY,BARNS WY,
00002,,BILL BAR,123456
This might not be the most elegant solution, but it should be applicable for most cases. It can be improved upon.
echo "acct_no,acct_address,acct_nm,phne_nm" > res
while read line ; do
acct_no=$(echo $line | grep -Eoh 'acct_no="".*?""' | cut -d\" -f3)
acct_nm=$(echo $line | grep -Eoh 'acct_nm="".*?""' | cut -d\" -f3)
acct_address=$(echo $line | grep -Eoh 'acct_address="".*?""' | cut -d\" -f3)
phne_nm=$(echo $line | grep -Eoh 'phne_nm="".*?""' | cut -d\" -f3)
echo $acct_no,$acct_address,$acct_nm,$phne_nm >> res
done < file
grep and cut can be used to isolate parts of the lines with the matching attribute patterns. Just be warned that any double quote inside the attribute values might cause this code to fail.

Cut is working unexpected? [duplicate]

This question already has answers here:
How to make the 'cut' command treat same sequental delimiters as one?
(5 answers)
Closed 8 years ago.
I want to cut a line into fields with custom delimeter:
echo "field1 field2 field3" | cut -d\ -f2
I expect field2 to show in result, but I get nothing. How can I split it with cut?
The problem is, that there's more than one space. cut(1) doesn't collapse the delimiter. It will split on the first delimiter encountered.
POSIX has this to say about it (emphasis mine):
-f list
Cut based on a list of fields, assumed to be separated in the file by a delimiter character (see -d). Each selected field shall be output. Output fields shall be separated by a single occurrence of the field delimiter character. Lines with no field delimiters shall be passed through intact, unless -s is specified. It shall not be an error to select fields not present in the input line.
Solutions:
If you're on FreeBSD, use -w; this will split on whitespace (while very useful, this is non-standard; and not portable to GNU cut)
Use -f 4
use awk(1): echo "field1 field2 field3" | awk '{print $2}'
Use tr(1) to collapse all spaces to a single space: echo "field1 field2 field3" | tr -s \ | cut -d\ -f2
You can use awk:
echo "field1 field2 field3" | awk '{print $2}'
field2
since cut doesn't handle multiple delimiters correctly.
Or else use tr to truncate multiple spaces into one first:
echo "field1 field2 field3" | tr -s ' ' | cut -d ' ' -f2
field2

how can I get the index of a character in a given concurrence which is repeated several times in a TEXT line using SHELL (BASH) script

I have a Text string like below
"/path/to/log/file/LOG_FILE.log.2013-10-02-15:2013-10-02 15:46:57.809 INFO - TTT005|Receive|0000293|N~0000284~YOS~TTT005~ ~000~YC~|YOS TYOS-YCUPDT1-H 20131002154657669284YCARR TTT005 Y0TD04 |1|0150520106050|001|051052020603|003|015030010101502702060510520101|000||000|| "
Here "|" is repeated several times within the string and I need to get the index of 4th occurrence of "|" character using shell-script (BASH) command. I tried to find a way using grep command's options.
Thanks.
Using awk you can do:
awk -F '|' '{print index($0, $5)-1}' file
This will print character position of fourth pipe in the file.
grep can print the byte-offset; when used with -o it prints the byte-offset of the matching part.
$ string="/path/to/log/file/LOG_FILE.log.2013-10-02-15:2013-10-02 15:46:57.809 INFO - TTT005|Receive|0000293|N~0000284~YOS~TTT005~ ~000~YC~|YOS TYOS-YCUPDT1-H 20131002154657669284YCARR TTT005 Y0TD04 |1|0150520106050|001|051052020603|003|015030010101502702060510520101|000||000||"
$ grep -ob "[^|]*" <<< "${string}" | sed '5!d' | cut -d: -f1
132
Alternatively, without using grep:
$ newstring=$(echo "${string}" | cut -d\| -f5-)
$ echo $(( ${#string} - ${#newstring} ))
132

Resources