Unix Shell command for removing each words between space and comma - shell

I have a string variable as
columns = "name string,age int,address string,dob timestamp"
I want to remove the datatypes. ie I want to remove the words coming after space and a comma. The output should be as
name,age,address,dob

Assuming bash shell and extglob shell option is available - see pattern matching manual
$ columns='name string,age int,address string,dob timestamp'
$ echo "${columns// +([^,])/}"
name,age,address,dob
With sed
$ echo "$columns" | sed 's/ [^,]*//g'
name,age,address,dob
With awk to process fields separated by ,
$ echo "$columns" | awk -F, -v OFS="," '{for(i=1; i<=NF; i++){split($i,n," "); $i=n[1]}} 1'
name,age,address,dob
If all columns contain two words separated by space, one can use space or comma as delimiter and filter out unwanted fields
$ echo "$columns" | awk -F' |,' -v OFS=',' '{print $1,$3,$5,$7}'
name,age,address,dob

Related

AWK -F with print all but last record

/Home/in/test_file.txt
echo /Home/in/test_file.txt | awk -F'/' '{ print $2,$3 }'
Gives the result as:
Home in
But I need /Home/in/ as the result .I have to get all except test_file.txt
How to achieve this?
$ echo '/Home/in/test_file.txt' | awk '{sub("/[^/]+$","")} 1'
/Home/in
$ echo '/Home/in/test_file.txt' | awk '{sub("[^/]+$","")} 1'
/Home/in/
$ echo '/Home/in/test_file.txt' | sed 's:/[^/]*$::'
/Home/in
$ echo '/Home/in/test_file.txt' | sed 's:[^/]*$::'
/Home/in/
$ dirname '/Home/in/test_file.txt'
/Home/in
Your attempt awk -F'/' '{ print $2,$3 }' didn't do what you wanted as -F'/' is telling awk to split the input into fields at every / and then print $2,$3 is telling awk to print the 2nd and 3rd fields separated by a blank char (the default value for OFS). You could do:
$ echo '/Home/in/test_file.txt' | awk 'BEGIN{FS=OFS="/"} { print "",$2,$3,"" }'
/Home/in/
to get the expected output but it'd be the wrong approach since it's removing the field you don't want AND removing the input separators AND then adding new output separators which happen to the have the same value as the input separators rather than simply removing the field you don't want like the other solutions above do.
echo /Home/in/test_file.txt | awk -F'/[^/]*$' '{ print $1 }'
..will print the everything but the trailing slash
There are several ways to achieve this:
Using dirname:
$ dirname /home/in/test_file.txt
/home/in
Using Shell substitution:
$ var="/home/in/test_file.txt"
$ echo "${var%/*}"
/home/in
Using sed: (See Ed Morton)
Using AWK:
$ echo "/home/in/test_file.txt" | awk -F'/' '{OFS=FS;$NF=""}1'
/home/in/
Remark: all these work since you can't have a filename with a forward slash (Is it possible to use "/" in a filename?)
Note: all but dirname will fail if you just have a single file_name without a path. While dirname foo will return ./ all others will return foo
awk behaves as it should.
When you define slash / as a separator, the fields in your expression become the content between the separators.
If you need the separator to be printed as well, you need to do it explicitly, like:
echo /Home/in/test_file.txt | awk -F'/' '{ printf "%s/%s/",$2,$3 }'
replace your last field with an empty string and
put the slash back in as the (builtin) Output Field Separator (OFS)
echo /Home/in/test_file.txt | awk -F'/' -vOFS='/' '{$NF="";print}

Unable to get second column using awk

I have a file that contains three columns separated by four spaces
1234 567 q
1902 190 r
I'm trying to get the second column by searching for the first column string
i=`grep $str $file | awk -F "[ ]" '{print $2 }'`
j=`grep $str $file | awk -F "[ ]" '{print $3 }'`
echo second_col=$i
echo third_col=$j
I modified the file and used tab and comma as separators but I'm still unable to print the second or third column values for a particular string.
What am I doing wrong?
I'm trying to get the second column by searching for the first column string
If you don't have spaces in your columns then you can just use awk for this:
awk -v str="$str" '$1 ~ str { print $2 }' "$file"
awk automatically splits fields on whitespaces.
In case you have spaces in your column value then use:
awk -F ' {4}' -v str="$str" '$1 ~ str { print $2 }' "$file"
' {4}' is a regex to make 4 spaces a input field separator.
Reference: Effective AWK Programming
if you have a broken awk try this solution with sed
sed -nE 's/^1234\s+(\S+).*/\1/p'
find the pattern at the beginning of the line and print the next non-space field. If your fields include spaces this approach is not going to work.

How to: In bash print a value from a key/value pair

I need to print only the 900 in this line: auth required pam_faillock.so preauth silent deny=3 unlock_time=604800 fail_interval=900
However, this line will not always be in this order.
I need to find out how to print the value after the =.
I will need to do this for unlock_time and fail_interval
I have been searching all night for something that will work exactly for me and cannot find it. I have been toying around with sed and awk and have not nailed this down yet.
Let's define your string:
s='auth required pam_faillock.so preauth silent deny=3 unlock_time=604800 fail_interval=900'
Using awk:
$ printf %s "$s" | awk -F= '$1=="fail_interval"{print $2}' RS=' '
900
Or:
$ printf %s "$s" | awk -F= '$1=="unlock_time"{print $2}' RS=' '
604800
How it works
Awk divides its input into records. We tell it to use a space as the record separator. Each record is divided into fields. We tell awk to use = as the field separator. In more detail:
printf %s "$s"
This prints the string. printf is safer than echo in cases where the string might begin with -.
-F=
This tells awk to use = as the field separator.
$1=="fail_interval" {print $2}
If the first field is fail_interval, then we tell awk to print the second field.
RS=' '
This tells awk to use a space as the record separator.
You may use sed for this
Command
echo "...stuff.... unlock_time=604800 fail_interval=900" | sed -E '
s/^.*unlock_time=([[:digit:]]*).*fail_interval=([[:digit:]]*).*$/\1 \2/'
Output
604800 900
Notes
The (..) in sed is used for selections.
[[:digit:]]* or 0-9 is used to match any number of digits
The \1 and \2 is used to replace the matched stuff, in order.
Given an input variable:
input="auth required pam_faillock.so preauth silent deny=3 unlock_time=604800 fail_interval=900"
With GNU grep:
$ grep -oP 'fail_interval=\K([0-9]*)' <<< "$input"
900
$ grep -oP 'unlock_time=\K([0-9]*)' <<< "$input"
604800
Try using this.
unlock_time=$(echo "auth required pam_faillock.so preauth silent deny=3 unlock_time=604800 fail_interval=900" | awk -F'unlock_time=' '{print $2}' | awk '{print $1}')
echo "$unlock_time"
fail_interval=$(echo "auth required pam_faillock.so preauth silent deny=3 unlock_time=604800 fail_interval=900" | awk -F'fail_interval=' '{print $2}' | awk '{print $1}')
echo "$fail_interval"

awk and matching a literal backslash in a string

I'm trying to get the first substring cut along the delimiter of a literal "\n" (actual '\' and 'n' not a new line).
I am able to split the string at the '\' using an octal:
echo "1234\n5678" | awk -F'\134' '{print $1}'
But I cannot figure out how to split with the octal as part of a larger string. For example, the following fails:
echo "1234\n5678" | awk -F'\134'n '{print $1}'
I can do a string replace on the "\n" with sed and then split on that, but shouldn't I be able to do this simply with awk?
First you don't have to use \134. You can just -F '\\'. For your question, you can use
echo "1234\n5678" | awk -F '\\\\n' '{print $1}'
\ is used to escape \.

How to replace the nth column/field in a comma-separated string using sed/awk?

assume I have a string
"1,2,3,4"
Now I want to replace, e.g. the 3rd field of the string by some different value.
"1,2,NEW,4"
I managed to do this with the following command:
echo "1,2,3,4" | awk -F, -v OFS=, '{$3="NEW"; print }'
Now the index for the column to be replaced should be passed as a variable. So in this case
index=3
How can I pass this to awk? Because this won't work:
echo "1,2,3,4" | awk -F, -v OFS=, '{$index="NEW"; print }'
echo "1,2,3,4" | awk -F, -v OFS=, '{$($index)="NEW"; print }'
echo "1,2,3,4" | awk -F, -v OFS=, '{\$$index="NEW"; print }'
Thanks for your help!
This might work for you:
index=3
echo "1,2,3,4" | awk -F, -v OFS=, -v INDEX=$index '{$INDEX="NEW"; print }'
or:
index=3
echo "1,2,3,4" | sed 's/[^,]*/NEW/'$index
Have the shell interpolate the index in the awk program:
echo "1,2,3,4" | awk -F, -v OFS=, '{$'$index'="NEW"; print }'
Note how the originally single quoted awk program is split in three parts, a single quoted beginning '{$', the interpolated index value, followed by the single quoted remainder of the program.
Here's a seductive way to break the awkwardness:
$ echo "1,2,3,4" | sed 's/,/\n/g' | sed -e $index's/.*/NEW/'
This is easily extendable to multiple indexes just by adding another -e $newindex's/.*/NEWNEW/'
# This should be faster than awk or sed.
str="1,2,3,4"
IFS=','
read -a f <<< "$str"
f[2]='NEW'
printf "${f[*]}"
With plain awk (I.E. Not gawk etc) I believe you'll have to use split( string, array, [fieldsep] ); change the array entry of choice and then join them back together with sprintf or similar in a loop.
gawk allows you to have a variable as a field name, $index in your example. See here.
gawk is usually the default awk on Linux, so change your invocation to gawk "script" and see if it works.

Resources