using awk to cut a specific part - bash

i am dealing many lines containing paths
example :
posterita/posterita/web/jsp/pos/posReport1.jsp
build/web/view.jsp
uPortal-webapp/src/main/webapp/WEB-INF/flows/user-manager/selectUserAction.jsp
config/auth.php
database/seeds/DatabaseSeeder.php
admin/modules/announcement/functions.inc.php
what i need to do using bash print the first directory
I want the output to be like
posterita
build
uPortal-webapp
config
database
admin
the same method with second and third directory
posterita
web
src
seeds
modules
How can i use awk , sed or cut to do this ?

$ awk -F'/' -v n=1 'NF>n{print $n}' file
posterita
build
uPortal-webapp
config
database
admin
$ awk -F'/' -v n=2 'NF>n{print $n}' file
posterita
web
src
seeds
modules

awk -F '/' {print $1;}
Change $1 to $2 to print the 2nd piece, etc.
The -F flag allows you to choose the delimiting character(s).

Related

How can I use a variable as input in a awk command?

I have this variable:
a='/08/OPT/imaginary/N/08_i_N.out'
I want to use "/" as a field separator.
Then, I want to extract the first pattern.
I have tried:
awk -F/ '{print $1}' "$a"
But I get:
awk: cannot open /08/OPT/imaginary/N/08_i_N.out (No such file or directory)
I do not want the file, only to work on the path of that file.
Same way as any other command, either of these (or other alternatives, e.g. within "here-documents" or passed as awk variables or...):
printf '%s\n' "$a" | command
command <<<"$a"

Command to remove all but select columns for each file in unix directory

I have a directory with many files in it and want to edit each file to only contain a select few columns.
I have the following code which will only print the first column
for i in /directory_path/*.txt; do awk -F "\t" '{ print $1 }' "$i"; done
but if I try to edit each file by adding >'$I' as below then I lose all the information in my files
for i in /directory_path/*.txt; do awk -F "\t" '{ print $1 }' "$i" > "$i"; done
However I want to be able to remove all but a select few columns in each file for example 1 and 3.
Given:
cat file
1 2 3
4 5 6
You can do in place editing with sed:
sed -i.bak -E 's/^([^[:space:]]*).*/\1/' file
cat file
1
4
If you want freedom to work with multiple columns and have in place editing, use GNU awk that also supports in place editing:
gawk -i inplace '{print $1, $3}' file
cat file
1 3
4 6
If you only have POSIX awk or wanted to use cut you generally do this:
Modify the file with awk, cut, sed, etc
Redirect the output to a temp file
Rename the temp file back to the original file name.
Like so:
awk '{print $1, $3}' file >tmp_file; mv tmp_file file
Or with cut:
cut -d ' ' -f 1,3 file >tmp_file; mv tmp_file file
To do a loop on files in a directory, you would do:
for fn in /directory_path/*.txt; do
awk -F '\t' '{ print $1 }' "$fn" >tmp_file
mv tmp_file "$fn"
done
Just to add a little more to #dawg's perfectly well working answer according to my use case.
I was dealing with CSVs, and standard CSV can have , in some values as long as it's in double quotes like for example, the below-mentioned row will be a valid CSV row.
col1,col2,col2
1,abc,"abc, inc"
But the command above was treating the , between the double quotes as delimiter too.
Also, the output file delimiter wasn't specified in the command.
These are the modifications I had to make for it handle the above two problems:
for fn in /home/ubuntu/dir/*.csv; do
awk -F ',' '{ FPAT = "([^,]*)|(\"[^\"]+\")"; OFS=","; print $1,$2 }' "$fn" >tmp_file
mv tmp_file "$fn"
done
The OSF delimiter will be the diameter of the output/result file.
The FPAT handles the case of , between quotation mark.
The regex and the information for that is mentioned ins awk's official documentation in section 4.7 Defining Fields by Content.
I was led to that solution through this answer.

Get only part of file using sed or awk

I have a file which contains text as follows:
Directory /home/user/ "test_user"
bunch of code
another bunch of code
How can I get from this file only the /home/user/ part?
I've managed to use awk -F '"' 'NR==1{print $1}' file.txt to get rid of rest of the file and I'm gettig output like this:
Directory /home/user/
How can I change this command to get only /home/user/ part? I'd like to make it as simple as possible. Unfortunately, I can't modify this file to add/change the content.
this should work the fastest, noticeable if your file is large
awk '{print $2; exit}' file
it will print the second field of the first line and stop processing the rest of the file.
With awk it should be:
awk 'NR==1{print $2}' file.txt
Setting the field delimiter to " was wrong Since it splits the line into these fields:
$1 = 'Directory /home/user/'
$2 = 'test_user'
$3 = '' (empty)
The default record separator, which is [[:space:]]+, splits like this:
$1 = 'Directory'
$2 = '/home/user/'
$3 = '"test_user"'
As an alternate, you can use head and cut:
$ head -n 1 file | cut -d' ' -f2
Not sure why you are using the -F" as that changes the delimiter. If you remove that, then $2 will get you what you want.
awk 'NR==1{print $2}' file.txt
You can also use awk to execute the print when the line contains /home/user instead of counting records:
awk '/\home\/user\//{print $2}' file.txt
In this case, if the line were buried in the file, or if you had multiple instances, you would get the name for every occurrence wherever it was.
Adding some grep
grep Directory file.txt|awk '{print $2}'

How do I pass a stored value as the column number parameter to edit in awk?

I have a .dat file with | separator and I want to change the value of the column which is defined by a number passed as argument and stored in a var. My code is
awk -v var="$value" -F'|' '{ FS = OFS = "|" } $1=="$id" {$"\{$var}"=8}1'
myfile.dat > tmp && mv tmp myfiletemp.dat
This changes the whole line to 8, obviously doesn't work. I was wondering what is the right way to write this part
{$"\{$var}"=8}1
For example, if I want to change the fourth column to 8 and I have value=4, how do I get {$4=8}?
The other answer is mostly correct, but just wanted to add a couple of notes, in case it wasn't totally clear.
Referring to a variable with a $ in front of it turns it in to a reference to the column. So i=3; print $i; print i will print the third column and then the number 3.
Putting all your variables in the command line will avoid any problems with trying to include bash variables inside your single-quoted awk code, which won't work.
You can let awk do the output to the specific file instead of relying on bash to redirect output and move files.
The -F option on the command line specifies FS for you, so no need to redeclare it in your code.
Here's how I would do this:
#!/bin/bash
column=4
value=8
id=1
awk -v col="$column" -v val="$value" -v id="$id" -F"|" '
BEGIN {OFS="|"}
{$1==id && $col=val; print > "myfiletemp.dat"}
' myfile.dat
you can refer to the awk variable directly by it's name, slight rewrite of your script with correct reference to column number var...
awk -F'|' -v var="$value" 'BEGIN{OFS=FS} $1=="$id"{$var=8}1'
should work as long as $value is a number. If id is another bash variable, pass it the same way as an awk variable
awk -F'|' -v var="$value" -v id="$id" 'BEGIN{OFS=FS} $1==id{$var=8}1'
Not only can you use a number in a variable by putting a $ in front of it, you can also use put a $ in front of an expression!
$ date | tee /dev/stderr | awk '{print $(2+2)}'
Mon Aug 3 12:47:39 CDT 2020
12:47:39

Shell command to retrieve specific value using pattern

I have a file which contains data like below.
appid=TestApp
version=1.0.1
We want to parse the file and capture the value assigned to appid field.
I have tried with awk command as below
awk '/appid=/{print $1}' filename.txt
However it outputs the whole line
appid=TestApp
but we required only
TestApp
Please let me know how I can achieve this using awk/grep/sed shell commands.
You need to change the field separator:
awk -F'=' '$1 ~ /appid/ {print $2}' filename.txt
or with an exact match
awk -F'=' '$1 == "appid" {print $2}' filename.txt
outputs
TestApp
There's about 20 different ways to do this but it's usually a good idea when you have name = value statements in a file to simply build an array of those assignments and then just print whatever you care about using it's name, e.g.:
$ cat file
appid=TestApp
version=1.0.1
$
$ awk -F= '{a[$1]=$2} END{print a["appid"]}' file
TestApp
$ awk -F= '{a[$1]=$2} END{print a["version"]}' file
1.0.1
$ awk -F= '{a[$1]=$2} END{for (i in a) print i,"=",a[i]}' file
appid = TestApp
version = 1.0.1
If you are in the shell already then simply sourcing the file will let you get what you want.
. filename.txt
echo $appid

Resources