I have a file file.txt having the following structure:-
./a/b/c/sdsd.c
./sdf/sdf/wer/saf/poi.c
./asd/wer/asdf/kljl.c
./wer/asdfo/wer/asf/asdf/hj.c
How can I get only the c file names from the path.
i.e., my output will be
sdsd.c
poi.c
kljl.c
hj.c
You can do this simpy with using awk.
set field seperator FS="/" and $NF will print the last field of every record.
awk 'BEGIN{FS="/"} {print $NF}' file.txt
or
awk -F/ '{print $NF}' file.txt
Or, you can do with cut and unix command rev like this
rev file.txt | cut -d '/' -f1 | rev
You can use basename command:
basename /a/b/c/sdsd.c
will give you sdsd.c
For a list of files in file.txt, this will do:
while IFS= read -r line; do basename "$line"; done < file.txt
Using sed:
$ sed 's|.*/||g' file
sdsd.c
poi.c
kljl.c
hj.c
The most simple one ($NF is the last column of current line):
awk -F/ '{print $NF}' file.txt
or using bash & parameter expansion:
while read file; do echo "${file##*/}"; done < file.txt
or bash with basename :
while read file; do basename "$file"; done < file.txt
OUTPUT
sdsd.c
poi.c
kljl.c
hj.c
Perl solution:
perl -F/ -ane 'print $F[#F-1]' your_file
Also you can use sed:
sed 's/.*[/]//g' your_file
Related
I have these files in mydir:
APPLE_STORE_iphone12.csv
APPLE_STORE_iphonex.csv
APPLE_STORE_ipad.csv
APPLE_STORE_imac.csv
Need to rename the files after a matching pattern "APPLE_STORE_".
Required O/P
APPLE_STORE_NY_iphone12_20210107140443.csv
APPLE_STORE_NY_iphonex_20210107140443.csv
APPLE_STORE_NY_ipad_20210107140443.csv
APPLE_STORE_NY_imac_20210107140443.csv
Here is what I tried:
filelist=/mydir/APPLE_STORE_*.csv
dtstamp=`date +%Y%m%d%H%M%S`
location='NY'
for file in ${filelist}
do
filebase=${file%.csv}
mv ${file} ${filebase}_${location}_${dtstamp}.csv
done
This is giving me name like APPLE_STORE_imac_NY_20210107140443.csv
Another (maybe not so elegant) way is to first explicitly divide the filename in its parts using awk with separator "_" and then build it up again as needed. Your script could then look like:
#!/bin/bash
filelist=./APPLE_STORE_*.csv
dtstamp=`date +%Y%m%d%H%M%S`
location='NY'
for file in ${filelist}
do
filebase=${file%.csv}
part1=`echo ${filebase} | awk -v FS="_" '{print $1}'`
part2=`echo ${filebase} | awk -v FS="_" '{print $2}'`
part3=`echo ${filebase} | awk -v FS="_" '{print $3}'`
mv ${file} ${part1}_${part2}_${location}_${part3}_${dtstamp}.csv
done
I tested it successfully.
You are so close.
destfile="$(echo $file | sed -e 's/^APPLE_STORE/APPLE_STORE_${location}/' -e 's/\.csv$/${dtstamp}.csv/')"`
mv "$file" "$destfile"
...or something like that.
I'm using the following command:
grep -F "searchterm" source.csv >> output.csv
to search for matching terms in source.csv. Each line in the source file is like so:
value1,value2,value3|value4,value5
How do I insert only the fields value1,value2,value3 into the output file?
You can simply use awk which will go through line by line and then you apply the separator and get the part you would like to take from the string .
awk -F"|" '{print $1}' input.csv > output.csv
You can do it with a simple while read loop:
while read -r line; do echo ${line%|*}; done < file.csv >> newfile.csv
or in a subshell, so you truncate the newfile each time:
( while read -r line; do echo ${line%|*}; done < file.csv ) > newfile.csv
or with sed:
sed -e 's/[|].*$//' file.csv > newfile.csv
This perl solution is similar to the awk solution:
perl -F'\|' -lane 'print $F[0]' input.csv > output.csv
The | field separator character needs to be escaped with a \
-a puts perl into autosplit mode, which populates the fields array F
I have headers like
>XX|6226515|new|xx_000000.1| XXXXXXX
in a text file which I am trying shorten to
>XX6226515
using awk. I tried
awk -F"|" '/>/{$0=">"$1}1' input.txt > output.txt
but it yields the following instead
>XX|6226515|new|
awk -F"|" '{print $1$2}' input.txt > output.txt
Output:
>XX6226515
sed solution:
sed -e 's/|//' -e 's/|.*//'
The first substitution removes the first vertical bar, the second one removes the second one and anything after it.
$ awk -F'|' '$0=$1$2' <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
>XX6226515
This cut can also make it:
cut -d"|" --output-delimiter="" -f-2
See output:
$ echo ">XX|6226515|new|xx_000000.1| XXXXXXX" | cut -d"|" --output-delimiter="" -f-2
>XX6226515
-d"|" sets | as field delimiter.
--output-delimiter="" indicates that the output delimiter has to be empty.
-f-2 indicates that it has to print all records up to the 2nd (inclusive).
Also with just bash:
while IFS="|" read a b _
do
echo "$a$b"
done <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
See output:
$ while IFS="|" read a b _; do echo "$a$b"; done <<< ">XX|6226515|new|xx_000000.1| XXXXXXX"
>XX6226515
$ cat file
11 asasaw121
12 saasks122
13 sasjaks22
$ cat no
while read line
do
var=$(awk '{print $1}' $line)
echo $var
done<file
$ cat yes
while read line
do
var=$(echo $line | awk '{print $1}')
echo $var
done<file
$ sh no
awk: can't open file 11
source line number 1
awk: can't open file 12
source line number 1
awk: can't open file 13
source line number 1
$ sh yes
11
12
13
Why doesn't the first one work? What does awk expect to find in $1 in it? I think understanding this will help me avoid numerous scripting problems.
awk always expects a file name as input
In following, $line is string not a file.
var=$(awk '{print $1}' $line)
You could say (Note double quotes around variable)
var=$(awk '{print $1}' <<<"$line")
Why doesn't the first one work?
Because of this line:
var=$(awk '{print $1}' $line)
Which assumes $line is a file.
You can make it:
var=$(echo "$line" | awk '{print $1}')
OR
var=$(awk '{print $1}' <<< "$line")
awk '{print $1}' $line
^^ awk expects to see a file path or list of file paths here
what it is getting from you is the actual file line
What you want to do is pipe the line into awk as you do in your second example.
You got the answers to your specific questions but I'm not sure it's clear that you would never actually do any of the above.
To print the first field from a file you'd either do this:
while IFS= read -r first rest
do
printf "%s\n" "$first"
done < file
or this:
awk '{print $1}' file
or this:
cut -d ' ' -f1 <file
The shell loop would NOT be recommended.
How to print all columns but last 2?
e.g
input :echo FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008
output : FB_SYS_0032_I03_LTO3_idaen02r_02
delimiter : _ (underscore)
for your example, this awk one liner should do:
awk -F'_' -v OFS='_' 'NF-=2' file
test:
kent$ awk -F'_' -v OFS='_' 'NF-=2' <<< "FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008"
FB_SYS_0032_I03_LTO3_idaen02r_02
Just use an RE that describes the last 2 fields:
awk '{sub(/_[^_]*_[^_]*$/,"")}1'
or:
sed 's/_[^_]*_[^_]*$//'
e.g.:
$ echo FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008 | awk '{sub(/_[^_]*_[^_]*$/,"")}1'
FB_SYS_0032_I03_LTO3_idaen02r_02
$ echo FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008 | sed 's/_[^_]*_[^_]*$//'
FB_SYS_0032_I03_LTO3_idaen02r_02
Te above will work with any modern awk and any sed on any system.
use this awk command:
awk -F "_" '{for (i=1; i<=NF-2; i++) {printf ("%s", $i); if (i<NF-2) printf "_"} print ""}'
FB_SYS_0032_I03_LTO3_idaen02r_02
Using sed:
sed -r 's/(_[^_]*){2}$//'
For example,
$ echo 1_2_3_4_5 | sed -r 's/(_[^_]*){2}$//'
1_2_3
$ echo 1_2_3_4 | sed -r 's/(_[^_]*){2}$//'
1_2
$ echo FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008 | sed -r 's/(_[^_]*){2}$//'
FB_SYS_0032_I03_LTO3_idaen02r_02
Probably this is the simplest way:
$ input="FB_SYS_0032_I03_LTO3_idaen02r_02_20130820_181008"
$ echo "${input%_*_*}"
FB_SYS_0032_I03_LTO3_idaen02r_02