How to save the name of the file if it is being treated in the script - bash

I have 88 folders, each of which contains the file "pair.'numbers'." (pair.3472, pair.7829 and so on). I need to treat the files with awk to extract the second column, but I need to save the numbers. If I try:
#!/bin/bash
for i in {1..88}; do
awk '{print $2}' ~/Documents/attempt.$i/pair* > ~/Results/pred.pair*
done
It doesn't save the numbers, but gives only one file: pred.pair*
Thanks for any tips.

You don't need a loop (and see https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for why that's a Good Thing):
awk '
FNR==1 { close(out); out=FILENAME; sub(/\/Documents.*\//,"/Results/pred.",out) }
{ print $2 > out }
' ~/Documents/attempt.{1..88}/pair*

#!/bin/bash
for i in {1..88}; do
awk '{fname=FILENAME;sub(".*/", "", fname);print $2 > ("~/Results/pred."fname)}' ~/Documents/attempt.$i/pair*
done
Use AWK build in variable FILENAME. We need to get the basename fname from FILENAME. Then redirect $2 value to "~/Results/pred."fname

There are several ways to do it: awk has a FILENAME variable and you can redirect the output from within your awk script to a manipulated string which is based on FILENAME.
Or you can do it with bash
for i in {1..88}; do
to_be_processed_fname=$(ls ~/Documents/attempt.$i/pair*)
extension="${to_be_processed_fname/*./}"
awk '{print $2}' "${to_be_processed_fname}" > "$HOME/Results/pred.${extension}"
done
Now the above of course fails if you have more than one pair* files within the same directory. But I'm leaving that to you.

Related

Why awk if conditional matching is wrong

In my project, I have two files.
The content userid is :
6534
4524
4522
6635
The content userpwinfo.txt is:
nsgg315_RJ:x:4520:100::/home-gg/users/nsgg315_RJ:/bin/bash
nsgg316_ZJY:x:4521:100::/home-gg/users/nsgg316_ZJY:/bin/bash
nsgg317_CPA:x:4522:100::/home-gg/users/nsgg317_CPA:/bin/bash
nsgg318_ZRL:x:4523:100::/home-gg/users/nsgg318_ZRL:/bin/bash
nsgg319_YYM:x:4524:100::/home-gg/users/nsgg319_YYM:/bin/bash
Now I want to print the username which id is in userid. I writed a bash shell like:
for i in $(cat userid)
do
#username=`awk -F: '{if($3=="$i") print $1}' /root/userpwinfo.txt`
#username=`awk -F: '$3=="$i" {print $1}' /root/userpwinfo.txt`
#username=`awk -F: '{if($3~/$i/) print $1}' /root/userpwinfo.txt`
username=`awk -F: '{if($3==$i) print $1}' /root/userpwinfo.txt`
echo $username
done
But unlucky, it shows nothing. The correct result should be:
nsgg319_YYM
nsgg317_CPA
I have tried in command line:
awk -F: '{if($3==4524) print $1}' /root/userpwinfo.txt
It is OK
Maybe if($3==$i) is wrong in shell, Who can help me?
Your $i is the shell variable, but it's inside the quotation mark ' so awk will try to interpret it instead of the shell.
Try this:
username=`awk -F: '{if($3=='$i') print $1}' /root/userpwinfo.txt`
Note that the $i is between ' marks, meaning it's outside of the block that will be interpreted by awk, meaning it should be interpreted by the shell.
Also note that if you have an empty line in the input file, your awk command would be if($3==) which is invalid and will yield an error.
I'd like to comment also that awk is meant to have a filter and an execution block. You shouldn't need to write an if inside a block, unless you want something unusual. Meaning, your command would be more appropriately written as:
username=`awk -F: '($3=='$i'){print $1}' /root/userpwinfo.txt`
Note that even this is not a very good solution, but you already have much to think about with only these changes. When you're more familiar with awk or getting more professional, come back and check the comments. ;)
If username is what you needed using the 2 files, you could try
$ cat userpwinfo.txt
nsgg315_RJ:x:4520:100::/home-gg/users/nsgg315_RJ:/bin/bash
nsgg316_ZJY:x:4521:100::/home-gg/users/nsgg316_ZJY:/bin/bash
nsgg317_CPA:x:4522:100::/home-gg/users/nsgg317_CPA:/bin/bash
nsgg318_ZRL:x:4523:100::/home-gg/users/nsgg318_ZRL:/bin/bash
nsgg319_YYM:x:4524:100::/home-gg/users/nsgg319_YYM:/bin/bash
$ cat userid.txt
6534
4524
4522
6635
$ awk -F":" ' { if( NR==FNR ) { a[$3]=$1; next } ; if(a[$1]) print a[$1] }' userpwinfo.txt userid.txt
nsgg319_YYM
nsgg317_CPA

How to send parameters on AWK to replace a parameter. Unix Korn Shell

I'm trying to replace a parameter to change a value when the AWK is used to search for a string in a file.
is this possible? I'm doing this.
DisplayMessage()
{
##Parameter 1 = Message ID.
MessageFile="/dev/fs/C/Users/salasfri/Desktop/Messages.txt"
Message=$(awk '$1 ~ /^'$MessageID'$/ {$1=""; print $0}' $MessageFile)
}
the Message File looks for this in the file "MessageFile":
0005 The file ${1} was not tranmitted.
it search for 0005 and get the message "The file ${1} was not tranmitted."
I want to replace ${1} with the name of the file
this could be possible with awk? any idea?
this should do...
awk '$1~/^'$MessageID'$/ {$1=""; sub("\\${1}",FILENAME); print}'
but perhaps you want to change to
awk -v mid="${MessageID}" '$1==mid {$1=""; sub("\\${1}",FILENAME); print}'
since you're looking for an exact match, not pattern match. Also better to use awk variables instead of quote dance.

awk: assigning a shell variable in awk script

I have a situation in awk where I need to convert an input format into another format and later use the number of records processed separately. Is there any way I can use a shell variable to get the value of NR in the END section? Something like:
cat file1 | awk 'some processing END{SHELL_VARIABLE=NR}' > file2
Then later use SHELL_VARIABLE outside awk.
I do not want to process the file and then do a wc -l separately as the files are huge.
One way: Use the redirection inside your awk command and print your result in the END block. And use command substitution to read the result in a shell variable:
my_var=$(awk '{ some processing; print "your output" >>file2 } END { print NR }' file1)
No subprocess can affect the parent's environment variables. What you can do is have awk write output to the file directly, then have it print the value you want to stdout and capture it. Or if you prefer, you could reverse that and have awk just print it to a file and read it back afterwards.
Incidentally, you have a UUOC.
rows=$(awk '{ ...; print > "file2"} END {print NR}' file1)
Or
awk '... END{print NR > "rows"}' file1 >file2
rows=$(<rows)
rm rows

Bash - extract file name and extension from a string

Here is grep command:
grep "%SWFPATH%/plugins/" filename
And its output:
set(hotspot[hs_bg_%2].url,%SWFPATH%/plugins/textfield.swf);
set(hotspot[hs_%2].url,%SWFPATH%/plugins/textfield.swf);
url="%SWFPATH%/plugins/textfield.swf"
url="%SWFPATH%/plugins/scrollarea.swf"
alturl="%SWFPATH%/plugins/scrollarea.js"
url="%SWFPATH%/plugins/textfield.swf"
I'd like to generate a file containing the names of the all files in the 'plugins/' directory, that are mentioned in a certain file.
Basically I need to extract the file name and the extension from every line.
I can manage to delete any duplicates but I can't figure out how to extract the information that I need.
This would be the content of the file that I would like to get:
textfield.swf
scrollarea.swf
strollarea.js
Thanks!!!
PS: The thread "Extract filename and extension in bash (14 answers)" explains how to get filename and extension from a 'variable'. What I'm trying to achieve is extracting these from a 'file', which is completely different'
Using awk:
grep "%SWFPATH%/plugins/" filename | \
awk '{ match($0, /plugins\/([^\/[:space:]]+)\.([[:alnum:]]+)/,submatch);
print "filename:"submatch[1];
print "extension:"submatch[2];
}'
Some explanation:
the match function takes every line processed by awk (indicated by $0) and looks for matches to that regex. Submatches (the parts of the string that match the parts of the regex between parentheses) are saved in the array submatch. print is as straightforward as it looks, it just prints stuff.
For this specific problem
awk '/\/plugins\// {sub(/.*\//, ""); sub(/(\);|")?$/, "");
arr[$0] = $0} END {for (i in arr) print arr[i]}' filename
Use awk to simply extract the filename and then sed to clean up the trailing )"; characters.
awk -F/ '{print $NF}' a | sed -e 's/);//' -e 's/"$//'

Explode to Array

I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$

Resources