Assign the value of awk-for loop variable to a bash variable - bash

content within the tempfile
123 sam moore IT_Team
235 Rob Xavir Management
What i'm trying to do is get input from user and search it in the tempfile and output of search should give the column number
Code I have for that
#!/bin/bash
set -x;
read -p "Enter :" sword6;
awk 'BEGIN{IGNORECASE = 1 }
{
for(i=1;i<=NF;i++) {
if( $i ~ "'$sword6'$" )
print i;
}
} ' /root/scripts/pscripts/tempprint.txt;
This exactly the column number
Output
Enter : sam
2
What i need is the value of i variable should be assigned to bash variable so i can call as per the need in script.
Any help in this highly appreciated.
I searched to find any existing answer but not able to find any. If any let me know please.

first of all, you should pass your shell variable to awk in this way (e.g. sword6)
awk -v word="$sword6" '{.. if($i ~ word)...}`...
to assign shell variable by the output of other command:
shellVar=$(awk '......')
Then you can continue using $shellVar in your script.
regarding your awk codes:
if user input some special chars, your script may fail, e.g .*
if one column had matched userinput multiple times, you may have duplicated output.
if your file had multi-columns matching user input, you may want to handle it.

You just need to capture the output of awk. As an aside, I would pass sword6 as an awk variable, not inject it via string interpolation.
i=$(awk -v w="$sword6" '
BEGIN { IGNORECASE = 1 }
{ for (i=1;i<=NF;i++) {
if ($i ~ w"$") { print i; }
}
}' /root/scripts/pscipts/tempprint.txt)

Following script may help you on same too.
cat script.ksh
echo "Please enter the user name:"
read var
awk -v val="$var" '{for(i=1;i<=NF;i++){if(tolower($i)==tolower(val)){print i,$i}}}' Input_file

If tempprint.txt is big
awk -v w="$word6" '
BEGIN { IGNORECASE = 1 }
"$0 ~ \\<w\\>" {
for(i=1;i<=NF;i++)
if($i==w)print i
}' tempprint.txt

Related

How to process file content differently for each line using shell script?

I have a file which has this data -
view:
schema1.view1:/some-path/view1.sql
schema2.view2:/some-path/view2.sql
tables:
schema1.table1:/some-path/table1.sql
schema2.table2:/some-path/table2.sql
end:
I have to read the file and store the contents in different variables.
viewData=$(sed '/view/,/tables/!d;/tables/q' $file|sed '$d')
tableData=$(sed '/tables/,/end/!d;/end/q' $file|sed '$d')
echo $viewData
view:
schema1.view1:/some-path/view1.sql
schema2.view2:/some-path/view2.sql
echo $tableData
tables:
schema1.table1:/some-path/table1.sql
schema2.table2:/some-path/table2.sql
dataArray=("$viewData" "$tableData")
I need to use a for loop over dataArray so that I get all the components in 4 different variables.
Lets say for $viewData, the loop should be able to print like this -
objType=view
schema=schema1
view=view1
fileLoc=some-path/view1.sql
objType=view
schema=schema2
view=view2
fileLoc=some-path/view2.sql
I have tried sed and cut commands but that is not working properly. And I need to do this using shell script only.
Any help will be appreciated. Thanks!
remark: If you add a space character between the : and / in the input then you would be able to use YAML-aware tools for parsing it robustly.
Given your sample input you, can use this awk for generating the expected blocks:
awk '
match($0,/[^[:space:]]+:/) {
key = substr($0,RSTART,RLENGTH-1)
val = substr($0,RSTART+RLENGTH)
if (i = index(key,".")) {
print "objType=" type
print "schema=" substr(key,1,i-1)
print "view=" substr(key,i+1)
print "fileLoc=" val
printf "%c", 10
} else
type = key
}
' data.txt
objType=view
schema=schema1
view=view1
fileLoc=/some-path/view1.sql
objType=view
schema=schema2
view=view2
fileLoc=/some-path/view2.sql
objType=tables
schema=schema1
view=table1
fileLoc=/some-path/table1.sql
objType=tables
schema=schema2
view=table2
fileLoc=/some-path/table2.sql

Bash script does nothing when I run it, seems to keep waiting

I've written my first script, one in which I want to know if 2 files have the same values in a specific column.
Both files are WEKA machine-learning prediction outputs for different algorithms, hence they have to be in the same format, but the prediction column would be different.
Here's the code I've written based on the tutorial presented in https://linuxconfig.org/bash-scripting-tutorial-for-beginners:
#!/bin/bash
lineasdel1=$(wc -l $1 | awk '{print $1}')
lineasdel2=$(wc -l $2 | awk '{print $1}')
if [ "$lineasdel1" != "$lineasdel2" ]; then
echo "Files $1 and $2 have different number of lines, unable to perform"
exit 1
fi
function quitalineasraras {
awk '$1!="==="&&NF>0'
}
function acomodo {
awk '{gsub(/^ +| +$/, ""); gsub(/ +0/, " W 0"); gsub(/ +1$/, " W 1"); gsub(/ +/, "\t") gsub(/\+\tW/, "+"); print}'
}
function procesodel1 {
quitalineasraras "$1" | acomodo
}
function procesodel2 {
quitalineasraras "$2" | acomodo
}
el1procesado=$(procesodel1)
el2procesado=$(procesodel2)
function pegar {
paste <(echo "$el1procesado") <(echo "$el2procesado")
}
function contarintersec {
awk 'BEGIN {FS="\t"} $3==$8 {n++} END {print n}'
}
unido=$(pegar)
interseccion=$(contarintersec $unido)
echo "Estos 2 archivos tienen $interseccion coincidencias."
I ran all individual codes of all functions in the terminal and verified they work successfully (I'm using Linux Mint 19.2). Script's permissions also have been changed to make it executable. Paste command also is supposed to work with that variable syntax.
But when I run it via:
./script.sh file1 file2
if both files have the same number of lines, and I press enter, no output is obtained; instead, the terminal opens an empty line with cursor waiting for something. In order to write another command, I've got to press CTRL+C.
If both files have different number of lines the error message prints successfully, so I think the problem has something to do with the functions, with the fact that awk has different syntax for some chores, or with turning the output of functions into variables.
I know that I'm missing something, but can't come up with what could be.
Any help will be appreciated.
what could be.
function quitalineasraras {
awk '$1!="==="&&NF>0'
}
function procesodel1 {
quitalineasraras "$1" | acomodo
}
el1procesado=$(procesodel1)
The positional variables $1 are set for each function separately. The "$1" inside procesodel1 expands to empty. The quitalineasraras is passed one empty argument "".
The awk inside quitalineasraras is passed only the script without the filename, so it reads the input for standard input, ie. it waits for the input on standard input.
The awk inside quitalineasraras without any file arguments makes your script seem to wait.

How to collate multiple files in AWK?

I am trying to collate a series of .csv log files that are named by date (e.g., 2019-02-24.csv). There are a bunch of them, so I'm trying to script the process. I've crafted an AWK script that combines individual files:
awk ' FNR==1 { while (/"_time",PIN,FULLNAME,OFFICE,Acronym,Name/) getline; } 1 { print } ' 2019-01-01.csv >> usage_history.csv
But I am failing when I try to string the AWK commands together with a control loop in BASH:
for i in {01..28}; do echo "awk ' FNR==1 { while (/\"_time\",PIN,FULLNAME,OFFCODE,Acronym,Name/) getline; } 1 { print } ' 2019-01-$i.csv >> user_history.csv"; done
When I run this, it prints out the correct commands to the command line, but the awk scripts are not executed (they only get printed). If I run it without echo, I get errors telling me that the file doesn't exist; though all files are present:
bash: awk ' FNR==1 { while (/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/) getline; } 1 { print } ' 2019-01-01.csv >> user_history.csv: No such file or directory
What am I missing in my loop?
Here is a condensed sample of the command and the error messages:
$ for i in {01..02}; do "awk ' FNR==1 { while (/\"_time\",PIN,FULLNAME,OFFCODE,Acronym,Name/) getline; } 1 { print } ' 2019-01-$i.csv >> user_history.csv"; done
bash: awk ' FNR==1 { while (/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/) getline; } 1 { print } ' 2019-01-01.csv >> user_history.csv: No such file or directory
bash: awk ' FNR==1 { while (/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/) getline; } 1 { print } ' 2019-01-02.csv >> user_history.csv: No such file or directory
Could you please try following.
awk '!/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/' 2019-01-[0-9]*.csv >> user_history.csv
Here following are the points why one could use this approach:
1- Use of for loop and calling awk command in that each time will be a overkill. We should use smart approach when awk could read multiple files then we should sue it.
2- Now comes the getline part which you tried in your code, so if we want to negate any string then simply negate it by using !/string_to_be_skipped/ so it will look for only those lines which are NOT having this string.
3- While mentioning file(multiple files) to single awk command I used 2019-01-[0-9]*.csv why because since you have NOT told if files will be created daily basis or not so in case we give it a loop style and that specific file is NOT present then we will get an error. For an example let's say I use following awk command where I intentionally removed file named(2019-01-02.csv).
awk '........' 2019-01-{01..29}.csv
awk: cannot open 2019-01-02.csv (No such file or directory)
So to avoid these kind of situations I have used 2019-01-[0-9]*.csv where it will only look for files which have digits after 2019-01-0 and will loop NOT run in a loop and complaint us that some xyz etc file is missing.
Try this:
for i in {01..28}; do awk '!/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/' 2019-01-$i.csv >>user_history.csv;done
The commands after do should not be quoted.
And what you were doing essentially equals to ignore the title lines.
The {print} after 1 is unnecessary -- single 1 implies {print}. The 1 is to provide a true.
-- When there's only an expression but no block, the block implies to {print}.
-- And only a regexp equals $0~/regex/, and here I negated it.
If there's no other command inside the loop, you can simplify the loop with one awk command:
awk '!/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/' 2019-01-{01..28}.csv >>user_history.csv
But this one will throw error and stop executing when one of the files not existed.
Another way is:
awk '!/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/' 2019-01-[0-3][0-9].csv >>user_history.csv
This one will only match filenames, instead of loop for them.
It won't stop executing nor throw error, So if there's file missing you wouldn't know. And it will match extra files if exist.
For example it will read 2019-01-34.csv if it exists.
So if you want the warnings (warnings won't affect the results), but don't want the commands to stop, then use the first for loop one.
Pitfalls:
[0-3][1-9] won't match 10,20 and 30, but will match 32 to 39.
[0-9]* will match any longer number, but with 20 to 29 before 3 or likewise, it's string order.
Thanks to #Tiw and #RavinderSingh13 for their guidance. Here is the final awk script that is working well for my case where I have daily files from multiple days, months, and years (only 2018 and 2019 in this case):
awk '!/"_time",PIN,FULLNAME,OFFCODE,Acronym,Name/' 201[8-9]-[0-1][0-2]-[0-3][0-9].csv >> user_history.csv

Using a variable in an awk Bourne shell script

I'm trying to write a script that will read the country I type in and then display the people who are from that country. It works fine when I write the code with the country I want in, but when I try to use a variable nothing happens. I've been working on this for hours, so could someone please help me. This is what I have so far.
echo Please choose a country
echo
read country
awk -F: -v theCountry="$country" '
BEGIN {
recordCount = 0
printf "\t\tPlayers from that country\n\n"
print theCountry
}
/theCountry/{
print $1
recordCount++
}
END {
}' playerFile
Where I enter theCountry is where it's messing up.
Change:
/theCountry/{
to:
$0 ~ theCountry {
and read http://www.gnu.org/software/gawk/manual/gawk.html#Computed-Regexps

AWK split for multiple delimiters lines

I'm trying to split a file using AWK one-line but the code below that I came with is not working properly.
awk '
BEGIN { idx=0; file="original_file.split." }
/^REC_DELIMITER.(HIGH|TOP)$/ { idx++ }
/^REC_DELIMITER.TOP$/,/^REC_DELIMITER.(HIGH|TOP)$/ { print > file sprintf("%03d", idx) }
' original_file
Test file is "original_file":
REC_DELIMITER.TOP
lineA1
lineA2
lineA3
REC_DELIMITER.HIGH
lineB1
lineB2
lineB3
REC_DELIMITER.TOP
lineC1
lineC2
lineC3
REC_DELIMITER.HIGH
lineD1
lineD2
lineD3
AWK code above is for REC_DELIMITER.TOP and it is giving me these files:
original_file.split.001:
REC_DELIMITER.TOP
original_file.split.003:
REC_DELIMITER.TOP
however, I'm trying to get this:
original_file.split.001:
REC_DELIMITER.TOP
lineA1
lineA2
lineA3
original_file.split.003:
REC_DELIMITER.TOP
lineC1
lineC2
lineC3
There will be other record delimiters, and when needed, we can run for them like REC_DELIMITER.HIGH, this way getting files like below:
original_file.split.002:
REC_DELIMITER.HIGH
lineB1
lineB2
lineB3
original_file.split.004:
REC_DELIMITER.HIGH
lineD1
lineD2
lineD3
Any help guys is very appreciate, I have been trying to get this working past few days and AWK code above is the best I was able to get. I need now help from AWK masters. :)
Thank you!
You can try something like this:
awk '
/REC_DELIMITER\.TOP/ {
a=1
b=0
file = sprintf (FILENAME".split.%03d",++n)
}
/REC_DELIMITER\.HIGH/ {
b=1
a=0
file = sprintf (FILENAME".split.%03d",++n)
}
a {
print $0 > file
}
b {
print $0 > file
}' file
You need something like this (untested):
awk -v dtype="TOP" '
BEGIN { dbase = "^REC_DELIMITER\\."; delim = dbase dtype "$" }
$0 ~ dbase { inBlock=0 }
$0 ~ delim { inBlock=1; idx++ }
inBlock { print > sprintf("original_file.split.%03d", idx) }
' original_file
awk -vRS=REC_DELIMITER '/^.TOP\n/{print RS $0 > sprintf("original_file.split.%03d",n)};!++n' original_file
(Give or take an extra newline at the end.)
Generally, when input is supposed to be treated as a series of multi-line records with a special line as delimiter, the most direct approach is to set RS (and often ORS) to that delimiter.
Normally you'd want to add newlines to its beginning and/or end, but this case is a little special so it's easier without them.
Edited to add: You need GNU Awk for this. Standard Awk considers only the first character of RS.
I made some changes so the different delimiters go to the their own file, even when they occur later in the file. make a file like splitter.awk with the contents below, the chmod +x it and run it with ./splitter.awk original_file
#!/usr/bin/awk -f
BEGIN {
idx=0;
file="original_file.split.";
out=""
}
{
if($0 ~ /^REC_DELIMITER.(TOP|HIGH)/){
if (!cnt[$0]) {
cnt[$0] = ++idx;
}
out=cnt[$0];
}
print > file sprintf("%03d", out)
}
I'm not very used to AWK, however, plasticide's answer put me towards right direction and I finally got AWK script working as requirements.
In below code, first IF turn echo to 0 if a demilier is found. Second IF turn echo to 1 if the wanted delimiter is found, then the want ones are are split from file.
I know regex could be something like /^(REC_(DELIMITER\.(TOP|HIGH|LOW)|NO_CATEGORY)$/ but since regex is created dynamically via shellscript that reads from an specific file a list of delimiters, it will look more like in AWK below.
awk 'BEGIN {
idx=0; echo=1; file="original_file.split."
}
{
#All the delimiters to consider in given file
if($0 ~ /^(REC_DELIMITER.TOP|REC_DELIMITER.HIGH|REC_DELIMITER.LOW|REC_NO_CATEGORY)$/) {
echo=0
}
#Delimiters that should actually be pulled
if($0 ~ /^(REC_DELIMITER.HIGH|REC_DELIMITER.LOW)$/ {
idx++; echo=1
}
#Print to a file is match wanted delimmiter
if(echo) {
print > file idx
}
}' original_file
Thank you all. I really appreciate it very much.

Resources