Bash can't write file from sh script - bash

I have a bash script that basically does some math with awk and then exports the results to a file on the machine, but it doesn't create the file or even modify the file if I create it for it. It works fine on my Mac, but can't seem to get it working on Ubuntu. Below is my code.
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 >"/home/skyler/Documents/SkyMine/arp_nr.var" }'

Try to separate awk and get the result into a new file using tee command.
Option 1. Using tee command.
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 }' | tee "/home/skyler/Documents/SkyMine/arp_nr.var"
Option 2. Using the output (STDOUT) redirection
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 }' > "/home/skyler/Documents/SkyMine/arp_nr.var"
PS. You should post your data file example, it would be easier to understand and help you.

Related

Using awk with gedit external tools

I have used gedit with Ubuntu a lot and now transitioning on Macos. I noticed that some of the plugins in the Macos version are missing. For instance, there is no (AFAIK) an out-of-the-box option to comment/uncomment code. However, there is the possibility to define an external tool to basically have whatever you want.
I want to just comment the text selection in R/python style (prepending a # before each line of the input). I went to Tools -> Manage External Tools and defined a "Comment Code" tool in this way:
#!/bin/bash
awk '{print "#" $0}'
and set Input as "Current Selection" and output as "Replace Current Selection".
It works if you select a text; however if you don't select anything, it stalls forever, since (for what I understood) awk is waiting for an input.
How can I avoid this problem? Of course I don't need a awk (or whatever) solution, whatever works is fine. I'm not much expert of bash tools like awk or sed, so very likely I'm missing something very simple.
See https://www.gnu.org/software/gawk/manual/html_node/Read-Timeout.html for how to implement a read timeout with GNU awk.
With input from stdin (enter 7 then Enter/Return then Control-D):
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
7
#7
or with input from a pipe:
$ seq 2 | gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
#1
#2
or from a file:
$ seq 2 > file
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' file
#1
#2
but if you don't provide any input and wait 5 seconds:
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
gawk: cmd. line:1: fatal: error reading input file `-': Connection timed out
$
You can always redirect stderr the usual way if you don't want to see the timeout message:
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>/dev/null
$
but of course that would mask ALL error messages so you might want to do this instead to just mask that one message:
{ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>&1 >&3 |
grep -v 'fatal: error reading input file'; } 3>&1
e.g.:
$ { gawk 'BEGIN{print "other error" >"/dev/stderr";
PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>&1 >&3 |
grep -v 'fatal: error reading input file'; } 3>&1
other error
$
Obviously change the 5000 to however many milliseconds you want the script to wait for input.

File name substitution using awk and for loop

Hi I am trying to write dynamic filenames using variable substitution and I unable to figure out what am i missing here.
for i in `cat justPid.csv`
do
awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > "$i"file.txt
done
I have also tried the one below and many other combinations but it wont print multiple file names based on the $i.
for i in `cat justPid.csv`
do
awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > ${i}_file.txt
done
Any suggestions?
Edit:
my original intent is to split a 27gb file into manageable chunks based on PID (identifier in the file) so that it can be loaded onto R Studio for analysis. I am working on my laptop and not on a server hence the need to break them into small files.
Also I am using the ("new") ubuntu bash shell on windows.
The smaller test files I am working on look like what Jithin has posted. I will try out the suggestions and will update this post!
$cat justPid.csv
aaaa
bbbb
cccc
$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890
I am not quite sure this is what you are looking for, let
input files
$cat justPid.csv
aaaa
bbbb
cccc
$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890
script using for loop
for i in $(cat justPid.csv)
do
awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done
script using while loop
while read -r i
do
awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done < justPid.csv
Output
$ cat aaaa_file.txt
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
$ cat bbbb_file.txt
bbbb,1234567890
$ cat cccc_file.txt
cccc,1234567890
note: It is not advised to use for loop, see the link Use a while loop and the read command , Don't Read Lines With For
Without sample input/output it's just an untested guess but I THINK all you need is either::
awk -F, '{print > ($1"_file.txt")}' uniqPid.csv
or maybe:
awk -F, 'NR==FNR{a[$1];next} $1 in a{print > ($1"_file.txt")}' justPid.csv uniqPid.csv
So far I don't see any reason for a loop at all. You might need to close the output files as you go but we can address that if/when you provide sample input/output and tell us whether or not you have GNU awk.

Multiple argument bash script for awk processing

Is there a way to process 2 different files passed as arguments to a bash script which uses awk.
Script signature:
./statistical_sig.sh path_to_reviews_folder hotel_1 hotel_2
I tried the following but only the first argument got processed.
hotel1="$2";
hotel2="$3";
dos2unix -U $hotel1 | dos2unix -U $hotel2 | echo "$hotel1" "$hotel2" | xargs | awk -v hotel1="$hotel1" -v hotel2="$hotel2" { .. code ..}
You don't need all these pipes to run awk.
Either you will use something like this if you plan to read with awk some other files and use hotel1 and hotel2 somehow inside your awk code:
awk -v hotel1=$(dos2unix -U "$hotel1") -v hotel2=$(dos2unix -U "$hotel2") { awk code ..} file1 file2
Or you will use this if you plan to read and process contents of files hotel1 and hotel2:
awk { awk code ..} <(dos2unix -U "$hotel1") <(dos2unix -U "$hotel2")
Alternativelly you can modify your code like this, but this is less efficient :
hotel1=$(dos2unix "$hotel1") && hotel2=$(dos2unix "$hotel2") && echo "$hotel1 $hotel2" | awk '{your code here}'
If you explain better your question advising what is the awk code and what you are trying to achieve, you will get better advises.

Self-contained awk script: Saving to file, calling file

For a lab, I wrote a shell script that used awk to do some stuff. Rereading the lab's directions, it seems that I was supposed to write a self-contained awk script. I'm working on translating my bash script into awk, and I'm having a problem right now:
I want to save the output of an awk command to a new file, and then I want to use that output as input for another awk command.
In my bash script, I have this:
awk '/Blocked SPAM/' maillog > spamlog
cat spamlog | awk '{print $0}' RS=' '
It takes all the lines from maillog that contain the string "Blocked SPAM" and saves this to a new file titled spamlog. Then it opens spamlog and replaces every space character ' ' with a new line.
For my awk script, maillog is the file that is passed to the script from shell. My attempt at writing analogous code:
/Blocked SPAM/ > spamlog`
-f spamlog {print $0} RS=' '
I don't really know what I'm doing with my awk script since I'm having trouble finding useful resources for self-contained awk scripts.
awk '/Blocked SPAM/{ print > "spamlog"; gsub( " ","\n"); print }' maillog
Personally, I prefer to invoke that directly from a shell script, but you can easily make it an awk script by writing:
#!/usr/bin/awk -f
/Blocked SPAM/{ print > "spamlog"; gsub( " ","\n"); print }
Invoke that script with 'maillog' as an argument.

AWK: redirecting script output from script to another file with dynamic name

I know I can redirect awk's print output to another file from within a script, like this:
awk '{print $0 >> "anotherfile" }' 2procfile
(I know that's dummy example, but it's just an example...)
But what I need is to redirect output to another file, which has a dynamic name like this
awk -v MYVAR"somedinamicdata" '{print $0 >> "MYWAR-SomeStaticText" }' 2procfile
And the outpus should be redirected to somedinamicdata-SomeStaticText.
I know I can do it via:
awk '{print $0 }' 2procfile >> "$MYVAR-somedinamicdata"
But the problem is that it's a bigger awk script, and I have to output to several files depending on certain conditions (and this awk script is called from another bash, and it passes some dynamic variable via the -v switch... and son on.
Is it possible anyhow?
Thanks in advance.
i think
awk -v MYVAR="somedinamicdata" '{print $0 >> (MYVAR "-SomeStaticText") }' 2procfile
should do it. String concatenation in awk is just put one after another.

Resources