Using awk with gedit external tools - macos

I have used gedit with Ubuntu a lot and now transitioning on Macos. I noticed that some of the plugins in the Macos version are missing. For instance, there is no (AFAIK) an out-of-the-box option to comment/uncomment code. However, there is the possibility to define an external tool to basically have whatever you want.
I want to just comment the text selection in R/python style (prepending a # before each line of the input). I went to Tools -> Manage External Tools and defined a "Comment Code" tool in this way:
#!/bin/bash
awk '{print "#" $0}'
and set Input as "Current Selection" and output as "Replace Current Selection".
It works if you select a text; however if you don't select anything, it stalls forever, since (for what I understood) awk is waiting for an input.
How can I avoid this problem? Of course I don't need a awk (or whatever) solution, whatever works is fine. I'm not much expert of bash tools like awk or sed, so very likely I'm missing something very simple.

See https://www.gnu.org/software/gawk/manual/html_node/Read-Timeout.html for how to implement a read timeout with GNU awk.
With input from stdin (enter 7 then Enter/Return then Control-D):
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
7
#7
or with input from a pipe:
$ seq 2 | gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
#1
#2
or from a file:
$ seq 2 > file
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' file
#1
#2
but if you don't provide any input and wait 5 seconds:
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}'
gawk: cmd. line:1: fatal: error reading input file `-': Connection timed out
$
You can always redirect stderr the usual way if you don't want to see the timeout message:
$ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>/dev/null
$
but of course that would mask ALL error messages so you might want to do this instead to just mask that one message:
{ gawk 'BEGIN{PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>&1 >&3 |
grep -v 'fatal: error reading input file'; } 3>&1
e.g.:
$ { gawk 'BEGIN{print "other error" >"/dev/stderr";
PROCINFO["-", "READ_TIMEOUT"]=5000} {print "#" $0}' 2>&1 >&3 |
grep -v 'fatal: error reading input file'; } 3>&1
other error
$
Obviously change the 5000 to however many milliseconds you want the script to wait for input.

Related

Bash can't write file from sh script

I have a bash script that basically does some math with awk and then exports the results to a file on the machine, but it doesn't create the file or even modify the file if I create it for it. It works fine on my Mac, but can't seem to get it working on Ubuntu. Below is my code.
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 >"/home/skyler/Documents/SkyMine/arp_nr.var" }'
Try to separate awk and get the result into a new file using tee command.
Option 1. Using tee command.
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 }' | tee "/home/skyler/Documents/SkyMine/arp_nr.var"
Option 2. Using the output (STDOUT) redirection
awk -v a="$topnum" -v b="${allArray[2]}" 'BEGIN { if (a==b) print 2 }' > "/home/skyler/Documents/SkyMine/arp_nr.var"
PS. You should post your data file example, it would be easier to understand and help you.

File name substitution using awk and for loop

Hi I am trying to write dynamic filenames using variable substitution and I unable to figure out what am i missing here.
for i in `cat justPid.csv`
do
awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > "$i"file.txt
done
I have also tried the one below and many other combinations but it wont print multiple file names based on the $i.
for i in `cat justPid.csv`
do
awk -v var="$i" -F"," '{if ($1==var) {print $0 }}' uniqPid.csv > ${i}_file.txt
done
Any suggestions?
Edit:
my original intent is to split a 27gb file into manageable chunks based on PID (identifier in the file) so that it can be loaded onto R Studio for analysis. I am working on my laptop and not on a server hence the need to break them into small files.
Also I am using the ("new") ubuntu bash shell on windows.
The smaller test files I am working on look like what Jithin has posted. I will try out the suggestions and will update this post!
$cat justPid.csv
aaaa
bbbb
cccc
$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890
I am not quite sure this is what you are looking for, let
input files
$cat justPid.csv
aaaa
bbbb
cccc
$cat uniqPid.csv
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
bbbb,1234567890
cccc,1234567890
dddd,cccccccccc
ffff,1234567890
script using for loop
for i in $(cat justPid.csv)
do
awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done
script using while loop
while read -r i
do
awk -v var=${i} -F, '$1==var' uniqPid.csv > ${i}_file.txt
done < justPid.csv
Output
$ cat aaaa_file.txt
aaaa,1234567890
aaaa,aaaaaaaaaa
aaaa,bbbbbbbbbb
$ cat bbbb_file.txt
bbbb,1234567890
$ cat cccc_file.txt
cccc,1234567890
note: It is not advised to use for loop, see the link Use a while loop and the read command , Don't Read Lines With For
Without sample input/output it's just an untested guess but I THINK all you need is either::
awk -F, '{print > ($1"_file.txt")}' uniqPid.csv
or maybe:
awk -F, 'NR==FNR{a[$1];next} $1 in a{print > ($1"_file.txt")}' justPid.csv uniqPid.csv
So far I don't see any reason for a loop at all. You might need to close the output files as you go but we can address that if/when you provide sample input/output and tell us whether or not you have GNU awk.

Awk double-slash record separator

I am trying to separate RECORDS of a file based on the string, "//".
What I've tried is:
awk -v RS="//" '{ print "******************************************\n\n"$0 }' myFile.gb
Where the "******" etc, is just a trace to show me that the record is split.
However, the file also contains / (by themselves) and my trace, ****** is being printed there as well meaning that awk is interpreting those also as my record separator.
How can I get awk to only split records on // ????
UPDATE: I am running on Unix (the one that comes with OS X)
I found a temporary solution, being:
sed s/"\/\/"/"*"/g | awk -v RS="*" ...
But there must be a better way, especially with massive files that I am working with.
On a Mac, awk version 20070501 does not support multi-character RS. Here's an illustration using such an awk, and a comparison (on the same machine) with gawk:
$ /usr/bin/awk --version
awk version 20070501
$ /usr/bin/awk -v RS="//" '{print NR ":" $0}' <<< x//y//z
1:x
2:
3:y
4:
5:z
$ gawk -v RS="//" '{print NR ":" $0}' <<< x//y//z
1:x
2:y
3:z
If you cannot find a suitable awk, then pick a better character than *. For example, if tabs are acceptable, and if your shell supports $'...', then you could use this incantation of sed:
sed $'s,//,\t,g'

Unix Output of command to text file

I'm reading from a file called IMSI.txt using the following command:
$awk 'NR>2' IMSI.txt | awk '{print $NF}'
I need the output of this command to go to a new file called NEW.txt
So i did this :
$awk 'NR>2' IMSI.txt | awk '{print $NF}' > NEW.txt
This worked fine, but when i open the file, the output from the command are on the same line.
The new line is being neglected.
As an example, if i get an output in the console
222
111
333
i open the text file and i get
222111333
How can i fix that ?
Thank you for your help :)
PS: i am using Cygwin on windows
I am guessing your (Windows-y) editor would like to see Carriage Returns at the end of lines, not Linefeeds (which is what awk outputs). Change your print to this
print $NF "\r"
so it looks like this altogether:
awk 'NR>2 {print $NF "\r"}' IMSI.txt
Simply set your ORS to "\r\n" which allows Awk to generate DOS line endings for every output. I believe this is the most natural solution:
awk -v ORS="\r\n" '{print $NF}' > NEW.txt
Tested on a virtual XP system with Cygwin.
From Awk's manual:
ORS The output record separator, by default a newline.

tail -f, awk and output to file >

I am attempting to filter a log file and am running into issues, what I have so far is the following, which does not work,
tail -f /var/log/squid/accesscustom.log | awk '/username/;/user-name/ {print $1; fflush("")}' | awk '!x[$0]++' > /var/log/squid/accesscustom-filtered.log
The goal is to take a file that contains
ipaddress1 username
ipaddress7
ipaddress2 user-name
ipaddress1 username
ipaddress5
ipaddress3 username
ipaddress4 user-name
and save to accesscustom-filtered.log
ipaddress1
ipaddress2
ipaddress3
ipaddress4
It works without the output to accesscustom-filtered.log but something in the > isn't working right and the file ends up empty.
Edit: Changed the original example to be correct
Use tee:
tail -f /var/log/squid/accesscustom.log | awk '/username/;/user-name/ {print $1}' | tee /var/log/squid/accesscustom-filtered.log
See also: Writing “tail -f” output to another file and Turn off buffering in pipe
Note: awk doesn't buffer like grep in the superuser example, so you shouldn't need to do anything special with your awk command. (more info)

Resources