How do I fix these errors in my awk programs - bash

I have this awk code:
BEGIN {
valid_name[0] = "CEO"
valid_name[1] = "maffu"
valid_name[2] = "gerry"
valid_name[3] = "bob"
valid_name[4] = "cath"
valid_name[5] = "tom.tom.the.son.of.the.piper"
valid_name[6] = "Insuro_Corp"
valid_name[7] = "who-pays-the-piper"
valid_name[8] = "a_hat_at_a_party"
valid_name[9] = "do_dot_the_eyes"
valid_name[10]= "Kim_dot_COM"
valid_domain[0] = "InsuroCorp"
valid_domain[1] = "cs.otago"
valid_domain[2] = "gmail"
valid_domain[3] = "enron"
valid_domain[4] = "research.techies"
valid_domain[5] = "1st.national"
valid_extension[0] = "co.nz"
valid_extension[1] = "com.au"
valid_extension[2] = "co.uk"
valid_extension[3] = "co.us"
valid_extension[4] = "co.ca"
valid_extension[5] = "com"
valid_numeric[0] = "[139.80.81.50]"
valid_numeric[1] = "[127.0.0.0]"
valid_numeric[2] = "[139.80.32.68]"
valid_numeric[3] = "[255.255.25.255]"
invalid_name[0] = "-foo"
invalid_name[1] = "f--d"
invalid_name[2] = "_at_"
invalid_name[3] = "Top$"
invalid_name[4] = "tom/tom"
invalid_name[5] = ".com.au"
invalid_name[6] = "white space"
invalid_name[7] = " white-space"
invalid_domain[0] = "Insuro-Corp"
invalid_domain[1] = "cs_otago"
invalid_domain[3] = "100%"
invalid_domain[4] = "AT&T"
invalid_extension[0] = "ac.nz"
invalid_extension[1] = "edu.au"
invalid_extension[2] = "tv"
invalid_extension[3] = "com.us"
invalid_extension[4] = "edu"
invalid_numeric[0] = "139.80.81.50"
invalid_numeric[1] = "[1..2]"
invalid_numeric[2] = "[139-80-81-50]"
invalid_numeric[3] = "[1][2][3]"
}
function generate_invalid_e_mail_address() {
at = rand() < 0.3 ? "_at_" : rand() < 0.1 ? "" : "#"
dot = dot = rand() < 0.3 ? "_dot_" : rand() < 0.1 ? "" : "."
if (rand() < 0.5) {
name = valid_name[int(rand()*11)]
if (rand() < 0.3) {
numeric = invalid_numeric[int(rand()*4)]
print name at numeric
} else {
if (rand() < 0.5) {
domain = valid_domain[int(rand()*6)]
extension = invalid_extension[int(rand()*5)]
} else {
domain = invalid_domain[int(rand()*5)]
extension = valid_extension[int(rand()*6)]
}
print name at domain dot extension
}
} else {
name = invalid_name[int(rand()*8)]
if (rand() < 0.3) {
numeric = valid_numeric[int(rand()*4)]
print name at numeric
} else {
domain = valid_domain[int(rand()*5)]
extension = valid_extension[int(rand()*6)]
print name at domain dot extension
}
}
}
BEGIN {
print "maffu#cs.otago.ac.nz"
print "bob.gmail.com"
for (i = 0; i < 518; i++) generate_invalid_e_mail_address()
}
This program should generate email address test cases and put
them in a file called 'bad.data' with the command:
awk -f bad.awk >bad.data
Instead bad.data is created as an empty file because of these
errors:
awk: bad.awk:70: extension = invalid_extension[int(rand()*5)]
awk: bad.awk:70: ^ syntax error
awk: bad.awk:73: extension = valid_extension[int(rand()*6)]
awk: bad.awk:73: ^ syntax error
awk: bad.awk:76: print name at domain dot extension
awk: bad.awk:76: ^ unexpected newline or end of string
awk: bad.awk:84: extension = valid_extension[int(rand()*6)]
awk: bad.awk:84: ^ syntax error
awk: bad.awk:86: print name at domain dot extension
awk: bad.awk:86: ^ unexpected newline or end of string
This is the first awk code I have seen. How do I fix it?

It looks like extension is a keyword. Just replace all extensions to extension1 for example:
extension1 = invalid_extension[int(rand()*5)];

Related

do shell script with awk fails in AppleScript, but same awk command works in Terminal

Making a shell script work in AppleScript.
The following works in Terminal:
awk -F\" '/kMDItemTextContent/{print $2}' /Users/john/Desktop/PDFTags/mdimport.txt
But that same line, with do shell script, throws an error in Script Editor.
do shell script "awk -F\" '/kMDItemTextContent/{print $2}' /Users/john/Desktop/PDFTags/mdimport.txt"
This is from the Replies in Script Editor:
tell current application do shell script "awk -F\" '/kMDItemTextContent/{print $2}' /Users/john/Desktop/PDFTags/mdimport.txt"
--> error "sh: -c: line 0: unexpected EOF while looking for matching `\"' sh: -c: line 1: syntax error: unexpected end of file" number 2
For reference, here is the context of mdimport.txt
Imported '/Users/John/Desktop/PDFTags/Untitled.txt' of type 'public.plain-text' with plugIn /System/Library/Spotlight/RichText.mdimporter. 32 attributes returned {
":EA:_kMDItemUserTags" = (
);
":EA:kMDItemLastUsedDate" = "2020-11-07 08:17:13 +0000";
":MD:DeviceId" = 16777227;
":MD:kMDItemPath" = "/Users/john/Desktop/PDFTags/Untitled.txt";
"_kMDItemContentChangeDate" = "2020-11-05 23:24:57 +0000";
"_kMDItemCreationDate" = "2020-11-05 22:34:36 +0000";
"_kMDItemCreatorCode" = 0;
"_kMDItemDisplayNameWithExtensions" = {
"" = "Untitled.txt";
};
"_kMDItemFileName" = "Untitled.txt";
"_kMDItemFinderFlags" = 16;
"_kMDItemFinderLabel" = 0;
"_kMDItemFromImporter" = 1;
"_kMDItemIsExtensionHidden" = 1;
"_kMDItemIsFromImporter" = 1;
"_kMDItemLocked" = 1;
"_kMDItemOwnerGroupID" = 20;
"_kMDItemOwnerUserID" = 501;
"_kMDItemTextEncodingHint" = 134217984;
"_kMDItemTypeCode" = 0;
"com_apple_metadata_modtime" = "626311497.290109";
kMDItemAlternateNames = (
"Untitled.txt"
);
kMDItemContentCreationDate = "2020-11-05 22:34:36 +0000";
kMDItemContentModificationDate = "2020-11-05 23:24:57 +0000";
kMDItemContentType = "public.plain-text";
kMDItemContentTypeTree = (
"public.plain-text",
"public.text",
"public.data",
"public.item",
"public.content"
);
kMDItemDateAdded = "2020-11-07 08:56:45 +0000";
kMDItemDisplayName = {
"" = Untitled;
};
kMDItemDocumentIdentifier = 27837;
kMDItemKind = {
"" = NSStringPboardType;
ar = "\U0645\U0633\U062a\U0646\U062f \U0646\U0635\U064a \U0639\U0627\U062f\U064a";
ca = "Document de text sense format";
cs = "Prost\U00fd textov\U00fd dokument";
da = "Alm. tekstdokument";
de = "Reines Textdokument";
el = "\U0388\U03b3\U03b3\U03c1\U03b1\U03c6\U03bf \U03b1\U03c0\U03bb\U03bf\U03cd \U03ba\U03b5\U03b9\U03bc\U03ad\U03bd\U03bf\U03c5";
en = "Plain Text Document";
"en_AU" = "Plain Text Document";
"en_GB" = "Plain Text Document";
es = "Documento de texto sin formato";
"es_419" = "Documento de texto sin formato";
fi = "Pelkk\U00e4 teksti -dokumentti";
fr = "Document format texte";
"fr_CA" = "Document format texte";
he = "\U05de\U05e1\U05de\U05da \U05de\U05dc\U05dc \U05e4\U05e9\U05d5\U05d8";
hi = "\U092a\U094d\U0932\U0947\U0928 \U091f\U0947\U0915\U094d\U0938\U094d\U091f \U0926\U0938\U094d\U0924\U093e\U0935\U0947\U091c\U093c";
hr = "Dokument obi\U010dnog teksta";
hu = "Sima sz\U00f6veges dokumentum";
id = "Dokumen Teks Biasa";
it = "Documento di solo testo";
ja = "\U6a19\U6e96\U30c6\U30ad\U30b9\U30c8\U66f8\U985e";
ko = "\Uc77c\Ubc18 \Ud14d\Uc2a4\Ud2b8 \Ubb38\Uc11c";
ms = "Dokumen Teks Biasa";
nl = "Platte-tekstdocument";
no = "Dokument med ren tekst";
pl = "dokument tekstowy (zwyk\U0142y)";
pt = "Documento de Texto Simples";
"pt_PT" = "Documento de texto simples";
ro = "Document text simplu";
ru = "\U0414\U043e\U043a\U0443\U043c\U0435\U043d\U0442 \U043f\U0440\U043e\U0441\U0442\U043e\U0433\U043e \U0442\U0435\U043a\U0441\U0442\U0430";
sk = "Dokument s\U00a0oby\U010dajn\U00fdm textom";
sv = "Rent textdokument";
th = "\U0e40\U0e2d\U0e01\U0e2a\U0e32\U0e23\U0e02\U0e49\U0e2d\U0e04\U0e27\U0e32\U0e21\U0e18\U0e23\U0e23\U0e21\U0e14\U0e32";
tr = "D\U00fcz Metin Belgesi";
uk = "\U0414\U043e\U043a\U0443\U043c\U0435\U043d\U0442 \U043f\U0440\U043e\U0441\U0442\U043e\U0433\U043e \U0442\U0435\U043a\U0441\U0442\U0443";
vi = "T\U00e0i li\U1ec7u v\U0103n b\U1ea3n thu\U1ea7n t\U00fay";
"zh_CN" = "\U7eaf\U6587\U672c\U6587\U7a3f";
"zh_HK" = "\U7d14\U6587\U5b57\U6587\U4ef6";
"zh_TW" = "\U7d14\U6587\U5b57\U6587\U4ef6";
};
kMDItemLogicalSize = 345;
kMDItemPhysicalSize = 4096;
kMDItemTextContent = "Here are the text file contents that was used to test the routine.\n\nHash Tag Test Document\n\n#HashTag1 this is the first hash tag.\n#HashTag2 this is the second hash tag.\n\nThe following hash tag is inside and at the end of a paragraph:
#HashTag3\n\nThe next hash tag #HashTag4 is in the middle of a paragraph.\n\n#HashTag5\n#asdfasdfasdfasdfasfdasdfasd"; }
Looking at a portion of the error message:
unexpected EOF while looking for matching `\"'
It is referring to the field separator assigned by: -F
You need to, in this case, both single-quote and escape the single double-quote.
do shell script "awk -F'\"' '/kMDItemTextContent/{print $2}' /Users/john/Desktop/PDFTags/mdimport.txt"
Result:
"Here are the text file contents that was used to test the
routine.\n\nHash Tag Test Document\n\n#HashTag1 this is the first
hash tag.\n#HashTag2 this is the second hash tag.\n\nThe following
hash tag is inside and at the end of a paragraph:"

Split and format text output with separators Xamarin Forms

I have an entry and label I want to format my text to my label like this:
"email#gmail.com", "email2#gmail.com", "email3#gmail.com"
this is what I enter in my entry field:
email#gmail.com /space/ email2#gmail.com /space/ email3#gmail.com or
email#gmail.com,email2#gmail.com,email3#gmail.com
The separator is a space or comma. How can I format my output to the one above?
Good question!
string entry = Entry.Text;
List<string> arrayfromEntry = new List<string>();
if (entry.Contains(" ") == true){
arrayfromEntry = entry.Split(new char[] { ' ' }).ToList();
}
else{
arrayfromEntry = entry.Split(new char[] { ',' }).ToList();
}
for (int i = 0; i < arrayfromEntry.Count(); i++){
arrayfromEntry[i] = '"' + arrayfromEntry[i] + '"';
}
string f = (string.Join(", ",arrayfromEntry));
f = f.Remove(f.Count()-2,2);
f = f+'"';
textToLabel = f;
Where Entry.Text is the text from your entry and textToLabel changes the text of your label, this should work.
Based on the #jamesfdearborn answer, but using StringBuilder instead
string entry == "aaaa#ttttt.com,bbbb#ttttttyyy.com,tttt#errrer.com,yyyyyy#rrrttr.com,uuuuu#yuyuy.com";
var inputSeparator = ','; //comma is the separator in this case you can change it
var outputSeparator = ',';
var arrayfromEntry = entry.Split(inputSeparator).ToList();
var sb = new StringBuilder();
for (int i = 0; i < arrayfromEntry.Count(); i++)
{
sb.AppendFormat("\"{0}\"{1}",arrayfromEntry[i],outputSeparator);
}
sb.Remove(sb.ToString().Count()-1, 1);
sb.ToString() //result here
//output
//"aaaa#ttttt.com","bbbb#ttttttyyy.com","tttt#errrer.com","yyyyyy#rrrttr.com","uuuuu#yuyuy.com"
you can change the output or the the input separator

Having an issue with YACC grammar

The code I wrote seems to not be able to detect a function. I tried many edits but nothing seems to be working.
program : function-decl | decl | function-def
;
decl : kind var-list SEMICOLON
{
tok_type = "variable";
}
;
kind : KW_INT {integer = true; floatType = false;}
| KW_FLOAT {integer = false; floatType = true;}
;
var-list : ID varmany
{
tok_type = "variable";
t.check_token (tok_type, $1, line_no, bodyCheck, parameter);
}
;
varmany : /*empty*/ | varmany COMMA ID
{
tok_type = "variable";
t.check_token (tok_type, $3, line_no, bodyCheck, parameter );
}
;
function-decl : kind ID LPAR kind RPAR SEMICOLON
{
current_func = $2;
declaration = true;
parameter= false;
tok_type= "function";
t.check_token (tok_type, current_func, line_no, bodyCheck, parameter );
current_func ="";
}
;
function-def : kind ID LPAR kind ID RPAR body
{
current_func = $2;
paramName = $5;
declaration = false;
parameter= false;
tok_type= "function";
t.check_token (tok_type, current_func, line_no, bodyCheck, parameter );
tok_type = "variable";
parameter=true;
t.check_token(tok_type, paramName, line_no, bodyCheck, parameter);
current_func ="";
}
;
For example, for text input :
int main (int DUMMY) {
int x,y,z;
float p;
int main (int x){x = y;}
p = -z * (x/345+y*1.0) + - 300;
p = -z * (x/345+y*1.0) + -300;
while (p>=-(x+y)*3.45/6-z)
z = z + 3;
}
I get these error messages:
Local int variable y declared in line 3.
Local int variable z declared in line 3.
Local int variable x declared in line 3.
Local float variable p declared in line 5.
Local int variable main declared in line 6.
syntax error on line 6, matched: (
Local int variable x declared in line 6.
syntax error on line 6, matched: )
Your function_decl rule insists on exactly one parameter and does not allow for its name.

awk syntax error: awk: line 29: syntax error at or near :

I have written a awk script and I keep on getting the following error:
awk: line 29: syntax error at or near :
I do not understand why I keep on getting this error.
The script is below(script is large but the error is only at the top section. Just added the script for completeness. A flag has been marked for the line a error).
#!/bin/sh
tshark -V -r $1 > .pcap_out1_ver.txt
tshark -r $1 > .pcap_out_summ.txt
awk -F ":" '
BEGIN {
#Packet types and subtypes.
frame_id[0] = "Association Request";
frame_id[1] = "Association Response";
frame_id[2] = "Association Response";
frame_id[3] = "Reassociation Response";
frame_id[4] = "Probe Request";
frame_id[5] = "Probe Response";
frame_id[6] = "Reserved";
frame_id[7] = "Reserved";
frame_id[8] = "Beacon";
frame_id[9] = "ATIM";
frame_id[10] = "Disassociation";
frame_id[11] = "Authentication";
frame_id[12] = "Deauthentication";
frame_id[13] = "Action";
for(x=14; x<24; ++x) {
frame_id[x] = "Reserved";
}
frame_id[24] = "Block Ack Request";
frame_id[25] = "Block Ack";
frame_id[26] = "PS-Poll";
frame_id[27] = "RTS"; #******Error here****
frame_id[28] = "CTS";
frame_id[29] = "ACK";
frame_id[30] = "CF-end";
frame_id[31] = "CF-end + CF-ack";
frame_id[32] = "Data";
frame_id[33] = "Data + CF-ack";
frame_id[34] = "Data + CF-poll";
frame_id[35] = "Data + CF-ack +CF-poll";
frame_id[36] = "Null";
frame_id[37] = "CF-ack";
frame_id[38] = "CF-poll";
frame_id[39] = "CF-ack + CF-poll";
frame_id[40] = "QoS data";
frame_id[41] = "QoS data + CF-ack";
frame_id[42] = "QoS data + CF-poll";
frame_id[43] = "QoS data + CF-ack + CF-poll";
frame_id[44] = "QoS Null";
frame_id[45] = "Reserved";
frame_id[46] = "QoS + CF-poll (no data)";
frame_id[47] = "Qos + CF-ack (no data)";
packet_type[0] = "Management";
packet_type[1] = "Control";
packet_type[2] = "Data";
#Variables for storing stats.
captured_length = 0;
for(x=0; x<50; ++x) {
count[x]=0;
traffic[x]=0;
}
#Counter for Epoch Time. Avg data rates.
next_mark=0;
j=0;
first_epoch_time = 0;
cur_epoch_time = 0;
#Counter for rssi values.
k=0;
}
{
gsub(/^[ \t]+/, "", $1);
if($1=="Frame Control") {
gsub(/^[ \t]+/, "", $2);
intRep = sprintf("%d", "0x" substr($2, 4, 2));
traffic[intRep] += captured_length;
count[intRep] += 1;
} else if($1=="Capture Length") {
gsub(/^[ \t]+/, "", $2);
gsub(/ [^\0]*/,"", $2);
captured_length = $2;
} else if($1=="Epoch Time") {
gsub(/^[ \t]+/, "", $2);
gsub(/ [^\0]*/, "", $2);
if(next_mark<$2) {
if(next_mark==0) {
next_mark = $2+60;
first_epoch_time = $2;
} else {
next_mark += 60;
j++;
}
#initialization of array element before using.
traffic_per_min[j] = 0;
count_per_min[j] = 0;
data_rate[j] = 0;
}
cur_epoch_time = $2;
traffic_per_min[j] += captured_length;
count_per_min[j] += 1;
} else if($1=="SSI signal") {
gsub(/^[ \t]+/, "", $2);
print "ssi signal"
if( substr($2, 0, 1) == "-") {
rssi_v[k] = $2;
rssi_t[k] = cur_epoch_time;
print rssi_v[k];
print rssi_t[k];
k++;
}
} else if($1=="Data Rate") {
gsub(/^[ \t]+/, "", $2);
gsub(/ [^\0]*/, "", $2);
data_rate_avg[j] += $2;
data_rate[k] = $2;
}
}
END {
# print "Packet Subtype" "No. of Packets" "Amount of traffic"
for(x=0; x<48; ++x) {
if(count[x] != 0) {
print frame_id[x]":"count[x]":"traffic[x];
}
}
print "-----"
for(x=0; x<=j; ++x) {
print x traffic_per_min[x]/count_per_min[x];
}
}
' .pcap_out1_ver.txt > .parsed.txt
awk -F " \t" '
BEGIN {
for(x=0; x<6; ++x)
count[6] = 0;
protocol[0] = "HTTP";
protocol[1] = "ARP";
protocol[2] = "SMTP";
protocol[3] = "DNS";
protocol[4] = "FTP";
protocol[5] = "DHCP";
}
{
if($5==protocol[0]){
count[0] += 1;
} else if($5==protocol[1]) {
count[1] += 1;
} else if($5==protocol[2]) {
count[2] += 1;
} else if($5==protocol[3]) {
count[3] += 1;
} else if($5==protocol[4]) {
count[4] += 1;
} else if($5==protocol[5]) {
count[5] += 1;
}
}
END {
for(x=0; x<6; ++x) {
print protocol[x]:count[x]
}
}
' .pcap_out_summ.txt > .app_net.txt
You have this line in the END block:
print protocol[x]:count[x]
It should be replaced with:
print protocol[x]":"count[x]
Beside your syntax error, could I make a suggestion or 2 about your awk scripts:
Get rid of all those null statements (spurious trailing
semi-colons).
You don't seem to be grasping the power of awks associative arrays. Take your second script for example. It could be re-written as just:
awk -F " \t" '
BEGIN { n=split("HTTP ARP SMTP DNS FTP DHCP",protocol,/ /) }
{ count[$5]++ }
END { for(x=0;x<n;++x) print protocol[x]":"count[protocol[x]]+0 }
' .pcap_out_summ.txt > .app_net.txt
You might want to take a look at the book Effective Awk Programming, Third Edition By Arnold Robbins (http://www.oreilly.com/catalog/awkprog3/).
As awk tells you, this line of your second awk script is wrong:
print protocol[x]:count[x]
You probably meant to print a colon:
print protocol[x] ":" count[x]

String: replacing spaces by a number

I would like to replace every blank spaces in a string by a fixnum (which is the number of blank spaces).
Let me give an example:
s = "hello, how are you ?"
omg(s) # => "hello,3how10are2you1?"
Do you see a way (sexy if possible) to update a string like this?
Thank you Rubists :)
gsub can be fed a block for the "replace with" param, the result of the block is inserted into place where the match was found. The argument to the block is the matched string. So to implement this we capture as much whitespace as we can ( /\s+/ ) and feed that into the block each time a section is found, returning that string's length, which gets put back where the whitespace was originally found.
Code:
s = "hello, how are you ?"
res = s.gsub(/\s+/) { |m| m.length }
puts res
# => hello,3how10are2you1?
it is possible to do this via an array split : Javascript example
var s = "hello, how are you ?";
function omg( str ) {
var strArr = str.split('');
var count = 0;
var finalStr = '';
for( var i = 0; i < strArr.length; i++ ) {
if( strArr[i] == ' ' ) {
count++;
}
else
{
if( count > 0 ) {
finalStr += '' + count;
count = 0;
}
finalStr += strArr[i];
}
}
return finalStr
}
alert( omg( s ) ); //"hello,3how10are2you1?"
Lol, this seems the best it can be for javascript

Resources