issue with two nested loops in bash - bash

I am trying to loop over two series of variables and make them with R statistical software to get the correlation results.
I got confused why the bigger loop (on mica_headers) doesn't take place while the inner loop finishes each time.
#!/bin/sh
#set -e
micaHeaderList="tot_instruction ILP32 ILP64 ILP128 ILP256 total_ins_count_for_hpc_alignment totInstruction mem-read mem-write control-flow arithmetic floating-point stack shift string sse other nop InstrFootprint64 InstrFootprint4k DataFootprint64 DataFootprint4k mem_access memReuseDist0-2 memReuseDist2-4 memReuseDist4-8 memReuseDist8-16 memReuseDist16-32 memReuseDist32-64 memReuseDist64-128 memReuseDist128-256 memReuseDist256-512 memReuseDist512-1k memReuseDist1k-2k memReuseDist2k-4k memReuseDist4k-8k memReuseDist8k-16k memReuseDist16k-32k memReuseDist32k-64k memReuseDist64k-128k memReuseDist128k-256k memReuseDist256k-512k memReuseDist512k-00 GAg_mispred_cnt_4bits PAg_mispred_cnt_4bits GAs_mispred_cnt_4bits PAs_mispred_cnt_4bits GAg_mispred_cnt_8bits PAg_mispred_cnt_8bits GAs_mispred_cnt_8bits PAs_mispred_cnt_8bits GAg_mispred_cnt_12bits PAg_mispred_cnt_12bits GAs_mispred_cnt_12bits PAs_mispred_cnt_12bits total_brCount total_transactionCount total_takenCount total_num_ops instr_reg_cnt total_reg_use_cnt total_reg_age reg_age_cnt_1 reg_age_cnt_2 reg_age_cnt_4 reg_age_cnt_8 reg_age_cnt_16 reg_age_cnt_32 reg_age_cnt_64 mem_read_cnt mem_read_local_stride_0 mem_read_local_stride_8 mem_read_local_stride_64 mem_read_local_stride_512 mem_read_local_stride_4096 mem_read_local_stride_32768 mem_read_local_stride_262144 mem_read_global_stride_0 mem_read_global_stride_8 mem_read_global_stride_64 mem_read_global_stride_512 mem_read_global_stride_4096 mem_read_global_stride_32768 mem_read_global_stride_262144 mem_write_cnt mem_write_local_stride_0 mem_write_local_stride_8 mem_write_local_stride_64 mem_write_local_stride_512 mem_write_local_stride_4096 mem_write_local_stride_32768 mem_write_local_stride_262144 mem_write_global_stride_0 mem_write_global_stride_8 mem_write_global_stride_64 mem_write_global_stride_512 mem_write_global_stride_4096 mem_write_global_stride_32768 mem_write_global_stride_262144"
mhToBeReplaced="ILP32"
compilerOptionList="funsafe_math_optimizations fno_guess_branch_probability fno_ivopts fno_tree_loop_optimize fno_inline_functions funroll_all_loops fno_omit_frame_pointer falign_jumps fselective_scheduling fno_inline_small_functions fno_tree_pre ftracer fno_move_loop_invariants"
coToBeReplaced="fno_guess_branch_probability"
for mica_header in $micaHeaderList
do
for compiler_option in $compilerOptionList
do
echo "Calculating $compiler_option correlation for $mica_header"
sed -i "s/$coToBeReplaced/$compiler_option/g" r.scr
coToBeReplaced=$compiler_option
make
done
sed -i "s/$mhToBeReplaced/$mica_header/g" r.scr
mhToBeReplaced=$mica_header
done
Edited: Generally, in order to avoid inconsistency between the values of iterations and source file, how could I be able to link these two together from different files. i.e. micaHeaderList with the headers inside the ALL.scv file ?

In the script there is a mistake which echoes a value of $mica_header which will be substituted only for make calls in the next outer loop run. This is caused by the second sed being run only after make. It should have been called before the inner loop. Could not this cause the confusion?
Also instead of replacing one value of a parameter for another it is better to create a template of r.src file with placeholders for the parameters to be changed. This way we can be sure that for some values of the parameters there could not be conflicts with the search patterns and that a typo will not prevent substitution.
Below is an improved script with considerably simpler code. The part inside if test "$Test" = 1 is just a testing code activated by Test=1. For this script just create a template for r.scr named r.scr.template and put the placeholders %%MicaHeader%% and %%CompilerOption%% instead of the parameters which should be replaced. For the first test there is a shorter list of parameters.
#!/bin/sh
micaHeaderList="tot_instruction ILP32 ILP64"
mhToBeReplaced=%%MicaHeader%%
compilerOptionList="funsafe_math_optimizations fno_guess_branch_probability fno_ivopts fno_tree_loop_optimize"
coToBeReplaced=%%CompilerOption%%
TemplateFile=r.scr.template
OutputFile=r.scr
Test=1
if test "$Test" = 1
then
TemplateFile="${TemplateFile}-test"
echo "$coToBeReplaced $mhToBeReplaced" >"$TemplateFile"
make ()
{
echo -n ">>> make: "
cat "$OutputFile"
}
fi
for mica_header in $micaHeaderList
do
for compiler_option in $compilerOptionList
do
echo "Calculating $compiler_option correlation for $mica_header"
sed "s/$mhToBeReplaced/$mica_header/g;s/$coToBeReplaced/$compiler_option/g" "$TemplateFile" >"$OutputFile"
make
done
done
If the problem still persists:
Check the generated r.src files.
Check if the make script really uses the substituted parameters from r.src.

Related

How to combine lines from separate text files into one new text tile using Shell Script

I just want to preface this by saying I am an absolute noob to shell script, so stick with me here.
Essentially, I am trying to combine lines of text from separate test files (one line from HC_01_MNoFraming.txt added at the end of the corresponding line from HC_01_Gist.txt) into a single line of text on a newly created text file called HC_01_MGNoFraming.txt. This combination of the lines of text would occur for each of the four lines of text in each text file. Below is an example:
HC_01_Gist.txt
12.0130754383 33.0026698754 117.01282238700001 182.01823507400002 201.005570202 220.010026843 352.023495725 478.012859369 518.0072172580001
12.012680624100001 56.0118834624 144.026174161 167.018345335 228.002317522 247.00666698400002 356.027611312 434.014129075 478.013307259
56.01133045 142.00709709999998 207.0121417 331.01635039999996 350.0040858 369.0084907 390.01512310000004 409.0028586 430.0265601 512.0175128999999 556.0159299
12.012615221199999 176.01120546400003 199.00347731 243.019245908 430.00998814900004 472.02324678400004 495.01541966900004
HC_01_MNoFraming
262.006565969 283.013202477 375.015624781 541.016055549
33.019710919299996 497.017719727 537.0287543219999
121.0167582 226.0165682
75.01583263170001 113.024949998 262.023691934 386.010783941 537.011983108
HC_01_MGNoFraming.txt
12.0130754383 33.0026698754 117.01282238700001 182.01823507400002 201.005570202 220.010026843 352.023495725 478.012859369 518.0072172580001 262.006565969 283.013202477 375.015624781 541.016055549
12.012680624100001 56.0118834624 144.026174161 167.018345335 228.002317522 247.00666698400002 356.027611312 434.014129075 478.013307259 33.019710919299996 497.017719727 537.0287543219999
56.01133045 142.00709709999998 207.0121417 331.01635039999996 350.0040858 369.0084907 390.01512310000004 409.0028586 430.0265601 512.0175128999999 556.0159299 121.0167582 226.0165682
12.012615221199999 176.01120546400003 199.00347731 243.019245908 430.00998814900004 472.02324678400004 495.01541966900004 75.01583263170001 113.024949998 262.023691934 386.010783941 537.011983108
I tried several things but the simplest solution I could find online was just using the paste command like this:
#(A) First Timing File = Timing File you Want to Be in Front
A=HC_01_Gist.txt
#(B) Second Timing File Timing File you Want to Be in Back
B=HC_01_MNoFraming.txt
#(C) Output Timing File Name C=HC_01_MGNoFraming.txt
#------------------------------------------------------------
paste $B $A
paste $B $A >$C
Looking at the terminal output, this is what this code gives:
262.006512.0130754383 33.0026698754 117.01282238700001 182.01823507400002 201.005570202 220.010026843 352.023495725 478.012859369 518.0072172580001
33.0197112.012680624100001 56.0118834624 144.026174161 167.018345335 228.002317522 247.00666698400002 356.027611312 434.014129075 478.013307259
121.016756.01133045 142.00709709999998 207.0121417 331.01635039999996 350.0040858 369.0084907 390.01512310000004 409.0028586 430.0265601 512.0175128999999 556.0159299
75.0158312.012615221199999 176.01120546400003 199.00347731 243.019245908 430.00998814900004 472.02324678400004 495.01541966900004
It almost looks like in terminal that the paste command starts working but then after the first data point of HC_01_MNoFraming, the paste jumbles up the text and then resumes with the second data point of HC_01_Gist.txt.
Looking at the HC_01_MGNoFraming.txt file that the code creates:
262.006565969 283.013202477 375.015624781 541.016055549
12.0130754383 33.0026698754 117.01282238700001 182.01823507400002 201.005570202 220.010026843 352.023495725 478.012859369 518.0072172580001
33.019710919299996 497.017719727 537.0287543219999
12.012680624100001 56.0118834624 144.026174161 167.018345335 228.002317522 247.00666698400002 356.027611312 434.014129075 478.013307259
121.0167582 226.0165682
56.01133045 142.00709709999998 207.0121417 331.01635039999996 350.0040858 369.0084907 390.01512310000004 409.0028586 430.0265601 512.0175128999999 556.0159299
75.01583263170001 113.024949998 262.023691934 386.010783941 537.011983108
12.012615221199999 176.01120546400003 199.00347731 243.019245908 430.00998814900004 472.02324678400004 495.01541966900004
I'm wondering about two things:
1)How can I fix the output .txt file to have the data points in-line 2) how can I ensure that the data points for HC_01_Gist.txt go before the data points for HC_01_MNoFraming.
Thanks and I'd be more than happy to clarify my problem and answer any questions you have.

The %procid% sometimes blank in rsyslog template

I'm trying to configure rsyslog to output in RFC5424 format. This means that the PROCID must be output in the syslog header. If there's no header, it should output a single dash (-) in its place. However, some of the events output have it just blank, and some have an actual value.
This is rsyslogd 5.8.10 running on Amazon Linux.
Here are the config lines:
$template CustomFormat,"<%PRI%>1 %timegenerated:1:23:date-rfc3339%-00:00 %HOSTNAME% %app-name% b%procid%b %msgid% %STRUCTURED-DATA%%msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%\n"
$ActionFileDefaultTemplate CustomFormat
Note that I put a "b" on each side of %procid% to make it more visible (this part is not RFC5424-compliant). Here are two lines of sample output.
<87>1 2019-06-19T20:03:01.929-00:00 ip-10-90-0-15 crond b29408b - - pam_unix(crond:account): expired password for user root (password aged)
<85>1 2019-06-19T20:17:18.150-00:00 ip-10-90-0-15 sudo bb - - ssm-user : TTY=pts/0 ; PWD=/ ; USER=root ; COMMAND=/bin/vi /etc/rsyslog.conf
The first line is correct, but the second example should have "b-b" instead of "bb". What should I do to make the blank %procid% show up as a dash? It works fine for the %msgid% and %STRUCTURED-DATA%.
Is there a better way to get RFC5424 output? (I have to use -00:00 instead of Z.)
There may be a better way, but one thing you can try is to use a Rainer script variable in the template instead of the property, and set this variable to "-" if the procid is empty. For example,
$template CustomFormat,"<%PRI%>1 ... b%$.myprocid%b ..."
$ActionFileDefaultTemplate CustomFormat
if ($procid == "") then {
set $.myprocid = "-";
} else {
set $.myprocid = $procid;
}
*.* ./outputfile
Just make sure the if statement is before any action statements. Note, you cannot change the procid property itself with set.

Reading and Writing File Contents From HTTP Post Data In Bash

Problem Statement:
I am trying to upload a file through an HTML form using an HTTP post request and then write it to a file called configuration.xml on my local server. I can only use the stock capabilities of the server, so, as much as I'd love to, I can't use cURL, PHP, Perl, or anything that I'd have to install on the server. What I have tried doing is having the HTML form open a CGI file as the form action and all this CGI file does is run the Bash script with the proper HTML formatting. I would run the Bash script directly from the HTML form, but my research led me to believe that this isn't possible without having to edit .htaccess or other hacky alternatives, which are not roads I want to go down. (If this can be done in a reasonable fashion, please enlighten me!) Regardless, I am able to successfully run the Bash script. I know this because I put a "touch configuration.xml" command in the script and it creates it every time. My script is also able to tell that it is an HTTP Post, as shown by the echoed text in the browser, but then I can't seem to be able to properly read any data from the file. I tried echoing the data as well as redirecting the read data to a file, but nothing appeared in the browser and nothing wrote to the file I specified. This very well may be me not knowing Bash scripting well enough or something silly like that, but I really don't know how to proceed from here.
Code:
UploadToServer.html:
<form action="run_script.cgi" method="POST" enctype="multipart/form-data">
<input type="file" name="file" />
<input type="submit" name="submit" value="Submit">
</form>
run_script.c:
Note: I compile this to a CGI file with the following command: gcc run_script.c -o run_script.cgi
#include <stdlib.h>
#include <stdio.h>
int main() {
system("./test.sh &");
printf("Content-Type: text/html\r\n\r\n");
printf(""); // print blank line for proper HTML header formatting
printf("<html>\n");
printf("</HTML>\n");
}
test.sh:
The non-commented code in the second if statement is from here. The commented code is from here.
#!/bin/bash
touch configuration.xml
if [[ $REQUEST_METHOD = 'POST' ]]; then
echo "this is a post!"
if [ "$CONTENT_LENGTH" -gt 0 ]; then
echo "entered second if statement!"
# read -n $CONTENT_LENGTH POST_DATA <&0
# echo "$CONTENT_LENGTH"
while read line
do eval "echo ${line}"
done
fi
fi
I also tried the approach in the third code block of this post, but didn't get any output. I also looked through this post, but it doesn't seem to grab all the data from the file like I need to. I also tried the approach of just using a CGI file like suggested in this post (_http://blog.purplepixie.org/2013/08/cc-cgi-file-upload/), but, once again, no output. I've been looking through the Apache error log as I try new things and no errors come up.
Anybody have any ideas on what I could possibly be doing wrong? Is there a different approach worth looking into? Any suggestions are greatly appreciated!
I figured out how to do it, with some help from my friends. I ended up doing it all in a CGI script and foregoing the Bash component. While this isn't what I asked for in my original question, it gets the job done for me, which is really what the question was asking.
The following is the C code I'm now using to successfully write the file on the server:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void print_empty_html_page();
int main() {
char * req_method_str = getenv("REQUEST_METHOD");
if (req_method_str != NULL) {
if (strcmp(req_method_str, "POST") == 0) {
// process POST arguments
char * len_str = getenv("CONTENT_LENGTH");
if (len_str != NULL) {
int len = atoi(len_str);
if (len > 0) {
FILE * fp;
fp = fopen("file.xml", "w");
char * postdata = malloc((len + 1) * sizeof(char));
fread(postdata, sizeof(char), len, stdin);
postdata[len] = '\0';
fprintf(fp, "%s\n", postdata);
free(postdata);
fclose(fp);
}
system("sed -e '/Content/d' -e '/[-][-][*][*][*][*][*]/d' -e '/^[s]*$/d' -e '/WebKitFormBoundary/d' -e '/Submit/d' < file.xml > file_trimmed.xml");
system("rm file.xml");
}
}
}
print_empty_html_page();
return 0;
}
void print_empty_html_page() {
// Send the content type, letting the browser know this is HTML
printf("Content-type: text/html\r\n\r\n");
// Header information that prevents browser from caching
printf(
"<META HTTP-EQUIV=\"CACHE-CONTROL\" CONTENT=\"NO-CACHE, NO-STORE\">\r\n\r\n");
// Top of the page
printf("<html>\n");
printf("<BODY>\n");
// Finish up the page
printf("</BODY></html>\n");
}
Note: This method writes the entire HTTP POST to the file 'file.xml'. The system call to 'sed' is to remove the tags from the HTTP POST that don't correspond to the actual data in the file that was uploaded. If you need to check for additional unwanted lines, just add another -e '/<line_with_expression_to_delete>/d' in that sed call, where line_with_expression_to_delete is the expression you want to match and then delete all lines containing that expression. I couldn't figure out how to delete all the blank lines in the newly uploaded file, even though '/^[s]*$/d' should do that, according to my research. Gonna have to look into that more...
Also note: This method only works for uploading text files. It does not work for other file types, such as JPEGs or OGGs.
Hopefully this helps some other people with the same problem. Let me know if you have any questions.

Convert markdown italics and boldface to latex

I want to be able to convert markdown italics and boldface to latex versions on the fly (i.e., give a text string(s) return a text string(s)). I thought easy. Wrong! (Which it still may be). See the sill buisness and error I tried at the bottom.
What I have (note the starting asterisk that's been escaped as in markdown):
x <- "\\*note: I *like* chocolate **milk** too ***much***!"
What I would like:
"*note: I \\emph{like} chocolate \\textbf{milk} too \\textbf{\\emph{much}}!"
I'm not attached to regex but would prefer a base solution (though not essential).
Silly business:
helper <- function(ins, outs, x) {
gsub(paste0(ins[1], ".+?", ins[2]), paste0(outs[1], ".+?", outs[2]), x)
}
helper(rep("***", 2), c("\\textbf{\\emph{", "}}"), x)
Error in gsub(paste0(ins[1], ".+?", ins[2]), paste0(outs[1], ".+?", outs[2]), :
invalid regular expression '***.+?***', reason 'Invalid use of repetition operators'
I have this toy that Ananda Mahto helped me make if it's helpful. You could access it from reports via wheresPandoc <- reports:::wheresPandoc
EDIT Per Ben's comments I tried:
action <- paste0(" echo ", x, " | ", wheresPandoc(), " -t latex ")
system(action)
*note: I *like* chocolate **milk** too ***much***! | C:\PROGRA~2\Pandoc\bin\pandoc.exe -t latex
EDIT2 Per Dason's comments I tried:
out <- paste("echo", shQuote(x), "|", wheresPandoc(), " -t latex"); system(out)
system(out, intern = T)
> system(out, intern = T)
\*note: I *like* chocolate **milk** too ***much***! | C:\PROGRA~2\Pandoc\bin\pandoc.exe -t latex
The lack of pipes on Windows made this tricky, but you can get around it using input to provide the stdin:
> x = system("pandoc -t latex", intern=TRUE, input="\\*note: I *like* chocolate **milk** too ***much***!")
> x
[1] "*note: I \\emph{like} chocolate \\textbf{milk} too \\textbf{\\emph{much}}!"
Noting I am working on windows and from ?system
This means that redirection, pipes, DOS internal commands, ... cannot be used
and the note from ?system2
Note
system2 is a more portable and flexible interface than system,
introduced in R 2.12.0. It allows redirection of output without
needing to invoke a shell on Windows, a portable way to set
environment variables for the execution of command, and finer control
over the redirection of stdout and stderr. Conversely, system (and
shell on Windows) allows the invocation of arbitrary command lines.
Using system2
system2('pandoc', '-t latex', input = '**em**', stdout = TRUE)

command line arguments in bash to Rscript

I have a bash script that creates a csv file and an R file that creates graphs from that.
At the end of the bash script I call Rscript Graphs.R 10
The response I get is as follows:
Error in is.vector(X) : subscript out of bounds
Calls: print ... <Anonymous> -> lapply -> FUN -> lapply -> is.vector
Execution halted
The first few lines of my Graphs.R are:
#!/bin/Rscript
args <- commandArgs(TRUE)
CorrAns = args[1]
No idea what I am doing wrong? The advice on the net appears to me to say that this should work. Its very hard to make sense of commandArgs
With the following in args.R
print(commandArgs(TRUE)[1])
and the following in args.sh
Rscript args.R 10
I get the following output from bash args.sh
[1] "10"
and no error. If necessary, convert to a numberic type using as.numeric(commandArgs(TRUE)[1]).
Just a guess, perhaps you need to convert CorrAns from character to numeric, since Value section of ?CommandArgs says:
A character vector containing the name
of the executable and the
user-supplied command line arguments.
UPDATE: It could be as easy as:
#!/bin/Rscript
args <- commandArgs(TRUE)
(CorrAns = args[1])
(CorrAns = as.numeric(args[1]))
Reading the docs, it seems you might need to remove the TRUE from the call to commandArgs() as you don't call the script with --args. Either that, or you need to call Rscript Graphs.R --args 10.
Usage
commandArgs(trailingOnly = FALSE)
Arguments
trailingOnly logical. Should only
arguments after --args be returned?
Rscript args.R 10 where 10 is the numeric value we want to pass to the R script.
print(as.numeric(commandArgs(TRUE)[1]) prints out the value which can then be assigned to a variable.

Resources