Issue with scripts not running in a basic MapReduce in Bash - bash

I am trying to create a basic MapReduce function in bash (I am very new at it). I have two scripts at the moment, job_master.sh and map_function.sh. I am trying to run the map function from the job master to cut from a data file, and if it doesn't exist send it to key, but if it does send it to the file of that name. Nothing is happening when I run the job_master script or the map_function script on its own with a file as an argument. It was working before I added the if statements into the map_function.
I have included both codes below if anyone is able to spot why they are not running. I tried including echo statements to test and it is not entering the loop in the job_master script, or doing anything at all in the map_function script.
MAP_FUNCTION
#!/bin/bash
while IFS="," read -r date prod remainder; do
if [ ! -e "$prod" ];
then
echo $prod >> keys
else
echo $prod >> $prod
fi
done
JOB_MASTER
#!/bin/bash
files=$(ls | egrep 'sales_a*')
for elem in $files ; do
./map_function.sh $elem
done

The map function script is waiting for input. Look at this:
while IFS="," read -r date prod remainder; do
# ...
done
Where will read get its input? It's waiting on stdin, but you're not passing it anything.
On the other hand, I see that you're calling the map function script like this:
./map_function.sh $elem
That is, you are passing a command line argument. But the map function script doesn't use that argument. It seems you want to redirect the content of $elem to the stdin of the script.
Write like this:
for elem in sales_a*; do
./map_function.sh < "$elem"
done
This also fixes some other issues you had in the script. files=$(ls | grep 'sales_a*') looks buggy, and it's an inappropriate way to iterate over files.

Related

In a single script, how do I pass the output of an executed shell script, into another function?

I am trying to send an output of an executed shell script, to a log file.
However I want to put a timestamp at the start of the line for every output, so I created a function to do that.
But how do I pass the results of the executed shell script, into the function?
#This is a sample of the executed file testrun.sh
#!/bin/bash
echo "Script Executed."
#Actual script being run
#!/bin/bash
testlog="/home/usr/testlog.log"
log_to_file() {
echo "$(date '+%Y-%m-%d %H:%M:%S') $1" >> $testlog
}
sh /home/usr/testrun.sh >> log_to_file
If i were to log it normally, i would just do
sh /home/usr/testrun.sh >> $testlog
But how do I pass in the output of testrun.sh, into the function log_to_file, so that I can log the output to the file with the timestamp?
You can of course do a
log_to_file "$(sh /home/usr/testrun.sh)"
Of course if your testrun.sh produces more than one line of output, only the first one gets the timestamp as prefix.
Use a while read loop to get each line into a variable that you can pass to log_to_file.
/home/usr/testrun.sh | while read -r line; do
log_to_file "$line"
done >> "$testlog"
You could also use the ts command instead of your function
/home/usr/testrun.sh | ts >> "$testlog"

This loop will only ever run once. Bad quoting or missing glob/expansion?

Im working on a bash script that recive a list of processes and do a bunch of things with them, however when I want to analyze them with a loop this error happens.
Here is some code:
#! /bin/bash
ls /proc | grep ^9 > processes.txt
cat processes.txt
for line in $processes.txt
do
echo "$line"
done
PD: Im preatty new to bash
$ does parameter expansion; it does not expand a file name to the contents of the file.
Use a while read loop instead.
while IFS= read -r line; do
echo "$line"
done < processes.txt

Creating an associative array from 2 inputs

I have two files with the following contents:
beam.csv
T11
G14
M10
frequency.csv
115000
120000
344444
I want to associate each beam with a frequency i.e. build an associative array such as NAME[beam]=frequency
declare -A NAME
while read -r beam && read -r -u 3 freq; do
NAME[$beam]=$freq
done < beam.csv 3< frequency.csv
But it doesn't get right at all
When I run echo "${!NAME[#]}" "${NAME[#]}" there is no output and when I try echo ${NAME[T11]} I don't get any output as well.
Your code works fine. Probably you are just unaware that when you execute it like this:
thisisyourshellprompt$ ./yourscript
it actually executes in a subshell; therefore the NAME variable is local to that subshell where the script is run, not to the shell where you have just typed the command.
When the script is done, it returns, and you are back to your shell, where NAME has never been defined.
But the code works, as you can verify by putting the echo in the script, as in
declare -A NAME
while read -r beam && read -r -u 3 freq; do
NAME[$beam]=$freq
done < beam.csv 3< frequency.csv
echo "${!NAME[#]}" "${NAME[#]}"

How to redirect stdin to file in bash

Consider this very simple bash script:
#!/bin/bash
cat > /tmp/file
It redirects whatever you pipe into it to a file. e.g.
echo "hello" | script.sh
and "hello" will be in the file /tmp/file. This works... but it seems like there should be a native bash way of doing this without using "cat". But I can't figure it out.
NOTE:
It must be in a script. I want the script to operate on the file contents afterwards.
It must be in a file, the steps afterward in my case involve a tool that only reads from a file.
I already have a pretty good way of doing this - its just that it seems like a hack. Is there a native way? Like "/tmp/file < 0 " or "0> /tmp/file". I thought bash would have a native syntax to do this...
You could simply do
cp /dev/stdin myfile.txt
Terminate your input with Ctrl+D or Ctrl+Z and, viola! You have your file created with text from the stdin.
echo "$(</dev/stdin)" > /tmp/file
terminate your input with ENTERctrl+d
I don't think there is a builtin that reads from stdin until EOF, but you can do this:
#!/bin/bash
exec > /tmp/file
while IFS= read -r line; do
printf '%s\n' "$line"
done
Another way of doing it using pure BASH:
#!/bin/bash
IFS= read -t 0.01 -r -d '' indata
[[ -n $indata ]] && printf "%s" "$indata" >/tmp/file
IFS= and -d '' causes all of stdin data to be read into a variable indata.
Reason of using -t 0.01: When this script is called with no input pipe then read will timeout after negligible 0.01 seconds delay. If there is any data available in input it will be read in indata variable and it will be redirected to >/tmp/file.
Another option: dd of=/tmp/myfile/txt
Note: This is not a built-in, however, it might help other people looking for a simple solution.
Why don't you just
GENERATE INPUT | (
# do whatever you like to the input here
)
But sometimes, especially when you want to complete the input first, then operate on the modified output, you should still use temporary files:
TMPFILE="/tmp/fileA-$$"
GENERATE INPUT | (
# modify input
) > "$TMPFILE"
(
# do something with the input from TMPFILE
) < "$TMPFILE"
rm "$TMPFILE"
If you don't want the program to end after reaching EOF, this might be helpful.
#!/bin/bash
exec < <(tail -F /tmp/a)
cat -

How to save entire output of bash script to file

I am trying to get the entire output of a bash script to save to a file. I currently have one argument (ip address) at the beginning of the code looks like this:
#!/bin/bash
USAGE="Usage: $0 [<IP address>]"
if [ "$#" == "0" ]; then
echo "$USAGE"
exit 1
fi
ip_addr=$1
What I'd like to do is add another argument called "output", that the entire output of the script will save to. I'm aware I could just run myscript.sh | tee textfile.txt, but I'd like to make the script a little easier to run for others.
Thanks in advance,
hcaw
After the usage message, add the following line:
exec > "$2"
This will redirect standard output for the rest of the script to the file named in the 2nd argument.
Then run using
myscript 192.0.2.42 output.txt

Resources