Using pipe in hadoop - hadoop

I am using ProcessBuilder to run an executable.It works fine.
Now I am in a scenario where I have to give output of 1st executable to the second one.
eg:
exe1 arg1 arg2 | exe2 arg3
and get its InputStream and print to stdout.
so I am writing a small script.sh for that which contains
exe1 arg1 arg2 | exe2 arg3
The following works fine in java:
ProcessBuilder pb=new ProcessBuilder();
pb.command("/bin/sh","/home/biadmin/Desktop/script.sh");
Process p=pb.start();
InputStream in=p.getInputStream();
//output successfully printed to stdout.
But when I do the same thing in hadoop environment. I dont get anything in inputstream. I need to work this same thing in hadoop. Any suggestions/advise appreciated.
Thanks.

Related

Bash execute multiple command stored in variable without eval

I have a script wrapper.sh it takes a string as an argument.
wrapper.sh
#!/usr/bin/env bash
node ./index.js $1
Now if I pass argument as hello it runs fine but if I pass hello&pwd then it passes full string as an argument to the nodejs file instead of just passing hello in nodejs and running pwd separately.
Example
./wrapper.sh "hello"
# nodejs gets argument hello : Expected
./wrapper.sh "hello&pwd"
# nodejs gets argument hello&pwd : Not Expected
# Requied only hello in nodejs while pwd running separately
I have tried a lot of solutions online but none seem to work except eval and bash -c which I don't want to use because the script doesn't wait for these commands to finish.
Edit
wrapper.sh is executed by a third party software and the content of the script is dynamically configured by the user so there's nothing much in my hand. Job of my module is to just setup the script properly that it is executed by the third party software.

s3cmd put command: Upload: Command not found

So I want to put a file onto s3. Here is the cmd:
/usr/bin/s3cmd --rr --access_key="$access_key" --secret_key="$secret_key" put "$FILEPATH/$ZIPPED_FILE" "$s3_path/$ZIPPED_FILE"
And this works perfectly except in my bash shell, it prints out this message: upload:: command not found. Anyone encounter this?
This almost certainly means you're running the stdout of s3cmd as a command itself. For instance, that could happen if you were to run:
# BAD: runs the command, then runs its output as another command
`/usr/bin/s3cmd --rr --access_key="$access_key" --secret_key="$secret_key" put "$FILEPATH/$ZIPPED_FILE" "$s3_path/$ZIPPED_FILE"`
To fix that, just take the backticks out, and write:
# GOOD: just invokes your command, with its output written to stdout
/usr/bin/s3cmd --rr --access_key="$access_key" --secret_key="$secret_key" put "$FILEPATH/$ZIPPED_FILE" "$s3_path/$ZIPPED_FILE"

Script not working as Command line

i've created simple bash script that do the following
:
#!/usr/bin/env bash
cf ssh "$1"
When I run the command line from the CLI like cf ssh myapp its running as expected, but when I run the script like
. myscript.sh myapp
I got error: App not found
I dont understand what is the difference, I've provided the app name after I invoke the script , what could be missing here ?
update
when I run the script with the following its working, any idea why the "$1" is not working ...
#!/usr/bin/env bash
cf ssh myapp
When you do this:
. myscript.sh myapp
You don't run the script, but you source the file named in the first argument. Sourcing means reading the file, so it's as if the lines in the file were typed on the command line. In your case what happens is this:
myscript.sh is treates as the file to source and the myapp argument is ignored.
This line is treated as a comment and skipped.
#!/usr/bin/env bash
This line:
cf ssh "$1"
is read as it stands. "$1" takes the value of $1 in the calling shell. Possibly - most likely in your case - it's blank.
Now you should know why it works as expected when you source this version of your script:
#!/usr/bin/env bash
cf ssh myapp
There's no $1 to resolve, so everything goes smoothly.
To run the script and be able to pass arguments to it, you need to make the file executable and then execute it (as opposed to sourcing). You can execute the script for example this way:
./script.bash arg1 arg2

Makefile pass arguments to a C program

I have google but it seems to not be possible, is there a way to have the run command in a makefile be able to pass any arguments given down to the program simply?
argset1:
./a.out arg1 arg2
argset2:
./a.out arg3 arg4
arg1='default1'
arg2='default2'
custom:
./a.out $(arg1) $(arg2)
You can do:
make argset1
make argset2
make custom arg1=1234 arg2=3321

Error "exporting" shell functions from within a Java process

I have a script which looks like this:
#!/bin/bash
function func1() {
echo "HELLO!"
}
export -f func1
function func2() {
echo "HI!!"
func1
}
export -f func2
I locally start a hadoop tasktracker and my /usr/lib/hadoop/conf/hadoop-env.sh looks something like this:
# .. few configuration params
# source my_shell_file.sh
# my_function
When I start the tasktracker everything is fine. It prints out a couple of echo statements that I have inside my_function. When I start a hadoop job with a 'mapper.py' file, it works normally. It even takes the configuration params present in my_function . The problem occurs when I declare my mapper as
-mapper 'bash -c "func1 ; python mapper.py"'
It then throws this error:
/bin/bash: func2: line 1: syntax error: unexpected end of file
/bin/bash: error importing function definition for `func2'
/bin/bash: func1: line 1: syntax error: unexpected end of file
/bin/bash: error importing function definition for `func1'
I'm not sure what is happening here. I tried 'sh -c' instead of 'bash -c' and I get the same issue.
EDIT: The shell script works fine when I "source" it on the console. It recognizes the functions defined in the shell file on the console.
EDIT2: Added the EXACT contents of the shell file.
This is a bug in Hadoop.
Bash functions are passed as regular environment variables so that exporting works across processes:
foo='() { echo "hello world"; }' bash -c 'foo'
The environment variables generated bash will contain multiple lines, which is normally fine.
Hadoop Streaming, however, has a badly written Environment class that tries to reimplement System.getenv() by naively parsing the output of env.
Since it doesn't handle multi-line variables, it destroys your functions.
I tried to submit a bug report, but I didn't have a user for their bug tracker and I didn't want to subscribe to their mailing list. If you'd like this fix, I suggest you submit your own.

Resources