Airflow Bash Operator : Not able to see the output - bash

I am a newbie to Airflow and trying to create a simple task of executing a bash file(whose job is just to create a directory). I have given the full path of the bash file to be executed (with a space at the end) in bash_command. However, upon triggering the DAG from the UI, I see no errors in the log as well as no folder created with the name specified in the bash file.
Can someone please help me fix the issue?

When BashOperator executes, Airflow will create a temporary directory as the working directory and executes the bash command. When the execution finishes, the temporary directory will be deleted.
To keep the directory created from the bash command, you can either
specify an absolute path outside of the working directory, or
change your working directory to a place outside of the temporary directory.
I am creating a test directory in the Airflow home directory.
p = BashOperator(
task_id='create_dir',
bash_command='pwd; mkdir $AIRFLOW_HOME/test; ls -al',
)

Related

Executing a bash script from anywhere on Windows

I am on Windows.
I have a script file named basics.sh and here is what it contains:
cd opt-out-exam/abduvosid_malikov/IT
mkdir made_by_my_script
cd made_by_my_script
echo "Hello World" > hello.txt
so basically, basics.sh script file is responsible to:
go to folder opt-out-exam/abduvosid_malikov/IT
make a directory made_by_my_script
create hello.txt file with content Hello World
Right now. to execute this basics.sh script, I am going to IT folder and writing this command in the terminal:
./basics.sh
In order to execute this basics.sh script, is it compulsory for me to go to IT folder
OR
is it possible to execute this script file even if I am staying in another folder (lets say currently working directory is opt-out-exam)
The first line is a change directory command followed by a relative path, not absolute. In such cases, it is important where you run the script. (An absolute path would start with the filesystem root, i. e. /.)
If you run this script from a directory (I wouldn't call it a folder in this context) where the relative path opt-out-exam/abduvosid_malikov/IT does not exist, it won't cd into it. But it will make a new directory without any problem, it will also create the file and write a line into it.
So only the first line will fail if it's run somewhere else.
UPD: As Gordon Davisson pointed out, this means that you want to check whether the directory change actually took place or not.

Shell script to create directory from any random directory

I am currently in /Desktop on my ubuntu system and I want to create a directory named vip inside /Documents/subd. Please not that Documents and Desktop are at same level. But the crux of this question is that I have to write a shell script such that it can create the requied directory from any directory of the system, no matter where it is situated.
I have tried concatenating $home with the required directory path!! But it is not working.
mkdir $home."/Documents/subd/vip"
I need to run this inside /Desktop or any other directory.
Please guide me!!
This should do the job:
mkdir "$HOME/Documents/subd/vip"
You just had some minor errors in your command.

can't run .sh or .py file directly from bashoperater

I am trying to run python script from bash command in airflow, I even tried running the .sh file from the bash command which may trigger the python file. But it keeps giving me error "No such file or directory". I just can't figure out the path to be given to the file.
I have tried specifying different paths to reach the directory also changed permissions of the files to 777 but still no change
t1 = BashOperator(
task_id='requestpage',
bash_command="/home/username/runfile.sh ",
dag=dag,
)
above is the defined task but it never executes.

How to execute EMR step that loads more scripts from s3?

I want to execute a shell script as a step on EMR that loads a tarball, unzips it and runs the script inside. I chose this setup to stay as vendor-agnostic as possible.
My script is
#!/bin/sh
aws s3 cp s3://path_to_my_bucket/name_of.tar.gz .
tar -xzf name_of.tar.gz
. main_script.sh
Where main_script.sh is part of the tarball along with a number of other packages, scripts and config files.
If I run this script as a Hadoop user on the master node, everything works as intended. Added as a step via the command-runner.jar, I get errors, no matter what I try.
What I tried so far (and the errors):
running the script as above (file not found "main_script.sh")
hardcoding the path to be the Hadoop users home directory (permission denied on main_script.sh)
dynamically getting the path where script lives (using this) and giving this path as an argument for the tar -C option and invoking main_script.sh explicitly from this path (another permission denied on main_script.sh)
What is the proper way of loading a bash script into the master node and executing it?
As a bonus, I am wondering why the command-runner.jar is set up so different from the spark step, which runs as the Hadoop user in the Hadoop user directory.
you can use script-runner.jar with region
JAR location : s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
Arguments : s3://your_bucket/your_shell_script.sh
Refer below link for more info
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html

How to call nested script in sqlplus

I have following script hierarchy.
Scripts/master.sql
Scripts/GB/gb.sql
Scripts/GB/user1/insert.sql
master.sql contains simple #script to call gb.sql
e.g.
#GB/gb.sql
gb.sql contains below
#user1/insert.sql
The problem is that if i run master.sql from Scripts directory, i get below error:
unable to find insert.sql
Whereas if I execute gb.sql from GB directory, ir run successfully.
Can you please help me?
SQL*Plus directories are always reletive to the original working directory. Your scripts will need to repeat the full path from the working directory each time.
Change gb.sql to:
#GB/user1/insert.sql
The ## can be used to reference files in the same directory as the running file, but ## does not work with sub directories.

Resources