I have a shell script that should run the pentaho transforamtion job but it fails with the following error:
/data/data-integration/spoon.sh: 1: /data/data-integration/spoon.sh: ldconfig: not found
Here's the shell script which sits in:
/home/tureprw01/
and the script:
#!/bin/sh
NOW=$(date +"%Y%m%d_%H%M%S")
/data/data-integration/./pan.sh -file=/data/reporting_scripts/op/PL_Op.ExtlDC.kjb >> /home/tureprw01/logs/PL_Op.ExtDC/$NOW.log
I'm completely green in terms of Java but need to make it work somehow
Using command line executions for Pan / Kitchen is simple, This Documentation should help you create the Batch/SH command and make it work.
Though i see you are using variable creation on the command line, personally i do not know if the Batch/SH variable is passed down correctly to the PDI parameters, you'd have to test that yourself, or use this variable definition within the PDI structure, not as a named parameter.
use this :
!/bin/sh
NOW=$(date +"%Y%m%d_%H%M%S")
cd /data/reporting_scripts/op/
/data/data-integration/spoon.sh -main org.pentaho.di.pan.Pan -initialDir /data/data-integration -file=/data/reporting_scripts/op/PL_Op.ExtlDC.kjb
#!/bin/bash
# use for jobs if you want to run transform change :
# "org.pentaho.di.kitchen.Kitchen" to "org.pentaho.di.pan.Pan" and insert ktr file
export PENTAHO_JAVA_HOME=/root/app/jdk1.8.0_91
export JAVA_HOME=/root/app/jdk1.8.0_91
cd /{kjb path}/;
/{spoon path}/spoon.sh -main org.pentaho.di.kitchen.Kitchen -initialDir /{kjb path}//{kjb file}.kjb -repo=//{kjb path}/{resource file}.xml -logfile=/{log file}.log -dir=/{kjb path}
Related
I have several multi-purpose shell scripts stored in .sh files. My intention is to build a few Airflow DAGs on Cloud Composer that will leverage these scripts. The DAGs would be made mostly of BashOperators that call the scripts with specific arguments.
Here's a simple example, greeter.sh:
#!/bin/bash
echo "Hello, $1!"
I can run it locally like this:
bash greeter.sh world
> Hello, world!
Let's write a simple DAG:
# import and define default_args
dag = DAG('bash_test',
description='Running a local bash script',
default_args=default_args,
schedule_interval='0,30 5-23 * * *',
catchup=False,
max_active_runs=1)
bash_task = BashOperator(
task_id='run_command',
bash_command=f"bash greeter.sh world",
dag=dag
)
But where to put the script greeter.sh? I tried putting it both in the dags/ folder and the data/ folder, at first level or nested within a dependencies/ directory. I also tried writing the address as ./greeter.sh. Pointless: I can never find the file.
I also tried using sh in place of bash and I get a different error: sh: 0: Can't open greeter.sh. But this error also appears when the file is not there, so it's the same issue. Same with any attempt to run chmod +rx.
How can I make my file available to Airflow?
The comments on this question revealed the answer.
The address for the dags_folder is stored in the DAGS_FOLDER environment variable.
To get the right address for a script stored in dags_folder/:
import os
DAGS_FOLDER = os.environ["DAGS_FOLDER"]
file = f"{DAGS_FOLDER}/greeter.sh"
I have to run some hundreds of simulations and scan the output file for a certain variable. In order to run the program, I need to write
$SIMPLESIM/simplesim-3.0/sim-outorder -config ../../config/tmp.cfg bzip2_base.i386-m32-gcc42-nn dryer.jpg
to the terminal, where tmp.cfg is the config file I will be modifying for each simulation. Running this outputs a file in which I named via tmp.cfg. This obviously works when I literally type it into terminal, however, in bash script, running this command gives me the error
simplesim-3.0/sim-outorder no such file or directory
I believe it has to do with the $ symbol? Thanks for any help.
Before calling any command its path must be defined in $PATH variable or you have to manually give the complete path to invoke it.
So define SIMPLESIM in script to the path, like SIMPLESIM=/usr/bin , this /usr/bin is for reference only.To know the path do echo $SIMPLESIM in terminal and see the path
and call the command $SIMPLESIM/simplesim-3.0/sim-outorder -config ../../config/tmp.cfg bzip2_base.i386-m32-gcc42-nn dryer.jpg
I hope this is fairly simple but I'm struggling to get this to work.
I have a java package which I want to execute using a shell script command...
/jdk1.7.0/bin/java .path.to.classname.ClassToExecute >> /var/log/output.log
...so essentially...
./SCRIPT_NAME
...should run the above from the command line.
The problem is there is a classpath update needed every time first from the command line to enable the session to see a particular JAR...
export CLASSPATH=$CLASSPATH:/path/to/jar/file/lib/JAR_NAME.jar:.
If I don't put this line in first the shell script will not execute throwing errors of NoClassDefFoundError relating to the JAR I need to add manually.
Can anyone tell me where I need to edit this classpath update so that it's ALWAYS available to the script and also to the cron as ultimately I want to call it from the cron?
Thanks,
ForestSDMC
Your shell script should look like this.
#!/bin/bash
export CLASSPATH=$CLASSPATH:/path/to/jar/file/lib/JAR_NAME.jar:.
/jdk1.7.0/bin/java .path.to.classname.ClassToExecute >> /var/log/output.log
You also need to change the permissions of the script so that it is executable
chmod 700 SCRIPT_NAME
700 = owner can only execute the script
770 = owner and members of a group can run the script
777 = everyone who has access to the server can run the script.
Noticed that you want to run this from cron. You need to source your .profile either from the crontab entry or from within the script.
Just found the answer and works fine so hopefully others will find this useful...
You can dynamically generate the classpath variable within the shell script and then apply it as an attribute to the java command line execution. Like this...
THE_CLASSPATH=
for i in `ls /path/to/the/JARS/lib/*.jar`
do
THE_CLASSPATH=${THE_CLASSPATH}:${i}
done
/usr/bin/java -cp ".:${THE_CLASSPATH}" path.to.the.class.ClassName >> /var/log/logfile.log
I am using a scheduler to run a unix script which starts up my application. The script is in the PATH of the user used by the scheduler. Hence, can be run from an y
My application log files are created relative to where the script is run from. Unfortunatley, the scheduler does not run the script from the folder I had hoped hence log files are not going to correct folder.
Is there any way in I get the script to run and behaves as it was run from a specified folder, e.g. ./ScriptName.sh Working_Folder | Run_Folder
Note: I cannot change the script
if your scheduler run your tasks using a shell (which it probably do) you can use { cd /log/dir ; script; } directly as command.
if not, you need to use a wrapper script as stated by #Gilles but i would do:
#!/bin/sh
cd /log/dir
exec /path/to/script "$#"
to save a little memory. The extra exec will make sure only the script interpreter is in memory instead of both (sh and the script interpreter).
If you can't change the script, you'll have to make the scheduler run a different command, not the script directly. For example, make the scheduler run a simple wrapper script.
#!/bin/sh
cd /desired/directory/for/log/files
/path/to/script "$#"
This keeps happening to me all the time:
1) I write a script(ruby, shell, etc).
2) run it, it works.
3) put it in crontab so it runs in a few minutes so I know it runs from there.
4) It doesnt, no error trace, back to step 2 or 3 a 1000 times.
When I ruby script fails in crontab, I can't really know why it fails cause when I pipe output like this:
ruby script.rb >& /path/to/output
I sorta get the output of the script, but I don't get any of the errors from it and I don't get the errors coming from bash (like if ruby is not found or file isn't there)
I have no idea what environmental variables are set and whether or not it's a problem. Turns out that to run a ruby script from crontab you have to export a ton of environment variables.
Is there a way for me to just have crontab run a script as if I ran it myself from my terminal?
When debugging, I have to reset the timer and go back to waiting. Very time consuming.
How to test things in crontab better or avoid these problems?
"Is there a way for me to just have crontab run a script as if I ran it myself from my terminal?"
Yes:
bash -li -c /path/to/script
From the man page:
[vindaloo:pgl]:~/p/test $ man bash | grep -A2 -m1 -- -i
-i If the -i option is present, the shell is interactive.
-l Make bash act as if it had been invoked as a login shell (see
INVOCATION below).
G'day,
One of the basic problems with cron is that you get a minimal environment being set by cron. In fact, you only get four env. var's set and they are:
SHELL - set to /bin/sh
LOGNAME - set to your userid as found in /etc/passwd
HOME - set to your home dir. as found in /etc/passwd
PATH - set to "/usr/bin:/bin"
That's it.
However, what you can do is take a snapshot of the environment you want and save that to a file.
Now make your cronjob source a trivial shell script that sources this env. file and then executes your Ruby script.
BTW Having a wrapper source a common env. file is an excellent way to enforce a consistent environment for multiple cronjobs. This also enforces the DRY principle because it gives you just one point to update things as required, instead of having to search through a bunch of scripts and search for a specific string if, say, a logging location is changed or a different utility is now being used, e.g. gnutar instead of vanilla tar.
Actually, this technique is used very successfully with The Build Monkey which is used to implement Continuous Integration for a major software project that is common to several major world airlines. 3,500kSLOC being checked out and built several times a day and over 8,000 regression tests run once a day.
HTH
'Avahappy,
Run a 'set' command from inside of the ruby script, fire it from crontab, and you'll see exactly what's set and what's not.
To find out the environment in which cron runs jobs, add this cron job:
{ echo "\nenv\n" && env|sort ; echo "\nset\n" && set; } | /usr/bin/mailx -s 'my env' you#example.com
Or send the output to a file instead of email.
You could write a wrapper script, called for example rbcron, which looks something like:
#!/bin/bash
RUBY=ruby
export VAR1=foo
export VAR2=bar
export VAR3=baz
$RUBY "$*" 2>&1
This will redirect standard error from ruby to the standard output. Then you run rbcron in your cron job, and the standard output contains out+err of ruby, but also the "bash" errors existing from rbcron itself. In your cron entry, redirect 2>&1 > /path/to/output to get output+error messages to go to /path/to/output.
If you really want to run it as yourself, you may want to invoke ruby from a shell script that sources your .profile/.bashrc etc. That way it'll pull in your environment.
However, the downside is that it's not isolated from your environment, and if you change that, you may find your cron jobs suddenly stop working.