cron tab job not getting executed - hadoop

I have hadoop installed on centOS system. I have a shell script which merges all the small files of HDFS generated at some particular hour folder location into one single file at another location at hdfs.
The shell works perfectly OK when invoked.
I then placed the shell to run as a cron job at 01:30 AM everyday.
I typed crontab -e and pasted this:
30 1 * * * /home/hadoop/tmp/cron-merge-files.sh > /home/hadoop/tmp/cron-merge-files.txt
But the merge operation does not happen. I see at /var/log/cron file that at 01:30 AM this entry comes but I cant see those files merged at hdfs. When I simply execute
the shell script, then it works perfectly OK and does the said operation written inside the script.
Jul 8 01:30:01 ip-10-1-3-111 CROND[2265463]: (hadoopuser) CMD (/home/hadoop/tmp/cron-merge-files.sh > /home/hadoop/tmp/cron-merge-files.txt)
The content of /home/hadoop/tmp/cron-merge-files.txt is a single echo statement which is written inside a loop. The loop is supposed to run 24 times, and it prints it
24 times.
I am not sure what is happening.

I got the solution for this problem from another forum. There was some problem with the environment variables not getting picked up when same script was run through crontab. Just exported .bash_profile to my script and it worked.

Related

How do i execute this cron job from a mac?

I have a bash script that I can execute with cd ~/Documents/Code/shhh/ && ./testy if i'm in any directory on my computer and that successfully pushes to github which is what i want.
I'm trying to schedule a cron job to do this daily so I ran crontab -e which opens a nano editor and then I put 30 20 * * * cd ~/Documents/Code/shhh/ && ./testy to run daily at 10:30pm and hit control O, enter and control X. But still it it didn't execute. When I type crontab -l it shows my command & I have a You have new mail. message when I open a new window. Still my command doesn't execute even though it will when I run it from any other directory.
I think my crontab job is at /var/at/tmp so I ran 30 20 * * * cd ../../../Users/squirrel/Documents/Code/shhh/ && ./testy but still nothing, even though it does work when I write it out myself from that directory. Sidenote, I can't enter into the tmp folder even after using sudo
OK When I type in mail I see a lot message and inside i get this error
---------------Checking Status of 2---------------
[master 0c1fff8] hardyharhar
1 file changed, 1 insertion(+), 1 deletion(-)
fatal: could not read Username for 'https://github.com': Device not configured
When you by default open nano Input_file it opens it up in INSERT mode(unlike vi where we have to explicitly go to INSERT mode by pressing i key). Now when you have done CTNRL+O it will then ask you if you want to save changes to opened file or not eg--> File Name to Write: Input_file If you press ENTER then it will save it and will be back on the screen(to your Input_file) where you entered new line. Now you could press CONTRL+X to come out of Input_file. May be you are stuck after saving it and want to come out then try this out once?
crontab -e does not edit the crontab file "live" at all -- changes are saved to the active file only after you save changes and exit the editor.
It also sounds like you may be using incorrect directory paths. The command in a crontab entry will generally be executed starting from the user's home directory. So if your home directory is /Users/squirrl, the command cd ../parent_directory/ will try to move to /Users/parent_directory. I suspect this is not what you want.
Finally, note that cron jobs run with a very minimal environment, without running most of your usual shell setup files (e.g. .bashrc). Most notably, if your script uses any commands that aren't in /bin or /usr/bin, you'll either need to use explicit full paths to them, or change the PATH variable to include the directories they're in.

Execute bash commands in .txt file via cronjob

Set-up
I have 3 .txt files containing commands to be executed each day.
The berlin_run.txt file executes a.o. the 2 other .txt files. The file is the following,
#!/bin/bash
cd /path/to/folder/containing/berlin_run.txt
PATH=$PATH:/usr/local/bin
export PATH
./spider_apartments_run.txt
./spider_rooms_run.txt
python berlin_apartments_ads.py;python berlin_rooms.py
When I cd to /path/to/folder/containing/berlin_run.txt in my MacOS Terminal, and execute the ./berlin_run.txt command, everything works fine.
It is my understanding that ./ opens the berlin_run.txt, and that #!/bin/bash ensures that the subsequent lines in the berlin_run.txt are automatically executed upon opening.
Problem
I want to automate the execution of berlin_run.txt.
I have written the following cronjob,
10 13 * * * /path/to/folder/containing/berlin_run.txt
It is my understanding that this cronjob should open the berlin_run.txt each day at 13:10. Assuming that is correct, #!/bin/bash should execute all the subsequent lines. But nothing seems to happen.
Where and what am I doing wrong here?

monitor for file then copy to another directory

I'm relatively new to unix scripting, so apologies for the newbie question.
I need to create a script which will permanently run in the background, and monitor for a file to arrive in an FTP landing directory, then copy it to a different directory, and lastly remove the file from the original directory.
The script is running on a Ubuntu box.
The landing directory is /home/vpntest
The file needs to be copied as /etc/ppp/chap-secrets
So far, I've got this
#/home/vpntest/FTP_file_copy.sh
if [ -f vvpn_azure.txt ]; then
cp vvpn_azure.txt /etc/ppp/chap-secrets
rm vvpn_azure.txt
fi
I can run this as root, and it works, but only as a one off (I need it to run permanently in the background, and trigger each time a new file is received in the landing zone.)
If I don't run as root, I get issues with permissions (even if I run it from within the directory /home/vpntest.)
Any help would be much appreciated.
Updated: crontab correction and extra info
One way to have a check and move process in background with root permissions, is the "polling" approach done via root user's crontab, with your script.
Steps:
Revise your /home/vpntest/FTP_file_copy.sh:
#!/bin/bash
new_file=/home/vpntest/vvpn_azure.txt
if [ -f "$new_file" ]; then
mv $new_file /etc/ppp/chap-secrets
fi
Log out. Log in as root user.
Add a cron task to run the script:
crontab -e
If this is a new machine, and your first time running crontab, you may get a prompt first to choose an editor for crontab, just choose and continue into the editor.
The format is m h dom mon dow command, so if checking every 5 minutes is sufficiently frequent, do:
*/5 * * * * /home/vpntest/FTP_file_copy.sh
Save and close to apply.
It will now automatically run the script every 5 minutes in the background, helping you to move the file if found.
Explanation
Root user, because you mentioned it only worked for you as root.
So we set this in the root user's crontab to have sufficient permissions.
man 5 crontab informs us:
Steps are also permitted after an asterisk, so if you want to say
'every two hours', just use '*/2'.
Thus we write */5 in the first column, which is the minutes column,
to set for "every 5 minutes".
FTP_file_copy.sh:
uses absolute paths, can run from anywhere
re-arranged so one variable new_file can be re-used
good practice to enclose any values being checked within your [ ] test
uses mv to write over the destination while removing itself from the source directory

Bash cron job doesn't execute like it does on command line

I've written a bash script that executes a python script to write a file to a directory, then sends that file to Amazon S3. When I execute the script from the command line it executes perfectly, but when I run it with cron, the file writes to the directory, but never gets sent to S3. I must be doing something wrong with cron.
Here is the bash script:
#!/bin/bash
#python script that exports file to home directory
python some_script.py
#export file created by python to S3
s3cmd put /home/bitnami/myfile.csv s3://location/to/put/file/myfile.csv
Like I said before, manually executing works fine using ./bash_script.sh. When I set up the cron job, the file writes to the directory, but never gets sent to S3.
my cron job is:
18 * * * * /home/bitnami/bash_script.sh
Am I using cron incorrectly? Please help.
Cron looks OK, however your path to the .py file will not be found.
You will have to add a path or home like:
location=/home/bitnami/
python $location/some_script.py
Also s3cmd needs to be located correctly:
/bin/s3cmd
Alternative might also need to load your user environment first before executing the script to find username/password/ssh key for s3cmd

SQL script not executing in bash

I am running an SQL script from bash. One of the scripts seems to be running fine, but the other script fails. Can you please advise what might be the cause for the same?
#!/bin/bash
sqlplus -S user/password#database << EOF
whenever sqlerror exit sql.sqlcode;
set echo off
set heading off
#MyScript1
#MyScript2
exit;
EOF
Error:
SP2-0310: unable to open file "MyScript2.sql"
In Unix the access level for both is:
-rwxrwxrwx MyScript1.sql
-rwxrwxrwx MyScript2.sql
The error does give an indication that it is not able to access the file MyScript2.sql. But what I am curious about is how come it can access MyScript1.sql which is present in the same folder, but not MyScript2.sql?
Also if I run the file just in unix (using SQL*Plus) from the folder where the files are present it works fine. But if I run the same from a different folder it doesn't. Below example will explain it better
/Folder/having/the/files
both MyScript1.sql and MyScript2.sql run fine
/Some/random/folder
MyScript1.sql runs fine , but MyScript2.sql errors out
You said:
if I run the file just in unix (using SQL*Plus) from the folder where the
files are present it works fine. But if I run the same from a
different folder it doesn't.
If you run the bash script from a different folder to where you have the SQL files, how do you expect SQL*Plus to know where to find those? The question becomes not 'why can't it see MyScript2.sql, but why it can see MyScript1.sql. The obvious answer is that it can't, or at least can't see the version of the file you think it's seeing.
From the SQL*Plus documentation:
SQL*Plus searches for SQL scripts, including login.sql, in the current
directory and then in the directories specified by SQLPATH, and in the
subdirectories of SQLPATH directories.
So if you haven't given the full path to the SQL file, it will search in the current working directory - where you are sitting when you execute the bash script, not the directory the bash script is in, i.e. what pwd shows - and the in $SQLPATH if it is set.
That suggests you have a copy of MyScript1.sql in one of those places, or possibly a soft link to your real file. If I had to guess, I'd speculate that you originally wrote MyScript.sql the same directory as the script, then copied it to another directory before writing MyScript2.sql. In any case, the MyScript1.sql you're running might be out of date, or is likely to become so in the future.
The short answer is to give the full path to the SQL files, either as part of the # command, or by changing to that directory in the bash script before launching SQL*Plus.

Resources