I have the following script that pushes files to remote location:
#!/usr/bin/bash
HOST1='a.b.c.d'
USER1='load'
PASSWD1='load'
DATE=`date +%Y%m%d%H%M`
DATE2=`date +%Y%m%d%H`
DATE3=`date +%Y%m%d`
FTPLOGFILE=/logs/Done.$DATE2.log
D_FOLDER='/dir/load01/input'
PUTFILE='file*un'
ls $PUTFILE | while read file
do
echo "${file} transfered at $DATE" >> /logs/$DATE3.log
done
ftp -n -v $HOST1 <<SCRIPT >> ${FTPLOGFILE} 2>&1
quote USER $USER1
quote PASS $PASSWD1
cd $D_FOLDER
ascii
prompt off
mput /data/file*un
quit
SCRIPT
mv *un test/
ls test/*un | awk '{print("mv "$1" "$1)}' | sed 's/\.un/\.processed/2' |sh
rm *unl
I am getting this error output:
200 PORT command successful.
553 /data/file1.un: A file or directory in the path name does not exist.
200 PORT command successful.
Some improvements:
#!/usr/bin/bash
HOST1='a.b.c.d'
USER1='load'
PASSWD1='load'
read Y m d H M <<<$(date "+%Y %m %d %H %M") # only one call to date
DATE='$Y$m$d$H$M'
DATE2='$Y$m$d$H'
DATE3='$Y$m$d'
FTPLOGFILE=/logs/Done.$DATE2.log
D_FOLDER='/dir/load01/input'
PUTFILE='file*un'
for file in $PUTFILE # no need for ls
do
echo "${file} transfered at $DATE"
done >> /logs/$DATE3.log # output can be done all at once at the end of the loop.
ftp -n -v $HOST1 <<SCRIPT >> ${FTPLOGFILE} 2>&1
quote USER $USER1
quote PASS $PASSWD1
cd $D_FOLDER
ascii
prompt off
mput /data/file*un
quit
SCRIPT
mv *un test/
for f in test/*un # no need for ls and awk
do
mv "$f" "${f/%.un/.processed}"
done
rm *unl
I recommend using lower case or mixed case variables to reduce the chance of name collisions with shell variables.
Are all those directories really directly off the root directory?
Ftp to the the remote site and execute the ftp commands by hand. When the error occurs, look around to see what is the cause. (Use "help" if you don't know the ftp command line.)
Probably the /data directory does not exist. has anyone reorganized the upload directory recently, or maybe moved the root directory of the ftp server?
The problem with scripting an FTP session is that FTP believes it has executed itself correctly if it reports errors to stdout. Consequently, it's devilishly hard to pick up errors, since it will only return a fail on something catastrophic. If you need anything more than the most simple of command lists, you should really be using something like expect or a java or perl program that can easily test the result of each action.
That said, you can run the ftp as a coprocess, or set it up so that it runs in background with it's stdin and stdout fitted to named pipes, or some structure like that where you can read and parse the output from one command before deciding what to pass in for the next one.
A read loop that cycles on a case statement which tests for known responses and behaves accordingly is a passably acceptable all-bash version. if you always terminate every command block with something like an image command that returns a fixed and known value, you can scan for known errors, and check for the return from that command in the case statement, and when you get the "sentinal" return loop back and read the next input. This makes for a largish and fairly complicated shell script, though.
Also, you need to test that when you get (for example) a 5[0-9][0-9] *) return it isn't actually "553 bytes*" because ftp screws you that way too.
Apologies for the length of the answer without including a code example - I just wanted to mention some ideas and caveats that wouldn't fit readably in a comment.
Related
The below script download file using CURL, I'm trying inside the loop to save the file and also to insert the saved file name into a variable and then to print it.
My script downloads the script and saved the file but can't echo the saved file name:
for link in $url2; do
cd /var/script/twitter/html_files/ && file1=$({ curl -O $link ; cd -; })
echo $file1
done
Script explanation:
$url2 contains one or more URLs
curl -O write output to a file named as the remote file
Your code has several problems. Assuming $url2 is a list of valid URLs which do not require shell quoting, you can make curl print the output variable directly.
cd /var/script/twitter/html_files
for link in $url2; do
curl -s -w '%{filename_effective}\n' -O "$link"
done
Without the -w formatstring option, the output of curl does not normally contain the output file name in a conveniently machine-readable format (or actually at all). I also added an -s option to disable the download status output it prints by default.
There is no point in doing cd to the same directory over and over again inside the loop, or capturing the output into a variable which you only use once to print to standard output the string which curl by itself would otherwise print to standard output.
Finally, the cd - does not seem to do anything useful here; even if it did something useful per se, you are doing it in a subshell, which doesn't change the current working directory of the script which contains the $(cd -) command substitution.
If your task is to temporarily switch to that directory, then switch back to where you started, just cd once. You can use cd - in Bash but a slightly more robust and portable solution is to run the fetch in a subshell.
( cd directory;
do things ...
)
# you are now back to where you were before the cd
If you genuinely need the variable, you can trivially use
for link in $url2; do
file1=$(curl -s -w '%{filename_effective}' -O "$link")
echo "$file1"
done
but obviously the variable will only contain the result from the last iteration after the loop (in the code after done). (The format string doesn't need the final \n here because the command substitution will trim off any trailing newline anyway.)
I have some pseduocode below and would like to know if it would work/ is the best method to tackle the problem before I begin developing the code.
I need to dynamically search through a directory on one server and find out if it exists on another server or not. The path will be different so I use basename and save it as a temporary variable.
for $FILE in $CURRENT_DIRECTORY
$TEMP=$(basename "$FILE" )
if [ssh user#other_serverip find . -name '$TEMP']; then
//write code here
fi
Would this if statement return true if the file existed on the other server?
Here is a functioning, cleaner implementation of your logic:
for FILE in *; do
if ssh user#other_serverip test -e "$FILE"; then
# write code here
fi
done
(There won't be a path on files when the code is composed this way, so you don't need basename.) test -e "$FILE" will silently exit 0 (true) if the file exists and 1 (false) if the file does not, though ssh will also exit with a false code if the connection fails.
However, that is a very expensive way to solve your issue. It will fail if your current directory has too many files in it and it runs ssh once per file.
You're better off getting a list of the remote files first and then checking against it:
#!/bin/sh
if [ "$1" != "--xargs" ]; then # this is an internal flag
(
ssh user#other_serverip find . -maxdepth 1 # remote file list
find . -maxdepth 1 # local file list
) |awk '++seen[$0]==2' |xargs -d "\n" sh "$0" --xargs # keep duplicates
else
shift # remove the --xargs marker
for FILE in "$#"; do
# write code here using "$FILE" (with quotes)
done
fi
This does two things. First, since the internal --xargs is not given when you run the script, it connects to the remote server and gets a list of all files in the home directory there. These will be listed as ./.bashrc for example. Then the same list is generated locally, and the results are passed to awk.
The awk command builds an associative array (a hash) from each item it sees, incrementing it and then checking the total against the number two. It prints the second instance of any line it sees. Those are then passed on to xargs, which is instructed to use \n (a line break) as its delimiter rather than any space character.
Note: this code will break if you have any files that have a line break in their name. Don't do that.
xargs then recursively calls this script, resulting in the else clause and we loop through each file. If you have too many files, be aware that there may be more than one instance of this script (see man xargs).
This code requires GNU xargs. If you're on BSD or some other system that doesn't support xargs -d "\n", you can use perl -pe 's/\n/\0/' |xargs -0 instead.
It would return true if ssh exits successfully.
Have you tried command substitution and parsing find's output instead?
I have a 9000 urls list to scrap into a file.txt keeping same dir structure as writed in the url list.
Each url is composed by http://domain.com/$dir/$sub1/$ID/img_$ID.jpg where $dir and $sub1 are integer numbers from 0 to 9
I tried running
wget -i file.txt
but it takes any img_$ID.jpg in the same local dir where i'm, so i get any file in root loosing the $dir/%sub1/$ID folders structure.
I thought have to write a script which does
mkdir -p $dir/$sub1/$ID
wget -P $dir/$ #Correcting a typo in the message i left the full path pending, it was the same as previous mkdir command => "wget -P $dir/$sub1/$ID"
for each line in file.txt, but i have no idea on where to start.
I think simple shell loop with a bit of string processing should work for you:
while read line; do
line2=${line%/*} # removing filename
line3=${line2#*//} # removing "http://"
path=${line3#*/} # removing "domain.com/"
mkdir -p $path
wget -P$path $line
done <file.txt
(SO's editor mis-interprets # in the expression and colors the rest of the string as comment - don't mind it. The actual comments are on the very right.)
Notice that wget command is not as you described (wget -P $dir/$), but rather the one that seems more correct (wget -P $dir/$sub1/$ID). If you insist on your version, please clarify what do you mean by terminal $.
Also, for the purpose of debugging you might wanna verify the processing before you run the actual script (that copies the files) - you may do something like that:
while read line; do
echo $line
line2=${line%/*} # removing filename
echo $line2
line3=${line2#*//} # removing "http://"
echo $line3
path=${line3#*/} # removing "domain.com/"
echo $path
done <file.txt
You'll see all string processing steps and will make sure the resulting path is correct.
Thank you in advance for any help, this is coursework so further reading/ pointers is greatly appreciated.
I asked a question the other day relating to my own delete/trash/restore scripts and I have completed delete and trash as well as giving delete a backup text file for Restore to use later on.
However, instead of giving me errors, the Restore script just kinda stops in the console. Like when I type # ~/Restore -n the cursor skips to the next line without the usual # and I have to close it manually. Likewise without the -n option. The -n option should ask for a new location to restore to, and without it should restore to the files original location.
I'll post my script, see what y'all think.
#!/bin/bash
if [ "$1" == "-n" ]
then cd ~/rubbish
restore= grep $2 ~/store
filename= basename "$restore"
echo "Type the files new location"
read location
location1 = "readlink -f $location"
mv -i $filename "$location1" /$filename
else cd ~/rubbish
restore= grep $2 ~/store
filename= basename "$restore"
mv -i $filename "$location1" $location
fi
so, ~/rubbish is my own created directory to act as a recycle bin and ~/store is my text file which appends the deleted files readlink details on deletion. I can post the whole 3 scripts if necessary?
Many thanks!
If you call ~/Restore -n it will go to the if part and do a grep $2 ~/store. Since there is no parameter $2 it will result in grep ~/store, which tells grep to search for "~/store" in the input coming from standard input.
That's why your script stops and waits for input.
You can either test for a second parameter or enclose $2 in double quotes to make sure grep gets the correct number of parameters. Better yet, do both: 1. test for a second parameter and 2. enclose $2 in double quotes.
Some more points:
Don't put spaces around =
enclose commands in backticks `, if you want to capture the output
And no spaces between directory and filename
So, you should presumably write
restore=`grep "$2" ~/store`
filename=`basename "$restore"`
echo "Type the files new location"
read location
location1=`readlink -f "$location"`
mv -i $filename "$location1/$filename"
I suggest you look at bash info and follow the "Books and Resources".
I wrote one of these quite some time ago which I still use today. I don't have a restore script because I wrote it so that you could open your desktop trash can, right click and select "Restore". In other words it follows the Linux "trash info" standard.
http://wiki.linuxquestions.org/wiki/Scripting#KDE4_Command_Line_Trash_Can
I've been handed a project that consists of several dozen (probably over 100, I haven't counted) bash scripts. Most of the scripts make at least one call to another one of the scripts. I'd like to get the equivalent of a call graph where the nodes are the scripts instead of functions.
Is there any existing software to do this?
If not, does anybody have clever ideas for how to do this?
Best plan I could come up with was to enumerate the scripts and check to see if the basenames are unique (they span multiple directories). If there are duplicate basenames, then cry, because the script paths are usually held in variable names so you may not be able to disambiguate. If they are unique, then grep the names in the scripts and use those results to build up a graph. Use some tool (suggestions?) to visualize the graph.
Suggestions?
Wrap the shell itself by your implementation, log who called you wrapper and exec the original shell.
Yes you have to start the scripts in order to identify which script is really used. Otherwise you need a tool with the same knowledge as the shell engine itself to support the whole variable expansion, PATHs etc -- I never heard about such a tool.
In order to visualize the calling graph use GraphViz's dot format.
Here's how I wound up doing it (disclaimer: a lot of this is hack-ish, so you may want to clean up if you're going to use it long-term)...
Assumptions:
- Current directory contains all scripts/binaries in question.
- Files for building the graph go in subdir call_graph.
Created the script call_graph/make_tgf.sh:
#!/bin/bash
# Run from dir with scripts and subdir call_graph
# Parameters:
# $1 = sources (default is call_graph/sources.txt)
# $2 = targets (default is call_graph/targets.txt)
SOURCES=$1
if [ "$SOURCES" == "" ]; then SOURCES=call_graph/sources.txt; fi
TARGETS=$2
if [ "$TARGETS" == "" ]; then TARGETS=call_graph/targets.txt; fi
if [ ! -d call_graph ]; then echo "Run from parent dir of call_graph" >&2; exit 1; fi
(
# cat call_graph/targets.txt
for file in `cat $SOURCES `
do
for target in `grep -v -E '^ *#' $file | grep -o -F -w -f $TARGETS | grep -v -w $file | sort | uniq`
do echo $file $target
done
done
)
Then, I ran the following (I wound up doing the scripts-only version):
cat /dev/null | tee call_graph/sources.txt > call_graph/targets.txt
for file in *
do
if [ -d "$file" ]; then continue; fi
echo $file >> call_graph/targets.txt
if file $file | grep text >/dev/null; then echo $file >> call_graph/sources.txt; fi
done
# For scripts only:
bash call_graph/make_tgf.sh call_graph/sources.txt call_graph/sources.txt > call_graph/scripts.tgf
# For scripts + binaries (binaries will be leaf nodes):
bash call_graph/make_tgf.sh > call_graph/scripts_and_bin.tgf
I then opened the resulting tgf file in yEd, and had yEd do the layout (Layout -> Hierarchical). I saved as graphml to separate the manually-editable file from the automatically-generated one.
I found that there were certain nodes that were not helpful to have in the graph, such as utility scripts/binaries that were called all over the place. So, I removed these from the sources/targets files and regenerated as necessary until I liked the node set.
Hope this helps somebody...
Insert a line at the beginning of each shell script, after the #! line, which logs a timestamp, the full pathname of the script, and the argument list.
Over time, you can mine this log to identify likely candidates, i.e. two lines logged very close together have a high probability of the first script calling the second.
This also allows you to focus on the scripts which are still actually in use.
You could use an ed script
1a
log blah blah blah
.
wq
and run it like so:
find / -perm +x -exec ed {} <edscript
Make sure you test the find command with -print instead of the exec clause. And / is probably not the path that you want to use. If you have to include bin directories then you will probably need to switch to grep in order to identify the pathnames to include, then when you have a file full of the right names, use xargs instead of find to run the script.