Execute rm with string on file and delete line - bash

I have a file (log.txt) with multiples lines.
Uploaded 1Y3JxCDpjsId_f8C7YAGAjvHHk-y-QVQM at 1.9 MB/s, total 3.9 MB
Uploaded 14v58hwKP457ZF32rwIaUFH216yrp9fAB at 317.3 KB/s, total 2.1 MB
Each line in log.txt represents a file that needs to be deleted.
I want to delete the file and then delete the respective line.
Example:
rm 1Y3JxCDpjsId_f8C7YAGAjvHHk-y-QVQM
and after deleting the file that the log.txt contains, delete the line, leaving only the others.
Uploaded 14v58hwKP457ZF32rwIaUFH216yrp9fAB at 317.3 KB/s, total 2.1 MB

Try this:
#!/bin/bash
logfile="logfile.txt"
logfilecopy=$( mktemp )
cp "$logfile" "$logfilecopy"
while IFS= read -r line
do
filename=$( echo "$line" | sed 's/Uploaded \(.*\) at .*/\1/' )
if [[ -f "$filename" ]]
then
tempfile=$( mktemp )
rm -f "$filename" && grep -v "$line" "$logfile" >"$tempfile" && mv "$tempfile" "$logfile"
fi
done < "$logfilecopy"
# Cleanup
rm -f "$logfilecopy"
It does:
keep a copy of the original log file.
read each line of this copy using while and read.
for each line, extract the filename. Note, done with sed since a filename could contain spaces. Therefore cut would not work as required.
If the file exists, delete it, remove the line from the log file and store it in a temporary file, move the temporary file into the log file.
that last step is done with && between commands to ensure that the last command is done before continuing. If the rm fails, the log entry must not be deleted.
finally delete the original log file copy.
you can add echo statements and-or -x to $!/bin/bash to debug if required.

The following code reads log.txt line by line, captures the filename with a bash ERE and tries to delete that file. When the regex or the deletion fails it outputs the original line.
#!/bin/bash
tmpfile=$( mktemp ) || exit 1
while IFS='' read -r line
do
[[ $line =~ ^Uploaded\ (.*)\ at ]] &&
rm -- "${BASH_REMATCH[1]}" ||
echo "$line"
done < log.txt > "$tmpfile" &&
mv "$tmpfile" log.txt
remark: the while loop final result is true unless there's a problem reading log.txt or generating "$tmpfile", so chaining the mv with && makes it so that you won't overwrite the original logfile abusively.

Another approach using bash4+ and GNU tools.
#!/usr/bin/env bash
##: Save the file names in an array named files using mapfile aka readarray.
##: Process Substitution and With GNU grep that supports the -P flag.
mapfile -t files < <(grep -Po '(?<=Uploaded ).*(?= at)' log.txt)
##: loop through the files ("${files[#]}") and check if it is existing (-e).
##: If it does, save them in an array named existing_file.
##: Add an additional test if need be, see "help test".
for f in "${files[#]}"; do
[[ -e $f ]] && existing_file+=("$f")
done
##: Format the array existing_file into a syntax that is accepted
##: by GNU sed, e.g. "/file1|file2|file3|file4/d" and save it
##: in a variable named to_delete.
to_delete=$(IFS='|'; printf '%s' "/${existing_file[*]}/d")
##: delete/remove the existing files.
##: Not sure if ARG_MAX will come up.
echo rm -v -- "${existing_file[#]}"
##: Remove the deleted files (lines that contains the file name)
##: from log.txt using GNU sed.
echo sed -E -i "$to_delete" log.txt
Remove all the echo if you're satisfied with the output.
This not exactly what you asked for and it is not perfect but it just might be what you need.

Related

Parse CSV to find names corresponding to code, then copying folders with matching code to folders with corresponding name

I'm trying to automate the packaging of files and contents from various sources using a bash script.
I have a main directory which contains pdf files, a csv file, and various folders with additional contents. The folders are named with the location code they pertain to, e.g. 190, 191, etc.
A typical row in my csv file looks like this: form_letters_Part1.pdf,PX_A31_smith.adam.pdf,190,
Where the first column is the original pdf name, the second is what it will be renamed to, and the third column is the location code the person belongs to.
The first part of my script renames the pdf files from the cover letters format to the PX_A31... format, and then creates a directory for each file and moves them into it.
#!/usr/bin/tcsh bash
sed 's/"//g' rename_list_lab.csv | while IFS=, read orig new num; do
mv "$orig" "$new"
done
echo 'Rename Done.'
for file in *.pdf; do
mkdir "${file%.*}"
mv "$file" "${file%.*}"
done
echo 'Directory creation done.'
What needs to happen next is the folders with the location-specific contents get copied into those new directories just created, corresponding to the location code from the csv file.
So I tried this after the above echo 'Directory Creation Done.' line:
echo 'Directory Creation Done.'
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
for folder in *; do
if [[ -d .* = "$num" ]]; then
cp -R "$folder" "${file%.*}"
fi
done
echo 'Code Folder Contents Sort Done.'
However this results in a syntax error:
syntax error in conditional expression
syntax error near `='
` if [[ -d .* = "$num" ]]; then'
EDIT: To clarify the second part if statement, the intended logic of the statement is as follows: For the items in the current directory, if it is a directory, and the name of the directory matches the location code from the csv, that directory should be copied to any directories which have that same corresponding location code in the csv.
In other words, if the newly created directory from the first part is PX_A31_smith.adam whose location code in the csv line above is 190, then the folder called 190 should be copied into the directory PX_A31_smith.adam.
If three other people also have the 190 code in the csv, the 190 directory should also be copied to those as well.
EDIT 2: I resolved the syntax error, and also realized I had an nonterminated do statement. Fixing those, still seem to be having trouble with the evaluation of the if statement. Updated script below:
#!/usr/bin/tcsh bash
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
mv "$orig" "$new"
done
echo '1 Done.'
for file in *.pdf; do
mkdir "${file%.*}"
mv "$file" "${file%.*}"
done
echo '2 done.'
sed 's/"//g' rename_list.csv | while IFS=, read orig new num; do
for folder in * ; do
if [[ .* = "$num" ]]; then
cp -R "$folder" "${file%.*}"
else echo "No matches found."
fi
done
done
echo '3 Done.'
I'm not really sure if this answers your question, but I think it will at least set you on the right track. Structurally, I just combined all of the loops into one. This removes some of the possible logic errors that would not be considered syntax errors like the use of $file in the second part. This is a local variable to the loop in the first part and no longer exists. However, this would be interpreted as an empty string.
#!/usr/bin/bash
#^Fixed shebang line.
sed 's/"//g' rename_list.csv | while IFS=, read -r orig new num; do
if [[ -f $orig ]]; then #If the file we want to rename is indeed a file.
mkdir "${new%.*}" #make the directory from the file name you want
mv "$orig" "${new%.*}/$new" #Rename when we move the file into the new directory
if [[ -d $num ]]; then #If the number directory exists
cp -R "$num" "${new%.*}" #Fixed this based on your edit.
else
#Here you can handle what to do if the number directory does not exist.
echo "$num is not a directory."
fi
else
#Here you can handle what to do if the file does not exist.
echo "The file $orig does not exist."
fi
done
Edited based on your clarification
Note: This is pretty lacking as far as error checking goes. Remember, any of these functions could fail, which will have unwanted behavior. Either check if [[ $? != 0 ]] to check the exit status (0 being success) of the last issued command. You could also do something like mkdir somedir || exit 2 to exit on failure.

Bash loop to read line by line and create folder by line index

I am trying to do the following in a bash file:
mybash.sh myinputfile.txt myloopfile.csv
Create a for loop, that reads myloopfile.csv line by line, and then creates a folder which will take the line number as the folder name prefixed to "folder", ex. folder1, folder2...
and then inside this folder$i create a folder called input and another called output.
write the line$i content into a file called myline.txt and put this file inside the folder$i/input
and copy the file myinputfile.txt that i will pass as a parameter to the bash file inside the folder$i/input as well.
and then run my personal script that takes two arguments:
python myscript.py -i ./folder$i/input -o ./folder$i/output
and done!
myfile.csv
101,1001,10012,100121
102,101213,11122.1,12.15
103,122.15,155.2,1515.54
104,154.4,4551.1,454
what I currently have:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
mkdir ./folder$line
echo "$line" >> Folder$line/input/myline.txt;
cp myinputfile.txt Folder$line/input
python myscript.py -i ./folder$i/input -o ./folder$i/output
done < "$1"
the problem is that I can't get the line index to pass it as a suffix to the folder name, so I currently get the line content.and I don't know how to read the two files from the arguments that i pass to the mybash.sh.
Use a counter to keep track of line numbers:
#!/bin/bash
input_file=$1
csv_file=$2
count=1
while IFS= read -r line || [[ -n "$line" ]]; do
input_dir="./folder$count/input"
output_dir="./folder$count/output"
mkdir -p "$input_dir" "$output_dir"
printf '%s\n' "$line" > "$input_dir/myline.txt"
cp "$input_file" "$input_dir"
python myscript.py -i "$input_dir" -o "$output_dir"
((count++))
done < "$csv_file"

what is the purpase of the command rsync -rvzh

im trying to understand what this two command doing:
config=$(date +%s)
rsync -rvzh $1 /var/lib/tomcat7/webapps/ROOT/DataMining/target > /var/lib/tomcat7/webapps/ROOT/DataMining/$config
this line appears in a bigger script - script.sh looking like this:
#! /bin/bash
config=$(date +%s)
rsync -rvzh $1 /var/lib/tomcat7/webapps/ROOT/DataMining/target > /var/lib/tomcat7/webapps/ROOT/DataMining/$config
countC=0
countS=`wc -l /var/lib/tomcat7/webapps/ROOT/DataMining/$config | sed 's/^\([0-9]*\).*$/\1/'`
let countS--
let countS--
let countS--
while read LINEC #read line
do
if [ "$countC" -gt 0 ]; then
if [ "$countC" -lt "$countS" ]; then
FILENAME="/var/lib/tomcat7/webapps/ROOT/DataMining/target/"$LINEC
count=0
countW=0
while read LINE
do
for word in $LINE;
do
echo "INSERT INTO data_mining.data (word, line, numWordLine, file) VALUES ('$word', '$count', '$countW', '$FILENAME');" >> /var/lib/tomcat7/webapps/ROOT/DataMining/query
mysql -u root -Alaba1515< /var/lib/tomcat7/webapps/ROOT/DataMining/query
echo > /var/lib/tomcat7/webapps/ROOT/DataMining/query
let countW++
done
countW=0
let count++
done < $FILENAME
count=0
rm -f /var/lib/tomcat7/webapps/ROOT/DataMining/query
rm -f /var/lib/tomcat7/webapps/ROOT/DataMining/$config
fi
fi
let countC++
done < /var/lib/tomcat7/webapps/ROOT/DataMining/$config #finish while
i was able to find lots of documentary about rsync and what it is doing but i don't understand whats the rest of the command do. any help please?
The first command assigns the current time (in seconds since epoch) to the shell variable config. For example:
$ config=$(date +%s)
$ echo $config
1446506996
rsync is a file copying utility. The second command thus makes a backup copy of the directory listed in argument 1 (referred to as $1). The backup copy is placed in /var/lib/tomcat7/webapps/ROOT/DataMining/target. A log file of what was copied is saved in var/lib/tomcat7/webapps/ROOT/DataMining/$config:
rsync -rvzh $1 /var/lib/tomcat7/webapps/ROOT/DataMining/target > /var/lib/tomcat7/webapps/ROOT/DataMining/$config
The rsync options mean:
-r tells rsync to copy files diving recursively into subdirectories
-v tells it to be verbose so that it shows what is copied.
-z tells it to compress files during their transfer from one location to the other.
-h tells it to show any numbers in the output in human-readable format.
Note that because $1 is not inside double-quotes, this script will fail if the name of directory $1 contains whitespace.

bash call script with variable

What I want to achieve is the following :
I want the subtitles for my TV Show downloaded automatically.
The script "getSubtitle.sh" is ran as soon as the show is downloaded, but it can happen that no subtitle are released yet.
So what I am doing to counter this :
Creating a file each time "getSubtitle.sh" is ran. It contain the location of the script with its arguments, for example :
/Users/theo/logSubtitle/getSubtitle.sh "The Walking Dead - 5x10 - Them.mp4" "The.Walking.Dead.S05E10.480p.HDTV.H264.mp4" "/Volumes/Window HD/Série/The Walking Dead"
If a subtitle has been found, this file will contain only this line, if no subtitle has been found, this file will have 2 lines (the first one being "no subtitle downloaded", and the second one being the path to the script as explained above)
Now, once I get this, I'm planning to run a cron everyday that will do the following :
Remove all file that have only 1 line (Subtitle found), and execute the script again for the remaining file. Here is the full script :
cd ~/logSubtitle/waiting/
for f in *
do nbligne=$(wc -l $f | cut -c 8)
if [ "$nbligne" = "1" ]
then
rm $f
else
command=$(sed -n "2 p" $f)
sh $command 3>&1 1>&2 2>&3 | grep down > $f ; echo $command >> $f
fi
done
This is unfortunately not working, I have the feeling that the script is not called.
When I replace $command by the line in the text file, it is working.
I am sure that $command match the line because of the "echo $command >> $f" at the end of my script.
So I really don't get what I am missing here, any ideas ?
Thanks.
I'm not sure what you're trying to achieve with the cut -c 8 part in wc -l $f | cut -c 8. cut -c 8 will select the 8th character of the output of wc -l.
A suggestion: to check whether your file contains 1 or two lines (and since you'll need the content of the second line, if any, anyway), use mapfile. This will slurp the file in an array, one line per field. You can use the option -n 2 to read at most 2 lines. This will be much more efficient, safe and nice than your solution:
mapfile -t -n 2 ary < file
Then:
if ((${#ary[#]}==1)); then
printf 'File contains one line only: %s\n' "${ary[0]}"
elif ((${#ary[#]==2)); then
printf 'File contains (at least) two lines:\n'
printf ' %s\n' "${ary[#]}"
else
printf >&2 'Error, no lines found in file\n'
fi
Another suggestion: use more quotes!
With this, a better way to write your script:
#!/bin/bash
dir=$HOME/logSubtitle/waiting/
shopt -s nullglob
for f in "$dir"/*; do
mapfile -t -n 2 ary < "$f"
if ((${#ary[#]}==1)); then
rm -- "$f" || printf >&2 "Error, can't remove file %s\n" "$f"
elif ((${#ary[#]}==2)); then
{ sh -c "${ary[1]}" 3>&1 1>&2 2>&3 | grep down; echo "${ary[1]}"; } > "$f"
else
printf >&2 'Error, file %s contains no lines\n' "$f"
fi
done
After the done keyword you can even add the redirection 2>> logfile to a log file if you wish. Make sure the cron job is run with your user: check crontab -l and, if needed, edit it with crontab -e.
Use eval instead of sh. The reason it works with eval and not sh is due to the number of passes to evaluate variables. sh will treat the sed command as its command to execute while eval will evaluate the sed command first and then execute the result.
Briefly explained.

grep spacing error

Hi guys i've a problem with grep . I don't know if there is another search code in shell script.
I'm trying to backup a folder AhmetsFiles which is stored in my Flash Disk , but at the same time I've to group them by their extensions and save them into [extensionName] Folder.
AhmetsFiles
An example : /media/FlashDisk/AhmetsFiles/lecture.pdf must be stored in /home/$(whoami)/Desktop/backups/pdf
Problem is i cant copy a file which name contains spaces.(lecture 2.pptx)
After this introduction here my code.
filename="/media/FlashDisk/extensions"
count=0
exec 3<&0
exec 0< $filename
mkdir "/home/$(whoami)/Desktop/backups"
while read extension
do
cd "/home/$(whoami)/Desktop/backups"
rm -rf "$extension"
mkdir "$extension"
cd "/media/FlashDisk/AhmetsFiles"
files=( `ls | grep -i "$extension"` )
fCount=( `ls | grep -c -i "$extension"` )
for (( i=0 ; $i<$fCount ; i++ ))
do
cp -f "/media/FlashDisk/AhmetsFiles/${files[$i]}" "/home/$(whoami)/Desktop/backups/$extension"
done
let count++
done
exec 0<&3
exit 0
Your looping is way more complicated than it needs to be, no need for either ls or grep or the files and fCount variables:
for file in *.$extension
do
cp -f "/media/FlashDisk/AhmetsFiles/$file" "$HOME/Desktop/backups/$extension"
done
This works correctly with spaces.
I'm assuming that you actually wanted to interpret $extension as a file extension, not some random string in the middle of the filename like your original code does.
Why don't you
grep -i "$extension" | while IFS=: read x ; do
cp ..
done
instead?
Also, I believe you may prefer something like grep -i ".$extension$" instead (anchor it to the end of line).
On the other hand, the most optimal way is probably
cp -f /media/FlashDisk/AhmetsFiles/*.$extension "$HOME/Desktop/backups/$extension/"

Resources