Simple bash script to read from one file having double quotes in content - bash

Well i am really pissed off :(
I have a file called test.txt. and here it is:
"/var/lib/backup.log"
"/var/lib/backup2.log"
double quotes are included in the file each at beginning and end of the directory and i can not remove them.
i am trying to write a script to remove files in test.txt.
like this:
for del in `cat test.txt` ; do
rm -f $del
done
but it does not work as expected :(
it gives this error:
rm: cannot access "/var/lib/backup.log": No such file or directory
rm: cannot access "/var/lib/backup.log2": No such file or directory

This will just remove the quote character from the beginning and the end of the read entry, which is better than blindly removing all quote characters (since they can appear in filenames, of course).
And, regarding your initial code, PLEASE ALWAYS USE QUOTES until you really know when and when not.
while read -r; do
fname=${REPLY#\"}
fname=${fname%\"}
echo rm -f "$fname"
done < myfiles.txt

The following one-liner should do it:
rm $(tr '\"' '\0' < test.txt)
Here, tr translates all " to null (\0), where the input is from the file named test.txt. Finally, rm is supplied with the results.
The following Perl one-liner can be used for the same too:
perl -nle 's{"}{}g;unlink' test.txt
Searches and replaces " from the each line read from test.txt. Then, unlink removes the file.
Or,
sed 's! !\\ !g' < test.txt | sed 's/"//g' | xargs rm
Escape spaces, remove " and delete the file.

It's easy to rustle up a quick Perl script
#!/bin/perl
while (<STDIN>) {
chomp;
s/"//g;
unlink $_;
}
and run it thus:
./script.pl < test.txt
Although you've specified bash in the above, I'm not sure if you really want a bash-only solution.
Note that this will handle whitespaces in file names etc.

I guess eval command will do that work for you:
for del in `cat test.txt` ; do
eval rm -f $del
done

Related

Remove middle of filenames

I have a list of filenames like this in bash
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
And I want them to look like this
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz
I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz
is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.
This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.
Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:
for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done
Remove the echo before mv to perform actual renaming.
You can do something like this in the directory which contains the files to be renamed:
for file_name in *.gz
do
new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
mv "$file_name" "$new_file_name";
done
The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.
Example:
AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done
AMD$ ls
UTSHoS10_R1.fq.gz UTSHoS10_R2.fq.gz UTSHoS11_R2.fq.gz UTSHoS12_R1.fq.gz UTSHoS12_R2.fq.gz
Pure Bash, using substring operation and assuming that all file names have the same length:
for file in UTS*.gz; do
echo mv -i "$file" "${file:0:9}${file:38:8}"
done
Outputs:
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz
Once verified, remove echo from the line inside the loop and run again.
Going with your sed command, this can work as a bash one-liner:
for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done
Notes:
I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

How could I append '\' in front of the space within a file name?

I was working on a program that could transfer files using sftp program:
sftp -oBatchMode=no -b ${BATCH_FILE} user#$123.123.123.123:/home << EOF
bye
EOF
One of my requirement is I must have a BATCH_FILE use with sftp and the batch file was generate using following script:
files=$(ls -1 ${SRC_PATH}/*.txt)
echo "$files" > ${TEMP_FILE}
while read file
do
if [ -s "${file}" ]
then
echo ${file} >> "${PARSE_FILE}" ## line 1
fi
done < ${TEMP_FILE}
awk '$0="put "$0' ${PARSE_FILE} > ${BATCH_FILE}
Somehow my program doesn't able to handle files with space in it. I did try using following code to replace line 1 but failed, the output of this will show filename\.txt.
newfile=`echo $file | tr ' ' '\\ '`
echo ${newfile} >> "${PARSE_FILE}"
In order to handle file name with space, how could I append a \ in front of the space within a file name?
THE PROBLEM
The problem is that tr SET1 SET2 will replace the Nth character in SET1 with the Nth character in SET2, which means that you are effectively replacing every space by \, instead of adding a backslash before every space.
PROPOSED SOLUTION
Instead of manually trying to fix the missing spaces, upon using your variable that might contain spaces; wrap it in quotes and let the shell handle the trouble for you.
See the below example:
$ echo $FILENAME
file with spaces.txt
$ ls $FILENAME
ls: cannot access file: No such file or directory
ls: cannot access with: No such file or directory
ls: cannot access spaces.txt: No such file or directory
$ ls "$FILENAME"
file with spaces.txt
But I really wanna replace stuff..
Well, if you really want a command to change every ' ' (space) into '\ ' (backslash, space) you could use sed with a basic replace-pattern, as the below:
$ echo "file with spaces.txt" | sed 's, ,\\ ,g'
file\ with\ spaces.txt
I haven't looked too closely at what you're trying to do there, but I do know that bash can handle filenames with spaces in them if you double-quote them. Why not try quoting every filename variable and see if that works? You're quoting some of them but not all yet.
Like try these: "${newfile}" or just "$newfile" "$file" "$tempfile" etc...
You can further simplify your code if you're using Bash:
function generate_batch_file {
for FILE in "${SRC_PATH}"/*.txt; do
[[ -s $FILE ]] && echo "put {$FILE// /\\ }"
done
}
sftp -oBatchMode=no -b <(generate_batch_file) user#$123.123.123.123:/home <<< "bye"
you can try to rename the file to work and rename it again after it has done.

bash removing part of a file name

I have the following files in the following format:
$ ls CombinedReports_LLL-*'('*.csv
CombinedReports_LLL-20140211144020(Untitled_1).csv
CombinedReports_LLL-20140211144020(Untitled_11).csv
CombinedReports_LLL-20140211144020(Untitled_110).csv
CombinedReports_LLL-20140211144020(Untitled_111).csv
CombinedReports_LLL-20140211144020(Untitled_12).csv
CombinedReports_LLL-20140211144020(Untitled_13).csv
CombinedReports_LLL-20140211144020(Untitled_14).csv
CombinedReports_LLL-20140211144020(Untitled_15).csv
CombinedReports_LLL-20140211144020(Untitled_16).csv
CombinedReports_LLL-20140211144020(Untitled_17).csv
CombinedReports_LLL-20140211144020(Untitled_18).csv
CombinedReports_LLL-20140211144020(Untitled_19).csv
I would like this part removed:
20140211144020 (this is the timestamp the reports were run so this will vary)
and end up with something like:
CombinedReports_LLL-(Untitled_1).csv
CombinedReports_LLL-(Untitled_11).csv
CombinedReports_LLL-(Untitled_110).csv
CombinedReports_LLL-(Untitled_111).csv
CombinedReports_LLL-(Untitled_12).csv
CombinedReports_LLL-(Untitled_13).csv
CombinedReports_LLL-(Untitled_14).csv
CombinedReports_LLL-(Untitled_15).csv
CombinedReports_LLL-(Untitled_16).csv
CombinedReports_LLL-(Untitled_17).csv
CombinedReports_LLL-(Untitled_18).csv
CombinedReports_LLL-(Untitled_19).csv
I was thinking simply along the lines of the mv command, maybe something like this:
$ ls CombinedReports_LLL-*'('*.csv
but maybe a sed command or other would be better
rename is part of the perl package. It renames files according to perl-style regular expressions. To remove the dates from your file names:
rename 's/[0-9]{14}//' CombinedReports_LLL-*.csv
If rename is not available, sed+shell can be used:
for fname in Combined*.csv ; do mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" ; done
The above loops over each of your files. For each file, it performs a mv command: mv "$fname" "$(echo "$fname" | sed -r 's/[0-9]{14}//')" where, in this case, sed is able to use the same regular expression as the rename command above. s/[0-9]{14}// tells sed to look for 14 digits in a row and replace them with an empty string.
Without using an other tools like rename or sed and sticking strictly to bash alone:
for f in CombinedReports_LLL-*.csv
do
newName=${f/LLL-*\(/LLL-(}
mv -i "$f" "$newName"
done
for f in CombinedReports_LLL-* ; do
b=${f:0:20}${f:34:500}
mv "$f" "$b"
done
You can try line by line on shell:
f="CombinedReports_LLL-20140211144020(Untitled_11).csv"
b=${f:0:20}${f:34:500}
echo $b
You can use the rename utility for this. It uses syntax much like sed to change filenames. The following example (from the rename man-page) shows how to remove the trailing '.bak' extension from a list of backup files in the local directory:
rename 's/\.bak$//' *.bak
I'm using the advice given in the top response and have put the following line into a shell script:
ls *.nii | xargs rename 's/[f_]{2}//' f_0*.nii
In terminal, this line works perfectly, but in my script it will not execute and reads * as a literal part of the file name.

Trying to write a script to clean <script.aa=([].slice+'hjkbghkj') from multiple htm files, recursively

I am trying to modify a bash script to remove a glob of malicious code from a large number of files.
The community will benefit from this, so here it is:
#!/bin/bash
grep -r -l 'var createDocumentFragm' /home/user/Desktop/infected_site/* > /home/user/Desktop/filelist.txt
for i in $(cat /home/user/Desktop/filelist.txt)
do
cp -f $i $i.bak
done
for i in $(cat /home/user/Desktop/filelist.txt)
do
$i | sed 's/createDocumentFragm.*//g' > $i.awk
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
This is where the script bombs out with this message:
+ for i in '$(cat /home/user/Desktop/filelist.txt)'
+ sed 's/createDocumentFragm.*//g'
+ /home/user/Desktop/infected_site/index.htm
I get 2 errors and the script stops.
/home/user/Desktop/infected_site/index.htm: line 1: syntax error near unexpected token `<'
/home/user/Desktop/infected_site/index.htm: line 1: `<html><head><script>(function (){ '
I have the first 2 parts done.
The files containing createDocumentfragm have been enumerated in a text file correctly.
The files in the textfile.txt have been duplicated, in their original location with a .bak added to them IE: infected_site/some_directory/infected_file.htm and infected_file.htm.bak
effectively making sure we have a backup.
All I need to do now is write an AWK command that will use the list of files in filelist.txt, use the entire glob of malicious text as a pattern, and remove it from the files. Using just the uppercase script as the starting point, and the lower case script is too generic and could delete legitimate text
I suspect this may help me, but I don't know how to use it correctly.
http://backreference.org/2010/03/13/safely-escape-variables-in-awk/
Once I have this part figured out, and after you have verified that the files weren't mangled you can do this to clean out the bak files:
for i in $(cat /home/user/Desktop/filelist.txt)
do
rm -f $i.bak
done
Several things:
You have:
$i | sed 's/var createDocumentFragm.*//g' > $i.awk
You should probably meant this (using your use of cat which we'll talk about in a moment):
cat $i | sed 's/var createDocumentFragm.*//g' > $i.awk
You're treating each file in your file list as if it was a command and not a file.
Now, about your use of cat. If you're using cat for almost anything but concatenating multiple files together, you probably are doing something not quite right. For example, you could have done this:
sed 's/var createDocumentFragm.*//g' "$i" > $i.awk
I'm also a bit confused about the awk statement. Exactly what file are you using awk on? Your awk statement is using STDIN and STDOUT, so it's reading file names from the for loop and then printing the output on the screen. Is the sed statement suppose to feed into the awk statement?
Note that I don't have to print out my file to STDOUT, then pipe that into sed. The sed command can take the file name directly.
You also want to avoid for loops over a list of files. That is very inefficient, and can cause problems with the command line getting overloaded. Not a big issue today, but can affect you when you least suspect it. What happens is that your $(cat /home/user/Desktop/filelist.txt) must execute first before the for loop can even start.
A little rewriting of your program:
cd ~/Desktop
grep -r -l 'var createDocumentFragm' infected_site/* > filelist.txt
while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" > "$i.awk"
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
done < filelist.txt
We can use one loop, and we made it a while loop. I could even feed the grep into that while loop:
grep -r -l 'var createDocumentFragm' infected_site/* | while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" > "$i.awk"
awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p'
done < filelist.txt
and then I don't even have to create a temporary file.
Let me know what's going on with the awk. I suspect you wanted something like this:
grep -r -l 'var createDocumentFragm' infected_site/* | while read file
do
cp -f "$file" "$file.bak"
sed 's/var createDocumentFragm.*//g' "$file" \
| awk '/<\/SCRIPT>/{p=1;print}/<\/script>/{p=0}!p' > "$i.awk"
done < filelist.txt
Also note I put quotes around file names. This helps prevent problems if file name has a space in it.

sed command to fix filenames in a directory

I run a script which generated about 10k files in a directory. I just discovered that there is a bug in the script which causes some filenames to have a carriage return (presumably a '\n' character).
I want to run a sed command to remove the carriage return from the filenames.
Anyone knows which params to pass to sed to clean up the filenames in the manner described?
I am running Linux (Ubuntu)
I don't know how sed would do this, but this python script should do the trick:.
This isn't sed, but I find python a lot easier to use when doing things like these:
#!/usr/bin/env python
import os
files = os.listdir('.')
for file in files:
os.rename(file, file.replace('\r', '').replace('\n', ''))
print 'Processed ' + file.replace('\r', '').replace('\n', '')
It strips any occurrences of both \r and \n from all of the filenames in a given directory.
To run it, save it somewhere, cd into your target directory (with the files to be processed), and run python /path/to/the/file.py.
Also, if you plan on doing more batch renaming, consider Métamorphose. It's a really nice and powerful GUI for this stuff. And, it's free!
Good luck!
Actually, try this: cd into the directory, type in python, and then just paste this in:
exec("import os\nfor file in os.listdir('.'):\n os.rename(file, file.replace('\\r', '').replace('\\n', ''))\n print 'Processed ' + file.replace('\\r', '').replace('\\n', '')")
It's a one-line version of the previous script, and you don't have to save it.
Version 2, with space replacement powers:
#!/usr/bin/env python
import os
for file in os.listdir('.'):
os.rename(file, file.replace('\r', '').replace('\n', '').replace(' ', '_')
print 'Processed ' + file.replace('\r', '').replace('\n', '')
And here's the one-liner:
exec("import os\nfor file in os.listdir('.'):\n os.rename(file, file.replace('\\r', '').replace('\\n', '')replace(' ', '_'))\n print 'Processed ' + file.replace('\\r', '').replace('\\n', '');")
If there are no spaces in your filenames, you can do:
for f in *$'\n'; do mv "$f" $f; done
It won't work if the newlines are embedded, but it will work for trailing newlines.
If you must use sed:
for f in *$'\n'; do mv "$f" "$(echo "$f" | sed '/^$/d')"; done
Using the rename Perl script:
rename 's/\n//g' *$'\n'
or the util-linux-ng utility:
rename $'\n' '' *$'\n'
If the character is a return instead of a newline, change the \n or ^$ to \r in any places they appear above.
The reason you aren't getting any pure-sed answers is that fundamentally sed edits file contents, not file names; thus the answers that use sed all do something like echo the filename into a pipe (pseudo file), edit that with sed, then use mv to turn that back into a filename.
Since sed is out, here's a pure-bash version to add to the Perl, Python, etc scripts you have so far:
killpattern=$'[\r\n]' # remove both carriage returns and linefeeds
for f in *; do
if [[ "$f" == *$killpattern* ]]; then
mv "$f" "${f//$killpattern/}"
fi
done
...but since ${var//pattern/replacement} isn't available in plain sh (along with [[...]]), here's a version using sh-only syntax, and tr to do the character replacement:
for f in *; do
new="$(printf %s "$f" | tr -d "\r\n")"
if [ "$f" != "$new" ]; then
mv "$f" "$new"
fi
done
EDIT: If you really want it with sed, take a look at this:
http://www.linuxquestions.org/questions/programming-9/merge-lines-in-a-file-using-sed-191121/
Something along these lines should work similar to the perl below:
for i in *; do echo mv "$i" `echo "$i"|sed ':a;N;s/\n//;ta'`; done
With perl, try something along these lines:
for i in *; do mv "$i" `echo "$i"|perl -pe 's/\n//g'`; done
This will rename all files in the current folder by removing all newline characters from them. If you need to go recursive, you can use find instead - be aware of the escaping in that case, though.
In fact there is a way to use sed:
carr='\n' # specify carriage return
files=( $(ls -f) ) # array of files in current dir
for i in ${files[#]}
do
if [[ -n $(echo "$i" | grep $carr) ]] # filenames with carriage return
then
mv "$i" "$(echo "$i" | sed 's/\\n//g')" # move!
fi
done
This actually works.

Resources