Batch editing files 'stuck at weird place' - bash

I'm trying to learn how to batch edit files and extract information from them. I've begun with trying to create some trial files and editing their names. I tried to search but couldn't find the problem I'm in anywhere.
If it's already answered, I'd be happy to be directed to that link.
So, I wrote the following code:
#!/bin/bash
mkdir -p ./trialscript
echo $1
i=1
while [ $i -le $1 ]
do
touch ./trialscript/testfile$i.dat
i=$(($i+1))
done
for f in ./trialscript/*.dat
do
echo $f
mv "$f" "$fhello.dat"
done
This doesn't seem to work, and I think it's because the echo output is like:
4
./trialscript/testfile1.dat
./trialscript/testfile2.dat
./trialscript/testfile3.dat
./trialscript/testfile4.dat
I just need the filename in the 'f' and not the complete path and then just rename it.
Can someone suggest what is wrong in my code, and what's correct way to do what I'm doing.

If you want to move the file, you have to use the path, too, otherwise mv wouldn't be able to find it.
The target specification for the mv command is more problematic, though. You're using
"$fhello.dat"
which, in fact, means "content of the $fhello variable plus the string .dat". How should the poor shell know where the seam is? Use
"${f}hello.dat"
to disambiguate.
Also, to extract parts of strings, see Parameter expansion in man bash. You can use ${f%/*} to only get the path, or ${f##*/} to only get the filename.

Related

Use inotifywait to change filename and further loop through sql loader

Objective: The moment multiple.csv files are uploaded to the folder, code should check each filename, if appropriate filename, file should be further used by sqlloader to get data uploaded in the database. Once file is uploaded, code should delete the file processed. Next time, same process repeats.
I have some parts of the code working but some are creating problem, especially related to inotifywait. Please help.
In first loop, I am trying to monitor the /uploads folder, the moment it finds the .csv file, it checks if the filename has space. If yes, it wants to change the space to underscore in the filename. I have been trying to find a way to find "space, () or ," in the filename but only could do the 'space' part change. This is giving me an error that file cannot be moved, no such file or directory.
Second loop works separately but not when incorporated with first loop as there are errors which I have not been able to debug. If I run second loop separately, it is working correctly. But if there is a way to optimize the code better in one loop, I would be happy to know. Thanks!
Example: folder name: /../../upload
filenames: abc_123.csv (code should not make any change) , pqr(12 Apr).csv (code should change it to pqr_12_Apr.csv), May 12.csv (code should change it to May_12.csv) etc.
Once these 3 files have proper naming, it should be ready to be uploaded through sql loader and once files are processed, they get deleted.
My code is:
#!bin/bash
inotifywait -mqe create /../../upload | while read file; do
if [[ $file = '* *'.csv]]; then
mv "$file" ${file// /_}
fi
done
for file in /../..upload/*.csv
do
sqlcommand="sqlldr user/pwd control="/../xxx.ctl" data=$file silent=feedback, header"
$sqlcommand
rm $file
done
Thank you!
I have modified your script to this,
#!/usr/bin/env bash
while IFS= read -r file; do
filename=${file#* CREATE }
pathname=${file%/*}
if [[ $pathname/$filename = *\ *.csv ]]; then
echo mv -v "$pathname/$filename" "$pathname/${filename// /_}"
fi
done < <(inotifywait -mqe create /../../upload)
Remove the echo if you think the output is correct.
I just don't know how you can integrate the other parts of your script with that, probably create a separate script or remove the -m (which you don't want to do most probably). Well you could use a named pipe if mkfifo is available.
EDIT: as per OP's message add another parameter expansion for another string removal.
Add the code below the if [[ ... ]]; then
newfilename=${filename//\(\)}
Then change "${filename// /_}" to "${newfilename// /_}"

Using brace expansion to move files on the command line

I have a question concerning why this doesn't work. Probably, it's a simple answer, but I just can't seem to figure it out.
I want to move a couple of files I have. They all have the same filename (let's say file1) but they are all in different directories (lets say /tmp/dir1,dir2 and dir3). If I were to move these individually I could do something along the lines of:
mv /tmp/dir1/file1 /tmp
That works. However, I have multiple directories and they're all going to end up in the same spot....AND I don't want to overwrite. So, I tried something like this:
mv /tmp/{dir1,dir2,dir3}/file1 /tmp/file1.{a,b,c}
When I try this I get:
/tmp/file1.c is not a directory
Just to clarify...this also works:
mv /tmp/dir1/file1 /tmp/file1.c
Pretty sure this has to do with brace expansion but not certain why.
Thanks
Just do echo to understand how the shell expands:
$ echo mv /tmp/{dir1,dir2,dir3}/file1 /tmp/file1.{a,b,c}
mv /tmp/dir1/file1 /tmp/dir2/file1 /tmp/dir3/file1 /tmp/file1.a /tmp/file1.b /tmp/file1.c
Now you can see that your command is not what you want, because in a mv command, the destination (directory or file) is the last argument.
That's unfortunately now how the shell expansion works.
You'll have to probably use an associative array.
!/bin/bash
declare -A MAP=( [dir1]=a [dir2]=b [dir3]=c )
for ext in "${!MAP[#]}"; do
echo mv "/tmp/$ext/file1" "/tmp/file1.${MAP[$ext]}"
done
You get the following output when you run it:
mv /tmp/dir2/file1 /tmp/file1.b
mv /tmp/dir3/file1 /tmp/file1.c
mv /tmp/dir1/file1 /tmp/file1.a
Like with many other languages key ordering is not guaranteed.
${!MAP[#]} returns an array of all the keys, while ${MAP[#]} returns the an array of all the values.
Your syntax of /tmp/{dir1,dir2,dir3}/file1 expands to /tmp/dir1/file /tmp/dir2/file /tmp/dir3/file. This is similar to the way the * expansion works. The shell does not execute your command with each possible combination, it simply executes the command but expands your one value to as many as are required.
Perhaps instead of a/b/c you could differentiate them with the actual number of the dir they came from?
$: for d in 1 2 3
do echo mv /tmp/dir$d/file1 /tmp/file1.$d
done
mv /tmp/dir1/file1 /tmp/file1.1
mv /tmp/dir2/file1 /tmp/file1.2
mv /tmp/dir3/file1 /tmp/file1.3
When happy with it, take out the echo.
A relevant point - brace expansion is not a wildcard. It has nothing to do with what's on disk. It just creates strings.
So, if you create a bunch of files named with single letters or digits, echo ? will wildcard and list them all, but only the ones actually present. If there are files for vowels but not consonants, only the vowels will show. But -
if you say echo {foo,bar,nope} it will output foo bar nope regardless of whether or not any or all of those exist as files or directories, etc.

Wildcard on mv folder destination

I'm writing a small piece of code that checks for .mov files in a specific folder over 4gb and writes it to a log.txt file by name (without an extension). I'm then reading the names into a while loop line by line which signals some archiving and copying commands.
Consider a file named abcdefg.mov (new) and a corresponding folder somewhere else named abcdefg_20180525 (<-*underscore timestamp) that also contains a file named abcedfg.mov (old).
When reading in the filename from the log.txt, I strip the extension to store the variable "abcdefg" ($in1) and i'm using that variable to locate a folder elsewhere that contains that matching string at the beginning.
My problem is with how the mv command seems to support a wild card in the "source" string, but not in the "destination" string.
For example i can write;
mv -f /Volumes/Myshare/SourceVideo/$in1*/$in1.mov /Volumes/Myshare/Archive
However a wildcard on the destination doesn't work in the same way. For example;
mv -f /Volumes/Myshare/Processed/$in1.mov Volumes/Myshare/SourceVideo/$in1*/$in1.mov
Is there an easy fix here that doesn't involve using another method?
Cheers for any help.
mv accepts a single destination path. Suppose that $in1 is abcdfg, and that $in1* expands to abcdefg_20180525 and abcdefg_20180526. Then the command
mv -f /dir1/$in1 /dir2/$in1*/$in1.mov
will be equivalent to:
mv -f /dir1/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
mv -f /dir1/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
mv -f /dir2/abcdefg_20180525/abcdefg.mov /dir2/abcdefg_20180526/abcdefg.mov
Moreover, because the destination file is the same in all three cases, the first two files will be overwritten by the third.
You should create a precise list and do a precise copy instead of using wild cards.
This is what I would probably do, generate a list of results in a file with FULL path information, then read those results in another function. I could have used arrays but I wanted to keep it simple. At the bottom of this script is a function call to scan for files of EXT mp4 (case insensitive) then writes the results to a file in tmp. then the script reads the results from that file in another function and performs some operation (mv etc.). Note, if functions are confusing , you can just remove the function name { } and name calls and it becomes a normal script again. functions are really handy, learn to love them!
#!/usr/bin/env bash
readonly SIZE_CHECK_LIMIT_MB="10M"
readonly FOLDER="/tmp"
readonly DESTINATION_FOLDER="/tmp/archive"
readonly SAVE_LIST_FILE="/tmp/$(basename $0)-save-list.txt"
readonly EXT="mp4"
readonly CASE="-iname" #change to -name for exact ext type upper/lower
function find_files_too_large() {
> ${SAVE_LIST_FILE}
find "${FOLDER}" -maxdepth 1 -type f "${CASE}" "*.${EXT}" -size +${SIZE_CHECK_LIMIT_MB} -print0 | while IFS= read -r -d $'\0' line ; do
echo "FOUND => $line"
echo "$line" >> ${SAVE_LIST_FILE}
done
}
function archive_large_files() {
local read_file="${SAVE_LIST_FILE}"
local write_folder="$DESTINATION_FOLDER"
if [ ! -s "${read_file}" ] || [ ! -f "${read_file}" ] ;then
echo "No work to be done ... "
return
fi
while IFS= read -r line ;do
echo "mv $line $write_folder" ;sleep 1
done < "${read_file}"
}
# MAIN (this is where the script starts) We just call two functions.
find_files_too_large
archive_large_files
it might be easier, i think, to change the filenames to the folder name initially. So abcdefg.mov would be abcdefg_timestamp.mov. I can always strip the timestamp from the filename easy enough after its copied to the right location. I was hoping i had a small syntax issue but i think there is no easy way of doing what i thought i could...
I think you have a basic misunderstanding of how wildcards work here. The mv command doesn't support wildcards at all; the shell expands all wildcards into lists of matching files before they get passed to the mv command as wildcards. Furthermore, the mv command doesn't know if the list of arguments it got came from wildcards or not, and the shell doesn't know anything about what the command is going to do with them. For instance, if you run the command grep *, the grep command just gets a list of names of files in the current directory as arguments, and will treat the first of them as a regex pattern ('cause that's what the first argument to grep is) to search the rest of the files for. If you ran mv * (note: don't do this!), it will interpret all but the last filename as sources, and the last one as a destination.
I think there's another source of confusion as well: when the shell expands a string containing a wildcard, it tries to match the entire thing to existing files and/or directories. So when you use Volumes/Myshare/SourceVideo/$in1*/$in1.mov, it looks for an already-existing file in a matching directory; AIUI the file isn't there yet, there's no match. What it does in that case is pass the raw (unexpanded) wildcard-containing string to mv as an argument, which looks for that exact name, doesn't find it, and gives you an error.
(BTW, should there be a "/" at the front of that pattern? I assume so below.)
If I understand the situation correctly, you might be able to use this:
mv -f /Volumes/Myshare/Processed/$in1.mov /Volumes/Myshare/SourceVideo/$in1*/
Since the filename isn't supplied in the second string, it doesn't look for existing files by that name, just directories with the right prefix; mv will automatically retain the filename from the source.
However, I'll echo #Sergio's warning about chaos from multiple matches. In this case, it won't overwrite files (well, it might, but for other reasons), but if it gets multiple matching target directories it'll move all but the last one into the last one (along with the file you meant to move). You say you're 100% certain this won't be a problem, but in my experience that means that there's at least a 50% chance that something you'd never have thought of will go ahead and make it happen anyway. For instance, is it possible that $in1 could wind up empty, or contain a space, or...?
Speaking of spaces, I'd also recommend double-quoting all variable references. You want the variables inside double-quotes, but the wildcards outside them (or they won't be expanded), like this:
mv -f "/Volumes/Myshare/Processed/$in1.mov" "/Volumes/Myshare/SourceVideo/$in1"*/

Remove part of name of multiple files in Linux

I have several fastq.gz files in a directory. I want to delete parts of each file name.
Here are the file names:
RES_1448_001_S289_L001_R1_001.fastq.gz
RES_1448_001_S289_L001_R2_001.fastq.gz
RES_1448_012_S300_L001_R1_001.fastq.gz
RES_1448_012_S300_L001_R2_001.fastq.gz
I want to remove S and 3 digits after it. I expect this after removing
RES_1448_001_R1_001.fastq.gz
RES_1448_001_R2_001.fastq.gz
RES_1448_012_R1_001.fastq.gz
RES_1448_012_R2_001.fastq.gz
I asked a similar question before, but was advised to ask a new one to cover the precise requirements I have now.
Old question: Delete part of name of multiple files in Linux
Use rename.
rename 's/S\d{3}_//' *.fastq.gz
Using this bash, regEx would do the trick for you.
#!/bin/bash
for file in *.fastq.gz
do
if [[ $file =~ ^(.*)S([[:digit:]]{3})_L([[:digit:]]{3})_(.*)$ ]]
then
start="${BASH_REMATCH[1]}"
end="${BASH_REMATCH[4]}"
mv -- "$file" "${start}${end}"
fi
done

Renaming multiple files using a Shell Script

I have files named t1.txt, t2.txt, t3.txt ... t4.txt and I need a shell script to rename it like this:
file one: M.m.1.1.1.201108290000.ready
file two: M.m.1.1.1.201108290001.ready
etc, the sequence number in the last 4 digits changes.
I'd be grateful if someone helped me :)
Best Regards
This might be what you need:
cd /home/me/Desktop/files/renam/
n=201108290000
for file in *.txt; do
echo $file
prefix=M.m.1.1.1.
file_name=M.m.1.1.1.$n.ready
echo $file_name
n=$(( $n+1 ))
mv $file $file_name
done
It's close to what you'd written yourself, you just missed some bash syntax. Note that you might want to change the initial value of n, otherwise for the files you mentioned t1.txt would become M.m.1.1.1.201108290000.ready. Depending on what your use is, that might be confusing.
I'd also advice you to avoid use the names of programs and builtins as variable names, such as seq in your case.

Resources