bash for loop, use input filename in the output filename

bash for loop, use input filename in the output filename - bash

I want to use my input filename (baseline.YYYYMM.tar) in my output filename (baseline.YYYYMM.var1.tar). I can process the input files but don't know how to pass the output filename I need to my cdo application:
#!/bin/bash
prefix="basename"
fndate=$(ls | grep tar|cut -c 10-15)
var="var1"
extension=".tar"
outputfile=$prefix $fndate $var $extension
for f in $(find . -name "*.tar" -print) ; do
cdo selname,var1 $f $outputfile
done
thanks

(I agree with remarks about parsing ls.)
Did you forget dots?
outputfile=$prefix $fndate $var $extension
should be
outputfile=${prefix}.${fndate}.${var}${extension}

I'm not 100% sure I get what you're trying to do but if I'm right, you should be able to use something like this to get your output file name from the input:
var=var1
for f in *.tar; do
output=$(awk -v s="$var" 'BEGIN{FS=OFS="."}{print $1, $2, s, $3}' <<<"$f")
# use "$output" however you want, e.g.
echo "$output"
done
This uses awk to split up the input file name $f and insert the shell variable $var in the middle. <<<"$f" is bash syntax, equivalent to echo "$f" | at the start of the command.

Related

Pipe the output of basename to string substitution

I need the basename of a file that is given as an argument to a bash script. The basename should be stripped of its file extension.
Let's assume $1 = "/somefolder/andanotherfolder/myfile.txt", the desired output would be "myfile".
The current attempt creates an intermediate variable that I would like to avoid:
BASE=$(basename "$1")
NOEXT="${BASE%.*}"
My attempt to make this a one-liner would be piping the output of basename. However, I do not know how to pipe stdout to a string substitution.
EDIT: this needs to work for multiple file extensions with possibly differing lengths, hence the string substitution attempt as given above.

Why not Zoidberg ?
Ehhmm.. I meant why not remove the ext before going for basename ?
basename "${1%.*}"
Unless of course you have directory paths with dots, then you'll have to use basename before and remove the extension later:
echo $(basename "$1") | awk 'BEGIN { FS = "." }; { print $1 }'
The awk solution will remove anything after the first dot from the filename.
There's a regular expression based solution which uses sed to remove only the extension after last dot if it exists:
echo $(basename "$1") | sed 's/\(.*\)\..*/\1/'
This could even be improved if you're sure that you've got alphanumeric extensions of 3-4 characters (eg: mp3, mpeg, jpg, txt, json...)
echo $(basename "$1") | sed 's/\(.*\)\.[[:alnum:]]\{3\}$/\1/'

How about this?
NEXT="$(basename -- "${1%.*}")"
Testing:
set -- '/somefolder/andanotherfolder/myfile.txt'
NEXT="$(basename -- "${1%.*}")"
echo "$NEXT"
myfile
Alternatively:
set -- "${1%.*}"; NEXT="${1##*/}"

NOEXT="${1##*/}"; NOEXT="${NOEXT%.*}"

How about:
$ [[ $var =~ [^/]*$ ]] && echo ${BASH_REMATCH%.*}
myfile

How can I save only a substring of file names from a directory without the file extension?

I have a directory that I'm reading from and I want to save only the date representation as a string.
I am close to getting it , although I know there is probably an easier way. Here is what I have so far:
#files are in the format of "THIS_20200420.csv" so I want only "20200420"
declare -a arr
declare -a arr2
FILES=test2/*.csv
for file in $FILES
do
arr=(${arr[*]} "${file##*/}")
done
for i in "${arr[#]}"
do
arr2+=$(echo $i | cut -c6-13)
done
for item in "${arr2[#]}"
do
echo $item
done
the output shows the array only having one element which is all the strings concatenated:
20200110202001202020021920200220202004202020042220200110202001202020021920200220202004202020042220200219202002202020042020200422
Im bashing my head against my computer at this point.

arr=(
"THIS_20200420.csv"
"THIS_20200421.csv"
"THIS_20200422.csv"
"THIS_20200423.csv"
"THIS_20200424.csv"
"THIS_20200425.csv"
"THIS_20200426.csv"
"THIS_20200427.csv"
"THIS_20200428.csv"
"THIS_20200429.csv"
"THIS_20200430.csv" )
arr=( ${arr[#]//*_} )
arr=( ${arr[#]//.*} )
echo "arr: ${arr[#]}"
Explanation:
arr=( ${arr[#]//*_} ) will match all char up to '_' for each element, and replace them with empty string.
arr=( ${arr[#]//.*} ) will match all char after '.' for each element, and replace them with empty string.
For more information on parameter expansion, a good reference is TLDP's guide on parameter expansion.

Try this
declare -a arrayname=($(ls -1 test2/*.csv | grep -o '[0-9]*'))
Demo:
$ls -1 *csv
THIS_20200420.csv
THIS_20200421.csv
THIS_20200422.csv
THIS_20200423.csv
THIS_20200424.csv
THIS_20200425.csv
THIS_20200426.csv
THIS_20200427.csv
THIS_20200428.csv
THIS_20200429.csv
THIS_20200430.csv
$declare -a arrayname=($(ls -1 *csv | grep -o '[0-9]*'))
$echo ${arrayname[#]}
20200420 20200421 20200422 20200423 20200424 20200425 20200426 20200427 20200428 20200429 20200430
$echo ${arrayname[2]}
20200422
$

You could achieve this using a loop with awk:
$ for file in *.csv; do echo $file | awk -F '[^[:alnum:]]' '{print $2}'; done
The -F '[^[:alnum:]]' tells awk to use non alphanumeric characters as the delimiter.
Another way to do this is to use bash shell parameter expansion to echo only the part of the filename you want. This obviously only works if your filenames have consistent formatting:
$ for file in *.csv; do echo "${file:5:8}"; done
I thought it would be nice to use bash parameter expansion to strip the unwanted prefix and suffix but you can't have nested expansion (afaict) so this is the best I could come up with:
$ for file in *.csv; do echo "$(tmp=${file%.csv}; echo ${tmp#THIS_})"; done

Meet Cut! A good friend of Linux Users
for file in ./*.csv; do echo $file | cut -d "_" -f 2 | cut -d "." -f 1 ; done
This one line should do the trick!
Example:

Use an array for the files assignment and parameter expansion.
#!/usr/bin/env bash
shopt -s nullglob
##: Save the files ending in *.csv in an array
## so it expands properly, variable assignment does not expand the glob *
files=(test2/*.csv)
##: Remain only the files that end with .csv without the pathname, longest match
files=("${files[#]##*/}")
##: Remain only the file names without the .csv extention
files=("${files[#]%.csv}")
##: Remain only the filename after the _ from the beginning, shortest match.
files=("${files[#]#*_}")
printf '%s ' "${files[#]}"

Extract a line from a text file using grep?

I have a textfile called log.txt, and it logs the file name and the path it was gotten from. so something like this
2.txt
/home/test/etc/2.txt
basically the file name and its previous location. I want to use grep to grab the file directory save it as a variable and move the file back to its original location.
for var in "$#"
do
if grep "$var" log.txt
then
# code if found
else
# code if not found
fi
this just prints out to the console the 2.txt and its directory since the directory has 2.txt in it.
thanks.

Maybe flip the logic to make it more efficient?
f=''
while read prev
do case "$prev" in
*/*) f="${prev##*/}"; continue;; # remember the name
*) [[ -e "$f" ]] && mv "$f" "$prev";;
done < log.txt
That walks through all the files in the log and if they exist locally, move them back. Should be functionally the same without a grep per file.
If the name is always the same then why save it in the log at all?
If it is, then
while read prev
do f="${prev##*/}" # strip the path info
[[ -e "$f" ]] && mv "$f" "$prev"
done < <( grep / log.txt )

Having the file names on the same line would significantly simplify your script. But maybe try something like
# Convert from command-line arguments to lines
printf '%s\n' "$#" |
# Pair up with entries in file
awk 'NR==FNR { f[$0]; next }
FNR%2 { if ($0 in f) p=$0; else p=""; next }
p { print "mv \"" p "\" \"" $0 "\"" }' - log.txt |
sh
Test it by replacing sh with cat and see what you get. If it looks correct, switch back.
Briefly, something similar could perhaps be pulled off with printf '%s\n' "$#" | grep -A 1 -Fxf - log.txt but you end up having to parse the output to pair up the output lines anyway.

Another solution:
for f in `grep -v "/" log.txt`; do
grep "/$f" log.txt | xargs -I{} cp $f {}
done

grep -q (for "quiet") stops the output

Append wc lines to filename

Title says it all. I've managed to get just the lines with this:
lines=$(wc file.txt | awk {'print $1'});
But I could use an assist appending this to the filename. Bonus points for showing me how to loop this over all the .txt files in the current directory.

find -name '*.txt' -execdir bash -c \
'mv -v "$0" "${0%.txt}_$(wc -l < "$0").txt"' {} \;
where
the bash command is executed for each (\;) matched file;
{} is replaced by the currently processed filename and passed as the first argument ($0) to the script;
${0%.txt} deletes shortest match of .txt from back of the string (see the official Bash-scripting guide);
wc -l < "$0" prints only the number of lines in the file (see answers to this question, for example)
Sample output:
'./file-a.txt' -> 'file-a_5.txt'
'./file with spaces.txt' -> 'file with spaces_8.txt'

You could use the rename command, which is actually a Perl script, as follows:
rename --dry-run 'my $fn=$_; open my $fh,"<$_"; while(<$fh>){}; $_=$fn; s/.txt$/-$..txt/' *txt
Sample Output
'tight_layout1.txt' would be renamed to 'tight_layout1-519.txt'
'tight_layout2.txt' would be renamed to 'tight_layout2-1122.txt'
'tight_layout3.txt' would be renamed to 'tight_layout3-921.txt'
'tight_layout4.txt' would be renamed to 'tight_layout4-1122.txt'
If you like what it says, remove the --dry-run and run again.
The script counts the lines in the file without using any external processes and then renames them as you ask, also without using any external processes, so it quite efficient.
Or, if you are happy to invoke an external process to count the lines, and avoid the Perl method above:
rename --dry-run 's/\.txt$/-`grep -ch "^" "$_"` . ".txt"/e' *txt

Use rename command
for file in *.txt; do
lines=$(wc ${file} | awk {'print $1'});
rename s/$/${lines}/ ${file}
done

#/bin/bash
files=$(find . -maxdepth 1 -type f -name '*.txt' -printf '%f\n')
for file in $files; do
lines=$(wc $file | awk {'print $1'});
extension="${file##*.}"
filename="${file%.*}"
mv "$file" "${filename}${lines}.${extension}"
done
You can adjust maxdepth accordingly.

you can do like this as well:
for file in "path_to_file"/'your_filename_pattern'
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done
example:
for file in /oradata/SCRIPTS_EL/text*
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done

This would work, but there are definitely more elegant ways.
for i in *.txt; do
mv "$i" ${i/.txt/}_$(wc $i | awk {'print $1'})_.txt;
done
Result would put the line numbers nicely before the .txt.
Like:
file1_1_.txt
file2_25_.txt

You could use grep -c '^' to get the number of lines, instead of wc and awk:
for file in *.txt; do
[[ ! -f $file ]] && continue # skip over entries that are not regular files
#
# move file.txt to file.txt.N where N is the number of lines in file
#
# this naming convention has the advantage that if we run the loop again,
# we will not reprocess the files which were processed earlier
mv "$file" "$file".$(grep -c '^' "$file")
done

{ linecount[FILENAME] = FNR }
END {
linecount[FILENAME] = FNR
for (file in linecount) {
newname = gensub(/\.[^\.]*$/, "-"linecount[file]"&", 1, file)
q = "'"; qq = "'\"'\"'"; gsub(q, qq, newname)
print "mv -i -v '" gensub(q, qq, "g", file) "' '" newname "'"
}
close(c)
}
Save the above awk script in a file, say wcmv.awk, the run it like:
awk -f wcmv.awk *.txt
It will list the commands that need to be run to rename the files in the required way (except that it will ignore empty files). To actually execute them you can pipe the output to a shell for execution as follows.
awk -f wcmv.awk *.txt | sh
Like it goes with all irreversible batch operations, be careful and execute commands only if they look okay.

awk '
BEGIN{ for ( i=1;i<ARGC;i++ ) Files[ARGV[i]]=0 }
{Files[FILENAME]++}
END{for (file in Files) {
# if( file !~ "_" Files[file] ".txt$") {
fileF=file;gsub( /\047/, "\047\"\047\"\047", fileF)
fileT=fileF;sub( /.txt$/, "_" Files[file] ".txt", fileT)
system( sprintf( "mv \047%s\047 \047%s\047", fileF, fileT))
# }
}
}' *.txt
Another way with awk to manage easier a second loop by allowing more control on name (like avoiding one having already the count inside from previous cycle)
Due to good remark of #gniourf_gniourf:
file name with space inside are possible
tiny code is now heavy for such a small task

How do I use Bash to create a copy of a file with an extra suffix before the extension?

This title is a little confusing, so let me break it down. Basically I have a full directory of files with various names and extensions:
MainDirectory/
image_1.png
foobar.jpeg
myFile.txt
For an iPad app, I need to create copies of these with the suffix #2X appended to the end of all of these file names, before the extension - so I would end up with this:
MainDirectory/
image_1.png
image_1#2X.png
foobar.jpeg
foobar#2X.jpeg
myFile.txt
myFile#2X.txt
Instead of changing the file names one at a time by hand, I want to create a script to take care of it for me. I currently have the following, but it does not work as expected:
#!/bin/bash
FILE_DIR=.
#if there is an argument, use that as the files directory. Otherwise, use .
if [ $# -eq 1 ]
then
$FILE_DIR=$1
fi
for f in $FILE_DIR/*
do
echo "Processing $f"
filename=$(basename "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
newFileName=$(echo -n $filename; echo -n -#2X; echo -n $extension)
echo Creating $newFileName
cp $f newFileName
done
exit 0
I also want to keep this to pure bash, and not rely on os-specific calls. What am I doing wrong? What can I change or what code will work, in order to do what I need?

#!/bin/sh -e
cd "${1-.}"
for f in *; do
cp "$f" "${f%.*}#2x.${f##*.}"
done

It's very easy to do that with awk in one line like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }' | sh
with ls -1 you get just the bare list of files, then you pipe awk to use the dot (.) as separator. Then you build a shell command to create a copy of each file.
I suggest to run the command without the last sh pipe before, in order to check the cp commands are correct. Like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }'

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

bash for loop, use input filename in the output filename - bash

(I agree with remarks about parsing ls.) Did you forget dots? outputfile=$prefix $fndate $var $extension should be outputfile=${prefix}.${fndate}.${var}${extension}

Related

Pipe the output of basename to string substitution

How can I save only a substring of file names from a directory without the file extension?

Extract a line from a text file using grep?

Append wc lines to filename

How do I use Bash to create a copy of a file with an extra suffix before the extension?

Categories

Resources