Get parent directory name and filename without extension from full filepath in bash? - bash

I have a long file path, like
/path/to/file/dir1/dir2/file.txt.gz
I am interested in getting the file name without the last extension (i.e., file.txt), and the name of the parent directory (dir2, I don't want the full path to the parent directory, just its name).
How can I do this in bash?

Using BASH:
s='/path/to/file/dir1/dir2/file.txt.gz'
file="${s##*/}"
file="${file%.*}"
echo "$file"
file.txt
filder="${s%/*}"
folder="${folder##*/}"
echo "$folder"
dir2
Using awk:
awk -F '/' '{sub(/\.[^.]+$/, "", $NF); print $(NF-1), $NF}' <<< "$s"
dir2 file.txt
To read them into shell variables:
read folder file < <(awk -F '/' '{sub(/\.[^.]+$/, "", $NF);print $(NF-1), $NF}'<<<"$s")

The first part can be solved by basename(1):
$ basename /path/to/file/dir1/dir2/file.txt.gz
file.txt.gz
$
dirname(1) does the opposite, which is not quite what you want, but maybe you can use that as a starting point:
$ dirname /path/to/file/dir1/dir2/file.txt.gz
/path/to/file/dir1/dir2
$
Of course, you can always use Perl:
$ perl -E 'do { #p=split m|/|; say $p[-2] } for #ARGV' /path/to/file/dir1/dir2/file.txt.gz
dir2
$

Related

Extend filename with word from file -

I can change the filename for a file to the first word in the file.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1}')
done
But I would like to extend it inset.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1 FILENAME}')
done
I have tried different variations of this, but none seem to work.
Is there an easy solution?
Kind regards Svend
Firstly, let understand why you did not get desired result
head -1 -q $fname|awk '{print $1 FILENAME}'
You are redirecting standard output of head command to awk command, that is awk is reading standard input and therefore FILENAME is set to empty string. Asking GNU AWK about FILENAME when it does consume standard input does not make much sense, as only data does through pipe and there might not such things as input file at all, e.g.
seq 10 | awk '{print $1*10}'
Secondly, let find way to get desired result, you have access to filename and successfully extracted word, therefore you might concat them that is
for fname in lrccas1
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done
Thirdly, I must warn you that your command does copy (cp) rather than rename file (rm) and does not care if target name does exist or not - if it do, it will be overwritten.
You can do it in pure bash (or sh)
for fname in lrccas1
do
read -r word rest < "$fname" && cp "$fname" "$word$fname"
done
This would do what your shell script appears to be trying to do:
awk 'FNR==1{close(out); out=$1 FILENAME} {print > out}' lrccas1
but you might want to consider something like this instead:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt
so your newly created files don't overwrite your existing ones and then to also remove the originals would be:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt &&
rm -f *.txt
That assumes your original files have some suffix like .txt or other way of identifying the original files, or you have all of your original files into some directory such as $HOME/old and can put the new files in a new directory such as $HOME/new:
cd "$HOME/old" &&
mkdir -p "$HOME/new" &&
awk -v newDir="$HOME/new" 'FNR==1{close(out); out=newDir "/" $1 FILENAME} {print > out}' * &&
echo rm -f *
remove the echo when done testing and happy with the result.
try to execute this: (bash)
for fname in file_name
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done

Append wc lines to filename

Title says it all. I've managed to get just the lines with this:
lines=$(wc file.txt | awk {'print $1'});
But I could use an assist appending this to the filename. Bonus points for showing me how to loop this over all the .txt files in the current directory.
find -name '*.txt' -execdir bash -c \
'mv -v "$0" "${0%.txt}_$(wc -l < "$0").txt"' {} \;
where
the bash command is executed for each (\;) matched file;
{} is replaced by the currently processed filename and passed as the first argument ($0) to the script;
${0%.txt} deletes shortest match of .txt from back of the string (see the official Bash-scripting guide);
wc -l < "$0" prints only the number of lines in the file (see answers to this question, for example)
Sample output:
'./file-a.txt' -> 'file-a_5.txt'
'./file with spaces.txt' -> 'file with spaces_8.txt'
You could use the rename command, which is actually a Perl script, as follows:
rename --dry-run 'my $fn=$_; open my $fh,"<$_"; while(<$fh>){}; $_=$fn; s/.txt$/-$..txt/' *txt
Sample Output
'tight_layout1.txt' would be renamed to 'tight_layout1-519.txt'
'tight_layout2.txt' would be renamed to 'tight_layout2-1122.txt'
'tight_layout3.txt' would be renamed to 'tight_layout3-921.txt'
'tight_layout4.txt' would be renamed to 'tight_layout4-1122.txt'
If you like what it says, remove the --dry-run and run again.
The script counts the lines in the file without using any external processes and then renames them as you ask, also without using any external processes, so it quite efficient.
Or, if you are happy to invoke an external process to count the lines, and avoid the Perl method above:
rename --dry-run 's/\.txt$/-`grep -ch "^" "$_"` . ".txt"/e' *txt
Use rename command
for file in *.txt; do
lines=$(wc ${file} | awk {'print $1'});
rename s/$/${lines}/ ${file}
done
#/bin/bash
files=$(find . -maxdepth 1 -type f -name '*.txt' -printf '%f\n')
for file in $files; do
lines=$(wc $file | awk {'print $1'});
extension="${file##*.}"
filename="${file%.*}"
mv "$file" "${filename}${lines}.${extension}"
done
You can adjust maxdepth accordingly.
you can do like this as well:
for file in "path_to_file"/'your_filename_pattern'
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done
example:
for file in /oradata/SCRIPTS_EL/text*
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done
This would work, but there are definitely more elegant ways.
for i in *.txt; do
mv "$i" ${i/.txt/}_$(wc $i | awk {'print $1'})_.txt;
done
Result would put the line numbers nicely before the .txt.
Like:
file1_1_.txt
file2_25_.txt
You could use grep -c '^' to get the number of lines, instead of wc and awk:
for file in *.txt; do
[[ ! -f $file ]] && continue # skip over entries that are not regular files
#
# move file.txt to file.txt.N where N is the number of lines in file
#
# this naming convention has the advantage that if we run the loop again,
# we will not reprocess the files which were processed earlier
mv "$file" "$file".$(grep -c '^' "$file")
done
{ linecount[FILENAME] = FNR }
END {
linecount[FILENAME] = FNR
for (file in linecount) {
newname = gensub(/\.[^\.]*$/, "-"linecount[file]"&", 1, file)
q = "'"; qq = "'\"'\"'"; gsub(q, qq, newname)
print "mv -i -v '" gensub(q, qq, "g", file) "' '" newname "'"
}
close(c)
}
Save the above awk script in a file, say wcmv.awk, the run it like:
awk -f wcmv.awk *.txt
It will list the commands that need to be run to rename the files in the required way (except that it will ignore empty files). To actually execute them you can pipe the output to a shell for execution as follows.
awk -f wcmv.awk *.txt | sh
Like it goes with all irreversible batch operations, be careful and execute commands only if they look okay.
awk '
BEGIN{ for ( i=1;i<ARGC;i++ ) Files[ARGV[i]]=0 }
{Files[FILENAME]++}
END{for (file in Files) {
# if( file !~ "_" Files[file] ".txt$") {
fileF=file;gsub( /\047/, "\047\"\047\"\047", fileF)
fileT=fileF;sub( /.txt$/, "_" Files[file] ".txt", fileT)
system( sprintf( "mv \047%s\047 \047%s\047", fileF, fileT))
# }
}
}' *.txt
Another way with awk to manage easier a second loop by allowing more control on name (like avoiding one having already the count inside from previous cycle)
Due to good remark of #gniourf_gniourf:
file name with space inside are possible
tiny code is now heavy for such a small task

copy files from mount point listed in a csv

I need to move over 100,000 img's from 1 server to another via a mount point, i have a .csv with them listed and im looking to script it
the csv looks like this
"images1\002_0001\thumb",53717902.jpg,/www/images/002_0001/thumb/
"images1\002_0001\thumb",53717901.jpg,/www/images/002_0001/thumb/
"images1\002_0001\thumb",53717900.jpg,/www/images/002_0001/thumb/
comma separated we have source name and destination
I was thinking of using awk to create each as a variable
SOURCE=`awk -F ',' '{ print $1 }' test.csv`
IMGNAME=`awk -F ',' '{ print $2 }' test.csv`
DEST=`awk -F ',' '{ print $3 }' test.csv`
this is where im getting stuck, my loop
while read line
do
cp $SOURCE${IMGNAME} $DEST
done <test.csv
this has copied the first name it finds into all the directories
You could use what you have and move the variable declaration into the loop referencing $line, or you could use IFS, as suggested below.
while IFS=, read -r src filename dest
do
cp $src${filename} $dest
done <test.csv
There are many way to do it, some example
If you have no spaces in the directories string: you can do even from shell
sed -E 's/"/cp /; s/",/\// ; s/,/ /;s/\\/\//g' test.csv | /bin/bash
It better if check it before you try. You speak about a lot of files...
sed -E 's/"/cp /; s/",/\// ; s/,/ /;s/\\/\//g' test.csv | less
It can happen that you have spaces in the string of the directory name like My Windows Like Dir Name. In this case you need double quotes (there are the double quote even for this reason maybe...)
You can do it using only awk(always from the shell)
awk -F',' '{gsub(/"/, "", $1); gsub(/\\/, "/", $1); print "cp \""$1"/" $2"\" \"" $3"\""}' test.csv | /bin/bash
or that is equivalent
awk -F',' '{gsub(/"/, "", $1); gsub(/\\/, "/", $1); printf ("cp \"%s/%s\" \"%s\"\n",$1,$2,$3)}' test.csv | /bin/bash
Check it always in advance, avoiding the last pipe |/bin/bash, putting maybe | head -n 10 to have only the first 10 lines.
The script can be written:
while IFS=, read -r SOURCE IMGNAME DEST
do
SOURCE=( ${SOURCE//\\/\/} ) # Here you need to change "\" in "/"
SOURCE=( ${SOURCE//\"/} ) # Here I like to kill ""
cp "${SOURCE}/${IMGNAME}" "$DEST" # Here I put again ""
done <test.csv
Note: I think you need to change "\" windows style in "/" unix style. So I required to the substitution rules.

How do I use Bash to create a copy of a file with an extra suffix before the extension?

This title is a little confusing, so let me break it down. Basically I have a full directory of files with various names and extensions:
MainDirectory/
image_1.png
foobar.jpeg
myFile.txt
For an iPad app, I need to create copies of these with the suffix #2X appended to the end of all of these file names, before the extension - so I would end up with this:
MainDirectory/
image_1.png
image_1#2X.png
foobar.jpeg
foobar#2X.jpeg
myFile.txt
myFile#2X.txt
Instead of changing the file names one at a time by hand, I want to create a script to take care of it for me. I currently have the following, but it does not work as expected:
#!/bin/bash
FILE_DIR=.
#if there is an argument, use that as the files directory. Otherwise, use .
if [ $# -eq 1 ]
then
$FILE_DIR=$1
fi
for f in $FILE_DIR/*
do
echo "Processing $f"
filename=$(basename "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
newFileName=$(echo -n $filename; echo -n -#2X; echo -n $extension)
echo Creating $newFileName
cp $f newFileName
done
exit 0
I also want to keep this to pure bash, and not rely on os-specific calls. What am I doing wrong? What can I change or what code will work, in order to do what I need?
#!/bin/sh -e
cd "${1-.}"
for f in *; do
cp "$f" "${f%.*}#2x.${f##*.}"
done
It's very easy to do that with awk in one line like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }' | sh
with ls -1 you get just the bare list of files, then you pipe awk to use the dot (.) as separator. Then you build a shell command to create a copy of each file.
I suggest to run the command without the last sh pipe before, in order to check the cp commands are correct. Like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }'

Extract directory path and filename

I have a variable which has the directory path, along with the file name. I want to extract the filename alone from the Unix directory path and store it in a variable.
fspec="/exp/home1/abc.txt"
Use the basename command to extract the filename from the path:
[/tmp]$ export fspec=/exp/home1/abc.txt
[/tmp]$ fname=`basename $fspec`
[/tmp]$ echo $fname
abc.txt
bash to get file name
fspec="/exp/home1/abc.txt"
filename="${fspec##*/}" # get filename
dirname="${fspec%/*}" # get directory/path name
other ways
awk
$ echo $fspec | awk -F"/" '{print $NF}'
abc.txt
sed
$ echo $fspec | sed 's/.*\///'
abc.txt
using IFS
$ IFS="/"
$ set -- $fspec
$ eval echo \${${##}}
abc.txt
You can simply do:
base=$(basename "$fspec")
dirname "/usr/home/theconjuring/music/song.mp3"
will yield
/usr/home/theconjuring/music.
bash:
fspec="/exp/home1/abc.txt"
fname="${fspec##*/}"
echo $fspec | tr "/" "\n"|tail -1
Using bash "here string":
$ fspec="/exp/home1/abc.txt"
$ tr "/" "\n" <<< $fspec | tail -1
abc.txt
$ filename=$(tr "/" "\n" <<< $fspec | tail -1)
$ echo $filename
abc.txt
The benefit of the "here string" is that it avoids the need/overhead of running an echo command. In other words, the "here string" is internal to the shell. That is:
$ tr <<< $fspec
as opposed to:
$ echo $fspec | tr

Resources