Append wc lines to filename - bash

Title says it all. I've managed to get just the lines with this:
lines=$(wc file.txt | awk {'print $1'});
But I could use an assist appending this to the filename. Bonus points for showing me how to loop this over all the .txt files in the current directory.

find -name '*.txt' -execdir bash -c \
'mv -v "$0" "${0%.txt}_$(wc -l < "$0").txt"' {} \;
where
the bash command is executed for each (\;) matched file;
{} is replaced by the currently processed filename and passed as the first argument ($0) to the script;
${0%.txt} deletes shortest match of .txt from back of the string (see the official Bash-scripting guide);
wc -l < "$0" prints only the number of lines in the file (see answers to this question, for example)
Sample output:
'./file-a.txt' -> 'file-a_5.txt'
'./file with spaces.txt' -> 'file with spaces_8.txt'

You could use the rename command, which is actually a Perl script, as follows:
rename --dry-run 'my $fn=$_; open my $fh,"<$_"; while(<$fh>){}; $_=$fn; s/.txt$/-$..txt/' *txt
Sample Output
'tight_layout1.txt' would be renamed to 'tight_layout1-519.txt'
'tight_layout2.txt' would be renamed to 'tight_layout2-1122.txt'
'tight_layout3.txt' would be renamed to 'tight_layout3-921.txt'
'tight_layout4.txt' would be renamed to 'tight_layout4-1122.txt'
If you like what it says, remove the --dry-run and run again.
The script counts the lines in the file without using any external processes and then renames them as you ask, also without using any external processes, so it quite efficient.
Or, if you are happy to invoke an external process to count the lines, and avoid the Perl method above:
rename --dry-run 's/\.txt$/-`grep -ch "^" "$_"` . ".txt"/e' *txt

Use rename command
for file in *.txt; do
lines=$(wc ${file} | awk {'print $1'});
rename s/$/${lines}/ ${file}
done

#/bin/bash
files=$(find . -maxdepth 1 -type f -name '*.txt' -printf '%f\n')
for file in $files; do
lines=$(wc $file | awk {'print $1'});
extension="${file##*.}"
filename="${file%.*}"
mv "$file" "${filename}${lines}.${extension}"
done
You can adjust maxdepth accordingly.

you can do like this as well:
for file in "path_to_file"/'your_filename_pattern'
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done
example:
for file in /oradata/SCRIPTS_EL/text*
do
lines=$(wc $file | awk {'print $1'})
mv $file $file'_'$lines
done

This would work, but there are definitely more elegant ways.
for i in *.txt; do
mv "$i" ${i/.txt/}_$(wc $i | awk {'print $1'})_.txt;
done
Result would put the line numbers nicely before the .txt.
Like:
file1_1_.txt
file2_25_.txt

You could use grep -c '^' to get the number of lines, instead of wc and awk:
for file in *.txt; do
[[ ! -f $file ]] && continue # skip over entries that are not regular files
#
# move file.txt to file.txt.N where N is the number of lines in file
#
# this naming convention has the advantage that if we run the loop again,
# we will not reprocess the files which were processed earlier
mv "$file" "$file".$(grep -c '^' "$file")
done

{ linecount[FILENAME] = FNR }
END {
linecount[FILENAME] = FNR
for (file in linecount) {
newname = gensub(/\.[^\.]*$/, "-"linecount[file]"&", 1, file)
q = "'"; qq = "'\"'\"'"; gsub(q, qq, newname)
print "mv -i -v '" gensub(q, qq, "g", file) "' '" newname "'"
}
close(c)
}
Save the above awk script in a file, say wcmv.awk, the run it like:
awk -f wcmv.awk *.txt
It will list the commands that need to be run to rename the files in the required way (except that it will ignore empty files). To actually execute them you can pipe the output to a shell for execution as follows.
awk -f wcmv.awk *.txt | sh
Like it goes with all irreversible batch operations, be careful and execute commands only if they look okay.

awk '
BEGIN{ for ( i=1;i<ARGC;i++ ) Files[ARGV[i]]=0 }
{Files[FILENAME]++}
END{for (file in Files) {
# if( file !~ "_" Files[file] ".txt$") {
fileF=file;gsub( /\047/, "\047\"\047\"\047", fileF)
fileT=fileF;sub( /.txt$/, "_" Files[file] ".txt", fileT)
system( sprintf( "mv \047%s\047 \047%s\047", fileF, fileT))
# }
}
}' *.txt
Another way with awk to manage easier a second loop by allowing more control on name (like avoiding one having already the count inside from previous cycle)
Due to good remark of #gniourf_gniourf:
file name with space inside are possible
tiny code is now heavy for such a small task

Related

Extend filename with word from file -

I can change the filename for a file to the first word in the file.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1}')
done
But I would like to extend it inset.
for fname in lrccas1
do
cp $fname $(head -1 -q $fname|awk '{print $1 FILENAME}')
done
I have tried different variations of this, but none seem to work.
Is there an easy solution?
Kind regards Svend
Firstly, let understand why you did not get desired result
head -1 -q $fname|awk '{print $1 FILENAME}'
You are redirecting standard output of head command to awk command, that is awk is reading standard input and therefore FILENAME is set to empty string. Asking GNU AWK about FILENAME when it does consume standard input does not make much sense, as only data does through pipe and there might not such things as input file at all, e.g.
seq 10 | awk '{print $1*10}'
Secondly, let find way to get desired result, you have access to filename and successfully extracted word, therefore you might concat them that is
for fname in lrccas1
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done
Thirdly, I must warn you that your command does copy (cp) rather than rename file (rm) and does not care if target name does exist or not - if it do, it will be overwritten.
You can do it in pure bash (or sh)
for fname in lrccas1
do
read -r word rest < "$fname" && cp "$fname" "$word$fname"
done
This would do what your shell script appears to be trying to do:
awk 'FNR==1{close(out); out=$1 FILENAME} {print > out}' lrccas1
but you might want to consider something like this instead:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt
so your newly created files don't overwrite your existing ones and then to also remove the originals would be:
awk 'FNR==1{close(out); out=$1 FILENAME "_new"} {print > out}' *.txt &&
rm -f *.txt
That assumes your original files have some suffix like .txt or other way of identifying the original files, or you have all of your original files into some directory such as $HOME/old and can put the new files in a new directory such as $HOME/new:
cd "$HOME/old" &&
mkdir -p "$HOME/new" &&
awk -v newDir="$HOME/new" 'FNR==1{close(out); out=newDir "/" $1 FILENAME} {print > out}' * &&
echo rm -f *
remove the echo when done testing and happy with the result.
try to execute this: (bash)
for fname in file_name
do
cp $fname "$(head -1 -q $fname|awk '{print $1}')$fname"
done

Remove half the lines of files recursively using sed and wc

I wonder if there is a way to remove half the lines of a file using wc and sed.
I can do this:
sed -i '50,$d' myfile.txt
Which removes lines from 50 to the end of file. I can also do this:
wc -l myfile.txt
Which returns the number of lines in the file.
But what I really want to do is something like this:
wc -l myfile.txt | sed -i '{wc -l result}/2,$d' myfile.txt
How can I tell sed to remove the lines starting from the wc -l result divided by 2?
How can I do this recursively?
you can make use of head command:
head -n -"$(($(wc -l<file)/2))" file
awk is also possible, and with exit statement, it could be faster:
awk -v t="$(wc -l <file)" 'NR>=t/2{exit}7' file
You can get the file by awk/head ... > newFile or cmd > tmp && mv tmp file to have "in-place" change.
I guess you were close. Just use a command substitution with arithmetic expansion to get the value of the starting line:
startline="$(( $(wc -l <myfile.txt) / 2 ))"
sed -i "$startline"',$d' myfile.txt
Or a oneliner:
sed -i "$(( $(wc -l <myfile.txt) / 2 ))"',$d' myfile.txt
In some sense, reading the file twice (or, as noted by Kent, once and a half) is unavoidable. Perhaps it will be slightly more efficient if you use just a single process.
awk 'NR==FNR { total=FNR; next }
FNR>total/2 { exit } 1' myfile.txt myfile.txt
Doing this recursively with Awk is slightly painful. If you have GNU Awk, you can use the -t inline option, but I'm not sure of its semantics when you read the same file twice. Perhaps just fall back to a temporary output file.
find . -type f -exec sh -c "awk 'NR==FNR { total=FNR; next }
FNR>total/2 { exit } 1' {} {} >tmp &&
mv tmp {}" _ \;

Extract a line from a text file using grep?

I have a textfile called log.txt, and it logs the file name and the path it was gotten from. so something like this
2.txt
/home/test/etc/2.txt
basically the file name and its previous location. I want to use grep to grab the file directory save it as a variable and move the file back to its original location.
for var in "$#"
do
if grep "$var" log.txt
then
# code if found
else
# code if not found
fi
this just prints out to the console the 2.txt and its directory since the directory has 2.txt in it.
thanks.
Maybe flip the logic to make it more efficient?
f=''
while read prev
do case "$prev" in
*/*) f="${prev##*/}"; continue;; # remember the name
*) [[ -e "$f" ]] && mv "$f" "$prev";;
done < log.txt
That walks through all the files in the log and if they exist locally, move them back. Should be functionally the same without a grep per file.
If the name is always the same then why save it in the log at all?
If it is, then
while read prev
do f="${prev##*/}" # strip the path info
[[ -e "$f" ]] && mv "$f" "$prev"
done < <( grep / log.txt )
Having the file names on the same line would significantly simplify your script. But maybe try something like
# Convert from command-line arguments to lines
printf '%s\n' "$#" |
# Pair up with entries in file
awk 'NR==FNR { f[$0]; next }
FNR%2 { if ($0 in f) p=$0; else p=""; next }
p { print "mv \"" p "\" \"" $0 "\"" }' - log.txt |
sh
Test it by replacing sh with cat and see what you get. If it looks correct, switch back.
Briefly, something similar could perhaps be pulled off with printf '%s\n' "$#" | grep -A 1 -Fxf - log.txt but you end up having to parse the output to pair up the output lines anyway.
Another solution:
for f in `grep -v "/" log.txt`; do
grep "/$f" log.txt | xargs -I{} cp $f {}
done
grep -q (for "quiet") stops the output

How do I use Bash to create a copy of a file with an extra suffix before the extension?

This title is a little confusing, so let me break it down. Basically I have a full directory of files with various names and extensions:
MainDirectory/
image_1.png
foobar.jpeg
myFile.txt
For an iPad app, I need to create copies of these with the suffix #2X appended to the end of all of these file names, before the extension - so I would end up with this:
MainDirectory/
image_1.png
image_1#2X.png
foobar.jpeg
foobar#2X.jpeg
myFile.txt
myFile#2X.txt
Instead of changing the file names one at a time by hand, I want to create a script to take care of it for me. I currently have the following, but it does not work as expected:
#!/bin/bash
FILE_DIR=.
#if there is an argument, use that as the files directory. Otherwise, use .
if [ $# -eq 1 ]
then
$FILE_DIR=$1
fi
for f in $FILE_DIR/*
do
echo "Processing $f"
filename=$(basename "$fullfile")
extension="${filename##*.}"
filename="${filename%.*}"
newFileName=$(echo -n $filename; echo -n -#2X; echo -n $extension)
echo Creating $newFileName
cp $f newFileName
done
exit 0
I also want to keep this to pure bash, and not rely on os-specific calls. What am I doing wrong? What can I change or what code will work, in order to do what I need?
#!/bin/sh -e
cd "${1-.}"
for f in *; do
cp "$f" "${f%.*}#2x.${f##*.}"
done
It's very easy to do that with awk in one line like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }' | sh
with ls -1 you get just the bare list of files, then you pipe awk to use the dot (.) as separator. Then you build a shell command to create a copy of each file.
I suggest to run the command without the last sh pipe before, in order to check the cp commands are correct. Like this:
ls -1 | awk -F "." ' { print "cp " $0 " " $1 "#2X." $2 }'

Bash Shell awk/xargs magic

I'm trying to learn a little awk foo. I have a CSV where each line is of the format partial_file_name,file_path. My goal is to find the files (based on partial name) and move them to their respective new paths. I wanted to combine the forces of find,awk and mv to achieve this but I'm stuck implementing. I wanted to use awk to separate the terms from the csv file so that I could do something like
find . -name '*$1*' -print | xargs mv {} $2{}
where $1 and $2 are the split terms from the csv file. Anyone have any ideas? -peace
I suggest doing this:
$ cat korv
foo.txt,/hello/
bar.jpg,/mullo/
$ awk -F, '{print $1 " " $2}' korv
foo.txt /hello/
bar.jpg /mullo/
-F sets the delimiter, so the above will split using ",". Next, add * to the filenames:
$ awk -F, '{print "*"$1"*" " " $2}' korv
*foo.txt* /hello/
*bar.jpg* /mullo/
**
This shows I have an empty line. We don't want this match, so we add a rule:
$ awk -F, '/[a-z]/{print "*"$1"*" " " $2}' korv
*foo.txt* /hello/
*bar.jpg* /mullo/
Looks good, so encapsulate all this to mv using a subshell:
$ mv $(awk -F, '/[a-z]/{print "*"$1"*" " " $2}' korv)
$
Done.
You don't really need awk for this. There isn't really anything here which awk does better than the shell.
#!/bin/sh
IFS=,
while read file target; do
find . -name "$file" -print0 | xargs -ir0 mv {} "$target"
done <path_to_csv_file
If you have special characters in the file names, you may need to tweak the read.
what about using awk's system command:
awk '{ system("find . -name " $1 " -print | xargs -I {} mv {} " $2 "{}"); }'
example arguments in the csv file: test.txt ./subdirectory/
find . -name "*err" -size "+10c" | awk -F.err '{print $1".job"}' | xargs -I {} qsub {}

Resources