How to apply the same awk action to all the files in a folder? - bash

I had written an awk code for deleting all the lines ending in a colon from a file. But now I want to run this particular awk action on a whole folder containing similar files.
awk '!/:$/' qs.txt > fin.txt
awk '{print $3 " " $4}' fin.txt > out.txt

You could wrap your awk command in a loop in your shell such as bash.
myfiles=mydirectory/*.txt
for file in $myfiles
do
b=$(basename "$file" .txt)
awk '!/:$/' "$b.txt" > "$b.out"
done
EDIT: improved quoting as commenters suggested
If you like it better, you can use "${file%.txt}" instead of $(basename "$file" .txt).
Aside: My own preference runs to basename just because man basename is easier for me than man -P 'less -p "^ Param"' bash (when that is the relevant heading on the particular system). Please accept this quirk of mine and let's not discuss info and http://linux.die.net/man/ and whatever.

You could use sed. Just run the below command on the directory in which the files you want to change was actually stored.
sed -i '/:$/d' *.*

This will create new files in an empty directory, with the same name.
mkdir NEWFILES
for file in `find . -name "*name_pattern*"`
do
awk '!/:$/' $file > fin.txt
awk '{print $3 " " $4}' fin.txt > NEWFILES/$file
done
After that you just need to
cp -fr NEWFILES/* .

Related

Combine multiple files into one including the file name

I have been looking around trying to combine multiple text files into including the name of the file.
My current file content is:
1111,2222,3333,4444
What I'm after is:
File1,1111,2222,3333,4444
File1,1111,2222,3333,4445
File1,1111,2222,3333,4446
File1,1111,2222,3333,4447
File2,1111,2222,3333,114444
File2,1111,2222,3333,114445
File2,1111,2222,3333,114446
I found multiple example how to combine them all but nothing to combine them including the file name.
Could you please try following. Considering that your Input_file names extensions are .csv.
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.csv > output_file
After seeing OP's comments if file extensions are .txt then try:
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.txt > output_file
Assuming all your files have a .txt extension and contain only one line as in the example, you can use the following code:
for f in *.txt; do echo "$f,$(cat "$f")"; done > output.log
where output.log is the output file.
Well, it works:
printf "%s\n" *.txt |
xargs -n1 -d $'\n' bash -c 'xargs -n1 -d $'\''\n'\'' printf "%s,%s\n" "$1" <"$1"' --
First output a newline separated list of files.
Then for each file xargs execute sh
Inside sh execute xargs for each line of file
and it executes printf "%s,%s\n" <filename> for each line of input
Tested in repl.
Solved using grep "" *.txt -I > $filename.

How to rename a CSV file from a value in the CSV file

I have 100 1-line CSV files. The files are currently labeled AAA.txt, AAB.txt, ABB.txt (after I used split -l 1 on them). The first field in each of these files is what I want to rename the file as, so instead of AAA, AAB and ABB it would be the first value.
Input CSV (filename AAA.txt)
1234ABC, stuff, stuff
Desired Output (filename 1234ABC.csv)
1234ABC, stuff, stuff
I don't want to edit the content of the CSV itself, just change the filename
something like this should work:
for f in ./* ; do new_name=$(head -1 $f | cut -d, -f1); cp $f dir/$new_name
move them into a new dir just in case something goes wrong, or you need the original file names.
starting with your original file before splitting
$ awk -F, '{print > ($1".csv")}' originalFile.csv
and do all in one shot.
This will store the whole input file into the colum1.csv of the inputfile.
awk -F, '{print $0 > $1".csv" }' aaa.txt
In a terminal, changed directory, e.g. cd /path/to/directory that the files are in and then use the following compound command:
for f in *.txt; do echo mv -n "$f" "$(awk -F, '{print $1}' "$f").cvs"; done
Note: There is an intensional echo command that is there for you to test with, and it will only print out the mv command for you to see that it's the outcome you wish. You can then run it again removing just echo from the compound command to actually rename the files as desired via the mv command.

Rename files to new naming convention in bash

I have a directory of files with names formatted like
01-Peterson#2x.png
15-Consolidated#2x.png
03-Brady#2x.png
And I would like to format them like
PETERSON.png
CONSOLIDATED.png
BRADY.png
But my bash scripting skills are pretty weak right now. What is the best way to go about this?
Edit: my bash version is 3.2.57(1)-release
This will work for files that contains spaces (including newlines), backslashes, or any other character, including globbing chars that could cause a false match on other files in the directory, and it won't remove your home file system given a particularly undesirable file name!
for old in *.png; do
new=$(
awk 'BEGIN {
base = sfx = ARGV[1]
sub(/^.*\./,"",sfx)
sub(/^[^-]+-/,"",base)
sub(/#[^#.]+\.[^.]+$/,"",base)
print toupper(base) "." sfx
exit
}' "$old"
) &&
mv -- "$old" "$new"
done
If the pattern for all your files are like the one you posted, I'd say you can do something as simple as running this on your directory:
for file in `ls *.png`; do new_file=`echo $file | awk -F"-" '{print $2}' | awk -F"#" '{n=split($2,a,"."); print toupper($1) "." a[2]}'`; mv $file $new_file; done
If you fancy learning other solutions, like regexes, you can also do:
for file in `ls *.png`; do new_file=`echo $file | sed "s/.*-//g;s/#.*\././g" | tr '[:lower:]' '[:upper:]'`; mv $file $new_file; done
Testing it, it does for example:
mv 01-Peterson#2x.png PETERSON.png
mv 02-Bradley#2x.png BRADLEY.png
mv 03-Jacobs#2x.png JACOBS.png
mv 04-Matts#1x.png MATTS.png
mv 05-Jackson#4x.png JACKSON.png

redirect output of loop to current reading file

I have simple script that looks like
for file in `ls -rlt *.rules | awk '{print $9}'`
do
cat $file | awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file
done
How can i redirect output of awk to the same file which it is reading to perform action.
files have data before running above script
123|test||
After running script files should have data like
123|test|2017_04_05|2017_04_05
You cannot replace your files on the fly like this, mostly because you increase their size.
The way is to use temporary file, then replace the current:
for file in `ls -1 *.rules `
do
TMP_FILE=/tmp/${file}_$$
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file > ${TMP_FILE}
mv ${TMP_FILE} $file
done
I would modify Michael Vehrs otherwise good answer as follows:
ls -rt *.rules | while read file
do
TMP_FILE="/tmp/${file}_$$"
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) \
'!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" "$file" > "$TMP_FILE"
mv "$TMP_FILE" "$file"
done
Your question uses ls(1) to sort the files by time, oldest first. The above preserves that property. I removed the {} braces because they add nothing in a shell script if the variable name isn't being interpolated, and quotes to cope with filenames that include whitespace.
If time-order doesn't matter, I'd consider an inside-out solution: in awk, write to a temporary file instead of standard output, and then rename it with system in an END block. Then if something goes wrong your input is preserved.
First of all, it is silly to use a combination of ls -rlt and awk when the only thing you need is the file name. You don't even need ls because the shell glob is expanded by the shell, not ls. Simply use for file in *.rules. Since the date would seem to be the same for every file (unless you run the command at midnight), it is sufficient to calculate it in advance:
date=$(date +%Y"_"%m"_"%d)
for file in *.rules
do
TMP_FILE=$(mktemp ${file}_XXXXXX)
awk -F"|" -v DATE=${date} '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file > ${TMP_FILE}
mv ${TMP_FILE} $file
done
However, since awk also knows which file it is reading, you could do something like this:
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) \
'!$3{$3=DATE} !$4{$4=DATE} { print > FILENAME ".tmp" }' OFS="|" *.rules
rename .tmp "" *.rules.tmp

awk Getting ALL line but last field with the delimiters

I have to make a one-liner that renames all files in the current directory
that end in ".hola" to ".txt".
For example:
sample.hola and name.hi.hola will be renamed to sample.txt and name.hi.txt respectively
I was thinking about something like:
ls -1 *.hola | awk '{NF="";print "$0.hola $0.txt"}' (*)
And then passing the stdin to xargs mv -T with a |
But the output of (*) for the example would be sample and name hi.
How do I get the output name.hi for name.hi.hola using awk?
Why would you want to involve awk in this?
$ for f in *.hola; do echo mv "$f" "${f%hola}txt"; done
mv name.hi.hola name.hi.txt
mv sample.hola sample.txt
Remove the echo when you're happy with the output.
Well, for your specific problem, I recommend the rename command. Depending on the version on your system, you can do either rename -s .hola .txt *.hola, or rename 's/\.hola$/.txt/' *.hola.
Also, you shouldn't use ls to get filenames. When you run ls *.hola, the shell expands *.hola to a list of all the filenames matching that pattern, and ls is just a glorified echo at that point. You can get the same result using e.g. printf '%s\n' *.hola without running any program outside the shell.
And your awk is missing any attempt to remove the .hola. If you have GNU awk, you can do something like this:
awk -F. '{old=$0; NF-=1; new=$0".txt"; print old" "new}'
That won't work on BSD/MacOS awk. In that case you can do something like this:
awk -F. '{
old=$0; new=$1;
for (i=2;i<NF;++i) { new=new"."$i };
print old" "new".txt"; }'
Either way, I'm sure #EdMorton probably has a better awk-based solution.
How about this? Simple and straightforward:
for file in *.hola; do mv "$file" "${file/%hola/txt}"; done

Resources