awk Getting ALL line but last field with the delimiters - bash

I have to make a one-liner that renames all files in the current directory
that end in ".hola" to ".txt".
For example:
sample.hola and name.hi.hola will be renamed to sample.txt and name.hi.txt respectively
I was thinking about something like:
ls -1 *.hola | awk '{NF="";print "$0.hola $0.txt"}' (*)
And then passing the stdin to xargs mv -T with a |
But the output of (*) for the example would be sample and name hi.
How do I get the output name.hi for name.hi.hola using awk?

Why would you want to involve awk in this?
$ for f in *.hola; do echo mv "$f" "${f%hola}txt"; done
mv name.hi.hola name.hi.txt
mv sample.hola sample.txt
Remove the echo when you're happy with the output.

Well, for your specific problem, I recommend the rename command. Depending on the version on your system, you can do either rename -s .hola .txt *.hola, or rename 's/\.hola$/.txt/' *.hola.
Also, you shouldn't use ls to get filenames. When you run ls *.hola, the shell expands *.hola to a list of all the filenames matching that pattern, and ls is just a glorified echo at that point. You can get the same result using e.g. printf '%s\n' *.hola without running any program outside the shell.
And your awk is missing any attempt to remove the .hola. If you have GNU awk, you can do something like this:
awk -F. '{old=$0; NF-=1; new=$0".txt"; print old" "new}'
That won't work on BSD/MacOS awk. In that case you can do something like this:
awk -F. '{
old=$0; new=$1;
for (i=2;i<NF;++i) { new=new"."$i };
print old" "new".txt"; }'
Either way, I'm sure #EdMorton probably has a better awk-based solution.

How about this? Simple and straightforward:
for file in *.hola; do mv "$file" "${file/%hola/txt}"; done

Related

How to rename a CSV file from a value in the CSV file

I have 100 1-line CSV files. The files are currently labeled AAA.txt, AAB.txt, ABB.txt (after I used split -l 1 on them). The first field in each of these files is what I want to rename the file as, so instead of AAA, AAB and ABB it would be the first value.
Input CSV (filename AAA.txt)
1234ABC, stuff, stuff
Desired Output (filename 1234ABC.csv)
1234ABC, stuff, stuff
I don't want to edit the content of the CSV itself, just change the filename
something like this should work:
for f in ./* ; do new_name=$(head -1 $f | cut -d, -f1); cp $f dir/$new_name
move them into a new dir just in case something goes wrong, or you need the original file names.
starting with your original file before splitting
$ awk -F, '{print > ($1".csv")}' originalFile.csv
and do all in one shot.
This will store the whole input file into the colum1.csv of the inputfile.
awk -F, '{print $0 > $1".csv" }' aaa.txt
In a terminal, changed directory, e.g. cd /path/to/directory that the files are in and then use the following compound command:
for f in *.txt; do echo mv -n "$f" "$(awk -F, '{print $1}' "$f").cvs"; done
Note: There is an intensional echo command that is there for you to test with, and it will only print out the mv command for you to see that it's the outcome you wish. You can then run it again removing just echo from the compound command to actually rename the files as desired via the mv command.

Bash Shell Scripting assigning new variables for output of a grep search

EDIT 2:
I've decided to re-write this in order to better portray my outcome.
I'm currently using this code to output a list of files within various directories:
for file in /directoryX/*.txt
do
grep -rl "Annual Compensation" $file
done
The output shows all files that have a certain table I'm trying to extract in a layout like this:
txtfile1.txt
txtfile2.txt
txtfile3.txt
I have been using this awk command on each individual .txt file to extract the table and then send it to a .csv:
awk '/Annual Compensation/{f=1} f{print; if (/<\/TABLE>/) exit}' txtfile1.txt > txtfile1.csv
My goal is to find a command that will run my awk command against each file in the list all at once. Thank you to those that have provided suggestions already.
If I understand what you're asking, I think what you want to do is add a line after the grep, or instead of the grep, that says:
awk '/Annual Compensation/{f=1} f{print; if (/<\/TABLE>/) exit}' $file > ${file}_new.csv
When you say ${file}_new.csv, it expands the file variable, then adds the string "_new.csv" to it. That's what you're shooting for, right?
Modifying your code:
for file in /directoryX/*.txt
do
files+=($(grep -rl "Annual Compensation" $file))
done
for f in "${files[#]}";do
awk '/Annual Compensation/{f=1} f{print; if (/<\/TABLE>/) exit}' "$f" > "$f"_new.csv
done
Alternative code:
files+=($(grep -rl "Annual Compensation" /directoryX/*))
for f in "${files[#]}";do
awk '/Annual Compensation/{f=1} f{print; if (/<\/TABLE>/) exit}' "$f" > "$f"_new.csv
In both cases, the grep results and awk results are not verified by me - it is just a copy - paste of your code.

How to apply the same awk action to all the files in a folder?

I had written an awk code for deleting all the lines ending in a colon from a file. But now I want to run this particular awk action on a whole folder containing similar files.
awk '!/:$/' qs.txt > fin.txt
awk '{print $3 " " $4}' fin.txt > out.txt
You could wrap your awk command in a loop in your shell such as bash.
myfiles=mydirectory/*.txt
for file in $myfiles
do
b=$(basename "$file" .txt)
awk '!/:$/' "$b.txt" > "$b.out"
done
EDIT: improved quoting as commenters suggested
If you like it better, you can use "${file%.txt}" instead of $(basename "$file" .txt).
Aside: My own preference runs to basename just because man basename is easier for me than man -P 'less -p "^ Param"' bash (when that is the relevant heading on the particular system). Please accept this quirk of mine and let's not discuss info and http://linux.die.net/man/ and whatever.
You could use sed. Just run the below command on the directory in which the files you want to change was actually stored.
sed -i '/:$/d' *.*
This will create new files in an empty directory, with the same name.
mkdir NEWFILES
for file in `find . -name "*name_pattern*"`
do
awk '!/:$/' $file > fin.txt
awk '{print $3 " " $4}' fin.txt > NEWFILES/$file
done
After that you just need to
cp -fr NEWFILES/* .

awk execute same command on different files one by one

Hi I have 30 txt files in a directory which are containing 4 columns.
How can I execute a same command on each file one by one and direct output to different file.
The command I am using is as below but its being applied on all the files and giving single output. All i want is to call each file one by one and direct outputs to a new file.
start=$1
patterns=''
for i in $(seq -43 -14); do
patterns="$patterns /cygdrive/c/test/kpi/SIGTRAN_Load_$(exec date '+%Y%m%d' --date="-${i} days ${start}")*"; done
cat /cygdrive/c/test/kpi/*$patterns | sed -e "s/\t/,/g" -e "s/ /,/g"| awk -F, 'a[$3]<$4{a[$3]=$4} END {for (i in a){print i FS a[i]}}'| sed -e "s/ /0/g"| sort -t, -k1,2> /cygdrive/c/test/kpi/SIGTRAN_Load.csv
Sth like this
for fileName in /path/to/files/foo*.txt
do
mangleFile "$fileName"
done
will mangle a list of files you give via globbing. If you want to generate the file name patterns as in your example, you can do it like this:
for i in $(seq -43 -14)
do
for fileName in /cygdrive/c/test/kpi/SIGTRAN_Load_"$(exec date '+%Y%m%d' --date="-${i} days ${start}")"*
do
mangleFile "$fileName"
done
done
This way the code stays much more readable, even if shorter solutions may exist.
The mangleFile of course then will be the awk call or whatever you would like to do with each file.
Use the following idiom:
for file in *
do
./your_shell_script_containing_the_above.sh $file > some_unique_id
done
You need to run a loop on all the matching files:
for i in /cygdrive/c/test/kpi/*$patterns; do
tr '[:space:]\n' ',\n' < "$i" | awk -F, 'a[$3]<$4{a[$3]=$4} END {for (i in a){print i FS a[i]}}'| sed -e "s/ /0/g"| sort -t, -k1,2 > "/cygdrive/c/test/kpi/SIGTRAN_Load-$i.csv"
done
PS: I haven't tried much to refactor your piped commands that can probably be shortened too.

String Manipulation in Bash

I am a newbie in Bash and I am doing some string manipulation.
I have the following file among other files in my directory:
jdk-6u20-solaris-i586.sh
I am doing the following to get jdk-6u20 in my script:
myvar=`ls -la | awk '{print $9}' | egrep "i586" | cut -c1-8`
echo $myvar
but now I want to convert jdk-6u20 to jdk1.6.0_20. I can't seem to figure out how to do it.
It must be as generic as possible. For example if I had jdk-6u25, I should be able to convert it at the same way to jdk1.6.0_25 so on and so forth
Any suggestions?
Depending on exactly how generic you want it, and how standard your inputs will be, you can probably use AWK to do everything. By using FS="regexp" to specify field separators, you can break down the original string by whatever tokens make the most sense, and put them back together in whatever order using printf.
For example, assuming both dashes and the letter 'u' are only used to separate fields:
myvar="jdk-6u20-solaris-i586.sh"
echo $myvar | awk 'BEGIN {FS="[-u]"}; {printf "%s1.%s.0_%s",$1,$2,$3}'
Flavour according to taste.
Using only Bash:
for file in jdk*i586*
do
file="${file%*-solaris*}"
file="${file/-/1.}"
file="${file/u/.0_}"
do_something_with "$file"
done
i think that sed is the command for you
You can try this snippet:
for fname in *; do
newname=`echo "$fname" | sed 's,^jdk-\([0-9]\)u\([0-9][0-9]*\)-.*$,jdk1.\1.0_\2,'`
if [ "$fname" != "$newname" ]; then
echo "old $fname, new $newname"
fi
done
awk 'if(match($9,"i586")){gsub("jdk-6u20","jdk1.6.0_20");print $9;}'
The if(match()) supersedes the egrep bit if you want to use it. You could use substr($9,1,8) instead of cut as well.
garph0 has a good idea with sed; you could do
myvar=`ls jdk*i586.sh | sed 's/jdk-\([0-9]\)u\([0-9]\+\).\+$/jdk1.\1.0_\2/'`
You're needing the awk in there is an artifact of the -l switch on ls. For pattern substitution on lines of text, sed is the long-time champion:
ls | sed -n '/^jdk/s/jdk-\([0-9][0-9]*\)u\([0-9][0-9]*\)$/jdk1.\1.0_\2/p'
This was written in "old-school" sed which should have greater portability across platforms. The expression says:
don't print lines unless they match -n
on lines beginning with 'jdk' do:
on a line that contains only "jdk-IntegerAuIntegerB"
change it to "jdk.1.IntegerA.0_IntegerB"
and print it
Your sample becomes even simpler as:
myvar=`echo *solaris-i586.sh | sed 's/-solaris-i586\.sh//'`

Resources