Combine multiple files into one including the file name - bash

I have been looking around trying to combine multiple text files into including the name of the file.
My current file content is:
1111,2222,3333,4444
What I'm after is:
File1,1111,2222,3333,4444
File1,1111,2222,3333,4445
File1,1111,2222,3333,4446
File1,1111,2222,3333,4447
File2,1111,2222,3333,114444
File2,1111,2222,3333,114445
File2,1111,2222,3333,114446
I found multiple example how to combine them all but nothing to combine them including the file name.

Could you please try following. Considering that your Input_file names extensions are .csv.
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.csv > output_file
After seeing OP's comments if file extensions are .txt then try:
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.txt > output_file

Assuming all your files have a .txt extension and contain only one line as in the example, you can use the following code:
for f in *.txt; do echo "$f,$(cat "$f")"; done > output.log
where output.log is the output file.

Well, it works:
printf "%s\n" *.txt |
xargs -n1 -d $'\n' bash -c 'xargs -n1 -d $'\''\n'\'' printf "%s,%s\n" "$1" <"$1"' --
First output a newline separated list of files.
Then for each file xargs execute sh
Inside sh execute xargs for each line of file
and it executes printf "%s,%s\n" <filename> for each line of input
Tested in repl.

Solved using grep "" *.txt -I > $filename.

Related

How to output an awk command into a specific format and save to a file?

I am trying to output a list of video file names into a specifig format using awk. The output should be saved to a txt file.
For example, I have the following list of video files:
01-20191006184929.mkv
02-2019.mkv
and the desired output would be:
file 01-20191006184929.mkv
file 02-2019.mkv
My current script just outputs the file name. How do I add "file" in front of each line?
That's my current script:
ls | awk '/.mkv$/' > output.txt
You possibly can use awk, but this is easier using shell command:
printf "file %s\n" *.mkv > output.txt
printfis a builtin shell command that outputs the string referred in between double quote for each mkv filename.
Note: Don't parse ls
We shouldn't parse ls output, you could try with for loop.
for file_name in *.mkv
do
echo "file $file_name"
done
And using awk:
$ awk 'BEGIN{for(i=1;i<ARGC;i++)print "file",ARGV[i]}' *.mkv
Output:
file 01-20191006184929.mkv
file 02-2019.mkv
Explained:
$ awk '
BEGIN { # no need to touch the files
for(i=1;i<ARGC;i++) # for all parameter filenames
print "file",ARGV[i] # print
}' *.mkv # filename set goes here
Just for fun:
$ stat -c "file %n" *mkv
$ find -name '*mkv' -printf "name %p\n"

How to rename a CSV file from a value in the CSV file

I have 100 1-line CSV files. The files are currently labeled AAA.txt, AAB.txt, ABB.txt (after I used split -l 1 on them). The first field in each of these files is what I want to rename the file as, so instead of AAA, AAB and ABB it would be the first value.
Input CSV (filename AAA.txt)
1234ABC, stuff, stuff
Desired Output (filename 1234ABC.csv)
1234ABC, stuff, stuff
I don't want to edit the content of the CSV itself, just change the filename
something like this should work:
for f in ./* ; do new_name=$(head -1 $f | cut -d, -f1); cp $f dir/$new_name
move them into a new dir just in case something goes wrong, or you need the original file names.
starting with your original file before splitting
$ awk -F, '{print > ($1".csv")}' originalFile.csv
and do all in one shot.
This will store the whole input file into the colum1.csv of the inputfile.
awk -F, '{print $0 > $1".csv" }' aaa.txt
In a terminal, changed directory, e.g. cd /path/to/directory that the files are in and then use the following compound command:
for f in *.txt; do echo mv -n "$f" "$(awk -F, '{print $1}' "$f").cvs"; done
Note: There is an intensional echo command that is there for you to test with, and it will only print out the mv command for you to see that it's the outcome you wish. You can then run it again removing just echo from the compound command to actually rename the files as desired via the mv command.

How to apply the same awk action to all the files in a folder?

I had written an awk code for deleting all the lines ending in a colon from a file. But now I want to run this particular awk action on a whole folder containing similar files.
awk '!/:$/' qs.txt > fin.txt
awk '{print $3 " " $4}' fin.txt > out.txt
You could wrap your awk command in a loop in your shell such as bash.
myfiles=mydirectory/*.txt
for file in $myfiles
do
b=$(basename "$file" .txt)
awk '!/:$/' "$b.txt" > "$b.out"
done
EDIT: improved quoting as commenters suggested
If you like it better, you can use "${file%.txt}" instead of $(basename "$file" .txt).
Aside: My own preference runs to basename just because man basename is easier for me than man -P 'less -p "^ Param"' bash (when that is the relevant heading on the particular system). Please accept this quirk of mine and let's not discuss info and http://linux.die.net/man/ and whatever.
You could use sed. Just run the below command on the directory in which the files you want to change was actually stored.
sed -i '/:$/d' *.*
This will create new files in an empty directory, with the same name.
mkdir NEWFILES
for file in `find . -name "*name_pattern*"`
do
awk '!/:$/' $file > fin.txt
awk '{print $3 " " $4}' fin.txt > NEWFILES/$file
done
After that you just need to
cp -fr NEWFILES/* .

Remove Lines in Multiple Text Files that Begin with a Certain Word

I have hundreds of text files in one directory. For all files, I want to delete all the lines that begin with HETATM. I would need a csh or bash code.
I would think you would use grep, but I'm not sure.
Use sed like this:
sed -i -e '/^HETATM/d' *.txt
to process all files in place.
-i means "in place".
-e means to execute the command that follows.
/^HETATM/ means "find lines starting with HETATM", and the following d means "delete".
Make a backup first!
If you really want to do it with grep, you could do this:
#!/bin/bash
for f in *.txt
do
grep -v "^HETATM" "%f" > $$.tmp && mv $$.tmp "$f"
done
It makes a temporary file of the output from grep (in file $$.tmp) and only overwrites your original file if the command executes successfully.
Using the -v option of grep to get all the lines that do not match:
grep -v '^HETATM' input.txt > output.txt

awk execute same command on different files one by one

Hi I have 30 txt files in a directory which are containing 4 columns.
How can I execute a same command on each file one by one and direct output to different file.
The command I am using is as below but its being applied on all the files and giving single output. All i want is to call each file one by one and direct outputs to a new file.
start=$1
patterns=''
for i in $(seq -43 -14); do
patterns="$patterns /cygdrive/c/test/kpi/SIGTRAN_Load_$(exec date '+%Y%m%d' --date="-${i} days ${start}")*"; done
cat /cygdrive/c/test/kpi/*$patterns | sed -e "s/\t/,/g" -e "s/ /,/g"| awk -F, 'a[$3]<$4{a[$3]=$4} END {for (i in a){print i FS a[i]}}'| sed -e "s/ /0/g"| sort -t, -k1,2> /cygdrive/c/test/kpi/SIGTRAN_Load.csv
Sth like this
for fileName in /path/to/files/foo*.txt
do
mangleFile "$fileName"
done
will mangle a list of files you give via globbing. If you want to generate the file name patterns as in your example, you can do it like this:
for i in $(seq -43 -14)
do
for fileName in /cygdrive/c/test/kpi/SIGTRAN_Load_"$(exec date '+%Y%m%d' --date="-${i} days ${start}")"*
do
mangleFile "$fileName"
done
done
This way the code stays much more readable, even if shorter solutions may exist.
The mangleFile of course then will be the awk call or whatever you would like to do with each file.
Use the following idiom:
for file in *
do
./your_shell_script_containing_the_above.sh $file > some_unique_id
done
You need to run a loop on all the matching files:
for i in /cygdrive/c/test/kpi/*$patterns; do
tr '[:space:]\n' ',\n' < "$i" | awk -F, 'a[$3]<$4{a[$3]=$4} END {for (i in a){print i FS a[i]}}'| sed -e "s/ /0/g"| sort -t, -k1,2 > "/cygdrive/c/test/kpi/SIGTRAN_Load-$i.csv"
done
PS: I haven't tried much to refactor your piped commands that can probably be shortened too.

Resources