How to save a list of all files in a directory in a single text file and add prefixes and suffixes? - bash

I am trying to save a list of files in a directory into a single file using
ls > output.txt
Let's say we have in the directory:
a.txt
b.txt
c.txt
I want to modify the names of these files in the output.txt to be like:
1a.txt$
1b.txt$
1c.txt$

Another easy way use AWK to change content and save to file via .tmp
This script will print content how you want. Just add "1" and "$" to begining and ending accordingly.
cat output.txt | awk '{print "1"$1"$"}'
And then you can save to original file as you want by extending command && (if success then next )
cat output.txt | awk '{print "1"$1"$"}' > output.txt.tmp && mv output.txt.tmp output.txt

#!/bin/sh -x
for f in *.txt
do
nf=$(echo "${f}" | sed 's#^#1#')
mv -v "${f}" "${nf}"
done

Related

Copy file with filename based on grep output

I have a collection of files that all have a specific sequence in them. The files are named sequentially, and I want to copy over the first instance of each file that has a unique sequence.
For example,
1.txt Content: 1[Block]Alpha[/Block]1
2.txt Content: 2[Block]Beta[/Block]2
3.txt Content: 3[Block]Charlie[/Block]3
4.txt Content: 4[Block]Alpha[/Block]4
I want the output to be
Alpha.txt Content: 1[Block]Alpha[/Block]1
Beta.txt Content: 2[Block]Beta[/Block]2
Charlie.txt Content: 3[Block]Charlie[/Block]3
4.txt is missing, as it has 'Alpha' in it which a previous file already matched on.
Currently, I Have the following:
ls | sort -r | xargs grep -oE -m 1 '[Block].{0,40}[/Block]'
#which returns:
1.txt:[Block]Alpha[Block]
2.txt:[Block]Beta[Block]
3.txt:[Block]Charlie[Block]
4.txt:[Block]Alpha[Block]
I want to separate the filename from the left of the ':' and rename it to either everything to the right of it (including Block).txt, or just Alpha.txt (for example).
cp has -n flag for no overwriting, so as long as I do it in sequence i should have no issue there, but I am a bit lost how to continue
Here is a solution that uses one awk process to do the search and extract the filenames and the text between blocks. For the first occurence, it checks if the matched text has been used already, if not it prints, and goes to next file. Output is piped to xargs -n2 with the cp command.
#!/bin/bash
awk '/\[Block\].*\[\/Block\]/ {
gsub(/^.*\[Block\]/,""); gsub(/\[\/Block\].*$/,"")
if (!a[$0]++) print FILENAME, $0 ".txt"; nextfile
}' *.txt | xargs -n2 echo cp -n --
Note: remove echo after you are done with testing.
Testing with your sample files:
> sh test.sh
cp -n -- 1.txt Alpha.txt
cp -n -- 2.txt Beta.txt
cp -n -- 3.txt Charlie.txt
I your case, you want to rename your files in a directory with pattern matched from content of those files, and remove a file that duplicated with other?
I have tested on directory /tmp/test. In this dir, i have 4 file (1.txt 2.txt 3.txt, 4.txt) and write a shell script to perform requirement.
shell script as below:
#/bin/bash
cd /tmp/test
files=$(ls)
for i in $files; do
pattern=$(cat $i | sed "s/Block//g" | grep -o "[a-Z][a-Z]*")
if ! echo $pattern_list | grep -w $pattern; then
echo "Rename $i to ${pattern}.txt"
mv $i ${pattern}.txt
pattern_list+="$pattern "
else
rm $i
fi
done
Brief explain:
List all current file in /tmp/test
Read each file to capture file name and pattern (Alpha, Beta,
Charlie, ...)
Rename the file with new pattern
Remove the file if pattern is duplicated
The Result as below:
sh /tmp/myscript.sh
Rename 1.txt to Alpha.txt
Rename 2.txt to Beta.txt
Rename 3.txt to Charlie.txt
Alpha Beta Charlie
ls
Alpha.txt Beta.txt Charlie.txt

How to Write A Second Column in Bash in an Existing txt file

I need to extract the ID name of a parent directory and put that in a tab-delimited text file. Then I need to extract names of the contents of that folder and put it in the same row as that ID name I first extracted. Essentially, Column 1 should list the directory name from parent, Column 2 should list the name first file in that directory, Column 3 should be the name of the next file, and so on and so forth.
/path/to/folder/ID/
pwd | xargs echo | awk -F "/" '{print $n; exit}' >> Text.txt
where 'n' is the location of the desired parent folder (in this case, ID). This works fine, and writes something like "ID001" to my Text.txt file.
I try the same little hack again, using my pwd as my input to xargs, listing out the contents of that folder, and writing the names to my Text.txt file:
pwd | xargs echo | awk -F "/" '{print $7; exit}' >> Text.txt | pwd | xargs echo | xargs ls | xargs echo >> Text.txt
But instead of
ID001 file1 file2
I get
file1 file2
ID001
Which is mostly to be expected, given the commands. I am confused as to why my file names are being appended to the first row and not to the last row. The only related article I could find was this for writing a specific column to a CSV, but it wasn't quite what I was looking for.
This find plus awk pipeline MAY be what you're trying to do:
$ ls tmp
a b
$ find tmp -print | awk '{sub("^[^/]+/",""); printf "%s%s", sep, $0; sep="\t"} END{print ""}'
tmp a b
YMMV if your file names contain tabs or newlines of course.
You probably want to do that as part of multiple commands; for ease in understanding.
You can put the commands in a bash script.
Example scenario
$ pwd
/Users/pa357856/test/tmp/foo
$ ls
file1.txt file2.txt
commands -
$ parentDIR=`pwd | xargs echo | awk -F "/" '{print $6}'`
$ filesList=`ls`
$ echo "$parentDIR" "$filesList" >> test.txt
Result -
$ cat test.txt
foo file1.txt file2.txt

Combine multiple files into one including the file name

I have been looking around trying to combine multiple text files into including the name of the file.
My current file content is:
1111,2222,3333,4444
What I'm after is:
File1,1111,2222,3333,4444
File1,1111,2222,3333,4445
File1,1111,2222,3333,4446
File1,1111,2222,3333,4447
File2,1111,2222,3333,114444
File2,1111,2222,3333,114445
File2,1111,2222,3333,114446
I found multiple example how to combine them all but nothing to combine them including the file name.
Could you please try following. Considering that your Input_file names extensions are .csv.
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.csv > output_file
After seeing OP's comments if file extensions are .txt then try:
awk 'BEGIN{OFS=","} {print FILENAME,$0}' *.txt > output_file
Assuming all your files have a .txt extension and contain only one line as in the example, you can use the following code:
for f in *.txt; do echo "$f,$(cat "$f")"; done > output.log
where output.log is the output file.
Well, it works:
printf "%s\n" *.txt |
xargs -n1 -d $'\n' bash -c 'xargs -n1 -d $'\''\n'\'' printf "%s,%s\n" "$1" <"$1"' --
First output a newline separated list of files.
Then for each file xargs execute sh
Inside sh execute xargs for each line of file
and it executes printf "%s,%s\n" <filename> for each line of input
Tested in repl.
Solved using grep "" *.txt -I > $filename.

How to rename a CSV file from a value in the CSV file

I have 100 1-line CSV files. The files are currently labeled AAA.txt, AAB.txt, ABB.txt (after I used split -l 1 on them). The first field in each of these files is what I want to rename the file as, so instead of AAA, AAB and ABB it would be the first value.
Input CSV (filename AAA.txt)
1234ABC, stuff, stuff
Desired Output (filename 1234ABC.csv)
1234ABC, stuff, stuff
I don't want to edit the content of the CSV itself, just change the filename
something like this should work:
for f in ./* ; do new_name=$(head -1 $f | cut -d, -f1); cp $f dir/$new_name
move them into a new dir just in case something goes wrong, or you need the original file names.
starting with your original file before splitting
$ awk -F, '{print > ($1".csv")}' originalFile.csv
and do all in one shot.
This will store the whole input file into the colum1.csv of the inputfile.
awk -F, '{print $0 > $1".csv" }' aaa.txt
In a terminal, changed directory, e.g. cd /path/to/directory that the files are in and then use the following compound command:
for f in *.txt; do echo mv -n "$f" "$(awk -F, '{print $1}' "$f").cvs"; done
Note: There is an intensional echo command that is there for you to test with, and it will only print out the mv command for you to see that it's the outcome you wish. You can then run it again removing just echo from the compound command to actually rename the files as desired via the mv command.

how to remove the header(firstline) in all files and in a directory. and the file names still remain same in unix

Example: In my folder
C:\users\inputfiles contains
file1.txt
file2.txt
file3.txt with headers.
Need to remove the header in each file and move the data into same file( File name should not change) by using shell script
sed and tail will help you with this.
No output redirection is required if sed is used.
sed -i '1d' filename
If you're using tail then use a intermediate tmp file to have the contents stored and then move that content of tmp file to the original file name.
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"
tail will work faster compared to sed.
Run This Command For Each of your File.
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE"
It should work.
I assume the number of lines of the headers are all the same: 2
Give a try to this, to remove the lines before the 3 for each files with .txt suffix in current directory:
sed -n -i '3,$ p' *.txt
-i: modify each file directly

Resources