Bash; How to combine multiple files into one file - bash

I have multiple file in one directory, I want to combine each file into a single file using Bash. The output need to contain the file name and then list its contents. Example would be
$ cat File 1
store
$ cat File 2
bank
$ cat File 3
car
Desired output is in a single file named master
$ cat master
File 1
store
File 2
bank
File 3
car

for FILE in "File 1" "File 2" "File 3"; do
echo "$FILE"
cat "$FILE"
done > master

What you have asked for is what cat is meant for; it's short for concatenate, because it concatenates the contents of files together.
But it doesn't inject the filenames into the output. If you want the filenames there, your best bet is probably a loop:
for f in "File 1" "File 2" "File 3"; do
printf '%s\n' "$f"
cat "$f"
done > master

This will do the job
for f in File{1..3} ; do
echo $f >> master;
cat $f >> master;
done

With gnu sed
sed -s '1F' *
'-s'
'--separate'
By default, 'sed' will consider the files specified on the command
line as a single continuous long stream. This GNU 'sed' extension
allows the user to consider them as separate files: range addresses
(such as '/abc/,/def/') are not allowed to span several files, line
numbers are relative to the start of each file, '$' refers to the
last line of each file, and files invoked from the 'R' commands are
rewound at the start of each file.
'F'
Print out the file name of the current input file (with a trailing
newline).

Related

Script that lists all file names in a folder, along with some text after each name, into a txt file

I need to create a file that lists all the files in a folder into a text file, along with a comma and the number 15 after. For example
My folder has video.mp4, video2.mp4, picture1.jpg, picture2.jpg, picture3.png
I need the text file to read as follows:
video.mp4,15
video2.mp4,15
picture1.jpg,15
picture2.jpg,15
picture3.png,15
No spaces, just filename.ext,15 on each line. I am using a raspberry pi. I am aware that the command ls > filename.txt would put all the file names into a folder, but how would I get a ,15 after every line?
Thanks
bash one-liner:
for f in *; do echo "$f,15" >> filename.txt; done
To avoid opening the output file on each iteration you may redirect the entire output with > filename.txt:
for f in *; do echo "$f,15"; done > filename.txt
$ printf '%s,15\n' *
picture1.jpg,15
picture2.jpg,15
picture3.png,15
video.mp4,15
video2.mp4,15
This will work if those are the only files in the directory. The format specifier %s,15\n will be applied to each of printf's arguments (the names in the current directory) and they will be outputted with ,15 appended (and a newline).
If there are other files, then the following would work too, regardless of whether there are files called like this or not:
$ printf '%s,15\n' video.mp4 video2.mp4 picture1.jpg picture2.jpg "whatever this is"
video.mp4,15
video2.mp4,15
picture1.jpg,15
picture2.jpg,15
whatever this is,15
Or, on all MP4, PNG and JPEG files:
$ printf '%s,15\n' *.mp4 *.jpg *.png
video.mp4,15
video2.mp4,15
picture1.jpg,15
picture2.jpg,15
picture3.png,15
Then redirect this to a file with printf ...as above... >output.txt.
If you're using Bash, then this will not make use of any external utility, as printf is built into the shell.
You need to do something like this:
#!/bin/bash
for i in $(ls folder_name); do
echo $i",15" >> filename.txt;
done
It's possible to do this in one line, however, if you want to create a script, consider code readability in the long run.
Edit 1: better solution
As #CristianRamon-Cortes suggested in the comments below, you should not rely on the output of ls because of the problems explained in this discussion: why not parse ls. As such, here's how you should write the script instead:
#!/bin/bash
cd folder_name
for i in *; do
echo $i",15" >> filename.txt;
done
You can skip the part cd folder_name if you are already in the folder.
Edit 2: Enhanced solution:
As suggested by #kusalananda, you'd better do the redirection after done to avoid opening the file in each iteration of the for loop, so the script will look like this:
#!/bin/bash
cd folder_name
for i in *; do
echo $i",15";
done > filename.txt
Just 1 command line using 2 msr commands recusively (-r) search specific files:
msr -rp your-dir1,dir2,dirN -l -f "\.(mp4|jpg|png)$" -PAC | msr -t .+ -o '$0,15' -PIC > save-file.txt
If you want to sort by time, add --wt to first command like: msr --wt -l -rp your-dirs
Sort by size? Add --sz but only the prior one is effective if use both --sz and --wt.
If you want to exclude some directory, add like: --nd "^(test|garbage)$"
remove tail \r\n in save-file.txt : msr -p save-file.txt -S -t "\s+$" -o "" -R
See msr.exe / msr.gcc48 etc in my open project https://github.com/qualiu/msr tools directory.
A solution without a loop:
ls | xargs -i echo {},15 > filename.txt

bash to update filename in directory based on partial match to another

I am trying to use bash to rename/update the filename of a text file in /home/cmccabe/Desktop/percent based on a partial match of digits with another text file in /home/cmccabe/Desktop/analysis.txt. The match will always be in either lines 3,4,or 5 of this file. I am not able to do this but hopefully the 'bash` below is a start. Thank you :).
text file in /home/cmccabe/Desktop/percent - there could be a maximum of 3 files in this directory
00-0000_fbn1_20xcoverage.txt
text file in /home/cmccabe/Desktop/analysis.txt
status: complete
id names:
00-0000_Last-First
01-0101_LastN-FirstN
02-0202_La-Fi
desired result in /home/cmccabe/Desktop/percent
00-0000_Last-First_fbn1_20xcoverage.txt
bash
for filename in /home/cmccabe/Desktop/percent/*.txt; do echo mv \"$filename\" \"${filename//[0-9]-[0-9]/}\"; done < /home/cmccabe/Desktop/analysis.txt
Using a proper Process-Substitution syntax with a while-loop,
You can run the script under /home/cmccabe/Desktop/percent
#!/bin/bash
# ^^^^ needed for associative array
# declare the associative array
declare -A mapArray
# Read the file from the 3rd line of the file and create a hash-map
# as mapArray[00-0000]=00-0000_Last-First and so on.
while IFS= read -r line; do
mapArray["${line%_*}"]="$line"
done < <(tail -n +3 /home/cmccabe/Desktop/analysis.txt)
# Once the hash-map is constructed, rename the text file accordingly.
# echo the file and the name to be renamed before invoking the 'mv'
# command
for file in *.txt; do
echo "$file" ${mapArray["${file%%_*}"]}"_${file#*_}"
# mv "$file" ${mapArray["${file%%_*}"]}"_${file#*_}"
done
This is another similar bash approach:
while IFS="_" read -r id newname;do
#echo "id=$newid - newname=$newname" #for cross check
oldfilename=$(find . -name "${id}*.txt" -printf %f)
[ -n "$oldfilename" ] && echo mv \"$oldfilename\" \"${id}_${newname}_${oldfilename#*_}\";
done < <(tail -n+3 analysis)
We read the analysis file and we split each line (i.e 00-0000_Last-First) to two fields using _ as delimiter:
id=00-000
newname=Last-First
Then using this file id we read from file "analysis" we check (using find) to see if a file exists starting with the same id.
If such a file exists, it's filename is returned in variable $oldfilename.
If this variable is not empty then we do the mv.
tail -n+3 is used to ignore the first three lines of the file results.txt
Test this solution online here

Renames numbered files using names from list in other file

I have a folder where there are books and I have a file with the real name of each file. I renamed them in a way that I can easily see if they are ordered, say "00.pdf", "01.pdf" and so on.
I want to know if there is a way, using the shell, to match each of the lines of the file, say "names", with each file. Actually, match the line i of the file with the book in the positiĆ³n i in sort order.
<name-of-the-book-in-the-1-line> -> <book-in-the-1-position>
<name-of-the-book-in-the-2-line> -> <book-in-the-2-position>
.
.
.
<name-of-the-book-in-the-i-line> -> <book-in-the-i-position>
.
.
.
I'm doing this in Windows, using Total Commander, but I want to do it in Ubuntu, so I don't have to reboot.
I know about mv and rename, but I'm not as good as I want with regular expressions...
renamer.sh:
#!/bin/bash
for i in `ls -v |grep -Ev '(renamer.sh|names.txt)'`; do
read name
mv "$i" "$name.pdf"
echo "$i" renamed to "$name.pdf"
done < names.txt
names.txt: (line count must be the exact equal to numbered files count)
name of first book
second-great-book
...
explanation:
ls -v returns naturally sorted file list
grep excludes this script name and input file to not be renamed
we cycle through found file names, read value from file and rename the target files by this value
For testing purposes, you can comment out the mv command:
#mv "$i" "$name"
And now, simply run the script:
bash renamer.sh
This loops through names.txt, creates a filename based on a counter (padding to two digits with printf, assigning to a variable using -v), then renames using mv. ((++i)) increases the counter for the next filename.
#!/bin/bash
i=0
while IFS= read -r line; do
printf -v fname "%02d.pdf" "$i"
mv "$fname" "$line"
((++i))
done < names.txt

Building a csv file from multiple files

I have on a folder multiple txt file containing one or several lines. Each file name is an email address containing different email(s) address(es) inside.
For example, I have 3 files on my folder :
distribution-list1#example.com.txt
distribution-list2#example.com.txt
distribution-list3#example.com.txt
Content of each files:
cat distribution-list1#example.com.txt
john#example.com
aurel#example.com
cat distribution-list2#example.com.txt
doe#example.com
cat distribution-list3#example.com.txt
jack#example.com
gilbert#example.com
jane#example.com
I would like to build only one file containing those data:
distribution-list1#example.com;john#example.com
distribution-list1#example.com;aurel#example.com
distribution-list2#example.com;doe#example.com
distribution-list3#example.com;jack#example.com
distribution-list3#example.com;gilbert#example.com
distribution-list3#example.com;jane#example.com
lists_merge.sh
#!/usr/bin/env bash
shopt -s nullglob;
for fname in *.txt;
do
while read line;
do
printf "%s;%s\n" "$fname" "$line";
done <"$fname";
done;
output
$ ./lists_merge.sh
distribution-list1#example.com.txt;john#example.com
distribution-list1#example.com.txt;aurel#example.com
distribution-list2#example.com.txt;doe#example.com
distribution-list3#example.com.txt;jack#example.com
distribution-list3#example.com.txt;gilbert#example.com
distribution-list3#example.com.txt;jane#example.com
note: script assumed to be in same directory as distribution list text
files. assumed no other text files are in this directory
reference
nullglob info
You can use sed:
for emailfile in *.txt; do
email=${emailfile%.txt}
sed "s:^:$email;:" "$emailfile"
done
This will fail if an email ID has a colon (:), but I doubt you'd have such an example.

Merging, then splitting files

Using a for loop, I can merge all of the files in a directory that end with *.txt:
for filename in *.txt; do
cat "${filename}"
echo
done > output.txt
After doing this, I will run output.txt through various scripts, in which the text will be changed considerably. After that, I want to split the files, at the same places at which they were merged, into different files (output01.txt, output02.txt, etc.).
How can I split the files at the same place they were merged?
This cannot be based on line number, because the scripts will add \t in places.
I think a solution that might work is to place "#########" at the end of each of the initial *.txt files before merging them, but I don't know how to get BASH to split the files again at that mark.
Instead of that for loop for concatenating, you can just use cat *.txt.
Anyway, why don't you just perform the scripts on each file independently within the for loop?
If you really want to combine and re-segregate, you can use:
for filename in *.txt; do
cat "${filename}"
echo "#####"
done > output.txt
# Pass output.txt through whatever
awk 'BEGIN { fileno = 1; file = sprintf("output%02d.txt", fileno) };
{ if($1 ~ /#####/) { fileno++;
file = sprintf("output%02d.txt", fileno);
next }
else print >file
}' output.txt
The canonical answer would be:
tar c *.txt > output.txt
You could split/unmerge them exactly by doing
tar xf output.txt # in the current directory
tar x -C /tmp/splitfiles/ -f output.txt
Now if you really want to do stuff like that in a loop and extract to stdout/a pipe, you could:
while read fname < <(tar tf output.txt)
do
# extract named to pipe
tar -xOf output.txt "$fname" | myprogram "$fname"
done
However, that would possibly not be very efficient. You could consider just doing
while read fname < <(tar x -v -C /tmp/splitfiles/ -f output.txt)
do
# handle extracted file
myprogram "/tmp/splitfiles/$fname"
unlink "/tmp/splitfiles/$fname" # drop the temp file
done
This will be completely asynchronous (so if extraction or even the transmission of the archive is slow, the first files can already be processed while waiting for more data to arrive).
See also my other answer https://stackoverflow.com/a/8341221/85371 (look for the older answer part, since that question was changed to be very specific later)
As Fredrik wrote here you can use csplit to split your merged file.

Resources