Command line on a Mac. Have some text files. Want to remove certain lines from a group of files, then cat the remaining text of the file to a new merged file. Currently have the following attempt:
for file in *.txt;
do echo $file >> tempfile.html;
echo ''>>tempfile.html;
cat $file>>tempfile.html;
find . -type f -name 'tempfile.html' -exec sed -i '' '3,10d' {} +;
find . -type f -name 'tempfile.html' -exec sed -i '' '/<ACROSS>/,$d' {} +;
# ----------------
# some other stuff
# ----------------
done;
I am extracting a section of text from a bunch of files and concating them all together, but still need to know from which file each selection originated. First I concat the name of the file then (supposedly) the selection of text from each file. then repeat the process.
Plus, I need to leave the original text files in place for other purposes.
So the concatinated file would be:
filename1.txt
text-selection
more_text
filename2.txt
even-more-text
text-text-test-test
The first SED is supposed to delete from line 3 to line 10. The second is supposed to delete from the line containing to the end of the file.
However, what happens is the first deletes everything in the tempfile. The second one was doing nothing. (each were tested separately)
What am I doing wrong?
I must be missing something. Even trying -- what appears to be -- a very simple example does not work either. My hope was, the following example, would delete lines 3-10, but save the rest of the file to test.txt.
sed '3,10d' nxd2019-01-06.txt > test.txt
Your invocation of find will attempt to run sed with as many files as possible per call. But note: Addresses in sed do not address lines in each input file, they address the whole input of sed (which can consist out of many input files)
Try this:
> a.txt cat <<EOF
1
2
EOF
> b.txt cat <<EOF
3
4
EOF
Now try this:
sed 1d a.txt b.txt
2
3
4
As you can see, sed removed the first line from a.txt, not from b.txt
The problem in your case, is the second invocation of find. If will remove everything from the first occurrence of ACROSS until the last line in the last file found by find This will effectively remove the content from all but the first tempfile.html.
Having that the remaining logic in your script is working, you should just change the find invocations to:
find . -type f -name 'tempfile.html' -exec sed -i '' '3,10d' {} \;
find . -type f -name 'tempfile.html' -exec sed -i '' '/<ACROSS>/,$d' {} \;
This would call sed once per input file.
Related
I have some folders and subfoldes with .txt and other extensions (like .py, .html) and I want to concatenate all to one .txt file
I try this:
find . -type f -exec cat {} + > test.txt
Input:
txt1.txt:
aaaaa
test.py
print("a")
htmltest1.html:
<head></head>
Output:
aaaaaprint("a")<head></head>
Desired outup:
aaaaa
print("a")
<head></head>
So, how to modify this bash-command to get my desired output? I want to paste newline after each printed file
The problem is that the last lines of your files are not terminated with the newline character, which means they don't fulfill the POSIX definition of a text file, which may yield weird results like this.
Probably all graphical text editors I've used allow you not to put a terminating newline, and a lot of people won't put it, presumably because the editor makes it look like there's a redundant empty line at the end.
This may be the reason why some people couldn't reproduce your issue - presumably they created the sample files with well-behaving tools such as cat or vim or nano, or they did put the newline characters at the end.
So here's the issue:
user#host:~$ find . -type f -exec cat {} \;
aaaaaprint("a")<head></head>user#host:~$
To avoid these sorts of problems in the future, you should always hit <enter> after the last line of text in your file when using a graphical text editor. However, sometimes you have to work with files produced by other users, which might not know this sort of stuff, so:
here is a quick and dirty workaround (concatenating with an additional file which only contains the newline character):
user#host:~$ echo '' > /tmp/newline.txt
user#host:~$ find . -type f -exec cat {} /tmp/newline.txt \;
aaaaa
print("a")
<head></head>
user#host:~$
Use -E parameter on cat so it prints a $ at end of lines.
Then use sed to strip those out, with a literal \$ symbol anchored with $ at the end.
find . -type f -exec cat -E {} + | sed s'/\$$//' > test.txt
I have a question regarding the manipulation and creation of text files in the ubuntu terminal. I have a directory that contains several 1000 subdirectories. In each directory, there is a file with the extension stats.txt. I want to write a piece of code that will run from the parent directory, and create a file with the name of all the stats.txt files in the first column, and then returns to me all the information from the 5th line of the same stats.txt file in the next column. The 5th line of the stats.txt file is a sentence of six words, not a single value.
For reference, I have successfully used the sed command in combination with find and cat to make a file containing the 5th line from each stats.txt file. I then used the ls command to save a list of all my subdirectories. I assumed both files would be in alphabetical order of the subdirectories, and thus easy to merge, but I was wrong. The find and cat functions, or at least my implementation of them, resulted in a file that appeared to be random in order (see below). No need to try to remedy this code, I'm open to all solutions.
# loop through subdirectories and save the 5th line of stats.txt as a different file.
for f in ~/*; do [ -d $f ] && cd "$f" && sed -n 5p *stats.txt > final.stats.txt done;
# find the final.stats.txt files and save them as a single file
find ./ -name 'final.stats.txt' -exec cat {} \; > compiled.stats.txt
Maybe something like this can help you get on track:
find . -name "*stats.txt" -exec awk 'FNR==5{print FILENAME, $0}' '{}' + > compiled.stats
I want to write a shell script to merge contents of multiple files in a given directories.
DIR1 contains sample1.txt sample2.txt
sample1.txt contents :---this is sample1 file
sample2.txt contents :---this is sample2 file
DIR2 contains demo1.txt demo2.txt
demo1.txt contents :---this is demo1 file
I tried :
(find /home/DIR1 /home/DIR2 -type f | xargs -i cat {} ) > /result/final.txt
It worked!
this is sample2 file this is sample1 file this is demo1 file
however output appears in a single line I need every file's output in a separate new line.
like this:
this is sample1 file
this is sample2 file
this is demo1 file
how to achieve this?
Any help would be appreciated in advance.
Your files don't end with newlines, and therefor there are no newlines in the output file.
You should either make sure your input files end with newlines, or add them in the find command:
find /home/DIR1 /home/DIR2 -type f -exec cat {} \; -exec echo \;
As is pointed out in the comments, the issue may be because the End of File(EOF) is not preceded by the \n (newLine character).
One way of circumventing this issue, is to replace the "--- " with newline character. Hope the following command resolves the issue:
(find DIR1/ DIR2/ -type f | xargs -i cat {} ) | sed "s/$/\r/g" > result/final.txt
I have multiple files named as such --> 100.txt, 101.txt, 102.txt, etc.
The files are located within a directory. For every one of these files, I need to append the number before the extension in the file name to every line in the file.
So if the file content of 100.txt is:
blahblahblah
blahblahblah
...
I need the output to be:
blahblahblah 100
blahblahblah 100
...
I need to do this using sed.
My current code looks like this, but it is ugly and not very concise:
dir=$1
for file in $dir/*
do
base=$(basename $file)
filename="${base%.*}"
sed "s/$/ $filename/" $file
done
Is it possible to do this in such a way?
find $dir/* -exec sed ... {} \;
The code you already have is essentially the simplest, shortest way of performing the task in bash. The only changes I would make are to pass -i to sed, assuming you are using GNU sed (otherwise you will need to redirect the output to a temporary file, remove the old file, and move the new file into its place), and to provide a default value in case $1 is empty.
dir="${1:-.}"
the following command line will find all files that that has filename with only numbers with an extension and append the filename (numbers) at the end of each line in that file..(I tested with a couple of files)
find <directory path> -type f -name '[0-9]*' -exec bash -c 'num=`basename "{}"|sed "s/^\([0-9]\{1,\}\)\..*/\1/"`;sed -i.bak "s/.$/& $num/" "{}"' \;
Note: command line using sed not tested in OS X
replace <directory path> with the path of your directory
I'd like to cat recursively several files with same name to another file. There's an earlier question "Recursive cat all the files into single file" which helped me to get started. However I'd like to achieve the same so that each file is preceded by the filename and path, different files preferably separated with a blank line or ----- or something like that. So the resulting file would read:
files/pipo1/foo.txt
flim
flam
floo
files/pipo2/foo.txt
plim
plam
ploo
Any way to achieve this in bash?
Of course! Instead of just cating the file, you just chain actions to print the filename, cat the file, then add a line feed:
find . -name 'foo.txt' \
-print \
-exec cat {} \; \
-printf "\n"