Is there a simple one-line approach to combine the output of an unix command with a file? - shell

I want to combine the header of one text file with another text file. I know I can do it with
head -n1 file1.txt > header.txt
cat header.txt file2.txt > file2_withHeader.txt
but that generates an unnecessary intermediate file and it's two steps. Is there a way to make it one step without generating an intermediate file?
I was thinking of this
cat $(head -n1 file1.txt) file2.txt > file2_withHeader.txt
but it does not work because $(head -n1 file1.txt) is not a file so it cannot concatenate.
Thanks

All three of the other answers to this question are good, and since each answers in a different way, it'd be a good exercise to learn each one. LC-datascientist's and Diego's answer each spawn a subshell, so λuser's two-line approach, perhaps with && between the two commands, is the most ideal.
If you really want this in one command that doesn't launch a sub-shell, you can use awk (mawk):
awk 'BEGIN { getline < file1.txt; print } 1' file2.txt
gawk can do it even more elegantly thanks to the nextfile command:
gawk '1; NR == FNR { nextfile }' file1.txt file2.txt
These do the same thing: the first line of file1.txt and prints it, then prints the entirety of file2.txt.
The mawk code uses getline to read a single line from a given input into $0 (in this case; see the man page), we can do that before reading file2.txt. 1 is true, so this second clause always fires, triggering the default action (print $0).
The gawk code prints first, then uses NR == FNR to determine that you're looking at the first file (the overall number of records (lines) and the current file's number of records is the same), in which case it's already time to move to the second file.
For the first three lines, mawk needs a loop:
awk 'BEGIN { for (i=0;i<3;i++) { getline < file1.txt; print } } 1' file2.txt
and gawk merely needs another condition:
gawk '1; NR == FNR && NR >= 3 { nextfile }' file1.txt file2.txt

You can actually append with the redirections:
head -n1 file1.txt > result.txt
cat file2.txt >> result.txt

in bash you can also do
cat <(head -n 1 file1.txt) file2.txt > file2_withHeader.txt
which I think it was you were looking for.

Thank you, λuser. That's one option I can do, too.
Actually, I just figured out the one-liner that I was hoping for.
(head -n1 file1.txt; cat file2.txt) > file2_withHeader.txt
Sorry, I realized this is the same question as others asking to combine two commands in unix, so this has actually been answered.

Related

Creating a script that checks to see if each word in a file

I am pretty new to Bash and scripting in general and could use some help. Each word in the first file is separated by \n while the second file could contain anything. If the string in the first file is not found in the second file, I want to output it. Pretty much "check if these words are in these words and tell me the ones that are not"
File1.txt contains something like:
dog
cat
fish
rat
file2.txt contains something like:
dog
bear
catfish
magic ->rat
I know I want to use grep (or do I?) and the command would be (to my best understanding):
$foo.sh file1.txt file2.txt
Now for the script...
I have no idea...
grep -iv $1 $2
Give this a try. This is straight forward and not optimized but it does the trick (I think)
while read line ; do
fgrep -q "$line" file2.txt || echo "$line"
done < file1.txt
There is a funny version below, with 4 parrallel fgrep and the use of an additional result.txt file.
> result.txt
nb_parrallel=4
while read line ; do
while [ $(jobs | wc -l) -gt "$nb_parralel" ]; do sleep 1; done
fgrep -q "$line" file2.txt || echo "$line" >> result.txt &
done < file1.txt
wait
cat result.txt
You can increase the value 4, in order to use more parrallel fgrep, depending on the number of cpus and cores and the IOPS available.
With the -f flag you can tell grep to use a file.
grep -vf file2.txt file1.txt
To get a good match on complete lines, use
grep -vFxf file2.txt file1.txt
As #anubhava commented, this will not match substrings. To fix that, we will use the result of grep -Fof file1.txt file2.txt (all the relevant keywords).
Combining these will give
grep -vFxf <(grep -Fof file1.txt file2.txt) file1.txt
Using awk you can do:
awk 'FNR==NR{a[$0]; next} {for (i in a) if (index(i, $0)) next} 1' file2 file1
rat
You can simply do the following:
comm -2 -3 file1.txt file2.txt
and also:
diff -u file1.txt file2.txt
I know you were looking for a script but I don't think there is any reason to do so and if you still want to have a script you can jsut run the commands from a script.
similar awk
$ awk 'NR==FNR{a[$0];next} {for(k in a) if(k~$0) next}1' file2 file1
rat

How to fill empty lines from one file with corresponding lines from another file, in BASH?

I have two files, file1.txt and file2.txt. Each has an identical number of lines, but some of the lines in file1.txt are empty. This is easiest to see when the content of the two files is displayed in parallel:
file1.txt file2.txt
cat bear
fish eagle
spider leopard
snail
catfish rainbow trout
snake
koala
rabbit fish
I need to assemble these files together, such that the empty lines in file1.txt are filled with the data found in the lines (of the same line number) from file2.txt. The result in file3.txt would look like this:
cat
fish
spider
snail
catfish
snake
koala
rabbit
The best I can do so far, is create a while read -r line loop, create a counter that counts how many times the while loop has looped, then use an if-conditional to check if $line is empty, then use cut to obtain the line number from file2.txt according to the number on the counter. This method seems really inefficient.
Sometimes file2.txt might contain some empty lines. If file1.txt has an empty line and file2.txt also has an empty line in the same place, the result is an empty line in file3.txt.
How can I fill the empty lines in one file with corresponding lines from another file?
paste file1.txt file2.txt | awk -F '\t' '$1 { print $1 ; next } { print $2 }'
Here is the way to handle these files with awk:
awk 'FNR==NR {a[NR]=$0;next} {print (NF?$0:a[FNR])}' file2 file1
cat
fish
spider
snail
catfish
snake
koala
rabbit
First it store every data of the file2 in array a using record number as index
Then it prints file1, bit it thest if file1 contains data for each record
If there is data for this record, then use it, if not get one from file2
One with getline (harmless in this case) :
awk '{getline p<f; print NF?$0:p; p=x}' f=file2 file1
Just for fun:
paste file1.txt file2.txt | sed -E 's/^ //g' | cut -f1
This deletes tabs that are at the beginning of a line (those missing from file1) and then takes the first column.
(For OSX, \t doesn't work in sed, so to get the TAB character, you type ctrl-V then Tab)
a solution without awk :
paste -d"#" file1 file2 | sed 's/^#\(.*\)/\1/' | cut -d"#" -f1
Here is a Bash only solution.
for i in 1 2; do
while read line; do
if [ $i -eq 1 ]; then
arr1+=("$line")
else
arr2+=("$line")
fi
done < file${i}.txt
done
for r in ${!arr1[#]}; do
if [[ -n ${arr1[$r]} ]]; then
echo ${arr1[$r]}
else
echo ${arr2[$r]}
fi
done > file3.txt

How to increment number in a file

I have one file with the date like below,let say file name is file1.txt:
2013-12-29,1
Here I have to increment the number by 1, so it should be 1+1=2 like..
2013-12-29,2
I tried to use 'sed' to replace and must be with variables only.
oldnum=`cut -d ',' -f2 file1.txt`
newnum=`expr $oldnum + 1`
sed -i 's\$oldnum\$newnum\g' file1.txt
But I get an error from sed syntax, is there any way for this. Thanks in advance.
Sed needs forward slashes, not back slashes. There are multiple interesting issues with your use of '\'s actually, but the quick fix should be (use double quotes too, as you see below):
oldnum=`cut -d ',' -f2 file1.txt`
newnum=`expr $oldnum + 1`
sed -i "s/$oldnum\$/$newnum/g" file1.txt
However, I question whether sed is really the right tool for the job in this case. A more complete single tool ranging from awk to perl to python might work better in the long run.
Note that I used a $ end-of-line match to ensure you didn't replace 2012 with 2022, which I don't think you wanted.
usually I would like to use awk to do jobs like this
following is the code might work
awk -F',' '{printf("%s\t%d\n",$1,$2+1)}' file1.txt
Here is how to do it with awk
awk -F, '{$2=$2+1}1' OFS=, file1.txt
2013-12-29,2
or more simply (this will file if value is -1)
awk -F, '$2=$2+1' OFS=, file1.txt
To make a change to the change to the file, save it somewhere else (tmp in the example below) and then move it back to the original name:
awk -F, '{$2=$2+1}1' OFS=, file1.txt >tmp && mv tmp file1.txt
Or using GNU awk, you can do this to skip temp file:
awk -i include -F, '{$2=$2+1}1' OFS=, file1.txt
Another, single line, way would be
expr cat /tmp/file 2>/dev/null + 1 >/tmp/file
this works if the file doesn't exist or if the file doesnt contain a valid number - in both cases the file is (re)created with a value of 1
awk is the best for your problem, but you can also do the calculation in shell
In case you have more than one rows, I am using loop here
#!/bin/bash
IFS=,
while read DATE NUM
do
echo $DATE,$((NUM+1))
done < file1.txt
Bash one liner option with BC. Sample:
$ echo 3 > test
$ echo 1 + $(<test) | bc > test
$ cat test
4
Also works:
bc <<< "1 + $(<test)" > test

How to copy a .c file to a numbered listing

I simply want to copy my .c file into a line-numbered listing file. Basically generate a .prn file from my .c file. I'm having a hard time finding the right bash command to do so.
Do you mean nl?
nl -ba filename.c
The -ba means to number all lines, not just non-empty ones.
awk '{print FNR ":" $0}' file1 file2 ...
is one way.
FNR is FileNumberRecord (the current line number per file).
You can change the ":" per your needs.
$0 means "the-whole-line-of-input"
Or you can do
cat -n file1 file2 ....
IHTH
On my linux system, I occasionally use pr -tn to prefix line numbers for listings. The -t option suppresses headers and footers; -n says to prefix line numbers. -n allows optional format and digit specifiers; see man page. Anyhow, to print file xyz.c to xyz.prn with line numbering, use:
pr -tn xyz.c > xyz.prn
Note, this is not as compact and handy as cat -n xyz.c > xyz.prn (using cat -n as suggested in a previous answer); but pr has numerous other options, and I most often use it when I want to both number the lines and put them into multiple columns or print multiple files side by side. Eg for a 2-column numbered listing use:
pr -2 -tn xyz.c > xyz.prn
I think shellter has the right idea. However, if your require output written to files with prn extensions, here's one way:
awk '{ sub(/\.c$/, "", FILENAME); print FNR ":" $0 > FILENAME ".prn" }' file1.c file2.c ...
To perform this on all files in the present working directory:
for i in *.c; do awk '{ sub(/\.c$/, "", FILENAME); print FNR ":" $0 > FILENAME ".prn" }' "$i"; done

"while read LINE do" and grep problems

I have two files.
file1.txt:
Afghans
Africans
Alaskans
...
where file2.txt contains the output from a wget on a webpage, so it's a big sloppy mess, but does contain many of the words from the first list.
Bashscript:
cat file1.txt | while read LINE; do grep $LINE file2.txt; done
This did not work as expected. I wondered why, so I echoed out the $LINE variable inside the loop and added a sleep 1, so i could see what was happening:
cat file1.txt | while read LINE; do echo $LINE; sleep 1; grep $LINE file2.txt; done
The output looks in terminal looks something like this:
Afghans
Africans
Alaskans
Albanians
Americans
grep: Chinese: No such file or directory
: No such file or directory
Arabians
Arabs
Arabs/East Indians
: No such file or directory
Argentinans
Armenians
Asian
Asian Indians
: No such file or directory
file2.txt: Asian Naruto
...
So you can see it did finally find the word "Asian". But why does it say:
No such file or directory
?
Is there something weird going on or am I missing something here?
What about
grep -f file1.txt file2.txt
#OP, First, use dos2unix as advised. Then use awk
awk 'FNR==NR{a[$1];next}{ for(i=1;i<=NF;i++){ if($i in a) {print $i} } } ' file1 file2_wget
Note: using while loop and grep inside the loop is not efficient, since for every iteration, you need to invoke grep on the file2.
#OP, crude explanation:
For meaning of FNR and NR, please refer to gawk manual. FNR==NR{a[1];next} means getting the contents of file1 into array a. when FNR is not equal to NR (which means reading the 2nd file now), it will check if each word in the file is in array a. If it is, print out. (the for loop is used to iterate each word)
Use more quotes and use less cat
while IFS= read -r LINE; do
grep "$LINE" file2.txt
done < file1.txt
As well as the quoting issue, the file you've downloaded contains CRLF line endings which are throwing read off. Use dos2unix to convert file1.txt before iterating over it.
Although usng awk is faster, grep produces a lot more details with less effort. So, after issuing dos2unix use:
grep -F -i -n -f <file_containing_pattern> <file_containing_data_blob>
You will have all the matches + line numbers (case insensitive)
At minimum this will suffice to find all the words from file_containing_pattern:
grep -F -f <file_containing_pattern> <file_containing_data_blob>

Resources