combine two files and overwrite original file using cat - bash

I try to combine two files using cat command, but facing a problem.
original.txt
============
foo
bar
foo
bar
following is my script.
cat original.txt | wc -l > linecount.txt | cat linecount.txt original.txt > original.txt
This script returns error that says "input file and output file is the same.".
Expected result is like this.
original.txt
============
4
foo
bar
foo
bar
Any idea?

You can probably use:
{ wc -l < original.txt; cat original.txt; } > linecount.txt &&
mv linecount.txt original.txt
Or using awk:
awk 'NR==FNR{++n; next} FNR==1{print n} 1' original.txt{,} > linecount.txt &&
mv linecount.txt original.txt
Or:
awk -v n=$(wc -l < original.txt) 'NR==1{print n} 1' original.txt > linecount.txt &&
mv linecount.txt original.txt

You can use sponge from the moreutils package. I like it for that:
cat <(wc -l orig.txt) orig.txt | sponge orig.txt
If you don't have sponge or cannot install it, you can implement it with awk as a bash function:
function sponge() {
awk -v o="${1}" '{b=NR>1?b""ORS""$0:$0}END{print b > o}'
}
Keep in mind that this will need to store the whole file in memory. Don't use it for very large files.

Related

How to get line by line from file and cut string before # on the bash

I have file and its content :
gnome-tweak-tool #(Cinnamon does not support)
gnome-system-monitor
gnome-calculator
gedit
nomacs gimp #(viewer and editor for image)
#aa
#bb
After merge with command of "pacman".
I desire output like :
pacman -S gnome-tweak-tool
pacman -S gnome-system-monitor
pacman -S gnome-calculator
pacman -S gedit
pacman -S nomacs gimp
Does anybody help me?
Thanks!
This awk one-liner will do the job:
awk -F'#.*' '$1&&$0="pacman -S "$1' file
Note that, pacman accepts multiple packages, like:
pacman -S package1 package2 package3...
Try this. This will output to stdout:
awk -F# '{print $1}' file | awk 'NF' | sed -e 's/^/pacman -S /'
This will output to a new file:
awk -F# '{print $1}' file | awk 'NF' | sed -e 's/^/pacman -S /' > file2
Try this
sed -e 's/#.*$//' test.in | awk '{if (NF != 0) {printf("pacman -S %s\n", $0)} else {print $0}}'
Hope this helps
First remove all the lines starts with #
sed 's/#.*//g;/^$/d' file1.txt > tmp.txt
Now concat pacman -S in front of each line
awk '{print "pacman -S " $0;}' tmp.txt > file1.txt
Hope this will help you
sed only:
sed -n 's/#.*//; /./ s/^/pacman -S /p' inputfile
Explanation:
sed -n Don't print unless specified
s/#.*// Remove everything from '#'
/./ ... /p Only print lines with at least on char left
s/^/pacman -S / Insert "pacman -S"at the start of the line
pacman -Qqen > pkglist.txt
To install:
pacman -S - < pkglist.txt
From ArchWiki: https://wiki.archlinux.org/index.php/Pacman/Tips_and_tricks#List_of_installed_packages

awk command has different behaviors when executing the exact same code. Why?

I have created a little shellscript that is capable of receiving a list of values such as "MY_VAR_NAME=var_value MY_VAR_NAME2=value2 ...", separated by spaces only. There should be also the possibility to use values such as MY_VAR_NAME='' or MY_VAR_NAME= (nothing).
These values are then used to change the value inside a environment variables file, for example, MY_VAR_NAME=var_value would make the script change the MY_VAR_NAME value inside the .env file to var_value, without changing anything else about the file.
The env file has the following configuration:
NODE_ENV=development
APP_PATH=/media
BASE_URL=http://localhost:3000
ASSETS_PATH=http://localhost:3000
USE_CDN=false
APP_PORT=3000
WEBPACK_PORT=8080
IS_CONNECTED_TO_BACKEND=false
SHOULD_BUILD=false
USE_REDUX_TOOL=false
USE_LOG_OUTPUT_AS_JSON=false
ACCESS_KEY_ID=
SECRET_ACCESS_KEY=
BUCKET_NAME=
BASE_PATH=
MIX_PANEL_KEY=
RDSTATION_KEY=
RESOURCE_KEY=
SHOULD_ENABLE_INTERCOM=false
SHOULD_ENABLE_GTM=false
SHOULD_ENABLE_UTA=false
SHOULD_ENABLE_WOOTRIC=false
I have debugged my script, and found out that this is the point where sometimes it has a problem
cat .envtemp | awk -v var_value="$VAR_VALUE" \
-v var_name="$VAR_NAME" \
-F '=' '$0 !~ var_name {print $0} $0 ~ var_name {print $1"="var_value}' | tee .envtemp
This piece of code sometimes outputs to .envtemp the proper result, while sometimes it just outputs nothing, making .envtemp empty
The complete code i am using is the following:
function change_value(){
VAR_NAME=$1
VAR_VALUE=$2
cat .envtemp | awk -v var_value="$VAR_VALUE" \
-v var_name="$VAR_NAME" \
-F '=' '$0 !~ var_name {print $0} $0 ~ var_name {print $1"="var_value}' | tee .envtemp
ls -l -a .env*
}
function manage_env(){
for VAR in $#
do
var_name=`echo $VAR | awk -F '=' '{print $1}'`
var_value=`echo $VAR | awk -F '=' '{print $2}'`
change_value $var_name $var_value
done
}
function main(){
manage_env $#
cat .envtemp > .env
exit 0
}
main $#
Here is an example script for recreating the error. It does not happen every time, and when it happens, it is not always with the same input.
#!/bin/bash
ENV_MANAGER_INPUT="NODE_ENV=production BASE_URL=http://qa.arquivei.com.br ASSETS_PATH=https://d4m6agb781hapn.cloudfront.net USE_CDN=true WEBPACK_PORT= IS_CONNECTED_TO_BACKEND=true ACCESS_KEY_ID= SECRET_ACCESS_KEY= BUCKET_NAME=frontend-assets-dev BASE_PATH=qa"
cp .env.dist .env
#Removes comment lines. The script needs a .envtemp file.
cat .env.dist | grep -v '#' | grep -v '^$' > .envtemp
./jenkins_env_manager.sh ${ENV_MANAGER_INPUT}
Have you tried use two files:
mv .envtemp .envtemp.tmp
cat .envtemp.tmp | awk ... | tee .envtemp

bash pipe and printing with multiple filter

I was wondering if something like this exist:
tail -f file1 | grep "hello" > fileHello | grep "bye" > fileBye | grep "etc" > fileEtc
echo b1bla >> file1
echo b2hello >> file1
echo b3bye >> file1
echo b4hellobye >> file1
echo b5etc >> file1
echo b6byeetc >> file1
That will make that result :
file1:
b1bla
b2hello
b3bye
b4hellobye
b5etc
b6byeetc
fileHello:
b2hello
b4hellobye
fileBye:
b3bye
b4hellobye
b6byeetc
fileEtc:
b5etc
b6byeetc
Thanks!
Use tee with process substitution:
tail -f file1 | tee >(exec grep "hello" > fileHello) >(exec grep "bye" > fileBye) | grep "etc" > fileEtc
This works, but be aware that piping tail -f is likely to cause some unexpected buffering issues.
tail -f file1 |
awk '/hello/ { print > "fileHello"}
/bye/ { print > "fileBye"}
/etc/ { print > "fileEtc"}'

tab delimit a file in bash

I have two files. I would like to join them by column and convert them from tab delimited to space delimted.
What is needed on top of
paste fileA fileB
to make that work?
Through awk,
awk 'FNR==NR{a[FNR]=$1; next} {print a[FNR]"\t"$2}' file1 file2
Example:
$ cat m
cat
dog
$ cat r
foo bar
bar foo
$ awk 'FNR==NR{a[FNR]=$1; next} {print a[FNR]"\t"$2}' m r
cat bar
dog foo
Talking about pure bash, something like this, haven't tested but you should be able to fix any bugs:
exec 3<file1
exec 4<file2
while :; do
read -r -u 3 f1_w || exit
read -r -u 4 f2_w1 f2_w2 || exit 1
echo -e "${f1_w}\t${f2_w2}"
done

Add prefix to every line in text in bash

Suppose there is a text file a.txt e.g.
aaa
bbb
ccc
ddd
I need to add a prefix (e.g. myprefix_) to every line in the file:
myprefix_aaa
myprefix_bbb
myprefix_ccc
myprefix_ddd
I can do that with awk:
awk '{print "myprefix_" $0}' a.txt
Now I wonder if there is another way to do that in shell.
With sed:
$ sed 's/^/myprefix_/' a.txt
myprefix_aaa
myprefix_bbb
myprefix_ccc
myprefix_ddd
This replaces every line beginning ^ with myprefix_. Note that ^ is not lost, so this allows to add content to the beginning of each line.
You can make your awk's version shorter with:
$ awk '$0="myprefix_"$0' a.txt
myprefix_aaa
myprefix_bbb
myprefix_ccc
myprefix_ddd
or passing the value:
$ prefix="myprefix_"
$ awk -v prefix="$prefix" '$0=prefix$0' a.txt
myprefix_aaa
myprefix_bbb
myprefix_ccc
myprefix_ddd
It can also be done with nl:
$ nl -s "prefix_" a.txt | cut -c7-
prefix_aaa
prefix_bbb
prefix_ccc
prefix_ddd
Finally: as John Zwinck explains, you can also do:
paste -d'' <(yes prefix_) a.txt | head -n $(wc -l a.txt)
on OS X:
paste -d '\0' <(yes prefix_) a.txt | head -n $(wc -l < a.txt)
Pure bash:
while read line
do
echo "prefix_$line"
done < a.txt
For reference, regarding the speed of the awk, sed, and bash solution to this question:
Generate a 800K input file in bash:
line="12345678901234567890123456789012345678901234567890123456789012345678901234567890"
rm a.txt
for i in {1..10000} ; do
echo $line >> a.txt
done
Then consider the bash script timeIt
if [ -e b.txt ] ; then
rm b.txt
fi
echo "Bash:"
time bashtest
rm b.txt
echo
echo "Awk:"
time awktest
rm b.txt
echo
echo "Sed:"
time sedtest
where bashtest is
while read line
do
echo "prefix_$line" >> b.txt
done < a.txt
awktest is:
awk '$0="myprefix_"$0' a.txt > b.txt
and sedtest is:
sed 's/^/myprefix_/' a.txt > b.txt
I got the following result on my machine:
Bash:
real 0m0.401s
user 0m0.340s
sys 0m0.048s
Awk:
real 0m0.009s
user 0m0.000s
sys 0m0.004s
Sed:
real 0m0.009s
user 0m0.000s
sys 0m0.004s
It seems like the bash solution is much slower..
You can also use the xargs utility:
cat file | xargs -d "\n" -L1 echo myprefix_
The -d option is used to allow input line with trailing blanks (related to -L spec).

Resources