awk & sed split file - bash

if I have a file test.txt:
example 1 content 2013-3-8:
hello java
example 2 content 2013-4-9:
hello c
how can I use awk or sed to seperate the test.txt to two file
test1
hello java
test2
hello c
I use the command below:
awk '/example/{i++}{print > "test"i}' test.txt
but it will remain the first line(example xxx), can I add some fragment to the print in awk to delete the first line?

You almost have it:
awk '/^example/ { i++; next } { print >"test"i}'
the next makes awk skip the rest of the statements.

You can use getline to skip the first line. The following should give the desired output:
awk '/example/{getline; i++}{print > "test"i}' test.txt

Some weird way of doing this with sed:
sh <<< $(sed '/example/{N;s/\n//;s/example \([0-9]*\).*:\(.*\)/echo "\2" >> test\1;/}' input)

This might work for you (GNU sed):
sed -ne '2~4w test1.txt' -e '4~4w test2.txt' test0.txt

You could try something like :
awk 'BEGIN {i=0; j=0} /example/{i++; j=0} (j != 0){print > "test"i} {j++}' test.txt

sed -n "
/example 1/ {N;s/^.*\n//
w test1.txt
}
/example 2/ {N;s/^.*\n//
w test2.txt
}" test.txt
if you define a delimiter between section (define size or marker), there could be more text to put in each file

To complete the response from Alok Singhal: if you reach the "too many open files" limit on linux, you have to close the files in line.
awk '/^example/ {close("test" i); i++; next } { print >"test" i}'

Related

head and grep simultaneously

Is there a unix one liner to do this?
head -n 3 test.txt > out_dir/test.head.txt
grep hello test.txt > out_dir/test.tmp.txt
cat out_dir/test.head.txt out_dir/test.tmp.txt > out_dir/test.hello.txt
rm out_dir/test.head.txt out_dir/test.tmp.txt
I.e., I want to get the header and some grep lines from a given file, simultaneously.
Use awk:
awk 'NR<=3 || /hello/' test.txt > out_dir/test.hello.txt
You can say:
{ head -n 3 test.txt ; grep hello test.txt ; } > out_dir/test.hello.txt
Try using sed
sed -n '1,3p; /hello/p' test.txt > out_dir/test.hello.txt
The awk solution is the best, but I'll add a sed solution for completeness:
$ sed -n test.txt -e '1,3p' -e '4,$s/hello/hello/p' test.txt > $output_file
The -n says not to print out a line unless specified. The -e are the commands '1,3p prints ou the first three lines 4,$s/hello/hello/p looks for all lines that contain the word hello, and substitutes hello back in. The p on the end prints out all lines the substitution operated upon.
There should be a way of using 4,$g/HELLO/p, but I couldn't get it to work. It's been a long time since I really messed with sed.
Of course, I would go awk but here is an ed solution for the pre-vi nostalgics:
ed test.txt <<%
4,$ v/hello/d
w test.hello.txt
%

Using sed to insert file content

I'm trying to insert a file content before a given pattern
Here is my code:
sed -i "" "/pattern/ {
i\\
r $scriptPath/adapters/default/permissions.xml"
}" "$manifestFile"
It adds the path instead of the content of the file.
Any ideas ?
In order to insert text before a pattern, you need to swap the pattern space into the hold space before reading in the file. For example:
sed "/pattern/ {
h
r $scriptPath/adapters/default/permissions.xml
g
N
}" "$manifestFile"
Just remove i\\.
Example:
$ cat 1.txt
abc
pattern
def
$ echo hello > 2.txt
$ sed -i '/pattern/r 2.txt' 1.txt
$ cat 1.txt
abc
pattern
hello
def
I tried Todd's answer and it works great,
but I found "h" & "g" commands are ommitable.
Thanks to this faq (found from #vscharf's comments), Todd's answer can be this one liner.
sed -i -e "/pattern/ {r $file" -e 'N}' $manifestFile
Edit:
If you need here-doc version, please check this.
I got something like this using awk. Looks ugly but did the trick in my test:
command:
cat test.txt | awk '
/pattern/ {
line = $0;
while ((getline < "insert.txt") > 0) {print};
print line;
next
}
{print}'
test.txt:
$ cat test.txt
some stuff
pattern
some other stuff
insert.txt:
$ cat insert.txt
this is inserted file
this is inserted file
output:
some stuff
this is inserted file
this is inserted file
pattern
some other stuff
CodeGnome's solution don't work, if the pattern is on the last line..
So I used 3 commands.
sed -i '/pattern/ i\
INSERTION_MARKER
' $manifestFile
sed -i '/INSERTION_MARKER/r $scriptPath/adapters/default/permissions.xml' $manifestFile
sed -i 's/INSERTION_MARKER//' $manifestFile

Explode to Array

I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$

Insert a file at the beginning of a file possibly using sed

I think we already have similar post using sed to add "text" at the beginning of a file
Say: sed -i '1i text' inputfile
But here my question is: my text has many lines, so I put them in a file (file1). And I hope to insert the content in file1 at the beginning of file2.
How can I do that using sed, or other approaches?
thx
edit:
Sorry I'm myself complicating this question!
This is an idiot question because we can simply do by "cat"! :)
I'm an idiot
How about doing
cat file1 file2
(Well, this is not "inplace" editing, though, you probably need to use a temp file or a buffer.)
Notice that in some shells, you will also be able to do
command < file1 < file2
Using awk:
awk 'BEGIN { while ((getline tmp < "TEMPLATE" ) > 0) { print tmp }
close("TEMPLATE")}
{ print }' ORIGFILE > NEWFILE && mv NEWFILE ORIGFILE
Using vim:
vim -c "read TEMPLATE" -c "read FILE" -c "wq"
This might work for you (as an exercise as cat is the obvious choice):
sed '1{h;r file1'$'\n'';d};2{H;g}' file2

Join lines based on pattern

I have the following file:
test
1
My
2
Hi
3
i need a way to use cat ,grep or awk to give the following output:
test1
My2
Hi3
How can i achieve this in a single command? something like
cat file.txt | grep ... | awk ...
Note that its always a string followed by a number in the original text file.
sed 'N;s/\n//' file.txt
This should give the desired output when the content is in file.txt
paste -d "" - - < filename
This takes consecutive lines and pastes them together delimited by the empty string.
awk '{printf("%s", $0);} !(NR%2){printf("\n");}' file.txt
EDIT: I just noticed that your question requires the use of cat and grep. Both of those programs are unnecessary to achieve your stated aims. If you have some reason for including them that you haven't mentioned, try this (uselessly inefficient) version of the line I wrote immediately above:
cat file.txt | grep '^' | awk '{printf("%s", $0);} !(NR%2){printf("\n");}'
It is possible that this command uses features not present in the original awk program. You may need to invoke the new awk program, nawk instead.
If your input file is always 1 number then 1 string, and you only want the strings, all you have to do is take every other line.
If you only want the odd lines, you can do awk 'NR % 2' file.txt
If you want the evens, this becomes awk 'NR % 2==0' data
Here is the answer:
cat file.txt | awk 'BEGIN { lno = 0 } { val=$0; if (lno % 2 == 1) {printf "%s\n", $0} else {printf "%s", $0}; ++lno}'

Resources