Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
How to use sed, awk or bash to most succinctly convert the file format A to B below ?
A
1
blabla
2
another blabla
... (more omitted)
10
yet another blabla
...
100
final blabla
B
1 blabla
2 another blabla
...
10 yet another blabla
...
100 final blabla
So many different ways, here is one using paste
$ cat ip.txt
1
blabla
2
another blabla
10
yet another blabla
100
final blabla
$ paste - - < ip.txt
1 blabla
2 another blabla
10 yet another blabla
100 final blabla
See How to process a multi column text file to get another multi column text file? for many more methods
In one bash line:
while read line1; do if read line2; then echo "$line1" "$line2"; fi; done < file.txt
Use pr in Bash:
$ pr -2 -a -s\ -t foo2
1 blabla
2 another blabla
10 yet another blabla
100 final blabla
Related
This question already has answers here:
Take nth column in a text file
(6 answers)
Closed 2 years ago.
I have written a simple code that takes data from a text file( which has space-separated columns and 1.5 million rows) gives the output file with the specified column. But this code takes more than an hr to execute. Can anyone help me out to optimize runtime
a=0
cat 1c_input.txt/$1 | while read p
do
IFS=" "
for i in $p
do
a=`expr $a + 1`
if [ $a -eq $2 ]
then
echo "$i"
fi
done
a=0
done >> ./1.c.$2.column.freq
some lines of sample input:
1 ib Jim 34
1 cr JoHn 24
1 ut MaRY 46
2 ti Jim 41
2 ye john 6
2 wf JoHn 22
3 ye jOE 42
3 hx jiM 21
some lines of sample output if the second argument entered is 3:
Jim
JoHn
MaRY
Jim
john
JoHn
jOE
jiM
I guess you are trying to print just 1 column, then do something like
#! /bin/bash
awk -v c="$2" '{print $c}' 1c_input.txt/$1 >> ./1.c.$2.column.freq
If you just want something faster, use a utility like cut. So to
extract the third field from a single space delimited file bigfile
do:
cut -d ' ' -f 3 bigfile
To optimize the shell code in the question, using only builtin shell
commands, do something like:
while read a b c d; echo "$c"; done < bigfile
...if the field to be printed is a command line parameter, there are
several shell command methods, but they're all based on that line.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have a file delimited by pipes. I am not sure which bash tool would be most appropriate (I am thinking either awk or sed) to find the nearest number to those listed.
my file looks like this:
2|1 1 4 5
8|1 2 2 3 10 14
5|1 50 100
and I would like to get the output:
1
10
1
Explanation: In First Row, Nearest of 2 in {1 1 4 5} is 1. In Same way, For Second row Nearest of 8 in {1 2 2 3 10 14} is 10.For 3rd row Nearest for 5 will be 1.
$ awk -F'[| ]' '{
sq=($2-$1)*($2-$1);a=2;
for(i=3;i<=NF;i++){
sqi=($i-$1)*($i-$1)
if(sq<sqi){sq=sq}else{sq=sqi;a=i}
} print $a
}' file
1
10
1
Given:
$ echo "$ns"
2|1 1 4 5
8|1 2 2 3 10 14
5|1 50 100
It is easy in Ruby:
$ echo "$ns" | ruby -lne 'a=$_.split(/[| \t]/)
a.map!{|e| Integer(e)}
n=a.shift
p a.min_by {|e| (e-n).abs}'
1
10
1
It could be done similarly in gawk by defining a customer sort function based on the first value compared the rests, sort, take the first.
This is a way of doing it with awk:
awk -F"[ \t|]" '{
n=$2;m=($1-$2)*($1-$2)
for(i=3;i<=NF;i++){
d=($1-$i)*($1-$i)
if(d<m){n=$i;m=d}
} print n
}' input
This question already has an answer here:
Use awk to sum or average for each unique ID
(1 answer)
Closed 5 years ago.
not a native speaker so the best way to explain is to give example, of what I have to do.
name1: 15
name2: 20
name1: 8
name3: 30
Now this is a short example of the output I get when greping from a file.
Now I'm not sure how to handle suming of those numbers, so that the final solution is
name1: 23
name2: 20
name3: 30
There are several ways to solve this, and the only way I currently see is something involving Arrays, which I was told is not the best way to think about in Bash.
Thank you for your help and sorry if the question has been asked before.
awk 'NF{a[$1]+=$NF} END{for(i in a)print i, a[i]}' File
This would work for all non-empty lines.
Example:
$ cat File
name1: 15
name2: 20
name1: 8
name3: 30
Sample:
$ awk 'NF{a[$1]+=$NF} END{for(i in a)print i, a[i]}' File
name1: 23
name2: 20
name3: 30
Apologies for previous answer.. I didn't read your question properly (not enough coffee yet).
This might do what you want.
declare -A group_totals
while read -r group value ; do
group_totals[$group]=$(( group_totals[group] + value ))
done < <(grep command_here input_file)
for group in "${!group_totals[#]}" ; do
echo "$group: ${group_totals[$group]}"
done
This question already has answers here:
Length of string in bash
(11 answers)
Closed 6 years ago.
Is it even possible? I currently have a one-liner to count the number of words in a file. If I output what I currently have it looks like this:
3 abcdef
3 abcd
3 fec
2 abc
This is all done in 1 line without loops and I was thinking if I could add a column with length of each word in a column. I was thinking I could use wc -m to count the characters, but I don't know if I can do that without a loop?
As seen in the title, no AWK, sed, perl.. Just good old bash.
What I want:
3 abcdef 6
3 abcd 4
3 fec 3
2 abc 3
Where the last column is length of each word.
while read -r num word; do
printf '%s %s %s\n' "$num" "$word" "${#word}"
done < file
You can do something like this also:
File
> cat test.txt
3 abcdef
3 abcd
3 fec
2 abc
Bash script
> cat test.txt.sh
#!/bin/bash
while read line; do
items=($line) # split the line
strlen=${#items[1]} # get the 2nd item's length
echo $line $strlen # print the line and the length
done < test.txt
Results
> bash test.txt.sh
3 abcdef 6
3 abcd 4
3 fec 3
2 abc 3
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I need to sort data which reside in txt file. The sample data is as follows:
======
Jhon
Doe
score -
------
======
Ann
Smith
score +
------
======
Will
Marrow
score -
------
And I need to extract only sections where score + is defined. So the result should be
======
Ann
Smith
score +
------
I would try this one:
$ grep -B3 -A1 "score +" myfile
It means... grep three lines Before and one line After "score +".
Sed can do it as follows:
sed -n '/^======/{:a;N;/\n------/!ba;/score +/p}' infile
======
Ann
Smith
score +
------
where -n prevents printing, and
/^======/ { # If the pattern space starts with "======"
:a # Label to branch to
N # Append next line to pattern space
/\n------/!ba # If we don't match "------", branch to :a
/score +/p # If we match "score +", print the pattern space
}
Things could be more properly anchored with /\n------$/, but there are spaces at the end of the lines, and I'm not sure if those are real or copy-paste artefacts – but this work for the example data.
give this oneliner a try:
awk -v RS="==*" -F'\n' '{p=0;for(i=1;i<=NF;i++)if($i~/score \+/)p=1}p' file
with the given data, it outputs:
Ann
Smith
score +
------
The idea is, take all lines divided by ====... as one multiple-line record, and check if the record contains the searching pattern, print it out.
With GNU awk for multi-char RS:
$ awk -v RS='=+\n' '/score \+/' file
Ann
Smith
score +
------
Given:
$ echo "$txt"
======
Jhon
Doe
score -
------
======
Ann
Smith
score +
------
======
Will
Marrow
score -
------
You can create a toggle type match in awk to print only the section that you wist:
$ echo "$txt" | awk '/^=+/{f=1;s=$0;next} /^score \+/{f=2} f {s=s"\n"$0} /^-+$/ {if(f==2) {print s} f=0}'
======
Ann
Smith
score +
------
Use Grep Context Flags
Assuming you have a truly fixed-format file, you can just use fgrep (or GNU or BSD grep with the speedy --fixed-strings flag) along with the the --before-context and --after-context flags. For example:
$ fgrep -A1 -B3 'score +' /tmp/foo
======
Ann
Smith
score +
------
The flags will find your match, and include the three lines before and one line after each match. This gives you the output you're after, but with a lot less complexity than a sed or awk script. YMMV.