I have a text like this (in rows):
A
B
C
D
E
F
and I'd like to change line B by line D, and line C to by line E, obtaining (in rows):
A
D
E
B
C
F
is it any simple way to do it with bash?
You can use the mapfile builtin to read the entire file into an array of lines. Then in that array reorder however you want and write the array back out to a file.
Related
not sure how to ask this question but an example would surely clarify. Suppose I have this file:
$ cat intoThat
a b
a h
a l
a m
b c
b d
b m
c b
c d
c f
c g
c p
d h
d f
d p
and this list:
cat grepThis
a
b
c
d
now I would like to grepThis intoThat and I would do this:
$grep -wf grepThis intoThat
which will give an output like this:
**a b**
a h
a l
a m
**b c**
**b d**
b m
**c b**
**c d**
c f
c g
c p
d h
d f
d p
now the asterisks are used to highlight those lines that I would like grep to return. These are the lines that have a full match but...how to tell grep (or awk or whatever) to get only these lines?
Of course it is possible that some lines do not match any pattern, e.g. in the intoThat file I may have some other letters like g, h, l, s, t, etc...
With awk, you could do:
awk 'NR==FNR{ seen[$0]++; next } ($1 in seen && $2 in seen)' grepThis intoThat
a b
b c
b d
c b
c d
NR is set to 1 when the first record read by awk and incrementing for each next records reading either in single or multiple input files until all records/line read.
FNR is set to 1 when the first record read by awk and incrementing for each next records reading in current file and reset back to 1 for the next input file if multiple input files.
so NR == FNR is always a true condition for first input file and the block followed by this will perform actions on the first file only.
The seen is an associated awk array named seen (you can use different name as you want) with the key of whole line $0 and value with occurrences of each line occurred (this way usually is using to remove duplicated records in awk too).
The next token skips to executing rest of the commands and those will only execute actually for next file(s) except first.
In next (....), we are just checking if both column$1 and $2 are present in the array, if so they will goes in output.
I have a big text file, each line containing a sentence.
I want to use grep (or something similar in batch) to find sentences where word b occurs exactly or not exactly (some word(s) between them) after word a.
I don't want grep to return a sentence like this:
f g s b d a
because b is not after a but I want to return a sentence like
f g a d m s b f
because b is after a.
It is OK to return sentences where a is both after and before b:
s a s b s a s
I also don't want sentences with only a or b.
I just want the sentences where b is after a (something can be in the middle).
I can easily do it with Python but I want to use the beauty of bash.
Try to do that:
grep "a.*b" file
If I have something like:
value is between 1-1000
And if value is within 1-100, output A
within 101-200, output B
within 201-300, output C
within 301-400, output D
within 401-500, output E
else, output F
Can this be done more "efficiently" or better than having if statements for each one?
You could use a mapping between value and output:
outputs = [ A, B, C, D, E, F, F, F, F, F]
output = outputs[(int)((value - 1)/ 100)]
First of all I'd like to thank your community, you have been helping me tremendously over the past couple of months, thanks to your detailed answers and your comments.
However I came accross a snag. I want to compare 2 files containing simulation data. These files are the result of a previous operation which consists in extracting the desired data from 2 of output files.
So output-file1-> sorteddata1
Output-file2-> sorteddata2
Sorteddata1 looks like that
0.200000e-4 a b c d e
0.400000e-4 f g h i j
0.560000e-4 k l m n o
.
.
.
Sorteddata2
2.000000E-5 A
3.600000E-5 B
5.600000E-5 C
.
.
.
And what I would like this, sorteddata3:
0.200000e-4 a b c d e A
0.400000e-4 f g h i j
0.560000e-4 k l m n o C
.
.
.
So if the number in the first column is the same, add the corresponding value from sorteddata2 in the 7th column of sorteddata1.
I wanted to start from here:
Compare files with awk
But the number format from the first column of each file is different, so I don't get any return. I really want to use awk for this (personal preference, I kind of like it)
The goal is to plot this using gnuplot, so hopefully a blank in the last column won't be a problem.
Any thoughts on this?
You can use sprintf to make the number stick to the same format:
sprintf(format, expression1, ...)
Return (without printing) the string that printf would have printed
out with the same arguments (see Printf).
Then, the logic is the same as in the linked answer, adding an if/ default case to print either the current line or it together with the matched line from the other file.
awk 'NR==FNR {value=sprintf("%e", $1)
a[value]=$2
next
}
{value2=sprintf("%e", $1)
print $0, a[value2]
}' f2 f1
For your given input, it returns:
$ awk 'NR==FNR{value=sprintf("%e", $1); a[value]=$2; next} {value2=sprintf("%e", $1); if (value2 in a) {print $0, a[value2] }' f2 f1
0.200000e-4 a b c d e A
0.400000e-4 f g h i j
0.560000e-4 k l m n o C
Note in comments you say that E format shows a "unterminated string" error to you. Hence, you can replace the E with e in the number format with sub("E","e",$1). All together:
awk 'NR==FNR{value=sprintf("%e", $1); a[value]=$2; next} {sub("E","e",$1); value2=sprintf("%e", $1); print $0, a[value2] }' f2 f1
I am working through a really complex and long multi-conditional statement to do this and was wondering if anyone knew of a simpler method. I have a multi-column/multi-row list that I am trying to parse. What I need to do is take the first row which has the "*" in the 5th position and copy all those entries into the blank spaces on the next few rows and then discard the original top row. What complicates this a bit is that sometimes the next few rows may not have an empty space in all the other fields (see bottom half of original list). If that's the case, I want to take extra entry (Q1 below) and put it at the end of row, in a new column.
Original list:
A B C D ***** F G
E1
E2
E3
Q R S T ***** V W
U1
Q1 U2
Final output:
A B C D E1 F G
A B C D E2 F G
A B C D E3 F G
Q R S T U1 V W
Q R S T U2 V W Q1
Thanks in advance for help!
The concise/cryptic one liner:
awk '/[*]/{f=$0;p="[*]+";next}{r=$2?$2:$1;sub(p,r,f);p=r;print $2?f" "$1:f}' file
A B C D E1 F G
A B C D E2 F G
A B C D E3 F G
Q R S T U1 V W
Q R S T U2 V W Q1
Explanation:
/[*]+/ { # If line matches line with pattern to replace
line = $0 # Store line
pat="[*]+" # Store pattern
next # Skip to next line
}
{
if (NF==2) # If the current line has 2 fields
replace = $2 # We want to replace with the second
else # Else
replace = $1 # We want to replace with first first
sub(pat,replace,line) # Do the substitution
pat=replace # Next time the pattern to replace will have changed
if (NF==2) # If the current line has 2 fields
print line,$1 # Print the line with the replacement and the 1st field
else # Else
print line # Just print the line with the replacement
}
To run the script save it to a file such as script.awk and run awk -f script.awk file.