How can I replace a number with one higher - bash

I have a bash script for macOS, which replaces the last character (a number) on the first line of a document with the number plus one. For example, if the number was 19, it would be replaced with 20 (I know that the code only replaces the last character). Here's the code that I currently have:
#For some reason the -e (not -E) is necessary here
RC01=$(sed -ne "1s/.*\(.\)$/\1/p" ./document)
RC02=${RC01}
let "RC02++"
sed -i '' -n "1s/${RC01}/${RC02}/" ./document
When the last line runs, if the document has a first line with anything on it, it clears the whole document.
What's going wrong, and how can I fix it?

Your immediate problem is that sed -n suppresses all printing of output lines.
You can fix this easily by removing the -n;
RC01=$(sed -n "1s/.*\(.\)$/\1/p" ./document)
RC02=${RC01}
let "RC02++"
sed -i '' "1s/${RC01}\$/${RC02}/" ./document
This still has the problem that you only increment the very last digit; if your number has more than one digit, this will lead to weirdness like 18, 19, 110, 111 rather than 18, 19, 20, 21
RC01=$(sed -n "s/\(.*[^0-9]\)\([0-9]*\)$/\2/p;q" ./document)
RC02=${RC01}
let "RC02++"
sed -i '' "1s/${RC01}\$/${RC02}/" ./document
The -i feature of sed is hard to beat for replacing a file in place. If it weren't for that, I would definitely recommend Awk instead. Here's a quick attempt.
awk 'NR == 1 { $NF = 1+$NF }1' ./document >./document.tmp &&
mv ./document.tmp ./document
Perl has the best of both worlds, with sugar on top;
perl -pi -e 's/(\d+)$/1+$1/e if $. == 1' ./document

Related

Unix command to cut and check a specific column in a file [duplicate]

I'm a novice using grep/egrep/awk and have not wrapped my head around regular expressions (bonus: a link to an introduction to regex for someone who has zero programming experience would be great).
My question revolves around matching a number range within a flat file. I have values which are ten digits. Telephone numbers...
I'm attempting to match a range of numbers that move across a range for example.
55512122041 through 55512122050 (41, 42, 43, 44, 45, 46, 47, 48, 49, and 50).
I have been using grep to match the first value like this.
grep 555121204[1-9]
Next step is I grep for the final digit
grep 55512122050
I believe that I have not found the right way to use a regex to allow one grep.
Try the below grep command which uses P(Perl regex) parameter,
grep -P '55512120(?:4[1-9]|50)' file
OR
grep -E '555121204[1-9]|5551212050' file
This would print the lines which has the number ranges from 55512122041 to 55512122050.
If you want to print only the number then add o parameter to the above grep command.
grep -oP '55512120(?:4[1-9]|50)' file
Example:
$ cat file
bar foo
5551212040 Don't match
5551212041 Match
5551212050 Match
foo bar
$ grep -P '55512120(?:4[1-9]|50)' file
5551212041 Match
5551212050 Match
For the general case, where the number range is not easy to express as a regex, Awk is probably better, as it has proper support for arithmetic.
awk '(($1 > 123) && ($1 < 1024)) || (($1 > 2048) && ($1 < 65536))' file
This prints the entire matching line; if you only want to print the second field, add { print $2 } etc.
You can learn enough Awk to figure this out on your own with a good tutorial and 30 minutes; see the Stack Overflow awk tag info page for pointers.

Convert multi-line csv to single line using Linux tools

I have a .csv file that contains double quoted multi-line fields. I need to convert the multi-line cell to a single line. It doesn't show in the sample data but I do not know which fields might be multi-line so any solution will need to check every field. I do know how many columns I'll have. The first line will also need to be skipped. I don't how much data so performance isn't a consideration.
I need something that I can run from a bash script on Linux. Preferably using tools such as awk or sed and not actual programming languages.
The data will be processed further with Logstash but it doesn't handle double quoted multi-line fields hence the need to do some pre-processing.
I tried something like this and it kind of works on one row but fails on multiple rows.
sed -e :0 -e '/,.*,.*,.*,.*,/b' -e N -e '1n;N;N;N;s/\n/ /g' -e b0 file.csv
CSV example
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
The output I want is
First name,Last name,Address,ZIP
John,Doe,Country City Street,12345
Jane,Doe,Country City Street,67890
etc.
etc.
First my apologies for getting here 7 months late...
I came across a problem similar to yours today, with multiple fields with multi-line types. I was glad to find your question but at least for my case I have the complexity that, as more than one field is conflicting, quotes might open, close and open again on the same line... anyway, reading a lot and combining answers from different posts I came up with something like this:
First I count the quotes in a line, to do that, I take out everything but quotes and then use wc:
quotes=`echo $line | tr -cd '"' | wc -c` # Counts the quotes
If you think of a single multi-line field, knowing if the quotes are 1 or 2 is enough. In a more generic scenario like mine I have to know if the number of quotes is odd or even to know if the line completes the record or expects more information.
To check for even or odd you can use the mod operand (%), in general:
even % 2 = 0
odd % 2 = 1
For the first line:
Odd means that the line expects more information on the next line.
Even means the line is complete.
For the subsequent lines, I have to know the status of the previous one. for instance in your sample text:
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
You can say line 1 (John,Doe,"Country) has 1 quote (odd) what means the status of the record is incomplete or open.
When you go to line 2, there is no quote (even). Nevertheless this does not mean the record is complete, you have to consider the previous status... so for the lines following the first one it will be:
Odd means that record status toggles (incomplete to complete).
Even means that record status remains as the previous line.
What I did was looping line by line while carrying the status of the last line to the next one:
incomplete=0
cat file.csv | while read line; do
quotes=`echo $line | tr -cd '"' | wc -c` # Counts the quotes
incomplete=$((($quotes+$incomplete)%2)) # Check if Odd or Even to decide status
if [ $incomplete -eq 1 ]; then
echo -n "$line " >> new.csv # If line is incomplete join with next
else
echo "$line" >> new.csv # If line completes the record finish
fi
done
Once this was executed, a file in your format generates a new.csv like this:
First name,Last name,Address,ZIP
John,Doe,"Country City Street",12345
I like one-liners as much as everyone, I wrote that script just for the sake of clarity, you can - arguably - write it in one line like:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l " || echo "$l";done >new.csv
I would appreciate it if you could go back to your example and see if this works for your case (which you most likely already solved). Hopefully this can still help someone else down the road...
Recovering the multi-line fields
Every need is different, in my case I wanted the records in one line to further process the csv to add some bash-extracted data, but I would like to keep the csv as it was. To accomplish that, instead of joining the lines with a space I used a code - likely unique - that I could then search and replace:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l ~newline~ " || echo "$l";done >new.csv
the code is ~newline~, this is totally arbitrary of course.
Then, after doing my processing, I took the csv text file and replaced the coded newlines with real newlines:
sed -i 's/ ~newline~ /\n/g' new.csv
References:
Ternary operator: https://stackoverflow.com/a/3953666/6316852
Count char occurrences: https://stackoverflow.com/a/41119233/6316852
Other peculiar cases: https://www.linuxquestions.org/questions/programming-9/complex-bash-string-substitution-of-csv-file-with-multiline-data-937179/
TL;DR
Run this:
i=0;cat file.csv|while read l;do i=$((($(echo $l|tr -cd '"'|wc -c)+$i)%2));[[ $i = 1 ]] && echo -n "$l " || echo "$l";done >new.csv
... and collect results in new.csv
I hope it helps!
If Perl is your option, please try the following:
perl -e '
while (<>) {
$str .= $_;
}
while ($str =~ /("(("")|[^"])*")|((^|(?<=,))[^,]*((?=,)|$))/g) {
if (($el = $&) =~ /^".*"$/s) {
$el =~ s/^"//s; $el =~ s/"$//s;
$el =~ s/""/"/g;
$el =~ s/\s+(?!$)/ /g;
}
push(#ary, $el);
}
foreach (#ary) {
print /\n$/ ? "$_" : "$_,";
}' sample.csv
sample.csv:
First name,Last name,Address,ZIP
John,Doe,"Country
City
Street",12345
John,Doe,"Country
City
Street",67890
Result:
First name,Last name,Address,ZIP
John,Doe,Country City Street,12345
John,Doe,Country City Street,67890
This might work for you (GNU sed):
sed ':a;s/[^,]\+/&/4;tb;N;ba;:b;s/\n\+/ /g;s/"//g' file
Test each line to see that it contains the correct number of fields (in the example that was 4). If there are not enough fields, append the next line and repeat the test. Otherwise, replace the newline(s) by spaces and finally remove the "'s.
N.B. This may be fraught with problems such as ,'s between "'s and quoted "'s.
Try cat -v file.csv. When the file was made with Excel, you might have some luck: When the newlines in a field are a simple \n and the newline at the end is a \r\n (which will look like ^M), parsing is simple.
# delete all newlines and replace the ^M with a new newline.
tr -d "\n" < file.csv| tr "\r" "\n"
# Above two steps with one command
tr "\n\r" " \n" < file.csv
When you want a space between the joined line, you need an additional step.
tr "\n\r" " \n" < file.csv | sed '2,$ s/^ //'
EDIT: #sjaak commented this didn't work is his case.
When your broken lines also have ^M you still can be a lucky (wo-)man.
When your broken field is always the first field in double quotes and you have GNU sed 4.2.2, you can join 2 lines when the first line has exactly one double quote.
sed -rz ':a;s/(\n|^)([^"]*)"([^"]*)\n/\1\2"\3 /;ta' file.csv
Explanation:
-z don't use \n as line endings
:a label for repeating the step after successful replacement
(\n|^) Search after a newline or the very first line
([^"]*) Substring without a "
ta Go back to label a and repeat
awk pattern matching is working.
answer in one line :
awk '/,"/{ORS=" "};/",/{ORS="\n"}{print $0}' YourFile
if you'd like to drop quotes, you could use:
awk '/,"/{ORS=" "};/",/{ORS="\n"}{print $0}' YourFile | sed 's/"//gw NewFile'
but I prefer to keep it.
to explain the code:
/Pattern/ : find pattern in current line.
ORS : indicates the output line record.
$0 : indicates the whole of the current line.
's/OldPattern/NewPattern/': substitude first OldPattern with NewPattern
/g : does the previous action for all OldPattern
/w : write the result to Newfile

UNIX Search in specific column for user specified code and output entire line

I'm working on a program that searches a medication list and returns a report as requested by the user. So i am trying to search this list for a code that the user inputs and then return the relevant information.
EX. (medcode) (doseage)
commA6314 ifosfamide 30
home5341209 urokinase 6314
When i search the file i only want it to return the line if it finds a match in columns 6-12 (6314 for the first line) but at the moment it will return both lines since the second line also contains 6314. All of the answers i saw used text processing utilities like awk, sed or perl and one of the conditions of the program is not to use any of these utilities.
The programs expected output:
Enter medication code?
6314
See Generic name g/G or Dose d/D?
g
ifosfamide
What i am getting currently:
Enter medication code?
6314
See Generic name g/G or Dose d/D?
g
ifosfamide
urokinase
so it is also displaying information about the second medication because 6314 is also contained in the columns for doseage.
Using bash
To match 6314 but only if it starts in column 6 using just bash, try:
$ while read -r line; do [[ "$line" =~ ^.{5}6314 ]] && echo "$line"; done <infile
commA6314 ifosfamide 30
This reads lines from the file one-by-one. The line is echoed to output only if it matches the regex ^.{5}6314 which requires that 6314 appear starting at the sixth character from the start of the line.
To print just the second word on the line but only if the first word matches your number position six:
$ while read -r code name extra; do [[ "$code" =~ ^.{5}6314 ]] && echo "$name"; done <infile
ifosfamide
Using grep
To match 6314 but only if it starts in column 6, try:
$ grep -E '^.{5}6314' infile
commA6314 ifosfamide 30
Here, ^ specifies the beginning of a line and .{5} matches any five characters. Thus ^.{5}6314 matches 6314 but only if it starts as the sixth character on the line.
Using awk
$ awk '"6314" == substr($0, 6, 4)' infile
commA6314 ifosfamide 30
Here, substr($0, 6, 4) selects four characters from the line starting at the sixth. If this equals 6314, then the line is printed.
Using sed
$ sed -En '/^.{5}6314/p' infile
commA6314 ifosfamide 30
-n tells sed not to print unless we explicitly ask it to. /^.{5}6314/p tells sed to print any line that, starting at the sixth character, matches 6314.
Try this using just bash :
while read -r line; do
[[ ${line%% *} == *6314* ]] && echo "$line"
done < input_file
It search only in the medication column.
explanations
${line%% *}
is a bash parameter expansion, it keep only the first 'word' before the first space

Unix one-liner to swap/transpose two lines in multiple text files?

I wish to swap or transpose pairs of lines according to their line-numbers (e.g., switching the positions of lines 10 and 15) in multiple text files using a UNIX tool such as sed or awk.
For example, I believe this sed command should swap lines 14 and 26 in a single file:
sed -n '14p' infile_name > outfile_name
sed -n '26p' infile_name >> outfile_name
How can this be extended to work on multiple files? Any one-liner solutions welcome.
If you want to edit a file, you can use ed, the standard editor. Your task is rather easy in ed:
printf '%s\n' 14m26 26-m14- w q | ed -s file
How does it work?
14m26 tells ed to take line #14 and move it after line #26
26-m14- tells ed to take the line before line #26 (which is your original line #26) and move it after line preceding line #14 (which is where your line #14 originally was)
w tells ed to write the file
q tells ed to quit.
If your numbers are in a variable, you can do:
linea=14
lineb=26
{
printf '%dm%d\n' "$linea" "$lineb"
printf '%d-m%d-\n' "$lineb" "$linea"
printf '%s\n' w q
} | ed -s file
or something similar. Make sure that linea<lineb.
If you want robust in-place updating of your input files, use gniourf_gniourf's excellent ed-based answer
If you have GNU sed and want to in-place updating with multiple files at once, use
#potong's excellent GNU sed-based answer (see below for a portable alternative, and the bottom for an explanation)
Note: ed truly updates the existing file, whereas sed's -i option creates a temporary file behind the scenes, which then replaces the original - while typically not an issue, this can have undesired side effects, most notably, replacing a symlink with a regular file (by contrast, file permissions are correctly preserved).
Below are POSIX-compliant shell functions that wrap both answers.
Stdin/stdout processing, based on #potong's excellent answer:
POSIX sed doesn't support -i for in-place updating.
It also doesn't support using \n inside a character class, so [^\n] must be replaced with a cumbersome workaround that positively defines all character except \n that can occur on a line - this is a achieved with a character class combining printable characters with all (ASCII) control characters other than \n included as literals (via a command substitution using printf).
Also note the need to split the sed script into two -e options, because POSIX sed requires that a branching command (b, in this case) be terminated with either an actual newline or continuation in a separate -e option.
# SYNOPSIS
# swapLines lineNum1 lineNum2
swapLines() {
[ "$1" -ge 1 ] || { printf "ARGUMENT ERROR: Line numbers must be decimal integers >= 1.\n" >&2; return 2; }
[ "$1" -le "$2" ] || { printf "ARGUMENT ERROR: The first line number ($1) must be <= the second ($2).\n" >&2; return 2; }
sed -e "$1"','"$2"'!b' -e ''"$1"'h;'"$1"'!H;'"$2"'!d;x;s/^\([[:print:]'"$(printf '\001\002\003\004\005\006\007\010\011\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\177')"']*\)\(.*\n\)\(.*\)/\3\2\1/'
}
Example:
$ printf 'line 1\nline 2\nline 3\n' | swapLines 1 3
line 3
line 2
line 1
In-place updating, based on gniourf_gniourf's excellent answer:
Small caveats:
While ed is a POSIX utility, it doesn't come preinstalled on all platforms, notably not on Debian and the Cygwin and MSYS Unix-emulation environments for Windows.
ed always reads the input file as a whole into memory.
# SYNOPSIS
# swapFileLines lineNum1 lineNum2 file
swapFileLines() {
[ "$1" -ge 1 ] || { printf "ARGUMENT ERROR: Line numbers must be decimal integers >= 1.\n" >&2; return 2; }
[ "$1" -le "$2" ] || { printf "ARGUMENT ERROR: The first line number ($1) must be <= the second ($2).\n" >&2; return 2; }
ed -s "$3" <<EOF
H
$1m$2
$2-m$1-
w
EOF
}
Example:
$ printf 'line 1\nline 2\nline 3\n' > file
$ swapFileLines 1 3 file
$ cat file
line 3
line 2
line 1
An explanation of #potong's GNU sed-based answer:
His command swaps lines 10 and 15:
sed -ri '10,15!b;10h;10!H;15!d;x;s/^([^\n]*)(.*\n)(.*)/\3\2\1/' f1 f2 fn
-r activates support for extended regular expressions; here, notably, it allows use of unescaped parentheses to form capture groups.
-i specifies that the files specified as operands (f1, f2, fn) be updated in place, without backup, since no optional suffix for a backup file is adjoined to the -i option.
10,15!b means that all lines that do not (!) fall into the range of lines 10 through 15 should branch (b) implicitly to the end of the script (given that no target-label name follows b), which means that the following commands are skipped for these lines. Effectively, they are simply printed as is.
10h copies (h) line number 10 (the start of the range) to the so-called hold space, which is an auxiliary buffer.
10!H appends (H) every line that is not line 10 - which in this case implies lines 11 through 15 - to the hold space.
15!d deletes (d) every line that is not line 15 (here, lines 10 through 14) and branches to the end of the script (skips remaining commands). By deleting these lines, they are not printed.
x, which is executed only for line 15 (the end of the range), replaces the so-called pattern space with the contents of the hold space, which at that point holds all lines in the range (10 through 15); the pattern space is the buffer on which sed commands operate, and whose contents are printed by default (unless -n was specified).
s/^([^\n]*)(.*\n)(.*)/\3\2\1/ then uses capture groups (parenthesized subexpressions of the regular expression that forms the first argument passed to function s) to partition the contents of the pattern space into the 1st line (^([^\n]*)), the middle lines ((.*\n)), and the last line ((.*)), and then, in the replacement string (the second argument passed to function s), uses backreferences to place the last line (\3) before the middle lines (\2), followed by the first line (\1), effectively swapping the first and last lines in the range. Finally, the modified pattern space is printed.
As you can see, only the range of lines spanning the two lines to swap is held in memory, whereas all other lines are passed through individually, which makes this approach memory-efficient.
This might work for you (GNU sed):
sed -ri '10,15!b;10h;10!H;15!d;x;s/^([^\n]*)(.*\n)(.*)/\3\2\1/' f1 f2 fn
This stores a range of lines in the hold space and then swaps the first and last lines following the completion of the range.
The i flag edits each file (f1,f2 ... fn) in place.
With GNU awk:
awk '
FNR==NR {if(FNR==14) x=$0;if(FNR==26) y=$0;next}
FNR==14 {$0=y} FNR==26 {$0=x} {print}
' file file > file_with_swap
The use of the following helper script allows using the power of find ... -exec ./script '{}' l1 l2 \; to locate the target files and to swap lines l1 & l2 in each file in place. (it requires that there are no identical duplicate lines within the file that fall within the search range) The script uses sed to read the two swap lines from each file into an indexed array and passes the lines to sed to complete the swap by matching. The sed call uses its "matched first address" state to limit the second expression swap to the first occurrence. An example use of the helper script below to swap lines 5 & 15 in all matching files is:
find . -maxdepth 1 -type f -name "lnum*" -exec ../swaplines.sh '{}' 5 15 \;
For example, the find call above found files lnumorig.txt and lnumfile.txt in the present directory originally containing:
$ head -n20 lnumfile.txt.bak
1 A simple line of test in a text file.
2 A simple line of test in a text file.
3 A simple line of test in a text file.
4 A simple line of test in a text file.
5 A simple line of test in a text file.
6 A simple line of test in a text file.
<snip>
14 A simple line of test in a text file.
15 A simple line of test in a text file.
16 A simple line of test in a text file.
17 A simple line of test in a text file.
18 A simple line of test in a text file.
19 A simple line of test in a text file.
20 A simple line of test in a text file.
And swapped the lines 5 & 15 as intended:
$ head -n20 lnumfile.txt
1 A simple line of test in a text file.
2 A simple line of test in a text file.
3 A simple line of test in a text file.
4 A simple line of test in a text file.
15 A simple line of test in a text file.
6 A simple line of test in a text file.
<snip>
14 A simple line of test in a text file.
5 A simple line of test in a text file.
16 A simple line of test in a text file.
17 A simple line of test in a text file.
18 A simple line of test in a text file.
19 A simple line of test in a text file.
20 A simple line of test in a text file.
The helper script itself is:
#!/bin/bash
[ -z $1 ] && { # validate requierd input (defaults set below)
printf "error: insufficient input calling '%s'. usage: file [line1 line2]\n" "${0//*\//}" 1>&2
exit 1
}
l1=${2:-10} # default/initialize line numbers to swap
l2=${3:-15}
while IFS=$'\n' read -r line; do # read lines to swap into indexed array
a+=( "$line" );
done <<<"$(sed -n $((l1))p "$1" && sed -n $((l2))p "$1")"
((${#a[#]} < 2)) && { # validate 2 lines read
printf "error: requested lines '%d & %d' not found in file '%s'\n" $l1 $l2 "$1"
exit 1
}
# swap lines in place with sed (remove .bak for no backups)
sed -i.bak -e "s/${a[1]}/${a[0]}/" -e "0,/${a[0]}/s/${a[0]}/${a[1]}/" "$1"
exit 0
Even though I didn't manage to get it all done in a one-liner I decided it was worth posting in case you can make some use of it or take ideas from it. Note: if you do make use of it, test to your satisfaction before turning it loose on your system. The script currently uses sed -i.bak ... to create backups of the files changed for testing purposes. You can remove the .bak when you are satisfied it meets your needs.
If you have no use for setting default lines to swap in the helper script itself, then I would change the first validation check to [ -z $1 -o -z $2 -o $3 ] to insure all required arguments are given when the script is called.
While it does identify the lines to be swapped by number, it relies on the direct match of each line to accomplish the swap. This means that any identical duplicate lines up to the end of the swap range will cause an unintended match and failue to swap the intended lines. This is part of the limitation imposed by not storing each line within the range of lines to be swapped as discussed in the comments. It's a tradeoff. There are many, many ways to approach this, all will have their benefits and drawbacks. Let me know if you have any questions.
Brute Force Method
Per your comment, I revised the helper script to use the brute forth copy/swap method that would eliminate the problem of any duplicate lines in the search range. This helper obtains the lines via sed as in the original, but then reads all lines from file to tmpfile swapping the appropriately numbered lines when encountered. After the tmpfile is filled, it is copied to the original file and tmpfile is removed.
#!/bin/bash
[ -z $1 ] && { # validate requierd input (defaults set below)
printf "error: insufficient input calling '%s'. usage: file [line1 line2]\n" "${0//*\//}" 1>&2
exit 1
}
l1=${2:-10} # default/initialize line numbers to swap
l2=${3:-15}
while IFS=$'\n' read -r line; do # read lines to swap into indexed array
a+=( "$line" );
done <<<"$(sed -n $((l1))p "$1" && sed -n $((l2))p "$1")"
((${#a[#]} < 2)) && { # validate 2 lines read
printf "error: requested lines '%d & %d' not found in file '%s'\n" $l1 $l2 "$1"
exit 1
}
# create tmpfile, set trap, truncate
fn="$1"
rmtemp () { cp "$tmpfn" "$fn"; rm -f "$tmpfn"; }
trap rmtemp SIGTERM SIGINT EXIT
declare -i n=1
tmpfn="$(mktemp swap_XXX)"
:> "$tmpfn"
# swap lines in place with a tmpfile
while IFS=$'\n' read -r line; do
if ((n == l1)); then
printf "%s\n" "${a[1]}" >> "$tmpfn"
elif ((n == l2)); then
printf "%s\n" "${a[0]}" >> "$tmpfn"
else
printf "%s\n" "$line" >> "$tmpfn"
fi
((n++))
done < "$fn"
exit 0
If the line numbers to be swapped are fixed then you might want to try something like the sed command in the following example to have lines swapped in multiple files in-place:
#!/bin/bash
# prep test files
for f in a b c ; do
( for i in {1..30} ; do echo $f$i ; done ) > /tmp/$f
done
sed -i -s -e '14 {h;d}' -e '15 {N;N;N;N;N;N;N;N;N;N;G;x;d}' -e '26 G' /tmp/{a,b,c}
# -i: inplace editing
# -s: treat each input file separately
# 14 {h;d} # first swap line: hold ; suppress
# 15 {N;N;...;G;x;d} # lines between: collect, append held line; hold result; suppress
# 26 G # second swap line: append held lines (and output them all)
# dump test files
cat /tmp/{a,b,c}
(This is according to Etan Reisner's comment.)
If you want to swap two lines, you can send it through twice, you could make it loop in one sed script if you really wanted, but this works:
e.g.
test.txt: for a in {1..10}; do echo "this is line $a"; done >> test.txt
this is line 1
this is line 2
this is line 3
this is line 4
this is line 5
this is line 6
this is line 7
this is line 8
this is line 9
this is line 10
Then to swap lines 6 and 9:
sed ':a;6,8{6h;6!H;d;ba};9{p;x};' test.txt | sed '7{h;d};9{p;x}'
this is line 1
this is line 2
this is line 3
this is line 4
this is line 5
this is line 9
this is line 7
this is line 8
this is line 6
this is line 10
In the first sed it builds up the hold space with lines 6 through 8.
At line 9 it prints line 9 then prints the hold space (lines 6 through 8) this accomplishes the first move of 9 to place 6. Note: 6h; 6!H avoids a new line at the top of the pattern space.
The second move occurs in the second sed script it saves line 7 to the hold space, then deletes it and prints it after line 9.
To make it quasi-generic you can use variables like this:
A=3 && B=7 && sed ':a;'${A}','$((${B}-1))'{'${A}'h;'${A}'!H;d;ba};'${B}'{p;x};' test.txt | sed $(($A+1))'{h;d};'${B}'{p;x}'
Where A and B are the lines you want to swap, in this case lines 3 and 7.
if, you want swap two lines, to create script "swap.sh"
#!/bin/sh
sed -n "1,$((${2}-1))p" "$1"
sed -n "${3}p" "$1"
sed -n "$((${2}+1)),$((${3}-1))p" "$1"
sed -n "${2}p" "$1"
sed -n "$((${3}+1)),\$p" "$1"
next
sh swap.sh infile_name 14 26 > outfile_name

Conditional replacement of string fragment with sed (one-liner!)

I am trying to process the result of diff operation with sed. This is my diff output, which I pipe into sed
3d2
< 12-03-22_JET_D_CL_UR_l4053_0061 True_Warning All 9 149261
62a62
> 13-01-29_VUE_EPM3_v37_CSAV2_0370 True_Warning All 13 22125
68c68
< 13-05-14_Regular_Front_0062 True_Warning All 13 123383
---
> 13-05-14_Regular_Front_0062 True_Warning All 21 123383
119c119
< CADS4_PMP363_20130202_DPH_069 True_Warning All 13 233405
---
> CADS4_PMP363_20130202_DPH_069 True_Warning All 9 233409
149c149
< CADS4_PMP363_20130315_Fujifilm_UK_186 True_Warning All 21 18611
---
> CADS4_PMP363_20130315_Fujifilm_UK_186 True_Warning All 17 18615
I need to sort out the difference string and prepend the 3rd word in the strings with either "Old" or "New" - depending on the first character. My best effort so far is
diff new_jumps/true.jump old_jumps/true.jump | sed -n "/^[<>]/ s:\(.\) \(\S\+\) \(.\+\):\2 \1,\3: p" | replace ">" Old | replace "<" New
Which give me this result (exactly what I wanted).
12-03-22_JET_D_CL_UR_l4053_0061 New,True_Warning All 9 149261
13-01-29_VUE_EPM3_v37_CSAV2_0370 Old,True_Warning All 13 22125
13-05-14_Regular_Front_0062 New,True_Warning All 13 123383
13-05-14_Regular_Front_0062 Old,True_Warning All 21 123383
CADS4_PMP363_20130202_DPH_069 New,True_Warning All 13 233405
CADS4_PMP363_20130202_DPH_069 Old,True_Warning All 9 233409
CADS4_PMP363_20130315_Fujifilm_UK_186 New,True_Warning All 21 18611
CADS4_PMP363_20130315_Fujifilm_UK_186 Old,True_Warning All 17 18615
My question is - how can I change conditional expression within sed one-liner that will eliminate the need to use replace afterwards? (I assume that it is possible)
Thanks in advance
EDIT:
I know, I missed the option to chain sed expressions, but what I had in mind - is it possible to do it within one substitute operation?
By adding more commands to sed using semicolon (;), like this:
diff new_jumps/true.jump old_jumps/true.jump | sed -n "/^[<>]/ s:\(.\) \(\S\+\) \(.\+\):\2 \1,\3:; s/</New/gp; s/>/Old/gp"
With awk I get a faster response. Try this:
diff new_jumps/true.jump old_jumps/true.jump | awk '{ if($1=="<" || $1==">"){($1=="<")?temp="New,":temp="Old,";print $2,temp$3,$4,$5}}'
Here's another solution suggested by Jidder:
awk '/^</{i="old,"}/^>/{i="new,"}i{$2=$2" "i;print;i=0}'
#volcano: here is a one-liner solution in sed, but relies in the interaction with the shell. IMHO if you want to have only one sed substitution command, you cannot avoid that behavior: you have to output to the shell the information of which first character has been seen on the line, the shell somewhat does the mapping to "Old" or "New" strings, and gives the result back to sed.
So the one-liner is not exactly a one-liner because we have to define things in the shell... ;)
replace() { if [ "$1" == ">" ] ; then echo -n "Old"; else echo -n "New" ; fi }
export -f replace
sed -n '/^[<>]/ s:\(.\) \(\S\+\) \(.\+\):echo "\2 $(replace \\\1),\3";:ep' yourfile
Please note that the e flag to the substitution command is a GNU sed extension, we use it here to avoid calling the shell explicitly. If you don't use GNU sed, you can simply replace the last line above by the following:
sed -n '/^[<>]/ s:\(.\) \(\S\+\) \(.\+\):echo "\2 $(replace \\\1),\3";:p' yourfile | bash
The solution I am giving here has been inspired by that other one.
Please also note that all this gymnastics is avoidable if you accept to replace your three-letter tokens "Old" and "New" by their initials, because then we can neatly use the y command to first act in a tr fashion, likewise:
sed -n '/^[<>]/ y/<>/ON/; s:\(.\) \(\S\+\) \(.\+\):\2 \1,\3:p' yourfile

Resources