I have the below code
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4 double NOT NULL,
CONSTRAINT Index1 PRIMARY KEY CLUSTERED
(
column2 ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY
) ON PRIMARY
GO
GO
and I want to replace
CONSTRAINT Index1 PRIMARY KEY CLUSTERED
(
column2 ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY
) ON PRIMARY
GO
with
)
You can't assume GO is the last character of the file. After Go there can be another table script.
How can I do that with single sed or awk.
Update:
You can use the following sed command to replace even the last , before the CONSTRAINT block:
sed -r '/,/{N;/CONSTRAINT/{:a;N;/GO/!ba;s/([^,]+).*/\1\n)/};/CONSTRAINT/!n}' input.sql
Let me explain it as a multiline script:
# Search for a comma
/,/ {
# If a command was found slurp in the next line
# and append it to the current line in pattern buffer
N
# If the pattern buffer does not contain the word CONSTRAINT
# print the pattern buffer and go on with the next line of input
# meaning start searching for a comma
/CONSTRAINT/! n
# If the pattern CONSTRAINT was found we loop until we find the
# word GO
/CONSTRAINT/ {
# Define a start label for the loop
:a
# Append the next line of input to the pattern buffer
N
# If GO is still not found in the pattern buffern
# step to the start label of the loop
/GO/! ba
# The loop was exited meaning the pattern GO was found.
# We keep the first line of the pattern buffer - without
# the comma at the end and replace everything else by a )
s/([^,]+).*/\1\n)/
}
}
You can save the above multiline script in a file and execute it using
sed -rf script.sed input.sql
You can use the following sed command:
sed '/CONSTRAINT/{:a;N;/GO/!ba;s/.*/)/}' input.sql
The pattern searches for a line containing /CONSTRAINT/. If the pattern is found a block of commands is started wrapped between { }. In the block we first define a label a through :a. The we get the next line of input through N and append it to the pattern buffer. Unless we find the pattern /GO/! we'll continue at label a using the branch command b. If the pattern /GO/ is found we simply replace the buffer by a ).
An alternative can be using using a range like FredPhil suggested:
sed '/CONSTRAINT/,/GO/{s/GO/)/;te;d;:e}'
This may look scary but it is not difficult to grasp with a bit of explanation:
SED_DELIM=$(echo -en "\001")
START=' CONSTRAINT Index1 PRIMARY KEY CLUSTERED'
END='GO'
sed -n $'\x5c'"${SED_DELIM}${START}${SED_DELIM},"$'\x5c'"${SED_DELIM}${END}${SED_DELIM}{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p" test2.txt
The sed has the following form you may be more familiar with:
sed /regex1/,/regex2/{commands}
First it uses the SOH non-printable as the delimiter \001
Sets the START and END tags for sed multiline match
Then performs the sed command:
-n do not print by default
$'\x5c' is a Bash string literal that corresponds to backslash \
The backslashes are necessary to escape the non-printable delimiter on the multiline range match.
{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p:
s${SED_DELIM}GO${SED_DELIM})${SED_DELIM} replace the line that matches GO with )
t a; if there is a successful substitution in the prior statement then branch to the :a label
d if there is no subsitution then delete the line
p print whatever the result is after the commands
branch to the
I didn't see their answers prior to posting this - this answer is the same as FredPhil/hek2mgl - except in this manner you have a mechanism to be more dynamic on the LHS since you can change the delimiter to a character that is much less likely to appear in the dataset.
With GNU awk for multi-char RS and assuming you want to get rid of the comma before the "CONSTRAINT":
$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
gsub(/\<GO\>/,"\034")
gsub(/,\s*CONSTRAINT[^\034]+\034/,")")
gsub(/\034/,"GO")
print
}
$ gawk -f tst.awk file
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4 double NOT NULL)
GO
The above works by replacing every stand-alone "GO" with a control char that's unlikely to appear in your input (in this case I used the same value as the default SUBSEP) so we can use that char in a negated character list in the middle gsub() to create a regexp that ends with the first "GO" after "CONSTRAINT". This is one way to do "non-greedy" matching in awk.
If there is no char that you KNOW cannot appear in your input, you can create one like this:
$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/\<GO\>/,"b")
gsub(/,\s*CONSTRAINT[^b]+b/,")")
gsub(/b/,"GO"); gsub(/aB/,"b"); gsub(/aA/,"a")
print
}
$
$ gawk -f tst.awk file
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4 double NOT NULL)
GO
The above initially converts all "a"s to "aA" and "b"s to "aB" so that
there are no longer any "b"s in the record, and
since all original "a"s now have an "A" after them, the only occurrences of
"aB" represent where "bs" were originally located
and that means that we can now convert all "GO"s to "b"s just like we converted them to "\034" in the first script above. Then we do the main gsub() and then unroll our initial gsub()s.
This idea of gsub()ing to create chars that cannot previously exist, using those chars, then unrolling the initial gsub()s is an extremely useful idiom to learn and remember, e.g. see https://stackoverflow.com/a/13062682/1745001 for another application.
To see it working one step at a time:
$ cat file
foo bar Hello World World able bodies
$ awk '{gsub(/a/,"aA")}1' file
foo baAr Hello World World aAble bodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB")}1' file
foo aBaAr Hello World World aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b")}1' file
foo aBaAr Hello b b aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The")}1' file
foo aBaAr We Are The b aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World")}1' file
foo aBaAr We Are The World aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b")}1' file
foo baAr We Are The World aAble bodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b"); ; gsub(/aA/,"a")}1' file
foo bar We Are The World able bodies
Related
Suppose I have text as:
This is a sample text.
I have 2 sentences.
text is present there.
I need to replace whole text between two 'text' words. The required solution should be
This is a sample text.
I have new sentences.
text is present there.
I tried using the below command but its not working:
sed -i 's/text.*?text/text\
\nI have new sentence/g' file.txt
With your shown samples please try following. sed doesn't support lazy matching in regex. With awk's RS you could do the substitution with your shown samples only. You need to create variable val which has new value in it. Then in awk performing simple substitution operation will so the rest to get your expected output.
awk -v val="your_new_line_Value" -v RS="" '
{
sub(/text\.\n*[^\n]*\n*text/,"text.\n"val"\ntext")
}
1
' Input_file
Above code will print output on terminal, once you are Happy with results of above and want to save output into Input_file itself then try following code.
awk -v val="your_new_line_Value" -v RS="" '
{
sub(/text\.\n*[^\n]*\n*text/,"text.\n"val"\ntext")
}
1
' Input_file > temp && mv temp Input_file
You have already solved your problem using awk, but in case anyone else will be looking for a sed solution in the future, here's a sed script that does what you needed. Granted, the script is using some advanced sed features, but that's the fun part of it :)
replace.sed
#!/usr/bin/env sed -nEf
# This pattern determines the start marker for the range of lines where we
# want to perform the substitution. In our case the pattern is any line that
# ends with "text." — the `$` symbol meaning end-of-line.
/text\.$/ {
# [p]rint the start-marker line.
p
# Next, we'll read lines (using `n`) in a loop, so mark this point in
# the script as the beginning of the loop using a label called `loop`.
:loop
# Read the next line.
n
# If the last read line doesn't match the pattern for the end marker,
# just continue looping by [b]ranching to the `:loop` label.
/^text/! {
b loop
}
# If the last read line matches the end marker pattern, then just insert
# the text we want and print the last read line. The net effect is that
# all the previous read lines will be replaced by the inserted text.
/^text/ {
# Insert the replacement text
i\
I have a new sentence.
# [print] the end-marker line
p
}
# Exit the script, so that we don't hit the [p]rint command below.
b
}
# Print all other lines.
p
Usage
$ cat lines.txt
foo
This is a sample text.
I have many sentences.
I have many sentences.
I have many sentences.
I have many sentences.
text is present there.
bar
$
$ ./replace.sed lines.txt
foo
This is a sample text.
I have a new sentence.
text is present there.
bar
Substitue
sed -i 's/I have 2 sentences./I have new sentences./g'
sed -i 's/[A-Z]\s[a-z].*/I have new sentences./g'
Insert
sed -i -e '2iI have new sentences.' -e '2d'
I need to replace whole text between two 'text' words.
If I understand, first text. (with a dot) is at the end of first line and second text at the beginning of third line. With awk you can get the required solution adding values to var s:
awk -v s='\nI have new sentences.\n' '/text.?$/ {s=$0 s;next} /^text/ {s=s $0;print s;s=""}' file
This is a sample text.
I have new sentences.
text is present there.
I have been tinkering with this for a while but can't quite figure it out. A sample line within the file looks like this:
"...~236 characters of data...Y YYY. Y...many more characters of data"
How would I use sed or awk to replace spaces with a B character only between positions 236 and 246? In that example string it starts at character 29 and ends at character 39 within the string. I would want to preserve all the text preceding and following the target chunk of data within the line.
For clarification based on the comments, it should be applied to all lines in the file and expected output would be:
"...~236 characters of data...YBBYYY.BBY...many more characters of data"
With GNU awk:
$ awk -v FIELDWIDTHS='29 10 *' -v OFS= '{gsub(/ /, "B", $2)} 1' ip.txt
...~236 characters of data...YBBYYY.BBY...many more characters of data
FIELDWIDTHS='29 10 *' means 29 characters for first field, next 10 characters for second field and the rest for third field. OFS is set to empty, otherwise you'll get space added between the fields.
With perl:
$ perl -pe 's/^.{29}\K.{10}/$&=~tr| |B|r/e' ip.txt
...~236 characters of data...YBBYYY.BBY...many more characters of data
^.{29}\K match and ignore first 29 characters
.{10} match 10 characters
e flag to allow Perl code instead of string in replacement section
$&=~tr| |B|r convert space to B for the matched portion
Use this Perl one-liner with substr and tr. Note that this uses the fact that you can assign to substr, which changes the original string:
perl -lpe 'BEGIN { $from = 29; $to = 39; } (substr $_, ( $from - 1 ), ( $to - $from + 1 ) ) =~ tr/ /B/;' in_file > out_file
To change the file in-place, use:
perl -i.bak -lpe 'BEGIN { $from = 29; $to = 39; } (substr $_, ( $from - 1 ), ( $to - $from + 1 ) ) =~ tr/ /B/;' in_file
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak.
I would use GNU AWK following way, for simplicity sake say we have file.txt content
S o m e s t r i n g
and want to change spaces from 5 (inclusive) to 10 (inclusive) position then
awk 'BEGIN{FPAT=".";OFS=""}{for(i=5;i<=10;i+=1)$i=($i==" "?"B":$i);print}' file.txt
output is
S o mBeBsBt r i n g
Explanation: I set field pattern (FPAT) to any single character and output field seperator (OFS) to empty string, thus every field is populated by single characters and I do not get superfluous space when print-ing. I use for loop to access desired fields and for every one I check if it is space, if it is I assign B here otherwise I assign original value, finally I print whole changed line.
Using GNU awk:
awk -v strt=29 -v end=39 '{ ram=substr($0,strt,(end-strt));gsub(" ","B",ram);print substr($0,1,(strt-1)) ram substr($0,(end)) }' file
Explanation:
awk -v strt=29 -v end=39 '{ # Pass the start and end character positions as strt and end respectively
ram=substr($0,strt,(end-strt)); # Extract the 29th to the 39th characters of the line and read into variable ram
gsub(" ","B",ram); # Replace spaces with B in ram
print substr($0,1,(strt-1)) ram substr($0,(end)) # Rebuild the line incorporating raw and printing the result
}'file
This is certainly a suitable task for perl, and saddens me that my perl has become so rusty that this is the best I can come up with at the moment:
perl -e 'local $/=\1;while(<>) { s/ /B/ if $. >= 236 && $. <= 246; print }' input;
Another awk but using FS="":
$ awk 'BEGIN{FS=OFS=""}{for(i=29;i<=39;i++)sub(/ /,"B",$i)}1' file
Output:
"...~236 characters of data...YBBYYY.BBY...many more characters of data"
Explained:
$ awk ' # yes awk yes
BEGIN {
FS=OFS="" # set empty field delimiters
}
{
for(i=29;i<=39;i++) # between desired indexes
sub(/ /,"B",$i) # replace space with B
# if($i==" ") # couldve taken this route, too
# $i="B"
}1' file # implicit output
With sed :
sed '
H
s/\(.\{236\}\)\(.\{11\}\).*/\2/
s/ /B/g
H
g
s/\n//g
s/\(.\{236\}\)\(.\{11\}\)\(.*\)\(.\{11\}\)/\1\4\3/
x
s/.*//
x' infile
When you have an input string without \r, you can use:
sed -r 's/(.{236})(.{10})(.*)/\1\r\2\r\3/;:a;s/(\r.*) (.*\r)/\1B\2/;ta;s/\r//g' input
Explanation:
First put \r around the area that you want to change.
Next introduce a label to jump back to.
Next replace a space between 2 markers.
Repeat until all spaces are replaced.
Remove the markers.
In your case, where the length doesn't change, you can do without the markers.
Replace a space after 236..245 characters and try again when it succeeds.
sed -r ':a; s/^(.{236})([^ ]{0,9}) /\1\2B/;ta' input
This might work for you (GNU sed):
sed -E 's/./&\n/245;s//\n&/236/;h;y/ /B/;H;g;s/\n.*\n(.*)\n.*\n(.*)\n.*/\2\1/' file
Divide the problem into 2 lines, one with spaces and one with B's where there were spaces.
Then using pattern matching make a composite line from the two lines.
N.B. The newline can be used as a delimiter as it is guaranteed not to be in seds pattern space.
I am trying to print the contents of the whole file with highlighted search string.
For a simple file where Record is equal to a single line then I can do this easily using:
grep --color=auto "myseacrchpattern" inputfile
Here my records are in form of paragraphs not single line. Example:
CREATE TABLE mytable ( id SERIAL,
name varchar(20),
cost int );
CREATE TABLE notmytable ( id SERIAL,
name varchar(20),
cost int );
If I do use grep for keyword : "notmytable",it will give me colored output but only that line is printed.
grep --color=auto 'notmytable' inputfile
CREATE TABLE notmytable ( id SERIAL, # <-- "notmytable" is in red but its not the whole query
I need something like this :
CREATE TABLE notmytable ( id SERIAL, # <---"notmytable" is in red
name varchar(20),
cost int );
I can print the desired paragraph with awk or perl but how to color it :
awk -v RS=';' -v ORS=';\n' '/notmytable/' inputfile
CREATE TABLE notmytable ( id SERIAL,
name varchar(20),
cost int );
OR perl :
perl -00lne 'print $_ if /notmytable/' inputfile
CREATE TABLE notmytable ( id SERIAL,
name varchar(20),
cost int );
perl "-MTerm::ANSIColor qw(:constants)" -00lnE'
next if not /notmytable/;
for (split "\n") { /notmytable/ ? say RED $_, RESET : say }
' input
The :constants tag provides RED and such. There are other ways, see Term::ANSIColor
Note that there has to be some duplicate searching, since we need to first identify the paragraph, but then print only that line in color while printing others normally.
If only the pattern need be colored, double-parsing isn't needed (and it's far easier and nicer)
perl -MTerm::ANSIColor -00lnE'say if s/(notmytable)/colored($1,"red")/eg' input
If a string matching the desired regexp is found then surround it by the appropriate characters to change its color and print the record containing it:
$ awk -v RS=';' -v ORS=';\n' 'gsub(/notmytable/,"<RED>&</RED>")' file
CREATE TABLE <RED>notmytable</RED> ( id SERIAL,
name varchar(20),
cost int );
Just change <RED> and </RED> to the escape sequences for that color, e.g.
awk -v RS=';' -v ORS=';\n' 'gsub(/notmytable/,"\033[31m&\033[0m")' file
or if you don't want to hard-code those color values you could do:
awk -v RS=';' -v ORS=';\n' -v red="$(tput setaf 1)" -v nrm="$(tput sgr0)" 'gsub(/notmytable/,red"&"nrm)' file
BTW if you have blank lines between all records you'll probably find using -v RS= -v ORS='\n\n' works better for you than -v RS=';' -v ORS=';\n'.
Combine your awk or perl solution with grep:
perl -00lne 'print $_ if /notmytable/' input|grep -C1000 notmytable
The -C1000 option makes grep keep 1000 lines of surrounding context around the matching line, actually turning grep into a mere colorizer rather than line selector.
You can wrap it into a bash function:
function paragrep() { perl -00lne 'print $_ if /'"$1"/ "$2"|grep -C1000 "$1"; }
Usage example:
$ paragrep notmytable input
CREATE TABLE notmytable ( id SERIAL, # <---"notmytable" is in red
name varchar(20),
cost int );
$
I have a really bad habit of abusing tr.
I need to find another way, a different style.
all I want to so is print the list horizontally instead of vertically - so I can cut and past it into an email. Check out the use of the TR command. Just terrible.
$ cat /tmp/wig
update PTMM_ARCHIVE.FASTTRACK_USER set user_name = 'monohajoxx' where user_name = 'monohajo'
update PTMM_ARCHIVE.FASTTRACK_USER set user_name = 'wuemxx' where user_name = 'wuem'
update PTMM_ARCHIVE.FASTTRACK_USER set user_name = 'taraziemxx' where user_name = 'taraziem'
update PTMM_ARCHIVE.FASTTRACK_USER set user_name = 'mullankexx' where user_name = 'mullanke'
update PTMM_ARCHIVE.FASTTRACK_USER set user_name = 'fernanjaxx' where user_name = 'fernanja'
$ awk '{print $NF}' /tmp/wig | tr -d "'" | tr "\n" ", \s" ; echo "\n"
monohajo,wuem,taraziem,mullanke,fernanja,\n
Using awk
Here is one way to do it entirely with awk:
$ awk '{gsub(/'\''/,"",$NF); printf "%s%s",(NR>1?",":""),$NF} END{print "\\n"}' wig
monohajo,wuem,taraziem,mullanke,fernanja\n
The gsub command removes the single-quotes from the last field. The printf command prints the last field preceded by a comma if this isn't the first line. The final print statement finishes the line.
And, here is another:
$ awk '{printf "%s%s",(NR>1?",":""),substr($NF,2,length($NF)-2)} END{print "\\n"}' wig
monohajo,wuem,taraziem,mullanke,fernanja\n
This uses a similar printf statement but uses substr to remove the first and last characters of the last field.
Using sed
$ sed -nE "s/.*'([^']*)'/\1/"'; H; 1h; ${x; s/\n/,/g; s/$/\\n/; p}' wig
monohajo,wuem,taraziem,mullanke,fernanja\n
How it works:
-n tells sed not to print anything unless we explicitly ask it to.
-E tells sed to use extended regular expressions so that we don't have to type as many backslashes.
s/.*'([^']*)'/\1/
This removes everything from the line except for the last field single-quoted string (with the quotes are removed).
H; 1h;
H adds a newline to the hold space followed by a copy of the current pattern space (which now contains the last field, minus the quotes).
If this is the first line, however, the h command overwrites the hold space with just the current value of the pattern space (no newline).
${x; s/\n/,/g; s/$/\\n/; p}
On the last line, denoted by $, this does the following:
- `x` exchanges the hold and pattern spaces.
- `s/\n/,/g` converts all those newlines to commas.
- `s/$/\\n/` puts a `\n` at the end.
- `p` causes this pattern space to be printed.
Hello: I have tab separated data of the form
customer-item description-purchase price-category
e.g. a.out contains:
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
3\tNULL\t3.0\tfruit
4\tCarrots\tNULL\tfruit
5\tNULL\tNULL\tfruit
I'm attempting to get rid of all the NULL fields. I can't rely on the simple replacement of the string "NULL" as it may be a substring; so I am attempting
sed -i 's:\tNULL\t:\t\t:g' a.out
when I do this, I end up with
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
3\t\t3.0\tfruit
4\tCarrots\t\tfruit
5.\t\tNULL\tfruit
what's wrong here is that #5 has only suffered a replacement of the first instance of the search string on each line.
If I run my sed command twice, I end up with the result I want:
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
3\t\t3.0\tfruit
4\tCarrots\t\tfruit
5.\t\t\tfruit
where you can see that line 5 has both of the NULLs removed
But I don't understand why I'm suffering this?
awk -F'\t' -v OFS='\t' '{
for (i = 1; i <= NF; ++i) {
if ($i == "NULL") {
$i = "";
}
}
print
}' test.txt
The straightforward solution is to use \t as a field separator and then loop over all of the fields looking for an exact match of "NULL". No substringing.
Here's the same thing as a one liner:
awk -F'\t' -v OFS='\t' '{for(i=1;i<=NF;++i) if($i=="NULL") $i=""} 1' test.txt
Since tabs can't be inside strings in your case since that would imply a new field you might be able to do what you want simply by doing this;
sed ':start ; s/\tNULL\(\t\|$\)/\t\1/ ; t start' a.out
First the inner part s/\tNULL\(\t\|$\)/\t\1/ searches for tab NULL followed by a tab or end of line $ and replace with a tab followed by the character that did appear after NULL (this last part is done using \1). We'll call that expression
We now have:
sed ':start ; expression ; t start' a.out
This is effectively a loop (like goto). :start is a label. ; acts as a statement delimiter. I have described what expression does above. t start says that IF the expression did any substitution that a jump will be made to label start. The buffer will contain the substituted text. This loop occurs until no substitution can be done on the line and then processing continues.
Information on sed flow control and other useful tidbits can be found here
awk makes it simpler:
awk -F '\tNULL\\>' -v OFS='\t' '{$1=$1}1' file
1\t400 Bananas\t3.00\tfruit
2\t60 Oranges\t0.00\tfruit
3\t\t3.0\tfruit
4\tCarrots\t\tfruit
5\t\t\tfruit
From grep(1) on a recent Linux:
The Backslash Character and Special Expressions
The symbols \< and > respectively match the empty string at the
beginning and end of a word. The symbol \b matches the empty string at
the edge of a word [...]
--
So, how about:
sed -i 's:\<NULL\>::g' a.out