Finding a common pattern from input lines using single command - shell

I have below two lines. I want only part of line which has .script & after / using single command.
Input lines
hello/world/command_altr.program_for_input.script
hello/world/script/deleted_the_input.program_for_output.script
/com/bash/hastag/welcome/program -u util/basic/level/learning
Output expected :
command_altr.program_for_input.script
deleted_the_input.program_for_output.script

If I properly understand your problem, this can be solved with sed.
sed -n -r -e 's/^.*\/([^/]+\.script).*$/\1/p'
It will extract the given pattern and and print it. Lines without the pattern are discarded.
^.*\/ searches a string starting beginning of a line, containing any char and ended by a /
([^/]+\.script) searches a repetition of one or several chars (except /), followed by a dot and the string "script". If found, it is put in a remembered pattern thanks to the (....)
.*$ searches any number of chars up to end_of_line
/\1/ replaces the line by the remembered pattern.
'p' prints if pattern has been found.

Related

Shell Script sed to search for 2 different matches and replace it with different values in loop

In shell script i am unable to find solution for below:
I have a file.txt generated as below , whose values are not fixed
"string1","string2"
"string4","string5"
"string6","string9"
"string10","string11"
I have another file:
<abc><cde>var_1</cde><efg>var_2</efg></abc>
I need to generate output file as below
<abc><cde>string1</cde><efg>string2</efg></abc>
<abc><cde>string4</cde><efg>string5</efg></abc>
<abc><cde>string6</cde><efg>string9</efg></abc>
<abc><cde>string10</cde><efg>string11</efg></abc>
This might work for you (GNU sed):
sed -E '1{x;s/^/cat anotherFile/e;x};G
s/.*"(.*)","(.*)".*\n(.*)var_1(.*)var_2/\3\1\4\2/' txtFile
Store anotherFile in the hold space on the first line only.
Append anotherFile to each line of txtFile and using pattern matching format the result.
The following sed command would do that
sed 's#"\([^"]*\)","\([^"]*\)"#<abc><cde>\1</cde><efg>\2</efg></abc>#' file.txt
What is going on here.
First we use the s#pattern#replacement# subcommand to replace every line of the input file that matches the pattern with the replacement string.
Let's look at the pattern first. The pattern is "\([^"]*\)","\([^"]*\)". It says
Look for a quote character " in the input
select all symbols up to the next quote character as the first substitution expression this is expressed as \([^"]*\) where the expression in brackets \(...\) is the pattern for the substitution expression. The pattern is [^"]* which means take zero or more characters apart from ".
ignore the next "," and select the second substitution expression up to the next quote character.
We identified the substrings in quotes as replacement text for the first and the second substitution expression referred to as \1 and \2 respectively.
Now for the replacement. It is taken literally with the exception of \1 and \2 that are replaced with the first and second substitution expression identified when matching the pattern.

Add multiple elements to the text file in a specific way using Bash

I have a text file that contains a list of "word sequences" and I need to add some "" and "," to each word sequence, I´m thinking in use a bash command.
Here is the data:
NTSS
NGTG
NVSQ
NITL
NFTS
...
I need to add "" to each word sequence and separate with ","
Here an expected output:
"NTSS",
"NGTG",
"NVSQ",
"NITL",
...
Any recommendation with BASH to do that?
This can be done in many ways, but sed is perfect for the job.
sed 's/^.*$/"\0",/' < file.txt
This replacement simply matches the whole line and replaces it according to what you need.
The one above is a regular expression replacement, which has the structure:
s/<pattern to match>/<replacement>/
^ matches the beginning of the line
.* matches any character any number of times
$ matches the end of the line
In the replacement part, \0 represents the whole string that has matched the pattern (the entire line in this case)
Check out some regular expression tutorial for more.
If you prefer a purely bash alternative, you can use:
while read -r line; do echo "\"${line}\","; done < file.txt

SED's Substituted string is considered as one-line string, whereas it contains newline character

I am testing the sed command to substitute one line with 3 lines and, then, to delete the last line. (I could have substituted it with only the 2 first lines, but this is deliberately stated like this to showcase the main issue).
Let's say that I have the following text :
// ##OPTION_NAME: xxxx
I want to replace the token ##OPTION_NAME by ##OP-NAME and surround it by 2 new lines; Like so :
// ##OP-START
// ##OP-NAME: xxxx
// ##OP-END
To illustrate this, I put this text in a code.c file, and the sed commands in a sed script named script.sed.
Then, I call the following shell command :
Shell command
sed -f script.sed code.c
script.sed
# Begin by replacing patterns by their equivalents, surrounding them with ##OP-START and ##OP-END lines
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g
The problem
Now, I add another sed command in script.sed to delete the line containing ##OP-END. Surprise ! all 3 lines are removed !
# Begin by replacing patterns by their equivalents, surrounding them with ##OP-START and ##OP-END lines
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g
# Last parse; delete ##OP-END
/##OP-END/d
I tried \r\n instead of \n in the sustitution command
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g, but it does not work.
I also tested on ##OP-START to see if it makes some difference,
but alas ! All 3 lines were removed too.
It seems that sed is considering it as one line !
This is not a surprise, d operates on the pattern space, not on a per line basis. After the modification with the s command, your pattern space contains 3 lines. The content of it matches the expression and gets therefore deleted.
To delete this line from the pattern space, you need to use the s command again:
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g$
s/\n\/\/ ##OP-END//
About pattern and hold space: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13

sed error unterminated substitute pattern for new line text

I am writing a script to add new dependencies to the watch list. I am putting a placeholder to know where to add the text, for eg
assets = [
"../../new_app/assets"
# [[NEW_APP_ADD_ASSETS]]
]
It is simple to replace just the place holder but my problem is to add comma in the previous line.
that can be done if I search and replace
"
# [[NEW_APP_ADD_ASSETS]]
ie "\n # [[NEW_APP_ADD_ASSETS]]
I am not able to search for the new line.
One of the solutions I found for adding a new line was
sed -i '' 's/newline/line one\
line two/' filename.txt
But when same way done for the search string it returns :unterminated substitute pattern
sed -i '' s/'assets\"\
#'/'some new text'/ filename.txt
PS: I writing on macos
Sed works on a line-by-line base, hence it becomes tricky to add the coma to the previous line as that line has already been processed. It is possible, but the sed syntax quickly becomes messy.
To be a bit more specific:
In default operation, sed cyclically shall append a line of input, less its terminating <newline> character, into the pattern space. Reading from input shall be skipped if a <newline> was in the pattern space prior to a D command ending the previous cycle. The sed utility shall then apply in sequence all commands whose addresses select that pattern space, until a command starts the next cycle or quits. If no commands explicitly started a new cycle, then at the end of the script the pattern space shall be copied to standard output (except when -n is specified) and the pattern space shall be deleted. Whenever the pattern space is written to standard output or a named file, sed shall immediately follow it with a <newline>.
In short, if you do not manipulate the pattern space, you cannot process <newline> characters as they just do not appear!
And even shorter, if you only use the substitute command, sed only processes one line at a time!
This is also why you suffer from : unterminated substitute pattern. You are searching for a newline character, but as sed just reads one line at a time, it just does not find it and it also does not expect it. The error will vanish if you replace your newline with the symbols \n.
sed -i '' s/'assets\"\n #'/'some new text'/ filename.txt
A better way to achieve your goals would be to make use of awk. It is a bit more readable:
awk '/# [[NEW_APP_ADD_ASSETS]]/{ print t","; t="line1\nline2"; next }
{ print t; t=$0 }
END{ print t }' <file>

Ignoring lines with blank or space after character using sed

I am trying to use sed to extract some assignments being made in a text file. My text file looks like ...
color1=blue
color2=orange
name1.first=Ahmed
name2.first=Sam
name3.first=
name4.first=
name5.first=
name6.first=
Currently, I am using sed to print all the strings after the name#.first's ...
sed 's/name.*.first=//' file
But of course, this also prints all of the lines with no assignment ...
Ahmed
Sam
# I'm just putting this comment here to illustrate the extra carriage returns above; please ignore it
Is there any way I can get sed to ignore the lines with blank or whitespace only assignments and store this to an array? The number of assigned name#.first's is not known, nor are the number of assignments of each type in general.
This is a slight variation on sputnick's answer:
sed -n '/^name[0-9]\.first=\(.\+\)/ s//\1/p'
The first part (/^name[0-9]\.first=\(.\+\)/) selects the lines you want to pass to the s/// command. The empty pattern in the s command re-uses the previous regular expression and the replacement portion (\1) replaces the entire match with the contents of the first parenthesized part of the regex. Use the -n and p flags to control which lines are printed.
sed -n 's/^name[0-9]\.\w\+=\(\w\+\)/\1/p' file
Output
Ahmed
Sam
Explainations
the -n switch suppress the default behavior of sed : printing all lines
s/// is the skeleton for a substitution
^ match the beginning of a line
name literal string
[0-9] a digit alone
\.\w\+ a literal dot (without backslash means any character) followed by a word character [a-zA-Z0-9_] al least one : \+
( ) is a capturing group and \1 is the captured group

Resources