sed extract substring between two characters from a file and save to variable - bash

I am automatically building a package. The automated script needs to get the version of the package to build.
I need to get the string of the python script main.py. It says in line 15
VERSION="0.2.0.4" #DO NOT MOVE THIS LINE
I need the 0.2.0.4, in future it can easily become 0.10.3.15 or so, so the sed command must not have a fixed length.
I found this on stackoverflow:
sed -n '15s/.*\#\([0-9]*\)\/.*/\1/p'
"This suppresses everything but the second line, then echos the digits between # and /"
This does not work (adjusted). Which is the last "/"? How can I save the output into a variable called "version"?
version = sed -n ...
throws an error
command -n not found

If you just need version number.
awk -F\" 'NR==15 {print $2}' main.py
This prints everything between " on line 15. Like 0.2.0.4

With awk:
$ awk -F= 'NR==15 {gsub("\"","",$2); print $2}' main.py
0.2.0.4
Explanation
NR==15 performs actions on line number 15.
-F= defines the field separator as =.
{gsub("\"","",$2); print $2} removes the " character on the 2nd field and prints it.
Update
to be more specific the line is version="0.2.0.4" #DO NOT MOVE THIS
LINE
$ awk -F[=#] 'NR==15 {gsub("\"","",$2); print $2}' main.py
0.2.0.4
Using multiple field separator -F[=#] which means it can be either # or =.
To save it into your version variable, use the expression var=$(command) like:
version=$(awk -F[=#] 'NR==15 {gsub("\"","",$2); print $2}' main.py)

Try:
sed -n '15s/[^"]*"\(.*\)".*/\1/p' inputfile
In order to assign it to a variable, say:
VARIABLE=$(sed -n '15s/[^"]*"\(.*\)".*/\1/p' inputfile)
In order to remove the dependency that the VERSION would occur only on line 15, you could say:
sed -n '/^VERSION=/ s/[^"]*"\(.*\)".*/\1/p' inputfile

there should not be space in assigning variables
version=$(your code)
version=$(sed -r -i '15s/.*\"\([0-9]*\)\/.*/"/p' main.py)
OR
version=`sed -r -i '15s/.*\"\([0-9]*\)\/.*/"/p' main.py`

Related

Trimming a textfile

i want to trim a textfile and delete all lines from line n to the end of the file. I tried to use sed for that. The sed command for n=26 should look like that:
sed -i '26,$d' /path/to/textfile
So in my textfile i don't know n beforehand, but i know that there is a unique text in that line. So i tried it that way:
myvar=`grep -n 'unique text' /path/to/textfile | awk -F":" '{print $1 }'`
sed -i "${myvar}"',$d' /path/to/textfile
That works and deletes all wanted lines but it throws the error message:
sed: -e expression # 1, character 1: unknown command: »,«
So i tried changing my command to:
myvar=`grep -n 'unique text' /path/to/textfile | awk -F":" '{print $1 }'`
sed -i "${myvar},$d" /path/to/textfile
With that i get the same error message but it doesn't delete the lines.
I tried some variations with ' and " and how to put the variable in there, but it never works as wanted. Does someone knows what i do wrong?
I would appreciate other methods for trimming the textfile as long as i can do it in a bash script.
You can replace the fixed line number with a regular expression matching the line to start at.
sed -i '/unique text/,$d' /path/to/textfile
You can also use ed to edit the file, rather than rely on a non-standard sed extension.
printf '/unique text/,$d\nwq\n' | ed /path/to/textfile

how to grep everything between single quotes?

I am having trouble figuring out how to grep the characters between two single quotes .
I have this in a file
version: '8.x-1.0-alpha1'
and I like to have the output like this (the version numbers can be various):
8.x-1.0-alpha1
I wrote the following but it does not work:
cat myfile.txt | grep -e 'version' | sed 's/.*\?'\(.*?\)'.*//g'
Thank you for your help.
Addition:
I used the sed command sed -n "s#version:\s*'\(.*\)'#\1#p"
I also like to remove 8.x- which I edited to sed -n "s#version:\s*'8.x-\(.*\)'#\1#p".
This command only works on linux and it does not work on MAC. How to change this command to make it works on MAC?
sed -n "s#version:\s*'8.x-\(.*\)'#\1#p"
If you just want to have that information from the file, and only that you can quickly do:
awk -F"'" '/version/{print $2}' file
Example:
$ echo "version: '8.x-1.0-alpha1'" | awk -F"'" '/version/{print $2}'
8.x-1.0-alpha1
How does this work?
An awk program is a series of pattern-action pairs, written as:
condition { action }
condition { action }
...
where condition is typically an expression and action a series of commands.
-F "'": Here we tell awk to define the field separator FS to be a <single quote> '. This means the all lines will be split in fields $1, $2, ... ,$NF and between each field there is a '. We can now reference these fields by using $1 for the first field, $2 for the second ... etc and this till $NF where NF is the total number of fields per line.
/version/{print $2}: This is the condition-action pair.
condition: /version/:: The condition reads: If a substring in the current record/line matches the regular expression /version/ then do action. Here, this is simply translated as if the current line contains a substring version
action: {print $2}:: If the previous condition is satisfied, then print the second field. In this case, the second field would be what the OP requests.
There are now several things that can be done.
Improve the condition to be /^version :/ && NF==3 which reads _If the current line starts with the substring version : and the current line has 3 fields then do action
If you only want the first occurance, you can tell the system to exit immediately after the find by updating the action to {print $2; exit}
I'd use GNU grep with pcre regexes:
grep -oP "version: '\\K.*(?=')" file
where we are looking for "version: '" and then the \K directive will forget what it just saw, leaving .*(?=') to match up to the last single quote.
Try something like this: sed -n "s#version:\s*'\(.*\)'#\1#p" myfile.txt. This avoids the redundant cat and grep by finding the "version" line and extracting the contents between the single quotes.
Explanation:
the -n flag tells sed not to print lines automatically. We then use the p command at the end of our sed pattern to explicitly print when we've found the version line.
Search for pattern: version:\s*'\(.*\)'
version:\s* Match "version:" followed by any amount of whitespace
'\(.*\)' Match a single ', then capture everything until the next '
Replace with: \1; This is the first (and only) capture group above, containing contents between single quotes.
When your only want to look at he quotes, you can use cut.
grep -e 'version' myfile.txt | cut -d "'" -f2
grep can almost do this alone:
grep -o "'.*'" file.txt
But this may also print lines you don't want to: it will print all lines with 2 single quotes (') in them. And the output still has the single quotes (') around it:
'8.x-1.0-alpha1'
But sed alone can do it properly:
sed -rn "s/^version: +'([^']+)'.*/\1/p" file.txt

Insert a variable in a text file [duplicate]

This question already has answers here:
Replace a string in shell script using a variable
(12 answers)
Closed 4 years ago.
I am trying to using sed -i command to insert a string variable in the 1st line of a text file.
This command work : sed -i '1st header' file.txt
But when i pass a variable this doesn't work.
example :
var=$(cat <<-END
This is line one.
This is line two.
This is line three.
END
)
sed -i '1i $var' file.txt # doesn't work
sed -i ’1i $var’ file.txt # doesn't work
Any help with this problem
Thank you
First, let's define your variable a simpler way:
$ var="This is line one.
This is line two.
This is line three."
Since sed is not good at working with variables, let's use awk. This will place your variable at the beginning of a file:
awk -v x="$var" 'NR==1{print x} 1' file.txt
How it works
-v x="$var"
This defines an awk variable x to have the value of shell variable $var.
NR==1{print x}
At the first line, this tells awk to insert the value of variable x.
1
This is awk's shorthand for print-the-line.
Example
Let's define your variable:
$ var="This is line one.
> This is line two.
> This is line three."
Let's work on this test file:
$ cat File
1
2
This is what the awk command produces:
$ awk -v x="$var" 'NR==1{print x} 1' File
This is line one.
This is line two.
This is line three.
1
2
Changing a file in-place
To change file.txt in place using a recent GNU awk:
awk -i inplace -v x="$var" 'NR==1{print x} 1' file.txt
On macOS, BSD or older GNU/Linux, use:
awk -v x="$var" 'NR==1{print x} 1' file.txt >tmp && mv tmp file.txt
Using printf...
$ var="This is line one.
This is line two.
This is line three.
"
Use cat - to read from stdin and then print into a new file. Move it to the original file if you want to modify it.
$ printf "$var" | cat - file > newfile && mv newfile file;
Not the best job for sed. What about a simple cat ?
cat - file.txt <<EOF > newfile.txt
This is line one.
This is line two.
This is line three.
EOF
# you can add mv, if you really want the original file gone
mv newfile.txt file.txt
And for the original problem - sed does not like newlines and spaces in it's 'program', you need to quote and escape the line breaks:
# this works
sed $'1i "abc\\\ncde"' file.txt
# this does not, executes the `c` command from the second line
sed $'1i "abc\ncde"' file.txt

Extract first word in colon separated text file

How do i iterate through a file and print the first word only. The line is colon separated. example
root:01:02:toor
the file contains several lines. And this is what i've done so far but it does'nt work.
FILE=$1
k=1
while read line; do
echo $1 | awk -F ':'
((k++))
done < $FILE
I'm not good with bash-scripting at all. So this is probably very trivial for one of you..
edit: variable k is to count the lines.
Use cut:
cut -d: -f1 filename
-d specifies the delimiter
-f specifies the field(s) to keep
If you need to count the lines, just
count=$( wc -l < filename )
-l tells wc to count lines
awk -F: '{print $1}' FILENAME
That will print the first word when separated by colon. Is this what you are looking for?
To use a loop, you can do something like this:
$ cat test.txt
root:hello:1
user:bye:2
test.sh
#!/bin/bash
while IFS=':' read -r line || [[ -n $line ]]; do
echo $line | awk -F: '{print $1}'
done < test.txt
Example of reading line by line in bash: Read a file line by line assigning the value to a variable
Result:
$ ./test.sh
root
user
A solution using perl
%> perl -F: -ane 'print "$F[0]\n";' [file(s)]
change the "\n" to " " if you don't want a new line printed.
You can get the first word without any external commands in bash like so:
printf '%s' "${line%%:*}"
which will access the variable named line and delete everything that matches the glob :* and do so greedily, so as close to the front (that's the %% instead of a single %).
Though with this solution you do need to do the loop yourself. If this is the only thing you want to do with the variable the cut solution is better so you don't have to do the file iteration yourself.

How do I print a field from a pipe-separated file?

I have a file with fields separated by pipe characters and I want to print only the second field. This attempt fails:
$ cat file | awk -F| '{print $2}'
awk: syntax error near line 1
awk: bailing out near line 1
bash: {print $2}: command not found
Is there a way to do this?
Or just use one command:
cut -d '|' -f FIELDNUMBER
The key point here is that the pipe character (|) must be escaped to the shell. Use "\|" or "'|'" to protect it from shell interpertation and allow it to be passed to awk on the command line.
Reading the comments I see that the original poster presents a simplified version of the original problem which involved filtering file before selecting and printing the fields. A pass through grep was used and the result piped into awk for field selection. That accounts for the wholly unnecessary cat file that appears in the question (it replaces the grep <pattern> file).
Fine, that will work. However, awk is largely a pattern matching tool on its own, and can be trusted to find and work on the matching lines without needing to invoke grep. Use something like:
awk -F\| '/<pattern>/{print $2;}{next;}' file
The /<pattern>/ bit tells awk to perform the action that follows on lines that match <pattern>.
The lost-looking {next;} is a default action skipping to the next line in the input. It does not seem to be necessary, but I have this habit from long ago...
The pipe character needs to be escaped so that the shell doesn't interpret it. A simple solution:
$ awk -F\| '{print $2}' file
Another choice would be to quote the character:
$ awk -F'|' '{print $2}' file
Another way using awk
awk 'BEGIN { FS = "|" } ; { print $2 }'
And 'file' contains no pipe symbols, so it prints nothing. You should either use 'cat file' or simply list the file after the awk program.

Resources