I'm trying to write a bash script to find all placeholders in a file.
For example
I have following file:
<property name="sdfasdf" value="$ABC.D"></property>
<property name="sadf" value="$DFG.F.G"></property>
<property name="sadf" value="hello"></property>
<property name="ddd" value="$HJK"></property
and I would like to get these:
$ABC.D
$DFG.F.G
$HJK
I tried many options but without success.
Could someone help me?
Can grep for these values and placeholders and further grep to get the symbol names.
example
$ grep -o 'value="$.\+"' input.txt | grep -oE '\$(\w|\.)+'
$ABC.D
$DFG.F.G
$HJK
note: assumes there is only one placeholder value per line
details
o flag only prints the matches to the pattern
E flag for extended regex used to match either word or .
You can use sed:
sed -n 's/.*value="\($[\.a-zA-Z]*\)".*/\1/p' ./input.txt
where input.txt file contains your text.
Here we use a substitution group to only print the actual match (not the entire matching line).
sed -nr 's%(\$[A-Za-z][A-Za-z.]*)%\n\1\n%gp' test | grep '^\$[A-Za-z][A-Za-z.]*'
This is universal method to find placeholders, independent of whether or not there is only one placeholder per each line and whether or not there is "value=" context near it. All placeholders will be printed on STDOUT.
Related
I have an input file
RAKESH_ONE
RAKESH-TWO
RAKESH123
RAKESHTHREE
/RAKESH/
FIVERAKESH
456RAKESH
WELCOME123
This is RAKESH
I would like to get the output
RAKESH_ONE
RAKESH-TWO
/RAKESH/
This is RAKESH
I want to print the line matching the pattern RAKESH. If the pattern is prefixed or suffixed with alphanumeric we should avoid it.
([^a-zA-Z0-9]+|^)RAKESH([^a-zA-Z0-9]+|$)
This will match patterns on the lines without alphanumeric prefixes or suffixes. It will not match the whole line, but if used with grep or sed you can output just the lines you need.
UPDATE
As requested, here's the full grep command. Use the -E option to use extended regex:
grep -E "([^a-zA-Z0-9]+|^)RAKESH([^a-zA-Z0-9]+|$)" file.txt
For satisfying a legacy code i had to add date to a filename like shown below(its definitely needed and cannot modify legacy code :( ). But i need to remove the date within the same command without going to a new line. this command is read from a text file so i should do this within the single command.
$((echo "$file_name".`date +%Y%m%d`| sed 's/^prefix_//')
so here i am removing the prefix from filename and adding a date appended to filename. i also do want to remove the date which i added. for ex: prefix_filename.txt or prefix_filename.zip should give me as below.
Expected output:
filename.txt
filename.zip
Current output:
filename.txt.20161002
filename.zip.20161002
Assumming all the files are formatted as filename.ext.date, You can pipe the output to 'cut' command and get only the 1st and 2nd fields :
~> X=filename.txt.20161002
~> echo $X | cut -d"." -f1,2
filename.txt
I am not sure that I understand your question correctly, but perhaps this does what you want:
$((echo "$file_name".`date +%Y%m%d`| sed -e 's/^prefix_//' -e 's/\.[^.]*$//')
Sample input:
cat sample
prefix_original.txt.log.tgz.10032016
prefix_original.txt.log.10032016
prefix_original.txt.10032016
prefix_one.txt.10032016
prefix.txt.10032016
prefix.10032016
grep from start of the string till a literal dot "." followed by digit.
grep -oP '^.*(?=\.\d)' sample
prefix_original.txt.log.tgz
prefix_original.txt.log
prefix_original.txt
prefix_one.txt
prefix.txt
prefix
perhaps, following should be used:
grep -oP '^.*(?=\.\d)|^.*$' sample
If I understand your question correctly, you want to remove the date part from a variable, AND you already know from the context that the variable DOES contain a date part and that this part comes after the last period in the name.
In this case, the question boils down to removing the last period and what comes after.
This can be done (Posix shell, bash, zsh, ksh) by
filename_without=${filename_with%.*}
assuming that filename_with contains the filename which has the date part in the end.
% cat example
filename.txt.20161002
filename.zip.20161002
% cat example | sed "s/.[0-9]*$//g"
filename.txt
filename.zip
%
I have these two lines within a file:
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
where I'd like to get the following as output using awk or sed:
3
50000
Using this sed command does not work as I had hoped, and I suspect this is due to the presence of the quotes and delimiters in my line entry.
sed -n '/WORD1/,/WORD2/p' /path/to/file
How can I extract the values I want from the file?
awk -F'[<>]' '{print $3}' input.txt
input.txt:
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
Output:
3
50000
sed -e 's/[a-zA-Z.<\/>= \-]//g' file
Using sed:
sed -E 's/.*limit"*>([0-9]+)<.*/\1/' file
Explanation:
.* takes care of everything that comes before the string limit
limit"* takes care of both the lines, one with limit" and the other one with just limit
([0-9]+) takes care of matching numbers and only numbers as stated in your requirement.
\1 is actually a shortcut for capturing pattern. When a pattern groups all or part of its content into a pair of parentheses, it captures that content and stores it temporarily in memory. For more details, please refer https://www.inkling.com/read/introducing-regular-expressions-michael-fitzgerald-1st/chapter-4/capturing-groups-and
The script solution with parameter expansion:
#!/bin/bash
while read line || test -n "$line" ; do
value="${line%<*}"
printf "%s\n" "${value##*\>}"
done <"$1"
output:
$ ./ltags.sh dat/ltags.txt
3
50000
Looks like XML to me, so assuming it forms part of some valid XML, e.g.
<root>
<first-value system-property="unique.setting.limit">3</first-value>
<second-value-limit>50000</second-value-limit>
</root>
You can use Perl's XML::Simple and do something like this:
perl -MXML::Simple -E '$xml = XMLin("file"); say $xml->{"first-value"}->{"content"}; say $xml->{"second-value-limit"}'
Output:
3
50000
If the XML structure is more complicated, then you may have to drill down a bit deeper to get to the values you want. If that's the case, you should edit the question to show the bigger picture.
Ashkan's awk solution is straightforward, but let me suggest a sed solution that accepts non-integer numbers:
sed -n 's/[^>]*>\([.[:digit:]]*\)<.*/\1/p' input.txt
This extracts the number between the first > character of the line and the following <. In my RE this "number" can be the empty string, if you don't want to accept an empty string please add the -r option to sed and replace \([.[:digit:]]*\) by ([.[:digit:]]+).
I need to write a shell script that does the following:
In a given folder with files that fit the pattern: update-8.1.0-v46.sql I need to find the maximum version
I need to write the maximum version I've found into a configuration file
For 1, I've found the following answer: Shell script: find maximum value in a sequence of integers without sorting
The only problem I have is that I can't get down to a list of only the versions,
I tried:
ls | grep -o "update-8.1.0-v\(\d*\).sql"
but I get the entire file name in return and not just the matching part
Any ideas?
Maybe move everything to awk?
I ended up using:
SCHEMA=`ls database/targets/oracle/ | grep -o "update-$VERSION-v.*.sql" | sed "s/update-$VERSION-v\([0-9]*\).sql/\1/p" | awk '$0>x{x=$0};END{print x}'`
based on dreamer's answer
you can use sed for this:
echo "update-8.1.0-v46.sql" | sed 's/update-8.1.0-v\([0-9]*\).sql/\1/p'
The output in this case will be 46
grep isn't really the best tool for extracting captured matches, but you can use look-behind assertions if you switch it to use perl-like regular expressions. Anything in the assertion will not be printed when using the -o flag.
ls | grep -Po "(?<=update-8.1.0-v)\d+"
46
I've never used sed apart from the few hours trying to solve this. I have a config file with parameters like:
test.us.param=value
test.eu.param=value
prod.us.param=value
prod.eu.param=value
I need to parse these and output this if REGIONID is US:
test.param=value
prod.param=value
Any help on how to do this (with sed or otherwise) would be great.
This works for me:
sed -n 's/\.us\././p'
i.e. if the ".us." can be replaced by a dot, print the result.
If there are hundreds and hundreds of lines it might be more efficient to first search for lines containing .us. and then do the string replacement... AWK is another good choice or pipe grep into sed
cat INPUT_FILE | grep "\.us\." | sed 's/\.us\./\./g'
Of course if '.us.' can be in the value this isn't sufficient.
You could also do with with the address syntax (technically you can embed the second sed into the first statement as well just can't remember syntax)
sed -n '/\(prod\|test\).us.[^=]*=/p' FILE | sed 's/\.us\./\./g'
We should probably do something cleaner. If the format is always environment.region.param we could look at forcing this only to occur on the text PRIOR to the equal sign.
sed -n 's/^\([^,]*\)\.us\.\([^=]\)=/\1.\2=/g'
This will only work on lines starting with any number of chars followed by '.' then 'us', then '.' and then anynumber prior to '=' sign. This way we won't potentially modify '.us.' if found within a "value"