Bash regex to match a word folowed by numbers or not [duplicate] - bash

This question already has an answer here:
Regex - two specific digits followed by optional digits
(1 answer)
Closed 3 years ago.
I want to match this strings value, value1, value2.
I got the number so far, but I need to match the word with no numbers after, also.
sed -e 's/value[0-9]//g'

You can combine a multiple expressions into one by separating with semicolons. Hope this helps.
sed 's/value[0-9]//g;s/value//g' inputfile

Related

Adding padded zeros to name files [duplicate]

This question already has answers here:
How to zero pad a sequence of integers in bash so that all have the same width?
(15 answers)
Closed 2 years ago.
Based off this: How to zero pad a sequence of integers in bash so that all have the same width?
I need to create new file names to enter into an array representing chromosomes 1-22 with three digits (chromsome001_results_file.txt..chromsome022_results_file.txt)
Prior to using a three digit system (which sorts easier) I was using
for i in {1..22};
do echo chromsome${i}_results_file.txt;
done
I have read about printf and seq but was wondering how they could be put within the middle of a loop surrounded by text to get the 001 to 022 to stick to the text.
Many thanks
Use printf specifying a field with and zero padding.
for i in {1..22};
do
printf 'chromsome%03d_results_file.txt\n' "$i"
done
In %03d, d means decimal output, 3 means 3 digits, and 0 means zero padding.

Bash - how to sort negative values? [duplicate]

This question already has answers here:
Linux sort doesn't work with negative float numbers
(3 answers)
Closed 6 years ago.
is there a way to sort negative numbers with sort in bash ? I have sin written out
...
0.250109
0.188852
0.126850
0.064349
0.001593
-0.061168
-0.123689
-0.185722
-0.247023
-0.307349
...
and the problem is when I run sort on it, it just sorts it by values - regardless of the minus in front of some values. Is there a way to fix it ? Thanks
Use sort -g (--general-numeric-sort), not sort -n (--numeric-sort).
See sort Invocation for an explanation of the subtle differences between these two options.
my problem was that the data wasnt well formatted - because of my locale, I had to sed decimal point into decimal comma - that's how its written in czech republic
thanks

Ruby regular expression for sequence with specified start and end [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 7 years ago.
I have this string:
mRNA = "gcgagcgagcaugacgcauguactugacaugguuuaaggccgauuagugaaugugcagacgcgcauaguggcgagcuaaaaacat"
I want to upcase subsequences out of this given sequence. A subsequence should start with aug and should end with either uaa, uag or uga.
When I use the following regular expression in combination with gsub!:
mRNA.gsub!(/(aug.*uaa)|(aug.*uag)|(aug.*uga)/, &:upcase)
it results in
gcgagcgagcAUGACGCAUGUACTUGACAUGGUUUAAGGCCGAUUAGUGAAUGUGCAGACGCGCAUAGUGGCGAGCUAAaaacat
I don’t understand why it upcases one whole chunk instead of giving me two subsequences like this:
gcgagcgagcAUGACGCAUGUACTUGACAUGGUUUAAggccgauuagugaAUGUGCAGACGCGCAUAGuggcgagcuaaaaacat
What regular expression can I use to achieve this?
The .* operator is known as "greedy," which means it will grab up as many characters as it can while still matching the pattern.
To grab the smallest possible number of characters, use the "non-greedy" operator, .*?.
Modifying your original regex:
mRNA.gsub!(/(aug.*?uaa)|(aug.*?uag)|(aug.*?uga)/, &:upcase)
There are certainly smaller regexes that will do the job, though. Using #stribizhev's suggestion:
mRNA.gsub!(/aug.*?(?:uaa|uag|uga)/, &:upcase)

Find the words in string with no spaces [duplicate]

This question already has answers here:
Detect most likely words from text without spaces / combined words
(5 answers)
Closed 8 years ago.
Lets suppose a string with no spaces:
Input : "putreturnsbetwenparagaphs"
Output : put returns between paragraphs
This could get more complex as more words overlap. How to achieve this really fast. If required does spell corrections and splits the word. Think about it.
One problem could be the plural or case of the word. In your example it could be difficult to make a difference between paragraph and paragraphs.
Do you have more information? Are some words in a explicit grammatical form, or could any word of a common dictionary including case, numerus etc. occour?

Is there a way to check if two regexps can match the same string? [duplicate]

This question already has answers here:
Regex: Determine if two regular expressions could match for the same input?
(5 answers)
Closed 10 years ago.
I have two regexps. I need to determine if it is possible to build string of given length that matches these two regexps simultaneously. I need algorithm to do that.
String's length wouldn't exceed 20 characters.
It depends. For perl compatible regular expressions (pcre), this is not generally possible, as they are turing complete: you cannot even be sure that matching always terminates.
For the original, "clean" form of reguler languages as defined in the Chomsky-hierarchy, it is known that they are closed under intersection, this is already discussed in this thread.
As soon as you have the NFA for the intersection, it is easy to check whether any string matches it - if thera is a path from the start to the end of your NFA, then the string for this path is the string you are searching for, for DFAs, an algorithm is given here, it should be simple to adapt it to NFAs.

Resources