I look for a way to search in a string, starting from the last character.
I do have a solution with a for next loop and parse the string one by one. But there must be a smarter way to do this. I have tried to do this with
pos (" "=i$)
but this statement starts from the begin
The loop will do, but it's slow. There is a inbuild command you can use
I$="123 ABC DEF"
X = POS (" "=I$,-1)
This gives you the position of the last space in I$
result of this is 7
Another option is using MASK() which is pretty much similar to the unix "grep"
Related
every now and then I stumble over an error-message like the one in this case:
if "," in text.erase():
print ("comma erased")
error(109,1): Too few arguments for "erase()" call. Expected at least
2.
Whatever I try to put into those (), nothing seems to work. How can I find out what arguments I need in such a case?
At least some basic programming knowledge provided, the editor's Search Help offers some useful info in such a case:
void erase ( int position, int chars )
Erases chars characters from the string starting from position.
I have a column in Power Query (standalone power query with Excel), with text like this
"Hazelnut Berries Nuts Raspberry"
I need to be able to identify if there are more than 1 instance of "nut" ("berry") in it and remove generic word, to have result as
"Hazelnut Raspberry"
I have seen this post, but it works off whole words repeated.
I'm not entirely certain about your criteria for searching for the words you want to remove (PQ is fairly limited in how it can evaluate this with built in functions anyways). This will look through that string and remove any words that start with "Nut" or "Berr".
Text.Combine(List.Transform(Text.Split("Hazelnut Berries Nuts Raspberry", " "), each if (Text.StartsWith(_, "Nut") or Text.StartsWith(_, "Berr")) then null else _), " ")
Which will get your desired output. Don't know if you need more detailed criteria for evaluating each word, but that would probably need a custom function.
List.Distinct: https://learn.microsoft.com/en-ie/powerquery-m/list-distinct should do it; something like: List.Distinct(Text.Split("Hazelnut Berries Nuts Raspberry", " "))
You might need a bit more if your list could contain multiple spaces or other "stuff"
I'm trying to import a Google Play Store description into a Google spreadsheet, and that works fairly well with this formula:
=importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "//div[#itemprop='description']")
However, I'm running into the issue that this:
Keeping up with friends is faster than ever.<p>• See what friends are up to...</p>
Will be parsed as:
"Keeping up with friends is faster than ever.• See what friends are up to..."
Ideally I'd like to see the <p> tag replaced by a break, or at least a space. I've been trying the following formula
=importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "normalize-space(translate(//div[#itemprop='description'],'"',' '))")
but this removes every occurrence of &, q, u, o, t and ;
How can I replace these HTML tags for a break or space?
You can actually use this:
=join(char(10),IMPORTXML("https://play.google.com/store/apps/details?id=com.facebook.katana","//*[#jsname='C4s9Ed']"))
which gives you a newline for each element. Note that for the first example if you want to replace the •, you would want to sub that with a space or new line.
If you just want a space instead of a new line for either of those you can modify the char(10) to a " " instead.
here is another App page I tried it with:
=join(char(10),IMPORTXML("https://play.google.com/store/apps/details?id=com.facebook.orca","//*[#jsname='C4s9Ed']"))
Try:
=SUBSTITUTE(importXML("https://play.google.com/store/apps/details?id=com.facebook.katana", "//div[#itemprop='description']"), "•"," ")
This is my code
stopwordlist = "a|an|all"
File.open('0_9.txt').each do |line|
line.downcase!
line.gsub!( /\b#{stopwordlist}\b/,'')
File.open('0_9_2.txt', 'w') { |f| f.write(line) }
end
I wanted to remove words - a,an and all
But, instead it matches substrings also and removes them
For an example input -
Bromwell High is a cartoon comedy. It ran at the same time as some other programs about school life
I get the output -
bromwell high is cartoon comedy. it r t the same time s some other programs bout school life
As you can see, it matched the substring.
How do I make it just match the word and not substrings ?
The | operator in regex takes the widest scope possible. Your original regex matches either \ba or an or all\b.
Change the whole regex to:
/\b(?:#{stopwordlist})\b/
or change stopwordlist into a regex instead of a string.
stopwordlist = /a|an|all/
Even better, you may want to use Regexp.union.
\ba\b|\ban\b|\ball\b
try this.this will look for word boundaries.
I'm writing a piece of software using RealBASIC 2011r3 and need a reliable, cross-platform way to break a string out into paragraphs. I've been using the following but it only seems to work on Linux:
dim pTemp() as string
pTemp = Split(txtOriginalArticle.Text, EndOfLine + EndOfLine)
When I try this on my Mac it returns it all as a single paragraph. What's the best way to make this work reliably on all three build targets that RB supports?
EndofLine changes depending upon platform and depending upon the platform that created the string. You'll need to check for the type of EndOfLine in the string. I believe it's sMyString.EndOfLineType. Once you know what it is you can then split on it.
There are further properties for the EndOfLine. It can be EndOfLine.Macintosh/Windows/Unix.
EndOfLine docs: http://docs.realsoftware.com/index.php/EndOfLine
I almost always search for and replace the combinations of line break characters before continuing. I'll usually do a few lines of:
yourString = replaceAll(yourString,chr(10)+chr(13),"<someLineBreakHolderString>")
yourString = replaceAll(yourString,chr(13)+chr(10),"<someLineBreakHolderString>")
yourString = replaceAll(yourString,chr(10),"<someLineBreakHolderString>")
yourString = replaceAll(yourString,chr(13),"<someLineBreakHolderString>")
The order here matters (do 10+13 before an individual 10) because you don't want to end up replacing a line break that contains a 10 and a 13 with two of your line break holders.
It's a bit cumbersome and I wouldn't recommend using it to actually modify the original string, but it definitely helps to convert all of the line breaks to the same item before attempting to further parse the string.