How do I remove HTML encoded characters from a string?

How do I remove HTML encoded characters from a string? - ruby

I have a string which contains some HTML encoded characters and I want to remove them:
"<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"

What you want to do is doable many ways. Perhaps looking at why you might want to do that will help. Usually when I want to remove encoded HTML, I want to recover the contents of the HTML. Ruby has some modules that make it easy.
require 'cgi'
require 'nokogiri'
html = "<div>Hi All,</div><div class=\"paragraph_break\">< /></div><div>Starting today we are initiating PoLS.</div><div class=\"paragraph_break\"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class=\"paragraph_break\"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class=\"paragraph_break\"><br /></div><div>All the best!</div><div class=\"paragraph_break\"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>"
puts CGI.unescapeHTML(html)
which outputs:
<div>Hi All,</div><div class="paragraph_break">< /></div><div>Starting today we are initiating PoLS.</div><div class="paragraph_break"><br /></div><div>Please use the following communication protocols:<br /></div><div>1. Task Breakup and allocation - Gravity<br /></div><div>2. All mail communications - BC messages<br /></div><div>3. Reports on PoC / Spikes: Writeboard<br /></div><div>4. Non story related tasks: BC To-Do<br /></div><div>5. All UI and HTML will communicated to you through BC.<br /></div><div>6. For File sharing, we'll be using Dropbox.<br /></div><div>7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.</div><div class="paragraph_break"><br /></div><div>You'll have been given necessary accesses to all these portals. Please start using them judiciously.</div><div class="paragraph_break"><br /></div><div>All the best!</div><div class="paragraph_break"><br /></div><div>Thanks,<br /></div><div>Saurav<br /></div>
If I want to take it a step farther and remove the tags, retrieving all the text:
puts Nokogiri::HTML(CGI.unescapeHTML(html)).content
Will output:
Hi All,Starting today we are initiating PoLS.Please use the following communication protocols:1. Task Breakup and allocation - Gravity2. All mail communications - BC messages3. Reports on PoC / Spikes: Writeboard4. Non story related tasks: BC To-Do5. All UI and HTML will communicated to you through BC.6. For File sharing, we'll be using Dropbox.7. Use Skype for lighter and generic desicussions. However, in case you need any approvals, data for later reference, etc, then please use BC. PoLS conversation has been created on skype.You'll have been given necessary accesses to all these portals. Please start using them judiciously.All the best!Thanks,Saurav
Which is where I usually want to get when I see that sort of string.
Ruby's CGI makes encoding and decoding HTML easy. The Nokogiri gem makes it easy to remove the tags.

I think the easiest way to do this is, Assuming you want to use the html in the string.
raw CGI.unescapeHTML('The string you want to manipulate')

If you have assigned that string to a variable s, is this the result you want?
puts s.gsub(/<[^&]*>/, '')

I would suggest:
clean = str.gsub /<.+?>/, ''

Related

Writing Capybara expectations to verify phone numbers

I'm using AWS Textract to pull information from PDF documents. After the scanned text is returned from AWS and persisted to a var, I'm doing this:
phone_number = '(555) 123-4567'
scanned_pdf_text.should have_text phone_number
But this fails about 20% of the time because of the non-deterministic way that AWS is returning the scanned PDF text. On occasion, the phone numbers can appear either of these two ways:
(555)123-4567 or (555) 123-4567
Some of this scanned text is very large, and I'd prefer not to go through the exercise of sanitizing the text coming back if I can avoid it (I'm also not good at regex usage). I also think using or logic to handle both cases seems to be a little heavy handed just to check text that is so similar (and clearly near-identical to the human eye).
Is there an rspec matcher that'll allow me to check on this text? I'm also using Capybara.default_normalize_ws = true but that doesn't seem to help in this case.

Assuming scanned_pdf_text is a string and the only differences you're seeing is in spaces then you can just get rid of the spaces and compare
scanned_pdf_text.gsub(/\s+/, '').should eq('(555)123-4567') # exact
scanned_pdf_text.gsub(/\s+/, '').should match('(555)123-4567') # partial
scanned_pdf_text.gsub(/\s+/, '').should have_text('(555)123-4567') # partial

Ignore formatting in Slack App when reading messages

I'm building a Slack App which is only interested in actual user input in message, regardless if it's bold, italic, or a web link.
What would be the best way to remove all the formatting characters, such as _, *, etc, and leave actual text only?

The best solution is actually to ignore the text param entirely, as it is inherently lossy, and to instead parse what you want from the blocks param, which gives an exact representation of what the user saw in their slack client message field when they posted their message.
To demonstrate this, here's a simple example
what you type
WYS
text
blocks[0]["elements"]["elements"]
parsing text from blocks
[a][space][*][b][*]
a b
a *b*
[{"type":"text","text":"a "},{"type":"text","text":"b","style":{"bold":true}}]
a b
[a][*][b][*][←][←][←][space]
a *b*
a *b*
[{"type":"text","text":"a *b*"}]
a *b*
Unfortunately for now, the Slack API does not give you the blocks param in command events, only message events. Hopefully they will fix this, but until then you may wish to use regular messages instead of slash commands.

If you are using node.js there is a module for this:
https://openbase.io/js/remove-markdown/documentation
Note that this is for markdown and not Slack's variant mrkdwn so it might not work.

Managed to get the unformatted message in python by looping through the array.
def semakitutu(message, say):
user = message['user']
# print(message)
hasira = message['blocks']
hasira2 = hasira[0]['elements'][0]['elements']
urefu = len(hasira2)
swali = ''
for x in range(urefu):
swali=swali+hasira2[x]['text']
print(swali)
say(f"Hello <#{user}>! I am running your request \n {swali}")```

How to draw a chart from a CSV-file in MQL4?

I'm new to MQL and MetaTrader 4,but I want to read a .CSV-file and draw the values I've got into the chart of the Expert Advisor I'm working on.
Every .CSV file has the form of:
;EURUSD;1
DATE;TIME;HIGH;LOW;CLOSE;OPEN;VOLUME
2014.06.11;19:11:00;1.35272;1.35271;1.35271;1.35272;4
2014.06.11;19:14:00;1.35287;1.35282;1.35284;1.35283;30
Where the EURUSD part is the _Symbol, which another program generated, the 1 is the period, and all the other things are the data to draw.
Is there any form to do it inside an Expert Advisor, or do I need to use a Custom Indicator?
If that's the case, how can I do it in the simplest way?
P.S.: I read the data in a struct:
struct entry
{
string date;
string time;
double high;
double low;
double close;
double open;
int volume;
};

There are three principally different approaches available in MT4
First, one mayreshuffle data-cells into a compatible format T,O,H,L,C,V and import records using F2 History Center [Import] facility of the MetaTrader Terminal. One may create one's own Symbol-name so as to avoid name-colliding cases in the History Center database.
This way, one lets MT4 to create system-level illustrations of the TOHLCV-data, using the platform's underlying graphical engine.
Second,
one may ignore the underlying graphical engine andwork on a user-controlled GUI-overlayso as to implement an algorithm to read a CSV file and create a set of MQL4 GUI-objects algorithmically, based on the data contained in the said CSV file. An experience based decision whether to use an { ExpertAdvisor | CustomIndicator } would yield to use a Script for this purpose, due to it's one-shot processing.One shall realise, MT4 code-execution ecosystem does a specific context-binding between an MQL4-code ( which is being run ) and an MT4.Graph which does not allow a code launched on a GBPJPY MT4.Graph to process directly objects, related with FTSE.100 MT4.Graph. Yes, if asked to, one may implement a few add-ons and develop a sofisticated distributed processing model to make this work "accross" the said context-binding borders.
Third,
and for some cases the most interesting way is a file based approach, whereone may
pre-process CSV data in a similar way as in second option but not inside a live-MT4 process, but "beforehand" and
generate one's own Profile file, keeping an MT4 convention of placing & content of - ~/profiles/<aProfileNAME>/chart01.chr - ~/profiles/<aProfileNAME>/order.wnd
-~/profiles/lastprofile.ini, referring <aProfileNAME> on it's first row
This way, once the MT4 session starts, the pre-fabricated files are pilot-tape auto-loaded and displayed as one wishes to, Q.E.D.
A .chr file syntax sample:
<chart>
id=130394787628125000
comment=msLIB.TERMINAL: _______________2013.04.15 08:00:00 |cpuClockTIXs = 448765484 |
symbol=EURCHF
period=60
leftpos=6188
digits=4
scale=4
graph=1
fore=0
grid=0
volume=1
scroll=0
shift=1
ohlc=1
...
<window>
height=100
fixed_height=0
<indicator>
name=main
<object>
type=10
object_name=Fibo 16762
...
<object>
type=16
object_name=msLIB.RectangleOnEVENT
period_flags=0
create_time=1348596865
color=25600
style=0
weight=1
background=0
filling=0
selectable=1
hidden=0
zorder=0
time_0=1348592400
value_0=1.213992
time_1=1348624800
value_1=1.209486
ray=0
</object>
...
<object>
type=17
object_name=msLIB.TriangleMarker
period_flags=0
create_time=1348064992
color=17919
style=2
weight=1
background=0
filling=0
selectable=1
hidden=0
zorder=0
time_0=1348052400
value_0=1.213026
time_1=1348070400
value_1=1.213026
time_2=1348070400
value_2=1.210476
</object>

Yahoo Pipes: Extracting number from feed item for use in URL builder

Been looking all over the place for a solution to this issue. I have a Yahoo Pipe (http://pipes.yahoo.com/pipes/pipe.info?_id=e5420863cfa494ee40e4c9be43f0e812) that I've created to pull back image content from the Bing Search API. The URL builder includes a $skip attribute that takes an integer and uses it to select the starting (index) point for the result set that the query returns.
My initial plan had been to use the math engine in the Wolfram Alpha API to generate a random number (randomInteger[1000]) that I could use to seed the $skip value each time that the pipe is run. I have an earlier version of the pipe where I was able to get the query / result steps working using either "XPath Fetch" and "Fetch Data". However, regardless of how I Fetch the result, the response returns as an attribute / value pair in a list item.Even when I use "Emit items as string" in XPath Fetch, I still get a list with a single item, when what I really want is the integer that I can plug into my $skip attribute.
I've tried everything in Pipes I can think of, and spent a lot of time online looking for an answer. Is there anyway to extract text (in this case, a number) from a single list item and then use the output as input to "wire" a text parameter in another Pipes block? Any suggestions / ideas welcome. In the meantime, I'm generating a sorta-random number by manipulating a timecode hash, but it just feels tacky :-)
Thanks!

All the sources are for repeated items. You can't have a source that just makes a single number.
I'm not really clear what you're trying to do. You want to put a random number into part of the URL string that gets an RSS feed?

How to trim text and use it as parameter for next step in watir using ruby

This may be very simple question but I am very new to ruby or any programming language. I want to trim some text and use it as parameter for next step. Can any one please write me code for doing this. I am testing a web application which is used in financial domain. I need to use the cvv2 and expiry date of card number which is generated in next step as parameter. The text which gets displayed on html is
CVV2 - 657  Expiry - 05/12 (mm/yy)
Now from the above text I should some how get only '657' and '0512' as value to use is in next step.
Request for urgent assistance.

If all your strings will be formatted like this, I suggest using regexp, using String#gsub. There's lots of places to learn about regexp if you don't already, and rubular allows you to test it in your browser, with a short cheat sheet.

To do it quickly you could use
card_details = "CVV2 - 657 Expiry - 05/12 (mm/yy)"
card_details = card_details.scan(/\d{2,}/)
cvv2 = card_details[0]
expiry = card_details[1] + card_details[2]
Probably better ways of doing it as I'm no expert, but you said urgent, so.
For getting the text out of the cell you could try (I don't use the original watir anymore, so I might not be able to remember this):
card_details = browser.td(:text => /CVV2/).text
If that doesn't work give this a try (actually on second thought TRY THIS ONE FIRST)
card_details = browser.cell(:text => /CVV2/).text
For these examples I'm assuming your browser object is called "browser".

We can use regular expression to achieve the same,
> "CVV2 - 657 Expiry - 05/12 (mm/yy)".match(/\d{3}/)
=> "657"
>"CVV2 - 657 Expiry - 05/12 (mm/yy)".match(/\d+\/\d+/)
=> "05/12"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How do I remove HTML encoded characters from a string? - ruby

I think the easiest way to do this is, Assuming you want to use the html in the string. raw CGI.unescapeHTML('The string you want to manipulate')

If you have assigned that string to a variable s, is this the result you want? puts s.gsub(/<[^&]*>/, '')

I would suggest: clean = str.gsub /<.+?>/, ''

Related

Writing Capybara expectations to verify phone numbers

Ignore formatting in Slack App when reading messages

How to draw a chart from a CSV-file in MQL4?

Yahoo Pipes: Extracting number from feed item for use in URL builder

How to trim text and use it as parameter for next step in watir using ruby

Categories

Resources