How to make plotly faster in shiny - performance
I wonder if there is any good way to speed up the plotly performance especially for their speed. Currently I am trying to isolate the data for plots and only leave two inputs to reactively change the plotly content.
What I was confused is that when I tested those poltly plots with fixed data (I did isolate in shiny) the plots running pretty fast when changing those two inputs. However, when I import the same logic in shiny using isolate() and two inputs the polts running extremely slow.
Please help.
Here is my current codes, Thanks
observe({
##### Visualization - Daily Pattern #####
output$TTPlotB<-renderPlotly({
#input$file_load
#data_daily<-
#data_daily2<<-data_daily
date1<-input$dateRange[1]
date2<-input$dateRange[2]
journey1<-input$journey[1]
journey2<-input$journey[2]
plot_ly(isolate({Alloc()})%>%
filter(date_range>= date1 & date_range<=date2 )%>%
filter(Days_in_journey %in% ( journey1:journey2 ))%>%
group_by(date_range,OEM,daily_available)%>%
summarise(ngm=min(daily_avail_ngm), ### use the min based on the selected journey
product=min(daily_avail_product), ### use the min based on the selected journey
lp=min(daily_avail_lp), ### use the min based on the selected journey
cart=min(daily_avail_cart), ### use the min based on the selected journey
email=min(daily_avail_email))%>%
gather(channel,channel_daily_avail,-c(date_range,OEM,daily_available))%>%
select(-c(daily_available))%>%
spread(OEM,channel_daily_avail)%>%
group_by(date_range,channel)%>%
mutate(Daily_tot=sum(HP,Dell,Lenovo,Acer,Asus,Others))%>%
mutate(D_HP_avai=round(100*HP/(input$OEM.HP.Aud.wk*1000/7),0))%>%
mutate(D_Dell_avai=round(100*Dell/(input$OEM.Dell.Aud.wk*1000/7),0))%>%
mutate(D_Leno_avai=round(100*Lenovo/(input$OEM.Lenovo.Aud.wk*1000/7),0))%>%
mutate(D_Acer_avai=round(100*Acer/(input$OEM.Acer.Aud.wk*1000/7),0))%>%
mutate(D_Asus_avai=round(100*Asus/(input$OEM.Asus.Aud.wk*1000/7),0))%>%
mutate(D_Oth_avai=round(100*Others/(input$OEM.Others.Aud.wk*1000/7),0))%>%as.data.table()
,
height = 500, width = 1450,
x=~date_range,
y=~Daily_tot,
type="scatter",
color=~channel,
mode="lines+markers",
hoverinfo = 'text',
text = ~paste('</br> <B> Channel:</B>', channel
,
'</br> <B> Total Available Machines:</B>', formatC(Daily_tot, format="d", big.mark=","),
'</br> <B> Selected Date:</B>',date_range,
'</br> <B> HP: </B>',formatC(HP, format="d", big.mark=","),',',round(100*HP/Daily_tot,0),'%',
' <B> HP Daily:</B>',D_HP_avai,'%' ,
'</br> <B> Dell: </B>',formatC(Dell, format="d", big.mark=","),',',round(100*Dell/Daily_tot,0),'%',
' <B> Dell Daily:</B>',D_Dell_avai ,'%',
'</br> <B> Lenovo: </B>',formatC(Lenovo, format="d", big.mark=","),',',round(100*Lenovo/Daily_tot,0),'%',
' <B> Lenovo Daily:</B>',D_Leno_avai ,'%',
'</br> <B> Acer: </B>',formatC(Acer, format="d", big.mark=","),',',round(100*Acer/Daily_tot,0),'%',
' <B> Acer Daily:</B>', D_Acer_avai,'%',
'</br> <B> Asus: </B>',formatC(Asus, format="d", big.mark=","),',',round(100*Asus/Daily_tot,0),'%',
' <B> Asus Daily:</B>',D_Asus_avai ,'%',
'</br> <B> Others: </B>',formatC(Others, format="d", big.mark=","),',',round(100*Others/Daily_tot,0),'%',
' <B> Other Daily:</B>',D_Oth_avai ,'%'
)
)%>%
layout(xaxis = list(title = 'Dates'),
yaxis = list(title = 'Available Machines')
)
})
})
Related
Scrapy/XPath: Replace inline tags within paragraph
I'm trying to use Scrapy to extract and clean some text from within p which contains inline icons and other tags. In particular, I want to replace the image tags with text extracted from the image src attribute: from scrapy.selector import Selector text = ''' <p id="1"><b><br></b>For service <i>to </i>these stations, take the <img src="images/1.png"> to 72 St or Times Sq-42 St and transfer <br>to an uptown <img src="images/1.png"> or <img src="images/2.png"> <i>local</i>. <br> <br>For service <i>from </i>these stations, take the <img src="images/1.png"> or <img src="images/2.png"> to 72 St or 96 St and transfer <br>to a South Ferry-bound <img src="images/1.png">. <br><b>______________________________<br></b> </p> ''' sel = Selector(text=text) # do stuff The result I'm looking for is the string: For service to these stations, take the (1) to 72 St or Times Sq-42 St and transfer to an uptown (1) or (2) local. For service from these stations, take the (1) or (2) to 72 St or 96 St and transfer to a South Ferry-bound (1). I can extract the text from src using: node.css('img').xpath('#src').re_first(r'images/(.+).png') but I'm stuck on how to iterate through the child nodes and determine if it is a text node / how to filter out the other inline tags. Here's where I'm at: description = sel.css('p#1') def clean_html(description): for n in description.xpath('node()'): if (n.xpath('self::img')): yield n.xpath('#src').re_first(r'images/(.+).png') if (n.xpath('self::text()')): yield n.css('::text') text = ''.join(clean_html(description))
In this case, I don't think selectors are particularly helpful. Try processing this in two phases. Use re.sub to substitute the entire img tag with the string you want. Use BeautifulSoup to remove the remaining HTML from the resulting string. Like this: from scrapy.selector import Selector import re from bs4 import BeautifulSoup # manually construct a selector for demonstration purposes DATA = ''' <p id="1"><b><br></b>For service <i>to </i>these stations, take the <img src="images/1.png"> to 72 St or Times Sq-42 St and transfer <br>to an uptown <img src="images/1.png"> or <img src="images/2.png"> <i>local</i>. <br> <br>For service <i>from </i>these stations, take the <img src="images/1.png"> or <img src="images/2.png"> to 72 St or 96 St and transfer <br>to a South Ferry-bound <img src="images/1.png">. <br><b>______________________________<br></b> </p> ''' sel = Selector(text=DATA) # get the raw source string to work with text = sel.extract() # replace image tag with text from extracted file name image_regex = re.compile('(<img src="images/)(.+?)(.png">)', re.MULTILINE) replaced = re.sub(image_regex, r'(\2)', text) # remove html and return clean text soup = BeautifulSoup(replaced, 'lxml') print(soup.get_text()) Results: For service to these stations, take the (1) to 72 St or Times Sq-42 St and transfer to an uptown (1) or (2) local. For service from these stations, take the (1) or (2) to 72 St or 96 St and transfer to a South Ferry-bound (1). ______________________________
This is how I'd do it without any additional external library: Get text and image paths: results = selector.xpath('.//text()|.//img/#src').extract() Remove extra spaces, new lines and underscores: results = map(lambda x: x.strip('\n_ '), results) Remove empty strings: results = filter(None, results) Join results into a single paragraph and fix dots: raw_paragraph = " ".join(results).replace(' .', '.') Replace images/{Number}.png with ({Number}): paragraph = re.sub('images/(?P<number>\d+).png', '(\g<number>)', raw_paragraph) Result: For service to these stations, take the (1) to 72 St or Times Sq-42 St and transfer to an uptown (1) or (2) local. For service from these stations, take the (1) or (2) to 72 St or 96 St and transfer to a South Ferry-bound (1).
Xpath : how do i exclude the nodes inside the node I want?
In this picture of an html tree from the this picture of an html tree I only want the <div class="d"> node,but the <table> node and below is what I want to exclude from the <div class="d"> node.
well you can either manually pick them one by one by doing something like this tablePath = "//div[#class='d']/table" table = response.selector.xpath(tablePath ).extract(), para_1_Path = "//div[#class='d']/p[5]" para_1 = response.selector.xpath(para_1_Path).extract() and so on OR you can extract all of the div class="d" data and trim it but this would be tricky as you say you're new to scrapy.
Try using Xpath count: count(preceding-sibling::table)>0 something like: >>> import lxml.html >>> s = ''' ... <div class="d"> ... <p style="text-align: center">...</p> ... <p>...</p> ... <h2>Daydream...</h2> ... <p>...</p> ... <p>...</p> ... <p>VRsat</p> ... <table><tbody><tr><td>...</td></tr></tbody></table> ... <p style="text-align: center">...</p> ... <p style="text-align: center">...</p> ... <div id="click_div">...</div> ... </div> ... ''' >>> doc = lxml.html.fromstring(s) >>> xpath = '//div[#class="d"]/*[self::table or count(preceding-sibling::table)>0]' >>> for x in doc.xpath(xpath): x.tag ... 'table' 'p' 'p' 'div' UPDATE: The OP is actually asking about the inverse from my solution above. So, add not, switch to and, change the count to =0: >>> xpath = '//div[#class="d"]/*[not(self::table) and count(preceding-sibling::table)=0]' >>> for x in doc.xpath(xpath): x.tag ... 'p' 'p' 'h2' 'p' 'p' 'p'
XPath - extracting text between two nodes
I'm encountering a problem with my XPath query. I have to parse a div which is divided to unknown number of "sections". Each of these is separated by h5 with a section name. The list of possible section titles is known and each of them can occur only once. Additionally, each section can contain some br tags. So, let's say I want to extract the text under "SecondHeader". HTML <div class="some-class"> <h5>FirstHeader</h5> text1 <h5>SecondHeader</h5> text2a<br> text2b <h5>ThirdHeader</h5> text3a<br> text3b<br> text3c<br> <h5>FourthHeader</h5> text4 </div> Expected result (for SecondSection) ['text2a', 'text2b'] Query #1 //text()[following-sibling::h5/text()='ThirdHeader'] Result #1 ['text1', 'text2a', 'text2b'] It's obviously bit too much, so I've decided to restrict the result to the content between selected header and the header before. Query #2 //text()[following-sibling::h5/text()='ThirdHeader' and preceding-sibling::h5/text()='SecondHeader'] Result #2 ['text2a', 'text2b'] Yielded results meet the expectations. However, this can't be used - I don't know whether SecondHeader/ThirdHeader will exist in parsed page or not. It is needed to use only one section title in a query. Query #3 //text()[following-sibling::h5/text()='ThirdHeader' and not[preceding-sibling::h5/text()='ThirdHeader']] Result #3 [] Could you please tell me what am I doing wrong? I've tested it in Google Chrome.
If all h5 elements and text nodes are siblings, and you need to group by section, a possible option is simply to select text nodes by count of h5 that come before. Example using lxml (in Python) >>> import lxml.html >>> s = ''' ... <div class="some-class"> ... <h5>FirstHeader</h5> ... text1 ... <h5>SecondHeader</h5> ... text2a<br> ... text2b ... <h5>ThirdHeader</h5> ... text3a<br> ... text3b<br> ... text3c<br> ... <h5>FourthHeader</h5> ... text4 ... </div>''' >>> doc = lxml.html.fromstring(s) >>> doc.xpath("//text()[count(preceding-sibling::h5)=$count]", count=1) ['\n text1\n '] >>> doc.xpath("//text()[count(preceding-sibling::h5)=$count]", count=2) ['\n text2a', '\n text2b\n '] >>> doc.xpath("//text()[count(preceding-sibling::h5)=$count]", count=3) ['\n text3a', '\n text3b', '\n text3c', '\n '] >>> doc.xpath("//text()[count(preceding-sibling::h5)=$count]", count=4) ['\n text4\n'] >>>
You should be able to just test the first preceding sibling h5... //text()[preceding-sibling::h5[1][normalize-space()='SecondHeader']]
How to extract links, text and timestamp from webpage via Html Agility Pack
I am using Html Agility Pack and are trying to extract the links and link text from the following html code. The webpage is fetched from a remote page and the saved locally as a whole. Then from this local webpage I am trying to extract the links and link text. The webpage naturally has other html code like other links text, etc inside its page but is removed here for clarity. <span class="Subject2"><a href="/some/today.nsf/0/EC8A39D274864X5BC125798B0029E305?open"> Description 1 text here</span> <span class="time">2012-01-20 08:35</span></a><br> <span class="Subject2"><a href="/some/today.nsf/0/EC8A39XXXX264X5BC125798B0029E312?open"> Description 2 text here</span> <span class="time">2012-01-20 09:35</span></a><br> But the above are the most unique content to work from when trying to extract the links and linktext. This is what I would like to see as the result <link>/some/today.nsf/0/EC8A39D274864X5BC125798B0029E305</link> <title>Description 1 text here</title> <pubDate>Wed, 20 Jan 2012 07:35:00 +0100</pubDate> <link>/some/today.nsf/0/ EC8A39XXXX264X5BC125798B0029E312</link> <title>Description 2 text here</title> <pubDate> Wed, 20 Jan 2012 08:35:00 +0100</pubDate> This is my code so far: var linksOnPage = from lnks in document.DocumentNode.SelectNodes("//span[starts-with(#class, 'Subject2')]") (lnks.Name == "a" && lnks.Attributes["href"] != null && lnks.InnerText.Trim().Length > 0) select new { Url = lnks.Attributes["href"].Value, Text = lnks.InnerText Time = lnks. Attributes["time"].Value }; foreach (var link in linksOnPage) { // Loop through. Response.Write("<link>" + link.Url + "</link>"); Response.Write("<title>" + link.Text + "</title>"); Response.Write("<pubDate>" + link.Time + "</pubDate>"); } And its not working, I am getting nothing. So any suggestions and help would be highly appreciated. Thanks in advance. Update: I have managed to get the syntax correct now, in order to select the links from the above examples: With the following code: var linksOnPage = from lnks in document.DocumentNode.SelectNodes("//span[#class='Subject2']//a") This selects the links nicely with url and text, but how do I go about also getting the time stamp? That is, select the timestamp out of this: <span class="time">2012-01-20 09:35</span></a> which follows each link. And have that output with each link inside the output loop according to the above? Thanks for any help in regards to this.
Your HTML example is malformed, that's why you get unexpected results. To find your first and second values you'll have to get the <a> inside your <span class='Subject2'> - the first value is a href attribute value, the second is InnerText of the anchor. To get the third value you'll have to get the following sibling of the <span class='Subject2'> tag and get its InnerText. See, this how you can do it: var nodes = document.DocumentNode.SelectNodes("//span[#class='Subject2']//a"); foreach (var node in nodes) { if (node.Attributes["href"] != null) { var link = new XElement("link", node.Attributes["href"].Value); var description = new XElement("description", node.InnerText); var timeNode = node.SelectSingleNode( "..//following-sibling::span[#class='time']"); if (timeNode != null) { var time = new XElement("pubDate", timeNode.InnerText); Response.Write(link); Response.Write(description); Response.Write(time); } } } this outputs something like: <link>/some/today.nsf/0/EC8A39D274864X5BC125798B0029E305?open</link> <description>Description 1 text here</description> <pubDate>2012-01-20 08:35</pubDate> <link>/some/today.nsf/0/EC8A39XXXX264X5BC125798B0029E312?open</link> <description>Description 2 text here</description> <pubDate>2012-01-20 09:35</pubDate>
Can't assign value to Variable: undefined method `[]' for nil:NilClass (NoMethodError)
I am completely stumped on this one. I have the following code: puts block.at_xpath("*/img")["width"].to_i but when I change it to width = block.at_xpath("*/img")["width"].to_i I get this error: NokogiriTUT.rb:70:in `blockProcessor': undefined method `[]' for nil:NilClass (NoMethodError) When I have the puts in there it returns the expected value. Update: def blockProcessor(block) header = block.xpath('td[#class="default"]/*/span[#class="comhead"]') array = header.text.split if array[0] != nil #checks to make sure we aren't at the top of the parent list ### Date and Time ### if array[2] == 'hours' || array[2] == 'minutes' date = Time.now else days = (array[1].to_i * 24 * 60 * 60) date = Time.now - days end ##Get Comment## comment = block.at_xpath('*/span[#class="comment"]') hash = comment.text.hash #puts hash ##Manage Parent Here## width = block.at_xpath("*/img")["width"].to_i prevlevel = #parent_array[#parent_array.length-1][1] if width == 0 #has parents parentURL = header.xpath('a[#href][3]').to_s parentURL = parentURL[17..23] parentURL = "http://news.ycombinator.com/item?id=#{parentURL}" parentdoc = Nokogiri::HTML(open(parentURL)) a = parentdoc.at_xpath("//html/body/center/table/tr[3]/td/table/tr") nodeparent = blockProcessor(a) #parent_array = [] node = [hash, width, nodeparent] #id, level, parent #parent_array.push node elsif width > prevlevel nodeparent = #parent_array[#parent_array.length-1][0] node = [hash, width, nodeparent] #parent_array.push node elsif width == prevlevel nodeparent = #parent_array[#parent_array.length-1][2] node = [hash, width, nodeparent] #parent_array.push node elsif width < prevlevel until prevlevel == w do #parent_array.pop prevlevel = #parent_array[#parent_array.length-1][1] end nodeparent = #parent_array[#parent_array.length-1][2] node = [hash, width, nodeparent] #parent_array.push node end puts "Author: #{array[0]} with hash #{hash} with parent: #{nodeparent}" ##Handles Any Parents of Existing Comments ## return hash end end end Here is the block that it is acting on. <tr> <td><img src="http://ycombinator.com/images/s.gif" height="1" width="0"></td> <td valign="top"><center> <a id="up_3004849" href="vote?for=3004849&dir=up&whence=%2f%78%3f%66%6e%69%64%3d%34%6b%56%68%71%6f%52%4d%38%44"><img src="http://ycombinator.com/images/grayarrow.gif" border="0" vspace="3" hspace="2"></a><span id="down_3004849"></span> </center></td> <td class="default"> <div style="margin-top:2px; margin-bottom:-10px; "><span class="comhead">patio11 12 days ago | link | parent | on: Ask HN: What % of your job interviewees pass FizzB...</span></div> <br><span class="comment"><font color="#000000">Every time FizzBuzz problems come up among engineers, people race to solve them and post their answers, then compete to see who can write increasingly more nifty answers for a question which does not seek niftiness at all.<p>I'm all for intellectual gamesmanship, but these are our professional equivalent of a doctor being asked to identify the difference between blood and water. You can do it. <i>We know</i>. Demonstrating that you can do it is not the point of the exercise. We do it to have a cheap-to-administer test to exclude people-who-cannot-actually-program-despite-previous-job-titles from the expensive portions of the hiring process.</p></font></span><p><font size="1"><u>reply</u></font></p> </td> </tr>
Your basic problem is that you don't understand XPath. (You are in good company there; XPath is quite confusing.) Your selectors simply don't match what you think they match. In particular, the one that blows up */img should be //img or something like that. Now, because the xpath selector doesn't match anything, the value of this Ruby statement block.at_xpath("*/img") is nil. And nil doesn't support [], so when you try to call ["width"] on it, Ruby complains with a undefined method [] for nil:NilClass error. And as for why it only blows up when you assign it to a variable... yeah, that's not actually what's happening. You probably changed something else too. And now, please allow me to make some other hopefully constructive code criticisms: Your question was apparently designed to make it difficult to answer. In the future, please isolate the code in question, don't just paste in your whole homework assignment (or whatever this screen scraper is for). It would be extra great if you made it into a single runnable Ruby file that we can execute verbatim on our computers, e.g.: . require "nokogiri" doc = Nokogiri.parse <<-HTML <tr> <td><img src="http://ycombinator.com/images/s.gif" height="1" width="0"></td> <td valign="top"><center> <a id="up_3004849" href="vote?for=3004849&dir=up&whence=%2f%78%3f%66%6e%69%64%3d%34%6b%56%68%71%6f%52%4d%38%44"><img src="http://ycombinator.com/images/grayarrow.gif" border="0" vspace="3" hspace="2"></a><span id="down_3004849"></span> </center></td> <td class="default"> <div style="margin-top:2px; margin-bottom:-10px; "> <span class="comhead"> patio11 12 days ago | link | parent | on: Ask HN: What % of your job interviewees pass FizzB... </span> </div> <br><span class="comment"><font color="#000000">Every time FizzBuzz problems come up among engineers, people race to solve them and post their answers, then compete to see who can write increasingly more nifty answers for a question which does not seek niftiness at all.<p>I'm all for intellectual gamesmanship, but these are our professional equivalent of a doctor being asked to identify the difference between blood and water. You can do it. <i>We know</i>. Demonstrating that you can do it is not the point of the exercise. We do it to have a cheap-to-administer test to exclude people-who-cannot-actually-program-despite-previous-job-titles from the expensive portions of the hiring process.</p></font></span><p><font size="1"><u>reply</u></font></p> </td> </tr> HTML width = doc.at_xpath("*/img")["width"].to_i That way we can debug with our computers, not just with our minds. You're writing Ruby now, not Java, so conform to Ruby's spacing and naming conventions: file names are snake_case, indentation is 2 spaces, no tabs, etc. It really is difficult to read code that's formatted wrong -- where "wrong" means "non-standard." Everywhere you have one of those descriptive comments (### Date and Time ###) is an opportunity to extract a method (def date_and_time(array)) and make your code cleaner and easier to debug.