Intel Advisor: Inspect method including all submethods - performance

Using Intel Advisor and the roof line model, I would like to assess the performance of a certain function. This function uses the Eigen library for matrix operations, where the main part of work is done.
In the output I can see my function with a relatively small self time and several Eigen functions called by my function. Now I would like to group all FLOPS and memory operations of my function together (rather than of each separate function) and use this result for the roof line model. How can I achieve this?
Note: I have posted a similar question in the Intel Support Forum [1].
[1] https://software.intel.com/en-us/forums/intel-advisor-xe/topic/806091

In order to do loopnest/functions aggregation on Advisor Roofline chart, you have to run "Roofline with CallStacks" as described at https://software.intel.com/en-us/articles/roofline-with-callstacks
More specifically, in case of command line, you have to use:
advixe-cl -collect survey -project-dir MyResults -- MyExecutable
advixe-cl -collect tripcounts -flop -stacks -project-dir MyResults -- MyExecutable
(NB -stacks extra flag).
Once done, you'll need to locate the dot corresponding to your function on the chart and "collapse" it. Basically, Roofline with Callstacks makes it possible to configure the granularity / nesting level of your Roofline model/chart.
Navigating through parent-child dots on Roofline with Callstacks is easier when either using Top-Down view side by side with Roofline chart or when expanding righthand-side "roofline call stacks" view:

Related

BigQuery JavaScript UDF process - per row or per processing node?

I'm thinking of using BigQuery's JavaScript UDF as a critical component in a new data architecture. It would be used to logically process each row loaded into the main table, and also to process each row during periodical and ad-hoc aggregation queries.
Using an SQL UDF for the same purpose seems to be unfeasible because each row represents a complex object, and implementing the business logic in SQL, including things such as parsing complex text fields, gets ugly very fast.
I just read the following in the Optimizing query computation documentation page:
Best practice: Avoid using JavaScript user-defined functions. Use native UDFs instead.
Calling a JavaScript UDF requires the instantiation of a subprocess.
Spinning up this process and running the UDF directly impacts query
performance. If possible, use a native (SQL) UDF instead.
I understand why a new process for each processing node is needed, and I know that JS tends to be deployed in a single-thread-per-process manner (even though v8 does support multithreading these days). But it's not clear to me if once a JS runtime process is up, it can be expected to get reused between calls to the same function (e.g. for processing different rows on the same processing node). The amount of reuse will probably significantly affect the cost. My table is not that large (tens to hundreds of millions of rows), but still I need to have a better understanding here.
I could not find any authoritative source on this. Has anybody done any analysis of the actual impact of using a JavaScript UDF on each processed row, in terms of execution time and cost?
If it's not documented, then that's an implementation detail that could change. But let's test it:
CREATE TEMP FUNCTION randomThis(views INT64)
RETURNS FLOAT64
LANGUAGE js AS """
if (typeof variable === 'undefined') {
variable = Math.random()
}
return variable
""";
SELECT randomThis(views), COUNT(*) c
FROM (
SELECT views
FROM `fh-bigquery.wikipedia_v3.pageviews_2019`
LIMIT 10000000
)
GROUP BY 1
ORDER BY 2 DESC
I was expecting ten million different numbers, or a handful, but I only got one: The same process was reused ten million times, and variables were kept around in between calls.
This even happened when I went up to 100 million, signaling that parallelism is bounded by one JS VM.
Again, these are implementation details that could change. But while it stays that way, you can make the best use out of it.
I was expecting ten million different numbers, or a handful, but I only got one
That's because you didn't allow Math.random to be called more than once
and variables were kept around in between calls
due to the variable defined at the global scope.
In other words your code explicitly permits Math.random to be executed once only (by implictly defining the variable at the global scope).
If you try this:
CREATE TEMP FUNCTION randomThis(seed INT64)
RETURNS FLOAT64
LANGUAGE js AS """
let ret = undefined
if (ret === undefined) {
ret = Math.random()
}
return ret
""";
SELECT randomThis(size), COUNT(*) c
FROM (
SELECT repository_size as size
FROM `my-internal-dataset.sample-github-table`
LIMIT 10000000
)
GROUP BY 1
ORDER BY 2 DESC
then you get many rows. And now it does take much longer time to execute, probably because the single VM became a bottleneck.
Used another dataset to reduce the query cost.
Conclusion:
1. There is one VM (or maybe a container) per query to support JS UDF. This is in line with a single subprocess ("Calling a JavaScript UDF requires the instantiation of a subprocess") mentioned in the documentation.
2. If you can apply execute-once pattern (using some kind of a cache or coding technique like memoisation) and write a UDF similar to the previous answer, then the sheer presence of JS UDF has a limited impact on your query.
3. If you have to write a JS UDF like in this answer, then the impact on your query becomes very significant with query execution time skyrocketing even for simple JS code. So for this case it's certainly better to stay out.

What does "emit" mean in general computer science terms?

I just stumbled on what appears to be a generally-known compsci keyword, "emit". But I can't find any clear definition of it in general computer science terms, nor a specific definition of an "emit()" function or keyword in any specific programming language.
I found it here, reading up on MapReduce:
https://en.wikipedia.org/wiki/MapReduce
The context of my additional searches show it has something to do with signaling and/or events. But it seems like it is just assumed that the reader will know what "emit" is and does. For example, this article on MapReduce patterns:
https://highlyscalable.wordpress.com/2012/02/01/mapreduce-patterns/
There's no mention of what "emit" is actually doing, there are only calls to it. It must be different from other forms of returning data, though, such as "return" or simply "printf" or the equivalent, else the calls to "emit" would be calls to "return".
Further searching, I found a bunch of times that some pseudocode form of "emit" appears in the context of MapReduce. And in Node.js. And in Qt. But that's about it.
Context: I'm a (mostly) self-taught web programmer and system administrator. I'm sure this question is covered in compsci 101 (or 201?) but I didn't take that course.
In the context of web and network programming:
When we call a function the function may return a value.
When we call a function and the function is supposed to send those results to another function we will not user return anymore. Instead we use emit. We expect the function to emit the results to another function by our call.
A function can return results and emit events.
I've only ever seen emit() used when building a simple compiler in academia.
Upon analyzing the grammar of a program, you tokenize the contents of it and emit (push out) assembly instructions. (The compiler program that was written actually even contained an internal function called emit to mirror that theoretical/logical aspect of it.)
Once the grammar analysis is complete, the assembler will take the assembly instructions and generate the binary code (aka machine code).
So, I don't think there is a general CS definition for emit; however, I do know it is used in the pseudocode (and sometimes, actual code) for writing compiler programs. And that is undergraduate level computer science education in the US.
I can think of three contexts in which it's used:
Map/Reduce functions, where some input value causes 0 or more output values to go into the Reduce function
Tokenizers, where a stream of text is processed, and at various intervals, tokens are emitted
Messaging systems
I think the common thread is the "zero or more". A return provides exactly one value back from a function, whereas an "emit" is a function call that could take place zero times or several times.
In the context of the MapReduce programming model, it is said that an operation of a map nature takes an input value and emits a result, which is nothing more than a transformation of the input.

Handling mutually recursive GUI widgets with reactive-banana

I am hunting for a library to write a GUI on top of GLFW and OpenGL. I'm doing this because I am dissatisfied with the common UI library bindings which I feel are too imperative, and I would also like tight control of the look and feel of my UIs. I'd like a declarative approach to defining UI's. I am experimenting with reactive-banana (and temporarily reactive-banana-wx) to see if it meets my needs. I have a problem with defining recursive widgets. Here's my simplest test case:
A text widget that displays a counter.
A button widget that increments the counter.
A button widget that is inactive (so it is greyed out and does not respond to input at all) when the counter is 0 and otherwise active and resets the counter to 0.
The first and third widget have a recursive relationship. The first widget is intuitively a stepper of a union of events fed from the two buttons. However, the reset button is an fmap of the counter, and then the event stream relies on the reset button! What is to be done?
Beyond this question I have a concern about event handling: Since I want to handle device input and input focus within my code instead of relying on a framework, I see difficulties ahead in correctly dispatching events in a scalable way. Ideally I would define a data that encapsulates the hierarchical structure of a widget, a way to install event callbacks between the elements, and then write a function that traverses that data structure in order to define device input processing and graphical output. I am not sure how to take an event stream and split it as easily as event streams can be merged.
Recursion is allowed, as long as it is mutual recursion between a Behavior and an Event. The nice thing about Behaviors is that sampling them at the time of an update will return the old value.
For instance, your example can be expressed as follows
eClick1, eClick2 :: Event t ()
bCounter :: Behavior t Int
bCounter = accumB 0 $ mconcat [eIncrement, eReset]
eIncrement = (+1) <$ eClick1
eReset = (const 0) <$ whenE ((> 0) <$> bCounter) eClick2
See also the question "Can reactive-banana handle cycles in the network?"
As for your second question, you seem to be looking for the function filterE and its cousins filterApply and whenE?
As for your overall goal, I think it is quite ambitious. From what little experience I have gained so far, it seems to me that binding to an existing framework feels quite different from making an "clean-state" framework in FRP. Most likely, there are still some undiscovered (but exciting!) abstractions lurking there. I once started to write an application called BlackBoard that contains a nice abstraction about time-varying drawings.
However, if you care more about the result rather than the adventure, I would recommend a conservative approach: create the GUI toolkit in an imperative style and hook reactive-banana on top of that to get the benefits of FRP.
In case you just wish for any GUI, I am currently focussing on the web browser as a GUI. Here some preliminary experiments with Ji. The main benefit over wxHaskell is that it's a lot easier to get up and running and any API design efforts will benefit a very wide audience.

What are your top 3 XPages performance tips for new XPages developers?

What 3 things would you tell developers new to XPages to do to help maximize the performance of their XPages apps?
Tim Tripcony had given a bunch of suggestion here.
http://www-10.lotus.com/ldd/xpagesforum.nsf/topicThread.xsp?action=openDocument&documentId=365493C31B0352E3852578FD001435D2#AEFBCF8B111E149B852578FD001E617B
Not sure if this tipp is for beginners, but use any of the LifeCyclePhaseListeners from the OpenNTF Snippets to see what is going on in your datasources during a complete or partial refresh (http://openntf.org/XSnippets.nsf/snippet.xsp?id=a-simple-lifecyclelistener-)
Use the extension Library. Report Bugs ( or what you consider a bug ) at OpenNTF.
Use the SampleDb from the extLib. ou can easily modify the samples to your own need. Even good for testing if the issue you encounter is reproducable in this DB.
Use Firebug ( or a similar tool that comes with the browser of your choice ) If you see an error in the error tab, go and fix it.
Since you're asking for only 3, here are the tips I feel make the biggest difference:
Determine what your users / customers mean by "performance", and set the page persistence option accordingly. If they mean scalability (max concurrent users), keep pages on disk. If they mean speed, keep pages in memory. If they want an ideal mixture of speed and scalability, keep the current page in memory. This latter option really should be the server default (set in the server's xsp.properties file), overridden only as needed per application.
Set value bindings to compute on page load (denoted by a $ in the source XML) wherever possible instead of compute dynamically (denoted by a #). $ bindings are only evaluated once, # bindings are recalculated over and over again, so changing computations that only need to be loaded once per page to $ bindings speed up both initial page load and any events fired against the page once loaded.
Minimize the use of SSJS. Wherever possible, use standard EL instead (e.g. ${database.title} instead of ${javascript:return database.getTitle();}). Every SSJS expression must be parsed into an abstract syntax tree to be evaluated, which is incrementally slower than the standard EL resolver.
There are many other ways to maximize performance, of course, but in my opinion these are the easiest ways to gain noticeable improvement.
1. Use the Script Library instead writing a bulk of code into the Xpage.
2. Use the Theme or separate CSS class for each elements [Relational]
3. Moreover try to control your SSJS code. Because server side request only reduce our system performance.
4. Final point consider this as sub point of 3, Try to get the direct functions from our SSJS, Don't use the while llop and for loop for like document collection, count and other things.
The basics like
Use the immediate flags ( or one of the other flags) on serverside events if possible
Check the Flag which (forgot its name..) generates the css and js as
one big file at runtime therefore minimizing the ammount of
requests.
Choose your scope wisely. Dont put everything in your sessionscope but define when, where and how your are using the data and based on that use the correct scope. This can lead to better memory usage..
And of course the most important one read the mastering xpages book.
Other tips I would add:
When retrieving data use viewentrycollections or the viewnavigstor
Upgrade to 8.5.3
Use default html tags if possible. If you dont need the functionality of a xp:div or xp:panel use a <div> instead so you dont generate an extra uicomponent on the tree.
Define what page persistance mode you need
Depends a lot what you mean by performance. For performance of the app:
Use compute on page load wherever feasible. It significantly improves performance.
In larger XPages particularly, combine code into single controls where possible. E.g. Use a single Computed Field control combining literal strings, EL and SSJS rather than one control for each language. On that point, EL performs better than SSJS, and SSJS on the XPage performs better than SSJS in a Script Library.
Use dataContexts for properties that are calculated more than once on an XPage.
Partial Execution mode is a very strong recommendation, but probably beyond new XPages developers at this point. Java will also perform better than SSJS in a Script Library, but again beyond new developers. XPages controls you've created with the Extensibility Framework should perform better, because they should run fewer lines of Java than multiple controls, but I haven't tested that.
If you mean performance of the developer:
Get the Extension Library.
Use themes to set default properties, e.g. A standard style for all your pagers.
Use Firebug. If you're developing for Notes Client or IE, still use Firebug. You'll spend longer suffering through Client/IE thank you will fixing the few quirks that will remain.

easy, programmable data plotting

I spend most of my time plotting data, but unfortunately I haven't found a decent solution for my plotting needs. At the moment, the most powerful and pleasant library I found that performs plotting is matplotlib. The results are stunning, but I mostly spend my time fighting with the library when trying to do simple things like having an arrow as I want. SImilar programs like R and gnuplot produce visually less appealing results, and they are not GUI based.
On the other hand, programs like xmgrace (or better) allow direct manipulation of the plotted objects and direct feedback, but they fail on two important points:
if my dataset (normally stored in csv files) changes for some reason, I have to reimport it and perform the manipulations again, by hand
once I obtain a nice plot setup, the only way I have to recreate the plot is to use a graphical, interactive program. I would like to have the possibility to run a command line utility on my csv files and get the .pdf as a result, with no human intervention.
I still have to find something that provides me both worlds, and it has an affordable price. Ideally, I would need an interactive GUI program (a la Origin) to generate matplotlib-based python scripts.
Does anyone have any hints on software that could address my needs on OSX (preferably) or Linux ?
You may want to check out Igor Pro. It's quite old, and quirky but it provides the most advanced plotting system I've found yet on the Mac. You can modify anything graphically, at a command line or in script files. The most powerful feature (IMO) is the ability to automatically generate a script to recreate a figure or to use a figure to create a script that generates figures like (in style etc.) a particular figure. I use Igor for all publication figures I produce.
Data is stored in "waves" (translation: vectors) which encapsulate data and information about the delta between data points (e.g. time step). Figures reference waves as their data source. When you update a wave (e.g. by re-importing a CSV file and specifying that the data overwrite specific waves), all figures that reference that wave are automatically updated.
You can create "layouts" which are page-layouts containing multiple graphs. These layouts are also automatically updated whenever any of the figures in the layout are updated (see above). You can add drawing/text/annotations to either graphs or layout.s
Be warned: Igor Pro's scripting language is something like the bastard child of VB and Matlab. It makes my eyes bleed. It makes me pray to whatever God that the pain just end. But the entire system is so powerful that it's worth it.
I have always used Matlab or R for this sort of thing. While you may not like how the generic plots look, I find that once I familiarize myself with the libraries I can make them as fancy as I want them to be.
R being free, I would try to stick it out with that. It is extremely powerful and perfectly suited to what you need (generate charts on the fly directly from datafiles). I bet that the more you get comfortable with it, you'll find yourself using R for a wide range of tasks outside of plotting data.
MathGL is cross-platform GPL library which meet all yours criteria. It can produce nice graphics, it can read csv files, it have window for displaying graphics (you don't need to know widget libraries), and it can plot in console (don't need a window or X at all). At this you can use C/C++/Fortran/Python/... for yours own code or MGL scripts for simplicity (see UDAV front-end in the last case).
Finally it can produce bitmaps (PNG/JPEG/GIF/...) or vector (EPS/SVG) output. Later it can be converted to PDF easily. Or you can create a PDF with U3D directly -- you'll need HPDF and U3D libraries in this case.

Resources