BIRT XML data source url is requested several times - birt

I have a BIRT report which makes use of an XML data source. The location of this data source is provided by a parameter which holds a URL.
At first glance everything works fine.
However, when I look at the log provided by the back-end application that serves the actual XML files used for the reports I can see that the URL is requested quite a lot. It's not uncommon that while processing a single report the data source url is called 10 times or more.
Is there anything I can do to make BIRT only request the URL once?
I've already enabled "Needs cache for data-engine" for all the Data Sets.
I've also looked at the HTTP headers BIRT sends when requesting the URL. Unfortunately, it doesn't seem to support any kind of HTTP level caching.
Is there something else I can try?

Related

Retrieve all Embedded resources is not working in JMeter

I am using "Retrieve all Embedded resources" advanced option to retrieve all static content.
It work fine but does not retrieve below .js. Is there any Filter or option in JMeter to get below files ?
I don't see any "below .js", just in case be aware of the following limitation:
As per JMeter project main page:
JMeter is not a browser, it works at protocol level. As far as web-services and remote services are concerned, JMeter looks like a browser (or rather, multiple browsers); however JMeter does not perform all the actions supported by browsers. In particular, JMeter does not execute the Javascript found in HTML pages. Nor does it render the HTML pages as a browser does (it's possible to view the response as HTML etc., but the timings are not included in any samples, and only one sample in one thread is ever displayed at a time).
so if an "embedded resource" is being triggered by the client-side JavaScript - JMeter won't be able to process the JavaScript hence won't download the associated piece of content. If this is your case - you will need to download it using a separate HTTP Request sampler and put both "main" HTTP Request and the second one under the Transaction Controller to get the cumulative time. If there are multiple occurrences of such assets - put them under a Parallel Controller

NI-FI - Issue authenticating to a page to download a file

I'm currently trying to get a .zip file from a webpage using nifi to do it. I am able to generate a direct download link of the file but the application needs to log in into the page before opening the direct link. I've tried using InvokeHTTP, ListWebDAV and FetchWebDAV processors and I was not able to do this properly.
I even tried to add the login and the password as attributes using the same ID used by the page(logon, temp_password).
Also tried going for a Python code but I was not able to get any good results with it.
Every time I tried any of these methods I received a small file on the InvokeHTTP with the download link saying that authorization is required and it downloads a file that is the source code of the login page.
Tried to look in almost everyplace on the internet without much success :/
I'm now trying to get a processor to actually log into the page and keep it that way so the invoke processor can download the zip file using the direct link.
If somebody have another idea on how I can resolve this I will be very grateful.
I can provide more info if needed, at the moment I am using the Ni-Fi 1.1.2
Thanks in advance;
Depending on the authentication mechanism in place by the page, you'll likely need to chain two InvokeHTTP processors together. Assuming the first page has a form field you fill out with the username and password, you'll make one InvokeHTTP which uses the POST method to submit the form with the provided credentials and receives a response that contains some kind of token (session ID, etc.). You will extract this value (either from a response header or the page content), and provide it to the second InvokeHTTP as a request header. Using your browser's Developer Tools feature as daggett suggested to observe the authentication process will allow you to determine exactly where these values are provided.

How can I cache a web application locally

In my company, we have a report generation team which maintains a local web application which is horribly slow. These reports get generated weekly. The data for these reports reside inside a database which gets queried through this report portal. I cannot suggest them to change the application in anyway (like memcache etc.) the only option I have is to somehow save these pages locally and relay.
As these are not static pages(they use database to fetch the data), I want to know is there anyway I can store these pages locally by running a cronjob and then have the super fast access for me and my team.
PS:This application doesn't have any authentication these are plain diffs of two files stored in the database.
There are lot of options, but the following one may be easy
Generate HTML page regularly and update the cache (cache entire generated html page with key obtained from the dynamic content uniqueness), with some kind of cronjob as you have mentioned. This job populates all the modified dynamic content # regular intervals.
Have a wrapper for every dynamic page content to lookup cache. If hit then simply return the already generated HTML page. Else, go through regular flow.
You can also choose to cache this newly generated page also.
Hope it helps!

Jmeter exclude URL patterns not work

I was using Jmeter HTTPS Test Script Recorder to record a login request.
Please see the snapshot, I already added the URL patters to exclude the .js files, but I still get the js requests.
Why it's failed?
You can check that if you look at the contents of the said requests. Most likely they are GET requests, and most likely they have one or more Parameters. Regex .*\.js looks specifically for .js at the end of the URL. But if GET request has parameters, on recording its URL would look like <...>.js?param=value, so the regex .*\.js will not match (although the name of the request will still be the same).
So you need to specify 2 regex exclusions: .*\.js and .*\.js?.*
I know that it doesn't answer your question, but actually excluding images and .js files is not something you should be normally doing. I would rather use that field to filter out the "external" URLs, which are not connected to your application like 3rd-party banners, widgets, images, etc. - anything which is not related to your application under test. Even if you see it in response, these entities are being loaded from external sources which you cannot control so they are not interesting and the picture of your load test might be impacted.
So I would suggest the following:
In "Grouping" drop-down choose Store 1st sampler of each trade group only
Make sure that Follow Redirects and Retrieve All Embedded Resources. are turned on in the recorded requests. If not - enabled them via HTTP Request Defaults. Also check Use concurrent pool box is ticked as real browsers download images, styles and scripts in multi-threaded manner.
When it comes to running your test add HTTP Cache Manager to your test plan as well-behaved browsers download images, scripts and styles only once, on subsequent requests they are being returned from browsers cache and this situation needs to be properly simulated
For anyone else arriving here from google looking for an answer to this question:
You may simply be looking at the wrong place.
If you're looking at the workbench results tree, you'll see all requests. They are not filtered here. I've thought this was a bug with JMeter more times than I care to admit.
Instead, look inside the Recording controller tree (which is collapsed by default), where the results are in fact being filtered:

Lazy HTTP caching

I have a website which is displayed to visitors via a kiosk. People can interact with it. However, since the website is not locally hosted, and uses an internet connection - the page loads are slow.
I would like to implement some kind of lazy caching mechanism such that as and when people browse the pages - the pages and the resources referenced by the pages get cached, so that subsequent loads of the same page are instant.
I considered using HTML5 offline caching - but it requires me to specify all the resources in the manifest file, and this is not feasible for me, as the website is pretty large.
Is there any other way to implement this? Perhaps using HTTP caching headers? I would also need some way to invalidate the cache at some point to "push" the new changes to the browser...
The usual approach to handling problems like this is with HTTP caching headers, combined with smart construction of URLs for resources referenced by your pages.
The general idea is this: every resource loaded by your page (images, scripts, CSS files, etc.) should have a unique, versioned URL. For example, instead of loading /images/button.png, you'd load /images/button_v123.png and when you change that file its URL changes to /images/button_v124.png. Typically this is handled by URL rewriting over static file URLs, so that, for example, the web server knows that /images/button_v124.png should really load the /images/button.png file from the web server's file system. Creating the version numbers can be done by appending a build number, using a CRC of file contents, or many other ways.
Then you need to make sure that, wherever URLs are constructed in the parent page, they refer to the versioned URL. This obviously requires dynamic code used to construct all URLs, which can be accomplished either by adjusting the code used to generate your pages or by server-wide plugins which affect all text/html requests.
Then, you then set the Expires header for all resource requests (images, scripts, CSS files, etc.) to a date far in the future (e.g. 10 years from now). This effectively caches them forever. This means that all requests loaded by each of your pages will be always be fetched from cache; cache invalidation never happens, which is OK because when the underlying resource changes, the parent page will use a new URL to find it.
Finally, you need to figure out how you want to cache your "parent" pages. How you do this is a judgement call. You can use ETag/If-None-Match HTTP headers to check for a new version of the page every time, which will very quickly load the page from cache if the server reports that it hasn't changed. Or you can use Expires (and/or Max-Age) to reload the parent page from cache for a given period of time before checking the server.
If you want to do something even more sophisticated, you can always put a custom proxy server on the kiosk-- in that case you'd have total, centralized control over how caching is done.

Resources