Fiddler filter to hide recurring requests - filter

Is there any way to tell Fiddler not to log requests that have already been sent/logged previously?
Or even to filter them after you stop the capture, so as to get a smaller list to process?
Having a huge list of multiple identical requests is really difficult to debug...
Seemed simple but after many tries, i couldn't find anything.
Thanks in advance!
EDIT
To clarify things :
I am trying to debug a sort of monitoring system, in which the requests and responses change through time but could be hours and thousands of queries before an event changes the system state, hence the request response data. So i would like to skip logging identical request/response sets.

The easiest way to do this would be to write a bit of FiddlerScript (Rules > Customize Rules).
However, how exactly do you define "identical"? The same URL? The same request headers? The same response body? etc.
The definition you choose obviously has a significant impact on what the necessary FiddlerScript will look like.

Related

Process request and return response in chunks

I'm making a search agreggator and I've been wondering how could I improve the performance of the search.
Given that I'm getting results from different websites, currently I need to wait to receive the results for each provider but this is done one after another so the whole request takes a while to respond.
The easiest solution would be to just make a request from the client for each provider, but this would end up with a ton of request per search, (but if this is the proper way I'll just do it.)
Why I've been wondering is if there's way to return results everytime a provider responds, so if we have providers A, B and C and B already returned results then send it back to the client. In order for this to work all the searchs would need to run in parallel of course.
Do you know a way of doing this?
I'm trying to build a search experience similar to SkyScanner, that loads results but then you can see it still keeps getting more records and it sorts them on the fly (on client side as far as I can see).
Caching is the key here. Best practices for external API (or scraping) is to be as little of a 'taker' as possible. So in your Laravel setup, get your results, but cache the results for as long as makes sense for your app. Although the odds in a skyscanner situation is low that two users will make the exact same request, the odds are much higher that a user will make the same request multiple times, or may share the link, etc.
https://laravel.com/docs/8.x/cache
cache(['key' => 'value'], now()->addMinutes(10));
$value = cache('key');
To actually scrape the content, you could use this:
https://github.com/softonic/laravel-intelligent-scraper
Or to use an API which is the nicer route:
https://docs.guzzlephp.org/en/stable/
On the client side, you could just make a few calls to your own service in separate requests and that would give you your asynchronous feel you're looking for.

REST API for main page - one JSON or many?

I'm providing RESTful API to my (JS) client from (Java Spring) server.
Main site page contains a number of logical blocks (news, last comments, some trending stuff), each of them has a corresponding entity on server. Which way is a right one to go, handle one request like
/api/main_page/ ->
{
news: {...}
comments: {...}
...
}
or let the client do a few requests like
/api/news/
/api/comments/
...
I know in general it's better to have one large request/response, but is this an answer to this situation as well?
Ideally, you should have different API calls for fetching individual configurable content blocks of the page from the same API.
This way your content blocks are loosely bounded to each other.
You
can extend, port(to a new framework) and modify them independently at
anytime you want.
This comes extremely useful when application grows.
Switching off a feature is fairly easy in this
case.
A/B testing is also easy in this case.
Writing automation is
also very easy.
Overall it helps in reducing the testing efforts.
But if you really want to fetch this in one call. Then you should add additional params in request and when the server sees that additional param it adds the additional independent JSON in the response by calling it's own method from BL layer.
And, if speed is your concern then try caching these calls on server for some time(depends on the type of application).
I think in general multiple requests can be justified, when the requested resources reflect parts of the system state. (my personal rule of thumb, still WIP).
i.e. if a news gets displayed in your client application a lot, I would request it once and reuse it wherever I can. If you aggregate here, you would need to request for it later, maybe some of them never get actually displayed, and you have some magic to do if the representation of a news differs in the aggregation and /news/{id}-resource.
This approach would increase communication if the page gets loaded for the first time, but decrease communication throughout your client application the longer it runs.
The state on the server gets copied request by request to your client or updated when needed (Etags, last-modified, etc.).
In your example it looks like /news and /comments are some sort of latest or since last visit, but not all.
If this is true, I would design them to be a resurce as well, like /comments/latest or similar.
But in any case I would them only have self-links to the /news/{id} or /comments/{id} respectively. Then you would have a request to /comments/latest, what results in a list of news-self-links, for what I would start a request only if I don't already have that news (maybe I want to check if the cached copy is still up to date).
It is also possible to trigger the request to a /news/{id} only if it gets actually displayed (scrolling, swiping).
Probably the lifespan of a news or a comment is a criterion to answer this question. Meaning the caching in the client it is not that vital to the system, in opposite of a book in an Book store app.

Incremental updates using browser cache

The client (an AngularJS application) gets rather big lists from the server. The lists may have hundreds or thousands of elements, which can mean a few megabytes uncompressed (and some users (admins) get much more data).
I'm not planning to let the client get partial results as sorting and filtering should not bother the server.
Compression works fine (factor of about 10) and as the lists don't change often, 304 NOT MODIFIED helps a lot, too. But another important optimization is missing:
As a typical change of the lists are rather small (e.g., modifying two elements and adding a new one), transferring the changes only sounds like a good idea. I wonder how to do it properly.
Something like GET /offer/123/items should always return all the items in the offer number 123, right? Compression and 304 can be used here, but no incremental update. A request like GET /offer/123/items?since=1495765733 sounds like the way to go, but then browser caching does not get used:
either nothing has changed and the answer is empty (and caching it makes no sense)
or something has changed, the client updates its state and does never ask for changes since 1495765733 anymore (and caching it makes even less sense)
Obviously, when using the "since" query, nothing will be cached for the "resource" (the original query gets used just once or not at all).
So I can't rely on the browser cache and I can only use localStorage or sessionStorage, which have a few downsides:
it's limited to a few megabytes (the browser HTTP cache may be much bigger and gets handled automatically)
I have to implement some replacement strategy when I hit the limit
the browser cache stores already compressed data which I don't get (I'd have to re-compress them)
it doesn't work for the users (admins) getting bigger lists as even a single list may already be over limit
it gets emptied on logout (a customer's requirement)
Given that there's HTML 5 and HTTP 2.0, that's pretty unsatisfactory. What am I missing?
Is it possible to use the browser HTTP cache together with incremental updates?
I think there is one thing you are missing: in short, headers. What I'm thinking you could do and that would match (most) of your requirements, would be to:
First GET /offer/123/items is done normally, nothing special.
Subsequents GET /offer/123/items will be sent with a Fetched-At: 1495765733 header, indicating your server when the initial request has been sent.
From this point on, two scenarios are possible.
Either there is no change, and you can send the 304.
If there is a change however, return the new items since the time stamp previously sent has headers, but set a Cache-Control: no-cache from your response.
This leaves you to the point where you can have incremental updates, with caching of the initial megabytes-sized elements.
There is still one drawback though, that the caching is only done once, it won't cache updates. You said that your lists are not updated often so it might already work for you, but if you really want to push this further, I could think of one more thing.
Upon receiving an incremental update, you could trigger in the background another request without the Fetched-At header that won't be used at all by your application, but will just be there to update your http cache. It should not be as bad as it sounds performance-wise since your framework won't update its data with the new one (and potentially trigger re-renders), the only notable drawback would be in term of network and memory consumption. On mobile it might be problematic, but it doesn't sounds like an app intended to be displayed on them anyway.
I absolutely don't know your use-case and will just throw that out there, but are you really sure that doing some sort of pagination won't work? Megabytes of data sounds a lot to display and process for normal humans ;)
I would ditch the request/response cycle entirely and move to a push model.
Specifically, WebSockets.
This is the standard technology used on financial trading websites serving tables of real-time ticker data. Here is one such production application demonstrating the power of WebSockets:
https://www.poloniex.com/exchange#btc_eth
WebSocket applications have two types of state: global and user. The above link will show three tables of global data. When you're logged in, two aditional tables of user data are displayed at the bottom.
This is not HTTP; you won't be able to just slap this into a Java Servlet. You'll need to run a separate process on your server which communicates over TCP. The good news is, there are mature solutions readily available. A Java-based solution with a very decent free licensing option, which includes both client and server APIs (and does integrate with Angular2) is Lightstreamer. They have a well-organized demo page too. There are also adapters available to integrate with your data sources.
You may be hesitant to ditch your existing servlet approach, but this will be less headaches in the long run, and scales marvelously. HTTP polling, even with well-designed header-only requests, do not scale well with large lists which update frequently.
---------- EDIT ----------
Since the list updates are infrequent, WebSockets are probably overkill. Based on the further details provided by comments on this answer, I would recommend a DOM-based, AJAX-updated sorter and filterer such as DataTables, which has some built-in options for caching. In order to reuse client data across sessions, ajax requests in the previous link should be modified to save the current data in the table to localStorage after every ajax request, and when the client starts a new session, populate the table with this data. This will allow the plugin to manage the filtering, sorting, caching and browser-based persistence.
I'm thinking about something similar to Aperçu's idea, but using two requests. The idea is yet incomplete, so bear with me...
The client asks for GET /offer/123/items, possibly with the ETag and Fetched-At headers.
The server answers with
200 and a full list if either header is missing, or when there are too many changes since the Fetched-At timestamp
304 if nothing has changed since then
304 and a special Fetch-More header telling the client that more data is to be fetched otherwise
The last case is violating how HTTP should work, but AFAIK it's the only way letting the browser cache everything what I want it to cache. Since the whole communication is encrypted, proxies can't punish me for violating the spec.
The client reacts to Fetch-Errata by requesting GET /offer/123/items/errata. This way, the resource has got split into two requests. The split is ugly, but an angular $http interceptor can hide the ugliness from the application.
The second request is cacheable, too, and there can be also a Fetched-At header. The details are unclear, but some strong handwavium makes me believe that it can work. Actually, the errata could itself be inaccurate but still useful and get an errata itself.... etc.
With HTTP/1.1, more requests may mean more latency, but having a couple of them should still be profitable because of the saved bandwidth. The server can decide when to stop.
With HTTP/2, multiple requests could be send at once. The server could be make to handle them efficiently as it knows that they belong together. Some more handwavium...
I find the idea strange, but interesting and I'm looking forward to comments. Feel free to downvote me, but please leave an explanation.

Should requests contain unnecessary parameters which are sent if manually browsing the application

I'm currently testing a asp.net application. I have recorded all the steps i need and i have noticed that if i remove some of the parameters that i'm sending with the request the scripts still work and the desired outcome still happens. Anyway i couldn't find difference in the response time with them or without them, and i was wondering can i remove those parameters which are not needed and is this going to impact the performance in any way? I understand that the most realistic way of executing the scripts should be to do it like a normal user does (send all which is sent with normal usage) but this would really improve the readability of my scripts, any idea?
Thank you in advance and here is a picture which shows for example some parameters which i can remove and the scripts still work this is from a document management system and i'm performing step which doesn't direct the document as the parameters say but the normal usage records those :
Although it may be something very trivial like pre-populating date and time in calendar in user's time zone I believe you shouldn't be omitting any request parameters.
I strongly believe that load testing should mimic real user as close as possible so if it is not a big deal to send these extra parameters and perform their correlation - I would leave them.
Few other tips:
Embedded Resources (scripts, styles, images). Real-browsers download these entities so
Make sure you have "Retrieve All Embedded Resources" box checked
Make sure you "Use concurrent pool" size 3-5 threads
Filter out any "external" stuff via "URLs must match" input
Well-behaved browsers download embedded resources but do it only once. On subsequent requests they're being returned from browser's cache. Add HTTP Cache Manager to your Test Plan to simulate browser cache.
Add HTTP Cookie Manager to represent browser cookies and deal with cookie-based authentication.
See How To Make JMeter Behave More Like A Real Browser article for above tips explained just in case you want to dive into details
Less data to send, faster response time (normally).
Like you said, it's more realistic to test with all data from the recorded case, but if these parameters really doesn't impact your result and measured time, you can remove them for a better readability.
Sometimes jmeter records not necessary parameters because they are only needed for brower compability.

Ajax and Performance/Speed

I'm currently creating a small todo site, and I have multiple questions related to ajax and performance... So here are my questions:
In order to reduce number of request, I want to get all data from one request, so I will pass for example these attributes:
1.1. to get 1 task:
entity=task&id=2&type=single&extra=subtasks%%contexts
1.2. to get list of tasks and events in one listing:
entity=task%%event&user_id=2%type=multiple%order=date&limit=10
Do you think it will reduce number of request and improves some how the performance?
If all requests will go to one file, it means that that .php file might be quite big, is it bad? Or it not really matter?
For the listing. I will be able to change the order of listing and maybe filter it somehow. Do you think it will be better to load all tasks and event to
To keep things fast there are two concerns:
Reduce HTTP requests – if you need two separate bits of data, send them in one file.
Keep the content delivered in each AJAX request small – gzip and caching works wonders here.
So, yes, bundle things together. Large PHP file doesn't make any difference, DB queries are the only real bottleneck in a normally trafficked webpage.
For filtering and sorting, a good approach is to use JSON for the AJAX response, then sort/filter based on that on the client side if you are talking about a smallish number of items (probably upto 1000 items). If you have 100s of thousands of items, then returning a subset from the server will be better.

Resources