Spring boot actuator gauge provides unreliable data

Spring boot actuator gauge provides unreliable data - spring

I don't understand what this value intends to provide.
According to the documentation, the gauge sounds like the "totals" while the counter is the delta (since last request? delta compared to what?).
Well, in my standard spring boot application (still on version 1.4.2), all counters are 0 and the gauge would go up and down without any logic.
I am the only person in the system, i'll hit refresh, and i'll get a value like this:
{"gauge.servo.response.rest.star-star":20.0}
I'll hit refresh again on the application, and i'll get this again:
{"gauge.servo.response.rest.star-star":20.0}
Hit refresh again, and I get this:
{"gauge.servo.response.rest.star-star":12.0}
How can this number ever go DOWN? shouldn't it always increase as the use on my system increases? Perhaps it gauges the last few seconds? If so, how do I know when to monitor?
I need a decent monitoring solution that provides the following data:
Total number of requests (and rate)
Total number of errors
Latency information
Seems like item #3 is not supported by metrics at all, and the first two simply don't work as I expect them to.
Note: this server is a Spring Zuul gateway, but I don't think this should impact anything?
Any help would be appreciated.

Related

investments/transactions/get endpoint - how long to return data?

I've been testing Plaid's investments transactions endpoint (investments/transactions/get) in development.
I'm encountering issues with highly variable delays for data to be returned (following the product initialization with Link). Plaid states that it takes 1–2 minutes to return investment transaction data, but I've found that in practice, it can be up to several hours before the data is returned.
Anyone else using this endpoint and getting data returned within 1–2 minutes, or is it generally a longer wait?
If it is a longer wait, do you simply wait for the DEFAULT_UPDATE webhook before you retrieve the data?
So far, my experience with their investments/transactions/get has been problematic (missing transactions, product doesn't work as described in their docs, limited sandbox dataset, etc.) so I'm very interested in hearing from anyone with more experience with this endpoint.
Do you find this endpoint generally reliable, and the data provided to be usable, or have you had issues? I've not seen any issues with investments/holdings/get, so I'm hoping that my problems are unusual, and I just need to push through it.
I'm testing in development with my own brokerage accounts, so I know what the underlying transactions are compared to what Plaid is returning to me. My calls are set up correctly, and I can't get a helpful answer from Plaid support.

I took at look at the support issue and it does appear like the problem you're hitting is related to a bug (or two different bugs, in this case).
However, for posterity/anyone else reading this question, I looked it up and the general answer to the question is that the endpoint in the general case is pretty fast -- P95 latency for calling /investments/transactions/get is currently about 1 second (initial calls on an Item will be higher latency as they have more data to fetch and because they are blocked on Plaid's extracting the data for the Item for the first time -- hence the 1-2 minute guidance in the docs).
In addition, Investments updates at some major brokerages are scheduled to happen only overnight after market close, so there might be a delay of 12+ hours between making a trade and seeing that trade be returned by the API.

Schedule simple GET batch for each second or even less than one second - Should opt for Spring Cloud Task, Spring Batch or springframework.scheduling

Context: in my country there will be a new way to Instantly Payment previewed for November. Basically, the Central Bank will provide two endpoints: (1) one POST endpoint which we post a single money transfer and (2) one GET endpoint where we get the result of a money transfer sent before and it can be completely out of order. It will answer back only on Money Transfer result and in its header will inform if there is another result we must GET. It never informs how many results are available. If there is a result it gives back on Get response and only inform if it is the last one or there is remaining ones for next GET.
Top limitation: from the moment final user clicks Transfer button in his/her mobile app until final result showing in his mobile screen if it was successful or failed is 10 seconds.
Strategy: I want a schedule which triggers each second or even less than a second a Get to Central Bank. The Scheduler will basically evoke a simple function which
Calls the Get endpoint
Pushes it to a Kafka or persist in database and
If in the answer headers it is informed more results are available, start same function again.
Issue: Since we are Spring users/followers, I though my decision was between Spring Batch versus org.springframework.scheduling.annotation.SchedulingConfigurer/TaskScheduler. I have used successfully Spring Batch for while but never for a so short period trigger (never used for 1 second period). I stumbled in discussion that drove me to think if in my case, a very simple task but with very short period, I should consider Spring Cloud Data Flow or Spring Cloud Task instead of Spring Batch.
According to this answer "... Spring Batch is ... designed for the building of complex compute problems ... You can orchestrate Spring Batch jobs with Spring Scheduler if you want". Based on that, it seems I shouldn't use Spring Batch because it isn't complex my case. The challenge design decision is more regard a short period trigger and triggering another batch from current batch instead of transformation, calculation or ETL process. Nevertheless, as far as I can see Spring Batch with its tasklet is well-designed for restarting, resuming and retrying and fits well a scenario which never finishes while org.springframework.scheduling seems to be only a way to trigger an event based on period configuration. Well, this is my filling based on personal uses and studies.
According to an answer to someone asking about orchestration for composed tasks this answer "... you can achieve your design goals using Spring Cloud Data Flow along with the Spring Cloud Task/Spring Batch...". In my case, I don't see composed tasks. In my case, the second trigger doesn't depend on result from previous one. It sounds more as "chained" tasks instead of "composed". I have never used Spring Cloud Data Flow but it seems a nice candidate for Manage/View/Console/Dashboards the triggered task. Nevertheless, I didn't find anywhere informing limitations or rule of thumbs for short periods triggers and "chained" triggers.
So my straight question is: what is the current recommend Spring members for a so short period trigger? Assuming Spring Cloud Data Flow is used for manager/dashboard what is the trigger member from Spring recommended in so short trigger scenarios? It seems Spring Cloud Task is designed for calling complex functions and Spring Batch seems to add too much than I need and org.springframework.scheduling.* missing integration with Spring Cloud Data Flow. As an analogy and not as comparison, in AWS, the documentation clear says "don't use CloudWatch for less than one minute. If you want less than one minute, start CloudWatch for each minute that start another scheduler/cron each second". There might be a well-know rule of thumb for a simple task that needs to be trigger each second or even less than one second and take advantage of Spring family approach/concerns/experience.

This may be stupid answer. Why do you need scheduler here?. Wouldn't a never ending job will achieve the goal here?
You start a job, it does a GET request, push the result to kafka,
If the GET response indicated, it had more results, it immediately does a GET again, push the result to kafka
If the GET response indicated, there are no more results, sleep for 1 second, do the GET request again.

How to kill the thread of searching request on elasticsearch cluster? Is there some API to do this?

I made a elasticsearch cluster with big data, and the client can send searching request to it.
Sometimes, the cluster costs much time to deal with one request.
My question is, is there any API to kill the specified thread which cost too much time?

I wanted to follow up on this answer now that elasticsearch 1.0.0 has been released. I am happy to announce that there is new functionality that has been introduced that implements some protection for the heap, called the circuit breaker.
With the current implementation, the circuit breaker tries to anticipate how much data is going to be loaded into the field data cache, and if it's greater than the limit (80% by default) it will trip the circuit breaker and there by kill your query.
There are two parameters for you to set if you want to modify them:
indices.fielddata.breaker.limit
indices.fielddata.breaker.overhead
The overhead is the constant that is used to estimate how much data will be loaded into the field cache; this is 1.03 by default.
This is an exciting development to elasticsearch and a feature I have been waiting to be implemented for months.
This is the pull request if interested in seeing how it was made; thanks to dakrone for getting this done!
https://github.com/elasticsearch/elasticsearch/pull/4261
Hope this helps,
MatthewJ

Currently it is not possible to kill or stop the long running queries, But Elasticsearch is going to add a task management api to do this. The API is likely to be added in Elasticsearch 5.0, maybe in 2016 or later.
see Task management 1 and Task management 2.

GWT RequestFactory Performance

I have a question regarding the performance of RequestFactory and GWT. I have a Domain Entity with 8 fields that returns around 1000 EntityProxies. The time between the request fires and it responds is around 20 seconds. I do the same but returning 10 EntityProxies and the time is 17 seconds, almost the same.
Is this because I'm working in development mode, or when I release the code to the web the time will be the same?
Is there any way to improve the performance? , I'm only reading data so perhaps something that only read and doesn't writes may be the solution?
I read this post with something similar to my problem:
GWT Requestfactory performance suggestions
Thanks a lot.
PD: I read somewhere that one solution could be to create an xml in the server, send it to the client and recreate the object there, I don't want to do this because it would really change the design of my app.

Thank you all for the help, I realize now that perhaps using Request Factory to retrieve thousands of records was a mistake.
I initially used a Locator to override isLive() and Find() methods according to this post:
gwt-requestfactory-performance-suggestions
The response time was reduced to about 13 seconds, but it is still too high.
But I solved it easily. Instead of returning 1000+ Entities , I created a new database table which each field has all the same field records (1000+) concatenated by a separator (each db field has a length of about 10000 ) and I only have one record in the table with around 8 fields.
Something like this:
Field1 | Field2 | Field3
Field1val;Field1val;Field1val;....... | Field2val;Field2val;Field2val;...... | Field3val;Field3val;Field3val;......
I return that One record through RequestFactory to my client and it reduced the speed a lot!, around 1 second. I parse this large String in the client and the duration of that is about 500ms. So instead of wasting around 20 seconds now it takes around 1-2 seconds to accomplish the same.
By the way I am only displaying information, it is not necessary to Insert, Delete or Update records so this solution works for me.
Thought I could share this solution.

Performance Profiling and Fixing issues in GWT is tricky. Avoid all profiling in GWT Hosted mode. They do not mean anything useful.
You should profile only in WEB mode.
GWT RequestFactory by design is slower than GWT RPC and GWT JSON etc. This is a trade off w.r.t GWT RF ability to calculate delta and send only small amount information to server on save.
You should recheck you application design to avoid loading 1000's of proxies. RF is mean for "Form" like applications. The only reason you might need 1000's of proxies is for a Grid display. You probably can use paginated async grid in that scenario.

You should profile your app in order to find out how much time is spent on following steps:
Entities retrieved from the database (server): This can be improved using second level cache and optimized queries
Entities serialized to JSON (server): There is a overhead here because RequestFactory and AutoBean respectively rely on reflections. You can try to only transmit the Entities that you are also going to display on the client. Another optimization which greatly reduces latency is to override the isLive method of your EntitiyLocator and to return true
HTTP request from server to client to tranmit the data (wire): You can think about using gzip compression to reduce the amount of data that has to be transferred (important if you send a lof of objects over the wire).
De-serialization on the client (client): This should be quite fast. There was a benchmark that showed that AutoBean serialization was one of the fastest ways to serialize JSON. Again this will benefit from not sending the whole object graph over the wire.
One way to improve performance is to use caching. You can use HTML5 localstorage to cache data on the client. This applies specifically to data that doesn't change often.

Solution for graphing application events metrics in real time

We have an application that parses tweets and we want to see the activity in real time. We have tried several solution without success. Our main problems is that the graphing solution (example:graphite), needs a continious flow of metrics. When the db aggregates the metrics it's an average operation which is done, not a a sum.
We recently saw cube from square which would fit our requirement but it's too new.
Any alternatives?

I found the solution in the last version of graphite:
http://graphite.readthedocs.org/en/latest/config-carbon.html#storage-aggregation-conf

If I understood correctly, you cannot feed graphite in realtime, for instance as soon as you discover a new tweet?
If that's the case, it looks like you can specify a unix timestamp when updating graphite metric_path value timestamp\n so you could pass in the time of discovery/publication/whatever, regardless of when you process it.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio