Unable to debug azure redis cache locally from azure function app - Timeout performing SETEX - stackexchange.redis

I have created an azure function app locally and created azure redis cache under premium plan in azure portal.
I am trying to get/set cache by debugging the function app locally. I have used StackExchange.Redis 2.5.61 version. When I am trying to access the cache, its failing with below error -
{"Timeout performing SETEX (5000ms), inst: 0, qu: 1, qs: 0, aw: False, bw: SpinningDown, rs: NotStarted, ws: Idle, in: 0,
serverEndpoint: test.cache.windows.net:6380, mc: 1/1/0, mgr: 10 of 10
available, clientName:, IOCP: (Busy=0,Free=1000,Min=8,Max=1000),
WORKER: (Busy=1,Free=32766,Min=8,Max=32767), POOL:
(Threads=16,QueuedItems=0,CompletedItems=803), v: 2.5.61.22961 (Please
take a look at this article for some common client-side issues that can cause timeouts:
https://stackexchange.github.io/StackExchange.Redis/Timeouts)"}

Related

K6 script failing in azure devops

Setup
We are using K6 tool for Load testing our API.
Locally scripts work perfectly fine with virtual users up to 50
problem statement
While running the same script in Azure devops using K6 task and virtual user 15 for duration 60s most of the request (almost 90%) is failing because of timeout and following is the error :
level=warning msg="Request Failed" error="Get "https://xxxxx": dial tcp xxxx: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond."
Not sure
a) What is causing the timeout?
b) What are the other options?
PS: Api is behind Api Management Service in Azure and load test works locally fine even on 10 times higher load

Some postgress connections timing-out while others don't

I have an AWS EC2 machine running a Laravel 5.2 application that connects to a Postgress 9.6 databse running in RDS. While most of the connections work, some of them are getting rejected when trying to stablish, which causes a Timeout and consequently an error in my API. I don't know what is causing them to be rejected. Also, it is very random when it happens, when it does happen it may be in any API endpoint and inside the endpoint in any query.
When the timeout is handled by PHP, it shows a message like:
SQLSTATE[08006] [7] timeout expired (SQL: ...)
Sometimes the Nginx handles the timeout and replies with a 504 Error. When Nginx handles the timeout I get an error like:
2019/04/24 09:48:18 [error] 20657#20657: *3236 upstream timed out (110: Connection timed out) while reading response header from upstream, client: {client-ip-here}, server: {my-url-here}, request: "GET {my-endpoint-here} HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php7.0-fpm.sock", host: "{}", referrer: "https://app.cartoriovirtual.com/"
All usage charts on the RDS and EC2 seems ok, I have plenty of RAM, storage, CPU and available connections for RDS. I also checked inner VPC Flows and they seem alright, however I have many IPs (listed as attackers) scanning my network interfaces, most of them been rejected. Some (to port 22) accepted but stoped at authentication, I use a .pem Key File for auth.
The RDS network interface only accepts requests from inner VPC machines. In its logs, every 5 minutes I have a Checkpoint like this:
2019-04-25 01:05:29 UTC::#:[22595]:LOG: checkpoint starting: time
2019-04-25 01:05:34 UTC::#:[22595]:LOG: checkpoint complete: wrote 43 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=4.393 s, sync=0.001 s, total=4.404 s; sync files=19, longest=0.001 s, average=0.000 s; distance=16515 kB, estimate=16515 kB
Anyone has tips on how to find a solution? I looked at all possible logs that came in mind, fixed a few little issues but the error persists. I am running out of ideas.

Understanding Spring Cloud Eureka Server self preservation and renew threshold

I am new to developing microservices, although I have been researching about it for a while, reading both Spring's docs and Netflix's.
I have started a simple project available on Github. It is basically a Eureka server (Archimedes) and three Eureka client microservices (one public API and two private). Check github's readme for a detailed description.
The point is that when everything is running I would like that if one of the private microservices is killed, the Eureka server realizes and removes it from the registry.
I found this question on Stackoverflow, and the solution passes by using enableSelfPreservation:false in the Eureka Server config. Doing this after a while the killed service disappears as expected.
However I can see the following message:
THE SELF PRESERVATION MODE IS TURNED OFF.THIS MAY NOT PROTECT INSTANCE
EXPIRY IN CASE OF NETWORK/OTHER PROBLEMS.
1. What is the purpose of the self preservation? The doc states that with self preservation on "clients can get the instances that do not exist anymore". So when is it advisable to have it on/off?
Furthermore, when self preservation is on, you may get an outstanding message in the Eureka Server console warning:
EMERGENCY! EUREKA MAY BE INCORRECTLY CLAIMING INSTANCES ARE UP WHEN
THEY'RE NOT. RENEWALS ARE LESSER THAN THRESHOLD AND HENCE THE
INSTANCES ARE NOT BEING EXPIRED JUST TO BE SAFE.
Now, going on with the Spring Eureka Console.
Lease expiration enabled true/false
Renews threshold 5
Renews (last min) 4
I have come across a weird behaviour of the threshold count: when I start the Eureka Server alone, the threshold is 1.
2. I have a single Eureka server and is configured with registerWithEureka: false to prevent it from registering on another server. Then, why does it show up in the threshold count?
3. For every client I start the threshold count increases by +2. I guess it is because they send 2 renew messages per min, am I right?
4. The Eureka server never sends a renew so the last min renews is always below the threshold. Is this normal?
renew threshold 5
rewnews last min: (client1) +2 + (client2) +2 -> 4
Server cfg:
server:
port: ${PORT:8761}
eureka:
instance:
hostname: localhost
client:
registerWithEureka: false
fetchRegistry: false
serviceUrl:
defaultZone: http://${eureka.instance.hostname}:${server.port}/eureka/
server:
enableSelfPreservation: false
# waitTimeInMsWhenSyncEmpty: 0
Client 1 cfg:
spring:
application:
name: random-image-microservice
server:
port: 9999
eureka:
client:
serviceUrl:
defaultZone: http://localhost:8761/eureka/
healthcheck:
enabled: true
I got the same question as #codependent met, I googled a lot and did some experiment, here I come to contribute some knowledge about how Eureka server and instance work.
Every instance needs to renew its lease to Eureka Server with frequency of one time per 30 seconds, which can be define in eureka.instance.leaseRenewalIntervalInSeconds.
Renews (last min): represents how many renews received from Eureka instance in last minute
Renews threshold: the renews that Eureka server expects received from Eureka instance per minute.
For example, if registerWithEureka is set to false, eureka.instance.leaseRenewalIntervalInSeconds is set to 30 and run 2 Eureka instance. Two Eureka instance will send 4 renews to Eureka server per minutes, Eureka server minimal threshold is 1 (written in code), so the threshold is 5 (this number will be multiply a factor eureka.server.renewalPercentThreshold which will be discussed later).
SELF PRESERVATION MODE: if Renews (last min) is less than Renews threshold, self preservation mode will be activated.
So in upper example, the SELF PRESERVATION MODE is activated, because threshold is 5, but Eureka server can only receive 4 renews/min.
Question 1:
The SELF PRESERVATION MODE is design to avoid poor network connectivity failure. Connectivity between Eureka instance A and B is good, but B is failed to renew its lease to Eureka server in a short period due to connectivity hiccups, at this time Eureka server can't simply just kick out instance B. If it does, instance A will not get available registered service from Eureka server despite B is available. So this is the purpose of SELF PRESERVATION MODE, and it's better to turn it on.
Question 2:
The minimal threshold 1 is written in the code. registerWithEureka is set to false so there will be no Eureka instance registers, the threshold will be 1.
In production environment, generally we deploy two Eureka server and registerWithEureka will be set to true. So the threshold will be 2, and Eureka server will renew lease to itself twice/minute, so RENEWALS ARE LESSER THAN THRESHOLD won't be a problem.
Question 3:
Yes, you are right. eureka.instance.leaseRenewalIntervalInSeconds defines how many renews sent to server per minute, but it will multiply a factor eureka.server.renewalPercentThreshold mentioned above, the default value is 0.85.
Question 4:
Yes, it's normal, because the threshold initial value is set to 1. So if registerWithEureka is set to false, renews is always below threshold.
I have two suggestions for this:
Deploy two Eureka server and enable registerWithEureka.
If you just want to deploy in demo/dev environment, you can set eureka.server.renewalPercentThreshold to 0.49, so when you start up a Eureka server alone, threshold will be 0.
I've created a blog post with the details of Eureka here, that fills in some missing detail from Spring doc or Netflix blog. It is the result of several days of debugging and digging through source code. I understand it's preferable to copy-paste rather than linking to an external URL, but the content is too big for an SO answer.
You can try to set renewal threshold limit in your eureka server properties. If you have around 3 to 4 Microservices to register on eureka, then you can set it to this:
eureka.server.renewalPercentThreshold=0.33
server:
enableSelfPreservation: false
if set to true, Eureka expects service instances to register themselves and to continue to send registration renewal requests every 30 s. Normally, if Eureka doesn’t receive a renewal from a service for three renewal periods (or 90 s), it deregisters that instance.
if set to false, in this case, Eureka assumes there’s a network problem, enters self-preservation mode, and won’t deregister service instances.
Even if you decide to disable self-preservation mode for development, you should leave it enabled when you go into production.

Performance - High Context Switch

I have an application which exposes a web service on which I am trying to do a load test.
It works for few concurrent users without any issue.
When I increase the user count to 30, I simply get this error in JMeter within 100 milli seconds.
Non HTTP response code: java.net.SocketException - Non HTTP response message: Connection reset
[I thought my JMeter config was wrong - but one of the web application which uses this web-service also failed consistently around that time saying the service was unavailable. So, server itself has some issue].
I checked the web service - application log - No exception & very clean.
CPU, Memory utilization of server is also very normal on the server machine.
However, 'Context Switch' & 'Device Interrupts' are increasing under load.
Context Switch is avg 1500/sec under heavy load. Normally It is 500/Sec.
Is this bad? Is it what makes my application perform badly? I have no clue to resolve this issue.
Note: It is JBOSS server

Preferred connection pool settings on slow DB server

In a JSF2+Spring web application, I'm using c3p0 for connection pooling. The DB server seems to be quite slow on establishing new connections, after a restart of the container (tomcat7, no persistence across restarts) quite often requests run into timeouts in awaitAvailable.
However, the DB server's owner isn't able to tweak the server itself, so I need to tweak the connection pool settings.
Any recommendations on this? Which values should I touch?
(DB is Oracle 11, actual pool settings are: checkoutTimeout = 10000, maxPoolSize = 110, idleConnectionTestPeriod = 300, minPoolSize = 5)

Resources