I want to capture metrics (number of calls, 95th percentile) about calls made from my backend service to other third party services. I am using WebClient to make these http calls. I couldn't find a specific property to enable WebClient Histogram metrics.
I have added MetricsWebClientFilterFunction to generate the metrics. Here is the logic -
WebClient webClient = WebClient.builder() .baseUrl(SERVICE_URL) .filter(new MetricsWebClientFilterFunction(meterRegistry, new DefaultWebClientExchangeTagsProvider(), "webClientCalls", AutoTimer.ENABLED)) .build();
Its generating only count and sum metrics. How can I generate histogram metrics for WebClient calls?
Here is the output in /actuator/prometheus endpoint -
webClientCalls_seconds_count{clientName="service_url",method="GET",outcome="SUCCESS",status="200",uri="/hello",} 1.0
webClientCalls_seconds_sum{clientName="service_url",method="GET",outcome="SUCCESS",status="200",uri="/hello",} 2.301044994
webClientCalls_seconds_max{clientName="service_url",method="GET",outcome="SUCCESS",status="200",uri="/hello",} 0.0
Just implement AutoTimer interface and override apply method.
public final class AutoTimerHistogram implements AutoTimer {
#Override
public void apply(Timer.Builder builder) {
builder
.serviceLevelObjectives(
Duration.ofMillis(100),
Duration.ofMillis(500),
Duration.ofMillis(800),
Duration.ofMillis(1000),
Duration.ofMillis(1200))
.minimumExpectedValue(Duration.ofMillis(100))
.maximumExpectedValue(Duration.ofMillis(10000));
}
}
After that add it to your MetricsWebClientFilterFunction
MetricsWebClientFilterFunction metricsFilter =
new MetricsWebClientFilterFunction(
meterRegistry,
new DefaultWebClientExchangeTagsProvider(),
"custom.web.client",
autoTimerHistogram);
# TYPE custom_web_client_seconds histogram
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="0.1",} 0.0
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="0.5",} 0.0
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="0.8",} 1.0
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="1.0",} 1.0
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="1.2",} 1.0
custom_web_client_seconds_bucket{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",le="+Inf",} 1.0
custom_web_client_seconds_count{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",} 1.0
custom_web_client_seconds_sum{clientName="login.microsoftonline.com",method="POST",outcome="SUCCESS",status="200",uri="/common/oauth2/v2.0/token",} 0.646656087
Related
I am trying to expose a custom Gauge metric from my Spring Boot Application. I am using Micrometer with the Prometheus registry to do so. I have set up the PrometheusRegistry and configs as per - Micrometer Samples - Github but it creates one more HTTP server for exposing the Prometheus metrics. I need to redirect or expose all the metrics to the Spring boot's default context path - /actuator/prometheus instead of a new context path on a new port. I have implemented the following code so far -
PrometheusRegistry.java -
package com.xyz.abc.prometheus;
import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.time.Duration;
import com.sun.net.httpserver.HttpServer;
import io.micrometer.core.lang.Nullable;
import io.micrometer.prometheus.PrometheusConfig;
import io.micrometer.prometheus.PrometheusMeterRegistry;
public class PrometheusRegistry {
public static PrometheusMeterRegistry prometheus() {
PrometheusMeterRegistry prometheusRegistry = new PrometheusMeterRegistry(new PrometheusConfig() {
#Override
public Duration step() {
return Duration.ofSeconds(10);
}
#Override
#Nullable
public String get(String k) {
return null;
}
});
try {
HttpServer server = HttpServer.create(new InetSocketAddress(8081), 0);
server.createContext("/sample-data/prometheus", httpExchange -> {
String response = prometheusRegistry.scrape();
httpExchange.sendResponseHeaders(200, response.length());
OutputStream os = httpExchange.getResponseBody();
os.write(response.getBytes());
os.close();
});
new Thread(server::start).run();
} catch (IOException e) {
throw new RuntimeException(e);
}
return prometheusRegistry;
}
}
MicrometerConfig.java -
package com.xyz.abc.prometheus;
import io.micrometer.core.instrument.MeterRegistry;
public class MicrometerConfig {
public static MeterRegistry carMonitoringSystem() {
// Pick a monitoring system here to use in your samples.
return PrometheusRegistry.prometheus();
}
}
Code snippet where I am creating a custom Gauge metric. As of now, it's a simple REST API to test - (Please read the comments in between)
#SuppressWarnings({ "unchecked", "rawtypes" })
#RequestMapping(value = "/sampleApi", method= RequestMethod.GET)
#ResponseBody
//This Timed annotation is working fine and this metrics comes in /actuator/prometheus by default
#Timed(value = "car.healthcheck", description = "Time taken to return healthcheck")
public ResponseEntity healthCheck(){
MeterRegistry registry = MicrometerConfig.carMonitoringSystem();
AtomicLong n = new AtomicLong();
//Starting from here none of the Gauge metrics shows up in /actuator/prometheus path instead it goes to /sample-data/prometheus on port 8081 as configured.
registry.gauge("car.gauge.one", Tags.of("k", "v"), n);
registry.gauge("car.gauge.two", Tags.of("k", "v1"), n, n2 -> n2.get() - 1);
registry.gauge("car.help.gauge", 89);
//This thing never works! This gauge metrics never shows up in any URI configured
Gauge.builder("car.gauge.test", cpu)
.description("car.device.cpu")
.tags("customer", "demo")
.register(registry);
return new ResponseEntity("Car is working fine.", HttpStatus.OK);
}
I need all the metrics to show up inside - /actuator/prometheus instead of a new HTTP Server getting created. I know that I am explicitly creating a new HTTP Server so metrics are popping up there. Please let me know how to avoid creating a new HTTP Server and redirect all the prometheus metrics to the default path - /actuator/prometheus. Also if I use Gauge.builder to define a custom gauge metrics, it never works. Please explain how I can make that work also. Let me know where I am doing wrong.
Thank you.
Every time you call MicrometerConfig.carMonitoringSystem(); it is creating a new prometheus registry (and trying to start a new server)
You need to inject the MeterRegistry in your class that is creating the gauge and use the injected MeterRegistry that way.
We are developing a Spring Boot (2.4.0) application that uses Spring Integration framework(5.4.1) to build SOAP Integration flows and register them dynamically. The time taken to register ‘IntegrationFlow’ with ‘FlowContext’ is increasing exponentially as the number of flows being registered increase.
Following is a quick snapshot of time taken to register flows:
5 flows – 500 ms
100 flows – 80 sec
300 flows – 300 sec
We see that first few flows are taking about 100ms to register, and as it reaches 300 it is taking up to 7 sec to register each flow. These flows are identical in nature (and they simply log an info message and return).
Any help to resolve this issue would be highly appreciated.
SoapFlowsAutoConfiguration.java (Auto Configuration class that registers Flows dynamically(manually))
#Bean
public UriEndpointMapping uriEndpointMapping(
ServerProperties serverProps,
WebServicesProperties webServiceProps,
IntegrationFlowContext flowContext,
FlowMetadataProvider flowMetadataProvider,
#ErrorChannel(Usage.SOAP) Optional<MessageChannel> errorChannel,
BeanFactory beanFactory) {
UriEndpointMapping uriEndpointMapping = new UriEndpointMapping();
uriEndpointMapping.setUsePath(true);
Map<String, Object> endpointMap = new HashMap<>();
flowMetadataProvider
.flowMetadatas()
.forEach(
metadata -> {
String contextPath = serverProps.getServlet().getContextPath();
String soapPath = webServiceProps.getPath();
String serviceId = metadata.id();
String serviceVersion = metadata.version();
String basePath = contextPath + soapPath;
String endpointPath = String.join("/", basePath, serviceId, serviceVersion);
SimpleWebServiceInboundGateway inboundGateway = new SimpleWebServiceInboundGateway();
errorChannel.ifPresent(inboundGateway::setErrorChannel);
endpointMap.put(endpointPath, inboundGateway);
IntegrationFlowFactory flowFactory = beanFactory.getBean(metadata.flowFactoryClass());
IntegrationFlow integrationFlow =
IntegrationFlows.from(inboundGateway).gateway(flowFactory.createFlow()).get();
flowContext.registration(integrationFlow).register();
});
uriEndpointMapping.setEndpointMap(endpointMap);
return uriEndpointMapping;
}
SoapFlow.java (Integration Flow)
#Autowired private SoapFlowResolver soapFlowResolver;
#Autowired private CoreFlow delegate;
#Override
public IntegrationFlow createFlow() {
IntegrationFlow a =
flow -> flow.gateway(soapFlowResolver.resolveSoapFlow(delegate.createFlow()));
return a;
}
SoapFlowResolver.java (Common class used by all integration flows to delegate request to a Coreflow that is responsible for business logic implementation)
public IntegrationFlow resolveSoapFlow(
IntegrationFlow coreFlow) {
return flow -> {
flow.gateway(coreFlow);
};
}
CoreFlow.java (Class that handles the business logic)
#Override
public IntegrationFlow createFlow() {
return flow -> flow.logAndReply("Reached CoreFlow");
}
You are crating too many beans, where each of them checks the rest if it wasn't created before. That's how you get increase with the start time when you add more and more flows dynamically.
What I see is an abuse of the dynamic flows purpose. Each time we decide to go this way we need to think twice if we definitely need to have the whole flow as a fresh instance. Again: the flow is not volatile object, it registers a bunch of beans in the application context which are going to stay there until you remove them. And they are singletons, so can be reused in any other places of your application.
Another concern that you don't count with the best feature of Spring Integration MessageChannel pattern implementation. You definitely can have some common flows in advance and connect your dynamic with those through channel between them. You probably just need to create dynamically a SimpleWebServiceInboundGateway and wire it with the channel for your target logic which is the same for all the flows and so on.
I am using following classes in one of controllers of (spring-boot.1.5.12 release)
I am unable to find matching classes in spring 2.1.9 release.
The following is the code snippet
import org.springframework.boot.actuate.endpoint.CachePublicMetrics;
import org.springframework.boot.actuate.metrics.Metric;
public class CachingController extends CloudRestTemplate {
#Autowired
private CachePublicMetrics metrics;
public #ResponseBody Map<String, Object> getData(#Pattern(regexp=Constants.STRING_VALID_PATTERN, message=Constants.STRING_INVALID_MSG) #PathVariable(required = true) final String name) throws Exception {
boolean success = false;
Map<String, Object> m = Maps.newHashMap();
Collection<Metric<?>> resp = new ArrayList<>();
Collection<Metric<?>> mets = metrics.metrics();
for (Iterator<Metric<?>> iterator = mets.iterator(); iterator.hasNext();) {
Metric<?> met = iterator.next();
String metName = met.getName();
logger.debug(metName+":"+met.getValue());
if(StringUtils.isNotEmpty(metName)
&& metName.indexOf(name) != -1 ){
resp.add(met);
}
}
}
I think you should take a deeper look into Spring boot actuator, once you expose your all endpoints you might find what you are looking for. Spring Boot provides bunch of pre-defined endpoints, below is a list of Spring boot actuator endpoints (source)
/auditevents: Exposes audit events information for the current application.
/beans: Returns list of all spring beans in the application.
/caches: Gives information about the available caches.
/health: Provides applications health information.
/conditions: Provides list of conditions those were evaluated during auto configurations.
/configprops: Returns list of application level properties.
/info: Provides information about current application. This info can be configured in a properties file.
/loggers: Displays logging configurations. Moreover, this endpoint can be used to modify the configurations.
/headdump: Produces a head dump file and returns it.
/metrics: Returns various metrics about the application. Includes memory, heap and threads info. However, this endpoint doesn’t return any metrics. While, it only returns list of available metrics, the
metrics names can be used in a separate request to fetch the respective details. For instance, /actuator/metrics/jvm.memory.max like this.
/scheduledtasks: Returns a list of Scheduled tasks in the application.
/httptrace: Returns last 100 http interactions in the form of request and response. Including, the actuator endpoints.
/mappings: List of all Http Request Mappings. Also includes the actuator endpoints.
Edit:
As discussed in comments, you need to access /actuator/metrics/jvm.memory.max , you can invoke same using RestTemplate if you want to access using Java, you need to explore Actuator APIs, I wrote a quick program, you can refer the same
#Autowired
private MetricsEndpoint metricsEndpoint;
public MetricResponse printJavaMaxMemMetrics() {
ListNamesResponse listNames = metricsEndpoint.listNames();
listNames.getNames().stream().forEach(name -> System.out.println(name));
MetricResponse metric = metricsEndpoint.metric("jvm.memory.max", new ArrayList<>());
System.out.println("metric (jvm.memory.max) ->" + metric);
List<AvailableTag> availableTags = metric.getAvailableTags();
availableTags.forEach(tag -> System.out.println(tag.getTag() + " : " + tag.getValues()));
List<Sample> measurements = metric.getMeasurements();
measurements.forEach(sample -> System.out.println(sample.getStatistic() + " : " + sample.getValue()));
return metric;
}
I have a service that interacts with a couple of other services. So I created separate webclients for them ( because of different basepaths). I had set timeouts for them individually based on https://docs.spring.io/spring/docs/5.1.6.RELEASE/spring-framework-reference/web-reactive.html#webflux-client-builder-reactor-timeout but that does not seem to be working effectively . For one of the services tried lowering the ReadTimeout to 2 seconds but the service doesn't seem to timeout ( The logs using logging.level.org.springframework.web.reactive=debug show that the request takes about 6-7 seconds to complete).
I am using spring5.1 and netty 0.8 , I am using blocking with the webclient though because we have not gone all in with webflux yet. I tried playing around with the timeouts for each of the calls a bit and it seems like some calls do respond to the timeout while others do not ( more details alongside code below)
How I initialize webclients -
#Bean
public WebClient serviceAWebClient(#Value("${serviceA.basepath}") String basePath,
#Value("${serviceA.connection.timeout}") int connectionTimeout,
#Value("${serviceA.read.timeout}") int readTimeout,
#Value("${serviceA.write.timeout}") int writeTimeout) {
return getWebClientWithTimeout(basePath, connectionTimeout, readTimeout, writeTimeout);
}
#Bean
public WebClient serviceBWebClient(#Value("${serviceB.basepath}") String basePath,
#Value("${serviceB.connection.timeout}") int connectionTimeout,
#Value("${serviceB.read.timeout}") int readTimeout,
#Value("${serviceB.write.timeout}") int writeTimeout) {
return getWebClientWithTimeout(basePath, connectionTimeout, readTimeout, writeTimeout);
}
#Bean
public WebClient serviceCWebClient(#Value("${serviceC.basepath}") String basePath,
#Value("${serviceC.connection.timeout}") int connectionTimeout,
#Value("${serviceC.read.timeout}") int readTimeout,
#Value("${serviceC.write.timeout}") int writeTimeout) {
return getWebClientWithTimeout(basePath, connectionTimeout, readTimeout, writeTimeout);
}
private WebClient getWebClientWithTimeout(String basePath,
int connectionTimeout,
int readTimeout,
int writeTimeout) {
TcpClient tcpClient = TcpClient.create()
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectionTimeout)
.doOnConnected(connection ->
connection.addHandlerLast(new ReadTimeoutHandler(readTimeout))
.addHandlerLast(new WriteTimeoutHandler(writeTimeout)));
return WebClient.builder().baseUrl(basePath)
.clientConnector(new ReactorClientHttpConnector(HttpClient.from(tcpClient))).build();
How I am essentially using this (have wrapper classes for each webclient) -
Mono<ResponseA> serviceACallMono = ..;
Mono<ResponseB> serviceBCallMono = ..;
Mono.zip(serviceACallMono,serviceBCallMono,
(serviceAResponse, serviceBResponse) -> serviceC.getImportantData(serviceAResponse,serviceBResponse))
.flatMap(Function.identity)
.block();
So in the above, I noticed the following -
If I lower the serviceA ReadTimeout , I do get the timeout error.
If I lower the serviceB ReadTimeout , I do get the timeout error.
If I lower the serviceC ReadTimeout , it DOES NOT responds to lowering the ReadTimeout. It just keeps on working till it gets response.
So , am I missing something here ? I was under the impression these timeouts should work in all the scenarios. Please do let me know if I can add something more .
Edit : Update, so I sort of can reproduce the issue in a simpler manner.
So, for something like -
return serviceACallMono
.flatMap(notUsed -> serviceBCallMono);
The timeout of serviceACallMono is honored, but no matter how much you lower it for serviceB it doesn't timeout.
And if you just flip the order -
return serviceBCallMono
.flatMap(notUsed -> serviceACallMono);
Now the timeout for serviceB is honored but that for serviceA isn't.
I updated the service to return Mono as well while observing the behavior in this Edit.
Edit 2 :
This is essentially whats happening in ServiceC#getImportantData -
#Override
public Mono<ServiceCResponse> getImportantData(ServiceAResponse requestA,
ServiceBResponse requestB) {
return serviceCWebClient.post()
.uri(GET_IMPORTANT_DATA_PATH, requestB.getAccountId())
.body(BodyInserters.fromObject(formRequest(requestA)))
.retrieve()
.bodyToMono(ServiceC.class);
}
formRequest is a simple POJO transformation method.
I was using spring-boot starter parent to pull various spring dependencies. Making it got from version 2.1.2 to 2.1.4 seems to resolve the issue.
I have developed a Spring Boot microservice which will be used very heavily (close to 10k hits/second). I have used Feign as my REST Client and used CompletableFutures to hit other services to get data in an async way.
In this one scenario, I am hitting another service for 4 different objects in one function using CompletableFutures.
My config is as follows :
Tomcat thread size :
server.tomcat.max-threads=250
Executor thread pool size :
private RejectedExecutionHandler rejectedExecutionHandler = new ThreadPoolExecutor.CallerRunsPolicy();
private ExecutorService executorService = new ThreadPoolExecutor(1000, 1000, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>(), rejectedExecutionHandler);
My code that I'm using to hit the external service :
CompletableFuture<Integer> future1 = CompletableFuture.supplyAsync(() ->
service1.function1(params) , executorService);
CompletableFuture<List<Object>> future2 = CompletableFuture.supplyAsync(() ->
service2.function2(params) , executorService);
CompletableFuture<Object> future3 = CompletableFuture.supplyAsync(() ->
service3.function3(params) , executorService);
CompletableFuture<Object> future4=CompletableFuture.supplyAsync(()->
service4.function4 (params) , executorService);
List<Object> response =Stream.of(future1, future2, future3, future4)
.parallel()
.map(CompletableFuture::join)
.collect(Collectors.toList());
These services are SpringBoot services which use Feign internally.
#Autowired
Service1(AppConfig config)
{
this.config=config;
this.currentService=connect(config);
}
#Bean
private Service1 connect(AppConfig config) {
Logger logger = new Logger.JavaLogger();
return Feign.builder()
.encoder(new GsonEncoder())
.decoder(new GsonDecoder())
.logger(logger)
.logLevel(feign.Logger.Level.FULL)
.target(Service1.class, config.getApiUrl());
}
On Load testing this Call using jMeter, which internally calls these 4 calls to the external service, I am able to achieve a rate of 200 hits/sec on 1 AWS instance. If I increase the load on 1 instance, I start getting RejectedExecutionException.
I tried the same with 10 AWS instances under a load balancer. Expectation was that it would scale linearly to 2000 hits/sec. However I could only achieve 1400 hits/sec. Any load increased after that caused the RejectedExecutionExceptions to come back.
What can be the solution here? Is there any tweak I can try here in the configurations?
Also, could Feign be the bottleneck over here? Should I try another client like Retrofit?