Multi-instance verticle in vert.x is thread safe - thread-safety

I understand that in vert.x default Verticle will run in same event loop every time we don't need to write thread-safety in our handler.
For example if I'm having a Verticle running HttpServer -
public class HttpServerVerticle extends AbstractVerticle {
#Override
public void start() throws Exception {
vertx.createHttpServer().requestHandler(req -> {
req.response().putHeader("content-type", "text/html").end("
<html><body><h1>Hello from vert.x!</h1></body></html>");
}).listen(8080);
}
}
It's guaranteed that at no point in time, my request handler will be called twice (for 2 different requests) on 2 event loops. Therefore I don't have to take care of thread safety in my request handler.
Now if I'm running multiple instances of my HttpServer verticle -
DeploymentOptions deploymentOptions = new
DeploymentOptions().setWorker(false).setInstances(10);
vertx.deployVerticle("com.....HttpServerVerticle", deploymentOptions);
Do I need to take care of thread safety? It's possible that multiple request handler (max = 10) will be running in parallel?

In this case what you get is 10 verticles, and HTTP requests will be dispatched in a round-robin fashion among these 10 verticles. Each verticle will be assigned to an event-loop, you keep the same thread-safety guarantees.

From https://vertx.io/docs/4.2.0/vertx-core/java/#_specifying_number_of_verticle_instances:
When deploying a verticle using a verticle name, you can specify the number of verticle instances that you want to deploy. This is useful for scaling easily across multiple cores. For example you might have a web-server verticle to deploy and multiple cores on your machine, so you want to deploy multiple instances to utilise all the cores.
I confirmed this through experiments: For multi instance deployment of a verticle, the deployment ID of all instances is the same. All instances are indeed thread safe.
DEMO code:
public class Server extends AbstractVerticle {
private static final Logger LOGGER = LoggerFactory.getLogger(Server.class);
private int counter = 0;
#Override
public void start(Promise<Void> startPromise) throws Exception {
vertx.createHttpServer().requestHandler(request -> {
LOGGER.info("Request #{} from {}, verticleId: {}, hashcode: {}", ++counter, request.remoteAddress().host(), vertx.getOrCreateContext().deploymentID(), this.hashCode());
request.response().end("Hello!");
}).listen(8090).onComplete(server -> {
if (server.succeeded()){
startPromise.complete();
}else {
startPromise.fail(server.cause());
}
});
}
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
vertx.deployVerticle(Server.class, new DeploymentOptions().setInstances(4));
}
}
public class Client {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
WebClient webClient = WebClient.create(vertx);
for (int i = 0; i < 100; i++) {
webClient.getAbs("http://localhost:8090").send().onSuccess(response -> System.out.println(response.bodyAsString())).onFailure(Throwable::printStackTrace);
}
}
}
However, this does not mean that you can deploy multiple instances of any verticle. Because multi instance deployment means that multiple verticle objects are generated. If your verticle is stateful, the states of multiple instance objects are independent of each other. In other words, if you need to maintain a consistent and shared state(like counter in DEMO) in verticle during multi instance deployment, you can't multi instance deployment. Alternatively, you can use the shared data provided by vertx to solve this problem.

Related

Why I can't see the performance difference between Spring Boot and Vert.x

One advantage of Vert.X is it's performance, but I can't see any difference from my testing, any one know why? The test is simply printing hello.
I have also perform a testing for requesting Google(async request in Vert.x) then print response. It also shows 2 framework have same performance.
Vert.x Code:
public class MainVerticle extends AbstractVerticle {
static String HELLO = "hello";
#Override
public void start(Promise<Void> startPromise) throws Exception {
vertx.createHttpServer().requestHandler(req -> {
req.response()
.putHeader("content-type", "text/plain")
.end(HELLO );
}).listen(8888, http -> {
if (http.succeeded()) {
startPromise.complete();
System.out.println("HTTP server started on port 8888");
} else {
startPromise.fail(http.cause());
}
});
}
}
Spring code:
#SpringBootApplication
#RestController
public class DemoApplication {
static String HELLO = "hello";
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
#GetMapping("/")
public String hei(){
return HELLO;
}
}
Apache Benchmark(calling from another machine):
ab -n 50000 -c 10 http://192.168.1.115:8888/
Simply put, you don't see performance benefits because you're testing the wrong things.
Vert.x, and asynchronous frameworks in general, are great for IO bound operations. And it so happens that most read world applications are IO bound (waiting for DB, disk, other services, etc).
Your application doesn't perform any significant IO, though. So, that's one reason you don't see a difference.
Another reason is the concurrency level you are using. Spring applications are bound by their thread pool size, but my guess is it's bigger than 10 threads, so you aren't really stressing your application.
The third reason is that you are most likely running this test on the same machine than your server is running. This is flawed, because your Vert.x application will compete for resources with ab.

How to process multiple AMQP messages in parallel with the same #Incoming method

Is it possible to process multiple amqp - messages in parallel with the same method annotated with #Incoming("queue") with quarkus and smallrye-reactive-messaging?
To be more precise, I have following class:
#ApplicationScoped
public class Receiver {
#Incoming("test-queue")
public void process(String input) {
System.out.println("start processing:" + input);
try {
Thread.sleep(10_000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("end processing:" + input);
}
}
With the configuration in the application.properties:
amqp-host: localhost
amqp-port: 5672
amqp-username: quarkus
amqp-password: quarkus
mp.messaging.incoming.test-queue.connector: smallrye-amqp
mp.messaging.incoming.test-queue.address: test-queue
Now I'd like define by configuration how many parallel processing of messages are possible. For example, on a 4 core cpu it should run 4 in parallel.
Currently I can just add 4 copies of the method with different names to allow this parallelism, but that is not configurable.
I'm not sure, but I don't think Reactive Messaging supports what you're asking for.
You can, however, do what you want another way. I think it's also a better overall pattern for using messaging.
http://smallrye.io/smallrye-reactive-messaging/smallrye-reactive-messaging/2.5/amqp/amqp.html#amqp-inbound
Find the example with the CompletionStage and the explicit ack(). That variant is asynchronous, so if you combine it with Java's existing concurrency facilities, you'll get efficient parallel processing.
I would send the incoming work to an executor, and then have the executing task ack() when it completes.
I just came across the same scenario and here is how the spec intends for you to handle concurrency:
From eclipse Microprofile spec
Basically, instead of having a class with a method like this:
#Incoming("test-queue")
public void process(String input) {}
You have 2 classes like this:
#ApplicationScoped
public class MessageSubscriberProducer {
#Incoming("test-queue")
public Subscriber<String> createSubscriber() {
return new SubscriberImpl();
}
}
public class SubsciberImpl implements Subscriber<String> {
private Subscription subscription;
#Override
public void onSubscribe(Subscription subscription) {
this.subscription = subscription;
this.subscription.request(4); // this tells how many messages to grab right away
}
#Override
public void onNext(String val) {
// do processing
this.subscription.request(1); // grab 1 more
}
}
This has the additional advantage of moving your processing code from the vert.x event-loop thread to a worker thread pool.

Vert.x: how to process HttpRequest with a blocking operation

I've just started with Vert.x and would like to understand what is the right way of handling potentially long (blocking) operations as part of processing a REST HttpRequest. The application itself is a Spring app.
Here is a simplified REST service I have so far:
public class MainApp {
// instantiated by Spring
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
Vertx.vertx().deployVerticle(alertsRestService);
}
}
public class AlertsRestService extends AbstractVerticle {
// instantiated by Spring
private PostgresService pgService;
#Value("${rest.endpoint.port:8080}")
private int restEndpointPort;
#Override
public void start(Future<Void> futureStartResult) {
HttpServer server = vertx.createHttpServer();
Router router = Router.router(vertx);
//enable reading of the request body for all routes
router.route().handler(BodyHandler.create());
router.route(HttpMethod.GET, "/allDefinitions")
.handler(this::handleGetAllDefinitions);
server.requestHandler(router)
.listen(restEndpointPort,
result -> {
if (result.succeeded()) {
futureStartResult.complete();
} else {
futureStartResult.fail(result.cause());
}
}
);
}
private void handleGetAllDefinitions( RoutingContext routingContext) {
HttpServerResponse response = routingContext.response();
Collection<AlertDefinition> allDefinitions = null;
try {
allDefinitions = pgService.getAllDefinitions();
} catch (Exception e) {
response.setStatusCode(500).end(e.getMessage());
}
response.putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(allAlertDefinitions));
}
}
Spring config:
<bean id="alertsRestService" class="com.my.AlertsRestService"
p:pgService-ref="postgresService"
p:restEndpointPort="${rest.endpoint.port}"
/>
<bean id="mainApp" class="com.my.MainApp"
p:alertsRestService-ref="alertsRestService"
/>
Now the question is: how to properly handle the (blocking) call to my postgresService, which may take longer time if there are many items to get/return ?
After researching and looking at some examples, I see a few ways to do it, but I don't fully understand differences between them:
Option 1. convert my AlertsRestService into a Worker Verticle and use the worker thread pool:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions().setWorker(true);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
What confuses me here is this statement from the Vert.x docs: "Worker verticle instances are never executed concurrently by Vert.x by more than one thread, but can [be] executed by different threads at different times"
Does it mean that all HTTP requests to my alertsRestService are going to be, effectively, throttled to be executed sequentially, by one thread at a time? That's not what I would like: this service is purely stateless and should be able to handle concurrent requests just fine ....
So, maybe I need to look at the next option:
Option 2. convert my service to be a multi-threaded Worker Verticle, by doing something similar to the example in the docs:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions()
.setWorker(true)
.setInstances(5) // matches the worker pool size below
.setWorkerPoolName("the-specific-pool")
.setWorkerPoolSize(5);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
So, in this example - what exactly will be happening? As I understand, ".setInstances(5)" directive means that 5 instances of my 'alertsRestService' will be created. I configured this service as a Spring bean, with its dependencies wired in by the Spring framework. However, in this case, it seems to me the 5 instances are not going to be created by Spring, but rather by Vert.x - is that true? and how could I change that to use Spring instead?
Option 3. use the 'blockingHandler' for routing. The only change in the code would be in the AlertsRestService.start() method in how I define a handler for the router:
boolean ordered = false;
router.route(HttpMethod.GET, "/allDefinitions")
.blockingHandler(this::handleGetAllDefinitions, ordered);
As I understand, setting the 'ordered' parameter to TRUE means that the handler can be called concurrently. Does it mean this option is equivalent to the Option #2 with multi-threaded Worker Verticles?
What is the difference? that the async multi-threaded execution pertains to the one specific HTTP request only (the one for the /allDefinitions path) as opposed to the whole AlertsRestService Verticle?
Option 4. and the last option I found is to use the 'executeBlocking()' directive explicitly to run only the enclosed code in worker threads. I could not find many examples of how to do this with HTTP request handling, so below is my attempt - maybe incorrect. The difference here is only in the implementation of the handler method, handleGetAllAlertDefinitions() - but it is rather involved... :
private void handleGetAllAlertDefinitions(RoutingContext routingContext) {
vertx.executeBlocking(
fut -> { fut.complete( sendAsyncRequestToDB(routingContext)); },
false,
res -> { handleAsyncResponse(res, routingContext); }
);
}
public Collection<AlertDefinition> sendAsyncRequestToDB(RoutingContext routingContext) {
Collection<AlertDefinition> allAlertDefinitions = new LinkedList<>();
try {
alertDefinitionsDao.getAllAlertDefinitions();
} catch (Exception e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
return allAlertDefinitions;
}
private void handleAsyncResponse(AsyncResult<Object> asyncResult, RoutingContext routingContext){
if(asyncResult.succeeded()){
try {
routingContext.response().putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(asyncResult.result()));
} catch(EncodeException e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
} else {
routingContext.response().setStatusCode(500)
.end(asyncResult.cause());
}
}
How is this different form other options? And does Option 4 provide concurrent execution of the handler or single-threaded like in Option 1?
Finally, coming back to the original question: what is the most appropriate Option for handling longer-running operations when handling REST requests?
Sorry for such a long post.... :)
Thank you!
That's a big question, and I'm not sure I'll be able to address it fully. But let's try:
In Option #1 what it actually means is that you shouldn't use ThreadLocal in your worker verticles, if you use more than one worker of the same type. Using only one worker means that your requests will be serialised.
Option #2 is simply incorrect. You cannot use setInstances with instance of a class, only with it's name. You're correct, though, that if you choose to use name of the class, Vert.x will instantiate them.
Option #3 is less concurrent than using Workers, and shouldn't be used.
Option #4 executeBlocking is basically doing Option #3, and is also quite bad.

Spring Boot with CXF Client Race Condition/Connection Timeout

I have a CXF client configured in my Spring Boot app like so:
#Bean
public ConsumerSupportService consumerSupportService() {
JaxWsProxyFactoryBean jaxWsProxyFactoryBean = new JaxWsProxyFactoryBean();
jaxWsProxyFactoryBean.setServiceClass(ConsumerSupportService.class);
jaxWsProxyFactoryBean.setAddress("https://www.someservice.com/service?wsdl");
jaxWsProxyFactoryBean.setBindingId(SOAPBinding.SOAP12HTTP_BINDING);
WSAddressingFeature wsAddressingFeature = new WSAddressingFeature();
wsAddressingFeature.setAddressingRequired(true);
jaxWsProxyFactoryBean.getFeatures().add(wsAddressingFeature);
ConsumerSupportService service = (ConsumerSupportService) jaxWsProxyFactoryBean.create();
Client client = ClientProxy.getClient(service);
AddressingProperties addressingProperties = new AddressingProperties();
AttributedURIType to = new AttributedURIType();
to.setValue(applicationProperties.getWex().getServices().getConsumersupport().getTo());
addressingProperties.setTo(to);
AttributedURIType action = new AttributedURIType();
action.setValue("http://serviceaction/SearchConsumer");
addressingProperties.setAction(action);
client.getRequestContext().put("javax.xml.ws.addressing.context", addressingProperties);
setClientTimeout(client);
return service;
}
private void setClientTimeout(Client client) {
HTTPConduit conduit = (HTTPConduit) client.getConduit();
HTTPClientPolicy policy = new HTTPClientPolicy();
policy.setConnectionTimeout(applicationProperties.getWex().getServices().getClient().getConnectionTimeout());
policy.setReceiveTimeout(applicationProperties.getWex().getServices().getClient().getReceiveTimeout());
conduit.setClient(policy);
}
This same service bean is accessed by two different threads in the same application sequence. If I execute this particular sequence 10 times in a row, I will get a connection timeout from the service call at least 3 times. What I'm seeing is:
Caused by: java.io.IOException: Timed out waiting for response to operation {http://theservice.com}SearchConsumer.
at org.apache.cxf.endpoint.ClientImpl.waitResponse(ClientImpl.java:685) ~[cxf-core-3.2.0.jar:3.2.0]
at org.apache.cxf.endpoint.ClientImpl.processResult(ClientImpl.java:608) ~[cxf-core-3.2.0.jar:3.2.0]
If I change the sequence such that one of the threads does not call this service, then the error goes away. So, it seems like there's some sort of a race condition happening here. If I look at the logs in our proxy manager for this service, I can see that both of the service calls do return a response very quickly, but the second service call seems to get stuck somewhere in the code and never actually lets go of the connection until the timeout value is reached. I've been trying to track down the cause of this for quite a while, but have been unsuccessful.
I've read some mixed opinions as to whether or not CXF client proxies are thread-safe, but I was under the impression that they were. If this actually not the case, and I should be creating a new client proxy for each invocation, or use a pool of proxies?
Turns out that it is an issue with the proxy not being thread-safe. What I wound up doing was leveraging a solution kind of like one posted at the bottom of this post: Is this JAX-WS client call thread safe? - I created a pool for the proxies and I use that to access proxies from multiple threads in a thread-safe manner. This seems to work out pretty well.
public class JaxWSServiceProxyPool<T> extends GenericObjectPool<T> {
JaxWSServiceProxyPool(Supplier<T> factory, GenericObjectPoolConfig poolConfig) {
super(new BasePooledObjectFactory<T>() {
#Override
public T create() throws Exception {
return factory.get();
}
#Override
public PooledObject<T> wrap(T t) {
return new DefaultPooledObject<>(t);
}
}, poolConfig != null ? poolConfig : new GenericObjectPoolConfig());
}
}
I then created a simple "registry" class to keep references to various pools.
#Component
public class JaxWSServiceProxyPoolRegistry {
private static final Map<Class, JaxWSServiceProxyPool> registry = new HashMap<>();
public synchronized <T> void register(Class<T> serviceTypeClass, Supplier<T> factory, GenericObjectPoolConfig poolConfig) {
Assert.notNull(serviceTypeClass);
Assert.notNull(factory);
if (!registry.containsKey(serviceTypeClass)) {
registry.put(serviceTypeClass, new JaxWSServiceProxyPool<>(factory, poolConfig));
}
}
public <T> void register(Class<T> serviceTypeClass, Supplier<T> factory) {
register(serviceTypeClass, factory, null);
}
#SuppressWarnings("unchecked")
public <T> JaxWSServiceProxyPool<T> getServiceProxyPool(Class<T> serviceTypeClass) {
Assert.notNull(serviceTypeClass);
return registry.get(serviceTypeClass);
}
}
To use it, I did:
JaxWSServiceProxyPoolRegistry jaxWSServiceProxyPoolRegistry = new JaxWSServiceProxyPoolRegistry();
jaxWSServiceProxyPoolRegistry.register(ConsumerSupportService.class,
this::buildConsumerSupportServiceClient,
getConsumerSupportServicePoolConfig());
Where buildConsumerSupportServiceClient uses a JaxWsProxyFactoryBean to build up the client.
To retrieve an instance from the pool I inject my registry class and then do:
JaxWSServiceProxyPool<ConsumerSupportService> consumerSupportServiceJaxWSServiceProxyPool = jaxWSServiceProxyPoolRegistry.getServiceProxyPool(ConsumerSupportService.class);
And then borrow/return the object from/to the pool as necessary.
This seems to work well so far. I've executed some fairly heavy load tests against it and it's held up.

How to close a database connection opened by an IBackingMap implementation within a Storm Trident topology?

I'm implementing an IBackingMap for my Trident topology to store tuples to ElasticSearch (I know there are several implementations for Trident/ElasticSearch integration already existing at GitHub however I've decided to implement a custom one which suits my task better).
So my implementation is a classic one with a factory:
public class ElasticSearchBackingMap implements IBackingMap<OpaqueValue<BatchAggregationResult>> {
// omitting here some other cool stuff...
private final Client client;
public static StateFactory getFactoryFor(final String host, final int port, final String clusterName) {
return new StateFactory() {
#Override
public State makeState(Map conf, IMetricsContext metrics, int partitionIndex, int numPartitions) {
ElasticSearchBackingMap esbm = new ElasticSearchBackingMap(host, port, clusterName);
CachedMap cm = new CachedMap(esbm, LOCAL_CACHE_SIZE);
MapState ms = OpaqueMap.build(cm);
return new SnapshottableMap(ms, new Values(GLOBAL_KEY));
}
};
}
public ElasticSearchBackingMap(String host, int port, String clusterName) {
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", clusterName).build();
// TODO add a possibility to close the client
client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(host, port));
}
// the actual implementation is left out
}
You see it gets host/port/cluster name as input params and creates an ElasticSearch client as a member of the class BUT IT NEVER CLOSES THE CLIENT.
It is then used from within a topology in a pretty familiar way:
tridentTopology.newStream("spout", spout)
// ...some processing steps here...
.groupBy(aggregationFields)
.persistentAggregate(
ElasticSearchBackingMap.getFactoryFor(
ElasticSearchConfig.ES_HOST,
ElasticSearchConfig.ES_PORT,
ElasticSearchConfig.ES_CLUSTER_NAME
),
new Fields(FieldNames.OUTCOME),
new BatchAggregator(),
new Fields(FieldNames.AGGREGATED));
This topology is wrapped into some public static void main, packed in a jar and sent to Storm for execution.
The question is, should I worry about closing the ElasticSearch connection or it is Storm's own business? If it is not done by Storm, how and when in the topology's lifecycle I should do that?
Thanks in advance!
Okay, answering my own question.
First of all, thanks again #dedek for suggestions and reviving the ticket in Storm's Jira.
Finally, since there's no official way to do that, I've decided to go for cleanup() method of Trident's Filter. So far I've verified the following (for Storm v. 0.9.4):
With LocalCluster
cleanup() gets called on cluster's shutdown
cleanup() DOESN'T get called when killing the topology, this shouldn't be a tragedy, very likely one won't use LocalCluster for real deployments anyway
With a real cluster
it gets called when the topology is killed as well as when the worker is stopped using pkill -TERM -u storm -f 'backtype.storm.daemon.worker'
it doesn't get called if the worker is killed with kill -9 or when it crashes or - sadly - when the worker dies due to an exception
In overall that gives more or less decent guarantee of cleanup() to get called, provided you'll be careful with exception handling (I tend to add 'thundercatches' to every of my Trident primitives anyway).
My code:
public class CloseFilter implements Filter {
private static final Logger LOG = LoggerFactory.getLogger(CloseFilter.class);
private final Closeable[] closeables;
public CloseFilter(Closeable... closeables) {
this.closeables = closeables;
}
#Override
public boolean isKeep(TridentTuple tuple) {
return true;
}
#Override
public void prepare(Map conf, TridentOperationContext context) {
}
#Override
public void cleanup() {
for (Closeable c : closeables) {
try {
c.close();
} catch (Exception e) {
LOG.warn("Failed to close an instance of {}", c.getClass(), e);
}
}
}
}
However would be nice if some day hooks for closing connections become a part of the API.

Resources