Vertx worker verticle pool for jdbc - jdbc

I'm new to Vert.x and I would like to implement a pool of worker verticles to make database queries using BoneCP. However, I'm a little bit confused about how to 'call' them to work and how to share the BoneCP connection pool between them.
I saw in Vertx DeploymentManager source that the start(Future) method is called synchronously and then the verticle is kept in memory until undeployed. After the start method completes, what's the correct way of calling methods on the worker verticle? If I deploy many instances of the verticle (using DeploymentOptions.setInstances()), will Vertx do load balancing between them?
I saw that Vert.x comes with a JDBC client and a worker pool, but it has limited datatypes I can work with because it uses the EventBus and serializes all data returned by the database. I need to work with many different datatypes (including dates, BigDecimals and binary objects) and I would like to avoid serialization as much as possible, but instead make queries in the worker verticle, process the results and return an object via a Future or AsyncResult (I believe this is done on-heap, so no serialization needed; is this correct?).
Please help me to sort out all these questions :) I will appreciate a lot if you give me examples of how can I make this work!
Thanks!

I'll try to answer your questions one by one.
how to 'call' them to work
You call your worker verticles using the EventBus. That's the proper way to communicate between them. Please see this example:
https://github.com/vert-x3/vertx-examples/blob/master/core-examples/src/main/java/io/vertx/example/core/verticle/worker/MainVerticle.java#L27
how to share the BoneCP connection pool between them.
Don't. Instead, create a small connection pool for each. Otherwise, it will cause unexpected behavior.
config.setMinConnectionsPerPartition(1);
config.setMaxConnectionsPerPartition(5);
config.setPartitionCount(1);
will Vertx do load balancing between them
No. That's the reason #Jochen Bedersdorfer and I suggest to use EventBus. You can have a reference to your worker verticle, as you suggested, but then you're stuck with 1:1 configuration.
return an object via a Future or AsyncResult (I believe this is done
on-heap, so no serialization needed; is this correct?)
This is correct. But again, you're stuck with 1:1 mapping then. Which is a lot worse in terms of performance that serialization (that's using buffers).
If you still do need something like that, maybe you shouldn't use worker verticles at all, but something like .executeBlocking:
https://github.com/vert-x3/vertx-examples/blob/master/core-examples/src/main/java/io/vertx/example/core/execblocking/ExecBlockingExample.java#L25

In your start(...) method, register event listeners with event bus as this is how you interact with verticles (worker or not).
Yes, if you deploy many instances, Vert.x will use round-robin to send messages to those instances.
For what you describe, Vert.x might not be the best fit, since it works best with asynchronous I/O.
You might be better off using standard Java concurrency tools to manage the load, i.e. Executor and friends.

Related

Spring Boot Kafka: Consume same message with all instances for specific topic

I have a spring boot application (let's say it's called app-1) that is connected to a kafka cluster and that consumes from a specific topic, let's say the topic is called "foo". Topic foo always receives a message when another application (let's say it's called app-2) has imported a new foo-item into the database.
The topic is primarily meant to be used in a third application (let's say it's called app-3) which sends out some e-Mail notification to people that may be interested in this new foo-item. App-3 is clustered, meaning there are multiple instances of it running at the same time. Kafka automatically balances the foo-topic messages between all these instances because they use the same consumer-id. This is good and in the case of app-3 it is actually desired.
In the case of app-2, however, the messages from the foo-topic are used for cache eviction. The logic is, basically, that if there is a new foo-item then the currently existing caches should probably be cleared, because their content depends on the foo-items. The issue is that app-2 is also clustered, which means that by default kafka-logic, every instance will only receive some of the messages sent to the foo-topic. This does not work correctly for this specific app tho, because whenever there is a new foo-item, all of the instances need to know about it because all of them need their clear their local caches.
From what I understand I have these two options if I want to keep the current logic:
Introduce a distributed cache for all instances of app-2 so that they all share the same cache. Then it does not matter if only one instance receives a foo-item, because the cache eviction will also affect the cache of the other instances; even though they never learned about the foo-item. I would like to avoid this solution, as a distributed cache would add a noticeable amount of complexity and also overhead.
Somehow manage to use a different consumer-id for each instance of app-2. Then they would be considered different consumers by kafka and they all would get each foo-topic message. However, I don't even know how to programmatically do this. The code of the application is not aware of replicated instances, there is no way to access any information about what node it is. If I use a randomly generated string on startup, then each time such instance restarts it would be considered a new consumer and would have to re-process all previous messages. That would be incorrect behavior as well.
Here is my bottom line question: Is it possible to make all instances of app-2 receive all messages from the foo-topic without completely breaking the way kafka is supposed to work? I know that it is probably very unconventional to use kafka-messages for cache eviction and I am entirely able to find an alternative mechanism for the cache eviction logic that does not depend on kafka-topic messages. However, the applications are for demonstration purposes and I thought it would be cool if more than one app read from this topic. But if I end up having to hack a dirty workaround to make it work then it's also bad for demonstration purposes and I would rather implement an alternative way of cache eviction.
As you mentioned, you could use different consumer ids with random strings.
If notifications are being read from the beginning, then you probably have ConsumerConfig.AUTO_OFFSET_RESET_CONFIG set to "earliest" somewhere in your consumer configuration. If this is the case, removing it will probably solve your problems - when the app will start it will only receive notification sent after the consumer started listening.

Hard limit connections Spring Boot

I'm working on a simple micro service written in Spring Boot. This service will act as a proxy towards another resources that have a hard concurrent connection limit and the requests take a while to process.
I would like to impose a hard limit on concurrent connections allowed to my micro service and rejecting any with either a 503 or on tcp/ip level. I've tried to look into different configurations that can be made for Jetty/Tomcat/Undertow but haven't figured out yet something completely convincing.
I found some settings regulating thread pools:
server.tomcat.max-threads=0 # Maximum amount of worker threads.
server.undertow.io-threads= # Number of I/O threads to create for the worker.
server.undertow.worker-threads= # Number of worker threads.
server.jetty.acceptors= # Number of acceptor threads to use.
server.jetty.selectors= # Number of selector threads to use.
But if understand correctly these are all configuring thread pool sizes and will just result in connections being queued on some level.
This seems like really interesting, but this hasn't been merged in yet and is targeted for Spring Boot 1.5 , https://github.com/spring-projects/spring-boot/pull/6571
Am I out of luck using a setting for now? I could of course implement a filter but would rather block it on an earlier level and not have to reinvent the wheel. I guess using apache or something else in front is also an option, but still that feels like an overkill.
Try to look at EmbeddedServletContainerCustomizer
this gist could give you and idea how to do that.
TomcatEmbeddedServletContainerFactory factory = ...;
factory.addConnectorCustomizers(connector ->
((AbstractProtocol) connector.getProtocolHandler()).setMaxConnections(10000));

What is the right approach for an async work queue with results?

I have a REST server on heroku. It will have N-dynos for the REST service and N-dynos for workers.
Essentially, I have some long running rest requests. When these come in I want to delegate them to one of the workers and give the client a redirect to poll the operation and eventually return the result of the operation.
I am going to use JEDIS/REDIS from RedisToGo for this. As far as I can tell there are two ways I can do this.
I can use the PUB/SUB functionality. Have the publisher create unique identities for the work results and return these in a redirect URI to the REST client.
Essentially the same thing but instead of PUB/SUB use RPUSH/BLPOP.
I'm not sure what the advantage is to #1. For example, if I have a task called LongMathOperation it seems like I can simply have a list for this. The list elements are JSON objects that have the math operation arguments as well as a UUID generated by the REST server for where the results should be placed. Then all the worker dynos will just have blocking BLPOP calls and the first one there will get the job, process it, and put the results in REDIS using the key of the UUID.
Make sense? So my question is "why would using PUB/SUB be better than this?" What does PUB/SUB bring to the table here that I am missing?
Thanks!
I would also use lists because pubsub messages are not persistent. If you have no subscribers then the messages are lost. In other words, if for whatever reason you do not have any workers listening then the client won't get served properly. Lists are persistent on the other hand. But pubsub does not take as much memory as lists obviously for the same reason: there is nothing to store.

Can Netty efficiently handle scores of outgoing connections as a client?

I'm creating a client-server relationship whereby a single client will be connected to an arbitrary number of servers using persistent TCP connections. The actual number of servers is as-of-yet undetermined, but the design goal is to shoot for 1000.
I found an example using direct Java NIO that nearly completely matches my mental model of how this could work:
http://drdobbs.com/jvm/184406242
In general, it opens up all of the channels and adds them to a single thread monitoring java.nio.channels.Selector. The use of the Selector, in particular, is what allows this to scale far better than using the standard thread-per-channel.
I would rather use a (slightly) higher level socket framework like Netty, than direct Java NIO. Unfortunately, I have not been able to determine how Netty would handle a case like this. That is, the examples and discussions I've found all tend to center around the server side, with accepting scores of concurrent connections.
But what about doing this from the client side? If I create a large number of channels and just wait on their events, how is Netty going to handle this at the back-end?
This isn't a direct answer to your question but I hope it is helpful nonetheless. Below, I describe a way for you to determine the answer that you are looking for. This is something that I recently did myself for an upcoming project.
Compared to OIO (Old IO) the asynchronous nature of the Netty framework and NIO will indeed provide much better memory and CPU usage characteristics for your application. The way buffers are handled in Netty will also be of benefit as it will help you to avoid copying byte buffers. The point is that all of the thread pool and NIO details will be handled for you allowing you to focus on your business logic. You mentioned the NIO Selector and you will benefit from that; the nice thing about Netty is that you get the benefits without having to worry about that implementation yourself because it is already done for you.
My understanding of the client side is that it is very similar to the server side and should provide you with commensurate performance gains (as long as your business logic doesn't introduce any performance issues).
My advice would be to throw together a prototype that more or less does what you want. Leave out any time consuming details and just add in the basic Netty handlers that you need to make something that works.
Then I would use jmeter to invoke your client to apply load to the server and client. Using something like jconsole or jvisualvm will show you the performance characteristics of the client and server under load. You could also try jprobe. You can add a listener in jmeter that will indicate the throughput. I would advise to use jmeter in server mode, the client on another machine and the server on yet another. This is a bit of up front work but if you decide to move forward you will have these tools ready to go for further testing as your proceed.
I suspect a decent Netty implementation that doesn't introduce any extraneous poorly performing components will give you the performance characteristics you are looking for, but, the only way to know for sure is to measure the system under the expected load.
You need to define what the expected load looks like and the desired performance characteristics under such load. Given these inputs you can measure your system to find out if it will meet your expectations. I personally don't think anyone can tell you if it will behave in the desired manner. You have to measure it. It's the only reliable way to know if the system can meet your needs.
I would rather use a (slightly) higher level socket framework like Netty, than direct Java NIO.
This is the correct approach. You can try implementing your own NIO server and client but why do that when you have the benefit of a highly refined framework at your fingertips already?
Netty will use up to x worker threads that handle the work for you. Each worker thread will have one Selector that is used to register Channels to it. The number of used workers is configurable and by default 2 * cpu-count.
As you can see in the example from Netty's doc [http://netty.io/docs/stable/guide/html/#start.9][1] you can control exactly the number of worker threads (meaning the number of underlying selectors) on the Client side.
Netty solves a numbers of issues that are very hard to handle in a simple way such as NIO vs SSL, and have a lot of default encoder/decoder for Zip... etc.
I started using Netty a few week ago and it was quite fast to came into. (I recommend dowloading the project with all the example code inside, there is a lot of documentation in it that can not be found on the url above.
ChannelFactory factory = new NioClientSocketChannelFactory(
Executors.newCachedThreadPool(),
Executors.newCachedThreadPool());
ClientBootstrap bootstrap = new ClientBootstrap(factory);
bootstrap.setPipelineFactory(new ChannelPipelineFactory() {
public ChannelPipeline getPipeline() {
return Channels.pipeline(new TimeClientHandler());
}
});
bootstrap.setOption("tcpNoDelay", true);
bootstrap.setOption("keepAlive", true);
bootstrap.connect(new InetSocketAddress(host, port));
Good luck,
Renaud

Web crawler in Ruby: How to achieve the best perfomance?

I'm writing a web-crawler that should be able to parse multiple pages at the same time. I use Nokogiri for parsing which is quiet good and solve all my tasks, but I don't know how to achieve better perfomance.
I use threads to make many open-uri requests at the same time and it makes the process quicker, but it seems that it's still far from the potential that I can achieve from a single server. Should I use multiple processes? What are the limits of the threads and processes that can be launched for a single ruby application?
By the other words: how to achieve the best performance in this case.
I really like Typhoeus and Hydra for handling multiple requests at once.
Typhoeus is the http client side, and Hydra is the part that handles multiple requests. The examples are good so go through them and see.
While it sounds like you're not looking for something quite so complex I found this thesis an interesting read awhile ago: Building blocks of a scalable webcrawler - Marc Seeger.
In terms of threading/process limits Ruby has very low threading potential. Standard Ruby (MRI/YARV) and Rubinius don't support simultaneous thread execution, unless using an extension specifically built to support it. Depending on how much of your performance trouble is in the IO and how much is in the processing I could suggest using EventMachine.
Multi process however Ruby works very well, as long as you've got a good manager/database for all the processes to communicate with then running multiple processes should scale as well as your processing power allows.
Hey another way is to use a combination of Nokogiri and IronWorker (IronMQ and IronCache).
See a full blog entry on the Topic here
We use a combination of ActiveMQ/Active Messaging, Event Machine, and multi-threading for this problem. We start off with a big list of URL's to fetch. We then break them down into batches of 100 URL's per batch. Each batch is then pushed into ActiveMQ. Then, we have an array of poller/consumer processes listening to the queue. These consumers can all be on one computer, or they can be spread across multiple computers. The array of consumers can grow arbitrarily large to support as much parallelism as we want. The consumers use Active Messaging, which is a nice Ruby integration with ActiveMQ.
When a consumer receives a message to process a batch of 100 URL's, it kicks off Event Machine to create a thread pool that can process multiple messages in multiple threads. Like you, we use Nokogiri to process each URL.
So, there are three levels of parallelism:
1) Multiple concurrent requests per consumer process, supported by Event Machine and threads.
2) Multiple consumer processes per computer.
3) Multiple computers.
If you want something easy go for http://anemone.rubyforge.org/
If you want something fast, code something with eventmachine/em-http-request
I found redis to be a great multi purpose tool for queue management, caching and so on. You could also use specialized things like beanstalkd/active mq/... but at least in my use case, I didn't really find them to be a big advantage compared to redis.
Especially the load on the backend system could be a bottleneck, so chose your database carefully and pay attention to what you save

Resources