How to choose optimal circuit breaker parameters for microservices? - spring-boot

I was watching this video by java brains (https://www.youtube.com/watch?v=CSqxIKJhFRI&list=PLqq-6Pq4lTTbXZY_elyGv7IkKrfkSrX5e&index=14). At the timestamp 3:48, he states the parameters for microservice circuit breakers, in his experience, are best chosen by trial and error.
I was wondering if anyone could provide any resources on how to choose optimal circuit breaker parameters (ex. the parameters of hystrix for a spring boot application). Also, is there any room for using some algorithm like machine learning to predict these optimal parameters for you? I would love to know your thoughts on this subject. Thanks!

Related

Transactional between two microservice

I thinking about the solution to solve problem such as transfer money across banks.
Let say I have two microservices namely A and B with separated database on each of them. I prefered 2PC and Saga but my thought have a bit difference, let I explain more. With each of microservice operation I would have State to keep track, in this case eg. Created and Approved. I would like some how two microservice MUST acknowledge state of each other before change state to Approved. Below are what I mean, I do not know what should do in step 5.
I would have a atomicity operation in step 5. Both microservice would have Approved state or none of them would have. I thought my expectation will generate a cyclic operations between two microservice, right?
Imagine like two bank transfering money is what I would like to achieve. I do not know whether what I am design and thinking is correct or not. Could you possibly give me some advise to address this kind of problem. Or give me some patterns which adapt this issue more efficiently.
Please ask me if my question not clear.

Is there way to influence AlchemyAPI sentiment analysis

I was using AlchemyAPI for text analysis. I want to know if there is way to influence the API results or fine-tune it as per the requirement.
I was trying to analyse different call center conversations available on internet. To understand the sentiments i.e. whether customer was unsatisfied/angry and hence conversation is negative.
For 9 out of 10 conversations it gave sentiment as positive and for 1 it was negative. That conversation was about emergency response system (#911 in US). It seems that words shooting, fear, panic, police, siren could have cause this result.
But actually the whole conversation was fruitful. Caller was not angry with the service instead call center person solved the caller's problem and caller was relaxed. So logically this should not be treated as negative.
What is the way ahead to customize the AlchemyAPI behavior ?
We are currently looking at the tools that would be required to allow customization of the AlchemyAPI services. Our current service is entirely pre-trained on billions of web pages, but customization is on the road map. I can't give you any timelines this early, but keep checking back!
Zach, Dev Evangelist AlchemyAPI

When to use a certain Reinforcement Learning algorithm?

I'm studying Reinforcement Learning and reading Sutton's book for a university course. Beside the classic PD, MC, TD and Q-Learning algorithms, I'm reading about policy gradient methods and genetic algorithms for the resolution of decision problems.
I have never had experience before in this topic and I'm having problems understanding when a technique should be preferred over another. I have a few ideas, but I'm not sure about them. Can someone briefly explain or tell me a source where I can find something about typical situation where a certain methods should be used? As far as I understand:
Dynamic Programming and Linear Programming should be used only when the MDP has few actions and states and the model is known, since it's very expensive. But when DP is better than LP?
Monte Carlo methods are used when I don't have the model of the problem but I can generate samples. It does not have bias but has high variance.
Temporal Difference methods should be used when MC methods need too many samples to have low variance. But when should I use TD and when Q-Learning?
Policy Gradient and Genetic algorithms are good for continuous MDPs. But when one is better than the other?
More precisely, I think that to choose a learning methods a programmer should ask himlself the following questions:
does the agent learn online or offline?
can we separate exploring and exploiting phases?
can we perform enough exploration?
is the horizon of the MDP finite or infinite?
are states and actions continuous?
But I don't know how these details of the problem affect the choice of a learning method.
I hope that some programmer has already had some experience about RL methods and can help me to better understand their applications.
Briefly:
does the agent learn online or offline? helps you to decide either using on-line or off-line algorithms. (e.g. on-line: SARSA, off-line: Q-learning). On-line methods have more limitations and need more attention to pay.
can we separate exploring and exploiting phases? These two phase are normally in a balance. For example in epsilon-greedy action selection, you use an (epsilon) probability for exploiting and (1-epsilon) probability for exploring. You can separate these two and ask the algorithm just explore first (e.g. choosing random actions) and then exploit. But this situation is possible when you are learning off-line and probably using a model for the dynamics of the system. And it normally means collecting a lot of sample data in advance.
can we perform enough exploration? The level of exploration can be decided depending on the definition of the problem. For example, if you have a simulation model of the problem in memory, then you can explore as you want. But real exploring is limited to amount of resources you have. (e.g. energy, time, ...)
are states and actions continuous? Considering this assumption helps to choose the right approach (algorithm). There are both discrete and continuous algorithms developed for RL. Some of "continuous" algorithms internally discretize the state or action spaces.

Neo4j and Cluster Analysys

I'm developing a web application that will heavily depend on its ability to make suggestions on items basing on users with similar preferences. A friend of mine told me that what I'm looking for - mathematically - is some Cluster Analysis algorithm. On the other hand, here on SO, I was told that Neo4j (or some other Graph DB) was the kind DB that I should have approached for this task (the preferences one).
I started studying both this tools, and I'm having some doubts.
For Cluster Analysis purposes it looks to me that a standard SQL DB would still be the perfect choice, while Neo4j would be better suited for a Neural Network kind of approach (although still perfectly fit for the task).
Am I missing something? Am I trying to use the wrong tools combination?
I would love to hear some ideas on the subject.
Thanks for sharing
this depends on your data. neo4j is capable to provide even complex recommendations in real-time for one particular node - let's say you want to recommend to a user some product and this can be handle within a graph db in real-time
whereas using some clustering system is the best way to do recommendations for all users at once (and than maybe save it somewhere so you wouldn't need to calculate it again).
the computational difference:
neo4j has has no initialization cost and can give you one recommendations in an acceptable time
clustering needs more time for initialization (e.g. not in seconds but most likely in minutes/hours) and is better to calculate the recommendations for the whole dataset. in fact, taking strictly the time for one calculations for a specific user this clustering can do it faster than neo4j but the big restriction is the initial initialization - thus not good for real-time application
the practical difference:
if you have mostly static data and is ok for you to do recommendations once in a time than do clustering with SQL
if you got dynamical data where the data are being updated with each interaction and is necessary for you to always provide the newest recommendation, than use neo4j
I am currently working on various topics related to recommendation and clustering with neo4j.
I'm not exactly sure what you're looking for, but depending on how you implement you data on the graph, you can easily work out clustering algorithms based on counting links to various type of nodes.
If you plan correctly you nodes and relationships, you can then identify group of nodes that share most common links to a set of category.
let me introduce Reco4J (http://www.reco4j.org), is is an open source framework that provide recommendation based on graph database source. It uses neo4j as graph database management system.
Have a look at it and contact us if you are interested in support.
It is in a really early release but we are working hard to provide extended documentation and new interesting features.
Cheers,
Alessandro

Social network functionality finding connections you might know

I want to create a functionality for suggesting connections in a social network.
In the network you can have connections and connect to other users.
I want to implement a connection suggestion functionality on the network.
I think the most basic approach to implement this is to check all my connections most occurring common connection that my user is not connected to and sugest this user to my user to connect to.
My questions is:
Is this a good basic approach for an easy connection finder?
Is there any good implementation algorithm that i can use for finding my connections most occurring user that they are connected to?
I'd try a machine learning approach for this problem.
I'll suggest two common machine learning concepts in order to solve this problem. In order for both of them to work - you need to extract features from the data (for example look at a subgraph, and friendship with each member in the subgraph is a binary feature).
The two approaches are:
Classification. In here, you are trying to find a classifier C:UserxUser->Boolean (A classifier that given two users, gives a boolean answer - should they be friends). The classification approach will require you to first manually label, or extract some classified information (A big enough set of pairs, each with a classification). The algorithm will learn this pattern, and use it to predict future inputs.
Clustering (AKA Unsupervised learning). You can try and find clusters in your graph, and suggest users to be friends with all members in their cluster.
I have to admit I never used any of these methods for friendship suggestion - so I have no idea how accurate it will be. You can use cross-validation in order to estimate the accuracy of the algorithm.
If you are interested in learning more about it - two weeks ago an on line free course has started in stanford about machine learning: https://class.coursera.org/ml-2012-002

Resources