valid loss going down, but auprc also going down - metrics

Is it possible?
I am on the multi class classification task,
and I don't understand how this can happen.
Train, valid loss are all going down,
but train, valid auprc is going down.
I cannot find where the problem is.
Is it possible?

Related

Regularization vs. Validation

What I always see in the papers and articles about under/overfitting is a falling curve for training error and a U-shaped curve for testing error, saying the area left to the U-curve bottom is subject to underfitting and the area right to it is subject to overfitting.
To find the best model, we can test each configuration (e.g. changing the number of nodes and layers) and compare the test error values to find the minimum point (typically via cross-validation). That looks straightforward and perfect.
Do we need a regularizer to achieve this point? This is what I am not sure I have the topic understood well. To me, it seems that we don't need a regularizer if we can test different model configurations. The only case when a regularizer comes to play is when we have a fixed model configuration (e.g. fixed number of nodes and layers) and don't want to try other configurations, so we use regularizer to limit the model complexity by forcing other model parameters (e.g. network weights) to low values. Is this view right?
But if it is right, then what is the intuition behind it? First of all, when using a regularizer we don't know in advance if this network configuration/complexity bring us to the right or left of the minimum of test error curve. It may be already underfit, overfit, or fit. Putting math aside, why forcing weights to lower values will cause network to be more generalizable and less overfit? Is there any analogy of this method with the previous method of moving along test loss curve to find its minimum? Also regularizer does its job while training, it can not do anything with test data. How can it help to move toward minimum test error?

Developing cluster apps

I'm not sure exactly where (or even how exactly to ask) this question, so I'm hoping someone here can point me in the right direction.
I have a service that I'm building. That service has different objects in memory - each with it's own state. Whenever an object is created it loads the state from the database and hold it. When changes are made to the object they are also persistent to the database.
I would like to scale this service. I have looked at solutions such as akka.net (actor model) and they have a clustering solution. From what I've read, it synchronizes the state with something they call "gossip" where each node sends the state to the other node. I'm not sure that it really possible to convert my working application to akka.net at this point.
I'm wondering exactly how clusters keep state synced between different nodes (I get the gossip concept), what happens if I have machine A that receives a message and at the same time, machine B also receives a message - both change the same state of an object - that will make problems with data integrity between states. My only thought about this is to lock a shared resource, but that defeats the purpose of the cluster.
Keeping state in the database is also not an option since the database becomes a bottleneck and a single point of failure.
I can't seem to find any relevant reading materials online - but I'm also lacking the technical phrases I need to focus on.
In case it's relevant, I'm using .NET Core and c# for development.
Can anyone explain the concept of clustering, how it works and make sure nodes are at sync? or can point to the right direction?
You have a big problem. I think that the way you are thinking about the problem is a bigger problem. Let's go through some basics.
Clustering is used to solve big problems, much like the "eat an elephant" problem. You could to solve this problem design a unique bigger predator with a huge mouth. But history and paleontology has shown us that big predators are not easily sustained (they are expensive on the environment).
So to solve your problem, you could take a bigger stronger server.
Or, you could use clustering.
Clustering solves the "eat the elephant" problem in a very different way. Instead of sending a unique huge predator with a huge mouth to eat the elephant, it will use a concept of distributed and shared processing to eat it one bite at a time. When done properly, ants could eat the elephant. If there are enough of them and the circumstances are correct.
But notice in my example, ants are very small... A single ant will never carry the entire elephant. You could carry the entire elephant if all the ants worked together but then you run into concurrency and locking problems (you must coordinate the ants).
Ants have shown us a much better way to deal with this. They will take a piece of the elephant and deal with the problem in smaller chunks.
In your system you ask how you can sync data across nodes... My question would be why? If you are syncing data then you are mirroring and your problem becomes even bigger (you are cloning the elephant but can only eat the original).
The solution to your problem is to rethink the solution and see if you can break down the problem into smaller pieces.
In Akka and in the Actor pattern the best way to deal with problems is to use smaller "processes" (a single ant). While the process on its own is almost useless, when used in a large scale they can become very powerful. When the architecture is properly done you will notice that taking a flamethrower to ants will not defeat them... More ants will come, they will continue to work on the problem.
Copying and syncing data is not your solution, clustering it is. You must take your data and break it down to a point where you can give it to a single ant. If you can do this then you can use Akka. If this approach seems ludicrous then Akka is not for you.
But consider this... You obviously have concerns over your database backend - you don't want to increase IO and introduce a single point of failure. I would have to agree with you. But you need to rethink things. You could have database mirroring to remove the single point of failure but you are correct that this won't remove the bottleneck. So let's say that mirror removes the single point of failure... Now let's attack the bottleneck portion.
If you can split up your data into small enough chunks that ants can handle it then I would urge you to tell your ants to only report to the database when the data changes... You can read it once on initialization (you need a backend store, don't kid yourself, electricity can be quickly lost... it must be saved somewhere) but if you tell your ants to persist only changed data then you will remove all the queries from the equation which will drastically shift where the load is coming from. Once you only have updates, inserts and deletes to deal with... the whole landscape will be much simpler.
Clustering should be the solution for you, but only if you can take the concept of mirror away from your mind.
Cluster nodes can and will crash... But they can be respawned elsewhere to other nodes, so that you always have a quick system. Only when you deal with a crash or loss of a node/worker process/ant will you have to reload data...
Good luck... you have outlined a formidable problem that I have seen people with software engineering degrees fail at solving.

GameplayKit: Fine tune control of GKAgent

I'm testing out gameplaykit using spritekit. I've added a GKAgent to my GKEntity and I am making my Entity seek my touches by creating an endAgent at touch position.
This works great. The agent moves naturally and chases my touches. However, I have two questions..
How can I stop the agent when it reaches its destination. The agent will circle around forever trying to exactly land on the point. I've tried agent.behavior.removeAllGoals() I'd figure that would stop the agent right away since it has no goals.. but nothing happens.
Second question is how can I fine tune movement. An agent would be ideal for something like a missile chasing an airplane. The problem is that it decelerates when reaching its target. The movement pattern is so specific. I've tried playing with the properties mass, maxSpeed maxAcceleration etc.. Anything I'm missing?
The API describes agents in terms of their motivation, but in some ways they act more like physics bodies — that is, they follow Newton's First Law and stay in motion unless "motivated" to change their speed or direction.
To stop an agent when it reaches its destination, you need to make stopping be its primary goal. Check per-frame what your distance to the target is, and when you get "close enough" (whatever counts as that for your gameplay), take out the seek goal and replace it with a target-speed goal whose speed is zero.
For the "heat-seeking missile" behavior, you might try using an intercept-agent goal instead of a seek-agent goal and varying the prediction time to see how that affects the pursuit speed. (And you probably don't need the missile to remain in the scene once it gets close enough to the airplane, so if you can limit the slowdown to "within explosion distance" you can ignore it.)

What Machine Learning algorithm would be appropriate?

I am working on a predictor for learning the most likely period for grape harvesting, depending on weather and on the characteristics of grape, namely sugar level, Ph, acidity. I've got two datasets and I am thinking of how to merge them together: one is the pre-harvest analysis data of some Italian vineyards in the 2003-2013 period, the other is the weather on that decade. What I want to do is learning from my samples when to harvest, given a range for the optimal sugar level, Ph and acidity, and given a weather forecast.
I thought that some Reinforcement Learning approach could work. Since the pre-harvest analysis are done about 5 times during the grape maturation period, I thought that those could be states I step in, while the weather conditions could be the "probabilities" of going from a state to another.
Yet I am not sure of what algorithm would be the best as every state and every "probability" depends on several variables. I was told that Hidden Markov Model would work, but it seems to me that my problem doesn't fit the model perfectly.
Do you have any suggestion? Thx in advance
This has nothing to do with the actual algorithm, but the problem you are going to run into here is that weather is extremely local. One vineyard can have completely different weather than another only a mile away from it, believe or not. If you put rain gauges at each vineyard, you will find this out. To get really good results you need to have a mini weather station at each vineyard. Absent this, your best option is to use only vineyards in the immediate vicinity of the weather measurements. For example, if your data is from an airport, only use vineyards right next to the airport.
Reinforcement learning is appropriate when you can control the action. It is like a monkey pushing buttons. You push a button and get shocked, so you don't push that button again. Here you have a passive data set and cannot conduct experimental actions, so reinforcement learning does not apply.
Here you have a complex set of uncontrolled inputs, the weather data, a controlled input (harvest time), and several output parameters, sugar etc. Given that data, you want to predict what harvest time to use for some future, unknown weather pattern.
In general, what you are doing is sensitivity analysis: trying to figure out how your factors affected the outcome that occurred. The tricky part is that the outcomes may be driven by some non-obvious pattern. For example, maybe 3 weeks of drought, followed by 2 weeks of heavy rain implies the best harvest will be 65 days hence, or something like that.
So, what you have to do is featurize the data to characterize it in possible likely ways, then do a sensitivity analysis. If the analysis has a strong correlation, then you have found a solution. If it does not, then you have to find a different way to featurize the data. For example, your featurization might be number of days with rain over 2 inches, or it might be most number of days without rain, or it might be total number of days with bright sunshine. Possibly multiple features might combine to make a solution. The options are limited only by your imagination.
Of course, as I was saying above, the fly in the ointment is that your weather data will only roughly approximate the real and actual weather at the particular vineyard, so there will be noise in the data, possibly so much noise as to make getting a good result impossible.
Why you actually don't care too much about the weather
Getting back to the data, having unreliable weather information is actually not a problem, because you actually don't care too much about the weather. The reason is two-fold. First of all, the question you are trying to answer is not when to harvest the grapes, it is whether to wait to harvest or not. The vintner can always measure the current sugar of the grapes. So, he just has to decide, "Should I harvest the grapes now with sugar X%, or should I wait and possibly get a better sugar Z% later? To answer this question the real data you need is not the weather, it is a series of sugar/acidity readings taken over time. What you want to predict is whether, given a situation, the grapes will get better or whether they will get worse.
Secondly, grapevines have an optimal amount of moisture they like. If the vine gets too dry, that is bad, if it gets too wet that is bad. You cannot predict how moist a vine is from the weather. Some soils hold moisture well, others are sandy. A sandy vineyard will require more rain than a clay vineyard to have the same moisture levels. Also, the vintner can water his vineyards, completely invalidating the rainfall pattern. Therefore, weather is pretty much a non-factor.
I agree with Tyler that from a feasible standpoint weather might harm your analysis. However, I think this is for you to test and find out!- there could be some interesting data that comes out of it.
I'm not sure exactly what your test is, but a simple way to start perhaps is to make this into a classification problem using svm (or even logistic regression since you want probabilities) and use all the data as the input for the algorithm- assuming you know which years were good harvest years or not. You could even test each variable individually and see how it effects your performance. I suggest you go this way if you can just because there's massive amounts of sources on the net and people here on SO that can help you tune your algo.
When you have a handle on this, I would, as you seem to have been suggested before, try the HMM- as it will tell you which day was probably the best for the harvest. This is where the weather might hurt, but you'll come to understand more about your data from the simpler experiments.
The think I've learned about machine learning is that while there are guidelines for when to choose which algorithm its not always set in stone and you can change your question slightly and try a new approach to the problem, depending how much freedom you have to play with the data. Good luck and have fun!

Insects following the leader - can I implement Boids algorithm for that?

I would like to illustrate how insects are following their leader in 2 dimensions.
How can I acomplish that?
Is it possible to do this with Boids algorithm?
Or maybe someone knows another algorithm, designed especially for that reason?
Boids-style algorithms should be fine for this, however you will probably need to tweak the algorithm and experiment a bit before you get something that looks really good. You'll get something like leader/follower behaviours providing you do the following:
Get the "followers" to adjust their heading towards the "leader". Depending on how strong you want the follower effect to be you can make this effect weaker or stronger, or only apply it some of the time etc.
You may choose to either have every bot follow the same leader, or each follow a different leader. If the former, you will get a big flock following a single individual. If the latter, you will tend to get "chains" forming.
You'll probably want the ultimate leader(s) to move relatively independently. Maybe make the leader change heading randomly or even try to head "away" from the centre of the group.

Resources