Delivery Assignment Algorithm - algorithm

I have a scenario where deliveries should be managed properly with optimized route.
Let me explain the scenario in detail;
We are dealing with a company who are doing the furniture sales. There may be more than 4000 to 5000 deliveries per day and it is pretty difficult for them to assign each delivery to each vehicle and the fuel cost also not able to control. The overtime allowance also coming very high as they are not properly scheduling the deliveries. So they required an application which will handle these situations.
The application should handle following scenarios;
All the deliveries to be done.
The system should findout how many vehicles required for delivering the items with the following input parameters.
Input 1: The delivery location (Latitude and Longitude).
Input 2: Percentage of space used by the item in Vehicle
Input 3: The FIXING TIME (For items dismantled; which will be refixing at delivery
location)
From the above parameters the system should calculate how many vehicles required and which all deliveries to be assigned to which all vehicles.
The Vehicle working time should be 8 hours (TRAVEL TIME + FIXING TIME).
I have tried this with the help of HUNGARIAN algorithm but failed to get the proper output.
Please suggest me which algorithm will be suitable for my scenario.
Shenu Lal

Related

How to optimize events scheduling with unknown future events?

Scenario:
I need to give users opportunity to book different times for the service.
Caveat is that i dont have bookings in advance but i need to fill them as they come in.
Bookings can be represented as keyvalue pairs:
[startTime, duration]
So, for example, [9,3] would mean event starts at 9 o’clock and has duration of 3 hours.
Rules:
users come in one by one, there is never a batch of users requests
no bookings can overlap
service is available 24/7 so no need to worry about “working time”
users choose duration on their own
obviously, once user chooses&confirms his booking we cannot shuffle it anymore
we dont want gaps to be lesser than some amount of time. this one is based on probability that future users will fill in the gap. for example, if distribution of durations over users bookings is such that probability for future users filling the gap shorter than x hours is less than p then we want a rule that gap cannot be shorter than x. (for purpose of this question, we can assume x being hardcoded, here i just explain reasons)
the goal is to have service-busy-duration maximized
My thinking so far...
I keep the list of bookings made so far
I also keep track of gaps (as they are potential slots for new users booking)
When new user comes with his booking [startTime, duration] i first check for ideal case where gapLength = duration. if there is no such gaps, i find all slots (gaps) that satisfy condition gapLength - duration > minimumGapDuration and order them in descending order by that gapLength - duration value
I assign user to the first gap with maximum value of gapLength - duration since that gives me highest probability that gap remaining after this booking will also get filled in future
Questions:
Are there some problems with my approach that i am missing?
Are there some algorithms solving this particular problem?
Is there some usual approach (good starting point) which i could start with and optimize later? (i am actually trying to get enough infos to start but not making some critical mistake; optimizations can/should come later)
PS.
From research so far it sounds this might be the case for constraint programming. I would like to avoid it if possible as i have no clue about it (maybe its simple, i just dont know) but if it makes a real difference, i will go for its benefits and implement it.
I went through stackoverflow for similar problems but didnt find one with unknown future events. If there is such and this is direct duplicate, please refer to it.

How to optimize employee transport service?

Gurus,
I am in the process of writing some code to optimize employee transport for corporate. I need all you expert's advice on how can this be achieved. Here is my scenario.
There are 100 pick up points all over city from where employees need to be brought to company with multiple vehicles. Each vehicle can occupy say 4 or 6 employees. My objective is to write some code which will group the people from nearby areas and bring them to company. Master data will have addresses and its latitude/longitude. I want to build an algorithm to optimize vehicle occupancy as well as distance and time. Could you guys please give some directions how can this be achieved. I understand I may need to use google maps or direction API for this but looking for some logic hint/advice how this can be achieved.
Some more inputs: These vehicles are of company's vehicle with driver. Travel time should not be more than 1.5 Hrs.
Thanks in advance.
Your problem description is a more complicated version of "The travelling salesman problem". You can look it up on and find some different examples and how they are implemented.
One point that need to be clarified : the vehicules to be use will be the employee vehicule that will be carshared or it will be company's vehicule with a driver?
You also need to define some time constrain. For example 50 employees should have under 30 min travel, 40 employe under 1h travel and 10 employe under 1,5H.
You also need to define the travel time for each road depending of the time, because at different time there will be traffic jam or not.
You also need to define group within the employee: usually people (admin clerck or CEO) in a company don't commute at the same time, it can have 1 hour range or more.
In fine, don't forget to include about 10% of the employee that wil be 2 to 5 min late to their meeting point.

A fast algorithm for determining timeshare requests

I am writing a program that will simulate how members of a certain timeshare will request their apartments. The apartments are only available for certain 'events' throughout the year, and the events each have different durations (counted in days). For each event, there are a number of apartments available, which are divided into groups according to cost (points per day).
We may also have the situation that some of the apartments are already rented out, so it is possible that certain events cannot admit a certain apartment type.
Now I want a member to request apartments according to this strategy:
Maximizing the number of total event days is first priority.
Maximizing the number of points spent is second priority (so the user requests the best possible apartment that he/she can afford and still have as many days as possible).
The total cost of the requested apartments cannot exceed the total amount of available points for that user.
Obviously, if there are no more apartments of a given type for a certain event, the user should not request that particular apartment/event combo.
I am wondering whether the problem is similar to the problem described here:
http://en.wikipedia.org/wiki/Hungarian_algorithm
in that it (I suppose) possible to frame this as a matrix problem where each entry has a cost associated with it.
However, the difference is that for my problem, I am allowed to use the same apartment for several events - it's not 'spent' once it has been used for one event. Also, the cost per entry is not really one-dimensional, since each event/apartment combo both has a number of days associated with it and a number of points - both of which should be maximized (but with priority given to the number of days).
As an example, let's say there are three apartment types, costing 75, 100, and 125 points per day, and three events, with a duration of 2, 10, and 4 days. Let's further say many of the apartments are taken, so the availability matrix looks like this:
cost
75 100 125
2 True False True
days 10 False False True
4 True False True
Let's also say the user has 1250 points available. The solution, in this case, would be that the user requests the 10-day event with the 125-point apartment and nothing else.
The brute-force way of doing this would perhaps be a recursive algorithm:
Let n be the number of events you are currently trying
Find all combinations of events and apartments, and calculate the combo that maximizes the number of days, then the number of points spent (this will both include all permutations of n events, but also the number of ways 3 apartment types can be assigned to n events).
Let n=n-1
This will quickly become overwhelming when the number of events goes up, I think, so I am wondering whether there are any algorithms that can solve this in a less expensive way?
If you have access to a library for http://en.wikipedia.org/wiki/Integer_programming you could try throwing this at it.
Even if you have only one choice of cost for each event, so that you are just trying to chose combinations of events that cover as many total days as possible without going over budget, I think this reduces to http://en.wikipedia.org/wiki/Knapsack_problem. This means that it is unlikely to be solved exactly by a worst case polynomial time algorithm such as the Hungarian algorithm.

Best practices for overall rating calculation

I have LAMP-based business application. SugarCRM to be more precise. There are 120+ active users at the moment. Every day each user generates some records that are used in complex calculation to get so called “individual rating”.
It takes for about 6 seconds to calculate one “individual rating” value. And there was not a big problem before: each user hits the link provided to start “individual rating” calculations, waits for 6-7 seconds, and get the value displayed.
But now I need to implement “overall rating” calculation. That means that additionally to “individual rating” I have to calculate and display to the user:
minimum individual rating among ALL the users of the application
maximum individual rating among ALL the users of the application
current user position in the range of all individual ratings.
Say, current user has individual rating equal to 220 points, minimum value of rating is 80, maximum is 235 and he is on 23rd position among all the users.
What are (imho) the main problems to be solved?
If one calculation lasts for 6 seconds, that overall calculations will take more than 10 minutes. I think it’s no good to make the application almost unaccessible for this period. And what if the quantity of users will rise in the nearest future 2-3 times?
Those calculations could be done as nightly job but all the users are in different timezones. In Russia difference between extreme timezones is 9 hours. So people in west part of Russia are still working in “today”. While people in eastern part is waking up to work in “tomorrow”. So what is the best time for nightly job in this case?
Are there any best practices|approaches|algorithms to build such rating system?
Given only the information provided, the only options I see:
The obvious one - reduce the time taken for a rating calculation (6 seconds to calculate 1 user's rating seems like a lot)
If possible, have intermediate values which you only recalculate some of, as required (for example, have 10 values that make up the rating, all based on different data, when some of the data changes, flag the appropriate values for recalcuation). Either do this recalculation:
During your daily recalculation or
When the update happens
Partial batch calculation - only recalculate x of the users' ratings at chosen intervals (where x is some chosen value) - has the disadvantage that, at all times, some of the ratings can be out of date
Calculate if not busy - either continuously recalculate ratings or only do so at a chosen interval, but instead of locking the system, have it run as a background process, only doing work if the system is idle
(Sorry, didn't manage with "long" comment posting; so decided to post as answer)
#Dukeling
SQL query that takes almost all the time for calculation mentioned above is just a replication of business logic that should be executed in PHP code. The logic was moved into SQL with the hope to reduce calculation time. OK, I’ll try both to optimize SQL query and play with executing logic in PHP code.
Suppose after that optimized application calculates individual rating for just 1 second. Great! But even in this case the first user logged into system should awaits for 120 seconds (120+ users * 1 sec = 120 sec) to calculate overall rating and gets its position in it.
I’m thinking of implementing the following approach:
Let’s have 2 “overall ratings” – “today” and “yesterday”.
For displaying purposes we’ll use “yesterday” overall rating represented as huge already sorted PHP array.
When user hits calculation link he started “today” calculation but application displays him “yesterday” value. Thus we have quickly accessible “yesterday” rating and each user randomly launches rating calculation that will be displayed for them tomorrow.
User list are partitioned by timezones. Each hour a cron job started to check if there’re any users in selected timezone that don’t have “today” individual rating calculated (e.g. user didn’t log into application). If so, application starts calculation of individual rating and puts its value in “today” (still invisible) ovarall rating array. Thus we have a cron job that runs nightly for each timezone-specific user group and fills the probable gaps in case users didn’t log into system.
After all users in all timezones had been worked out, application
sorts “today” array,
drops “yesterday” one,
rename “today” in “yesterday” and
initialize new “today”.
What do you think of it? Is it reasonable enough or not?

Algorithm/Heuristic for grouping chat message histories by 'conversation'/implicit sessions from time stamps?

The problem: I have a series of chat messages -- between two users -- with time stamps. I could present, say, an entire day's worth of chat messages at once. During the entire day, however, there were multiple, discrete conversations/sessions...and it would be more useful to the user to see these divided up as opposed to all of the days as one continuous stream.
Is there an algorithm or heuristic that can 'deduce' implicit session/conversation starts/breaks from time stamps? Besides an arbitrary 'if the gap is more than x minutes, it's a separate session'. And if that is the only case, how is this interval determined? In any case, I'd like to avoid this.
For example, there are...fifty messages that get sent between 2:00 and 3:00, and then a break, and then twenty messages sent between 4:00 and 5:00. There would be a break inserted between there...but how would the break be determined?
I'm sure that there is already literature on this subject, but I just don't know what to search for.
I was playing around with things like edge detection algorithms and gradient-based approaches for a while.
(see comments for more clarification)
EDIT (Better idea):
You can view each message as being of two types:
A continuation of a previous conversation
A brand new conversation
You can model these two types of messages as independent Poisson processes, where the time difference between adjacent messages is an exponential distribution.
You can then empirically determine the exponential parameters for these two types of messages by hand (wouldn't be too hard to do given some initial data). Now you have a model for these two events.
Finally when a new message comes along, you can calculate the probability of the message being of type 1 or type 2. If type 2, then you have a new conversation.
Clarification:
The probability of the message being a new conversation, given that the delay is some time T.
P(new conversation | delay=T) = P(new conversation AND delay=T)/P(delay=T)
Using Bayes' Rule:
= P(delay=T | new conversation)*P(new conversation)/P(delay=T)
The same calculation goes for P(old conversation | delay=T).
P(delay=T | new conversation) comes from the model. P(new conversation) is easily calculable from the data used to generate your model. P(delay=T) you don't need to calculate at all since all you want to do is compare the two probabilities.
The difference in timestamps between adjacent messages depends on the type of conversation and the people participating. Thus you'll want an algorithm that takes into account local characteristics, as opposed to a global threshold parameter.
My proposition would be as follows:
Get the time difference between the last 10 adjacent messages.
Compute the mean (or median)
If the delay until the next message is more than 30 times the the mean, it's a new conversation.
Of course, I came up with these numbers on the spot. They would have to be tuned to fit your purpose.

Resources