Algorithm for picking orders from warehouses

Algorithm for picking orders from warehouses - algorithm

I'll explain My Problem With an example.
Let's say we have:
An Order from a certain store for five products, We will name those products A,B,C,D, & E, with their quantities In the Order A(19),B(25),C(6),D(33),E(40).
A single Truck that can fit different amount of each product:A(30), B(40), C(25), D(50), E(30).
Ex: Transporting A & B together, I loaded the the Truck with A(19) so that's two thirds of what my Truck can handle, So that leaves one third for B, Which means i can only transport 1/3 of B's maximum Truck capacity which is (40/3 ≈ 13).
A Set of Warehouses which contains different amounts of each product.
I made an Excel spreadsheet which contains more useful info regarding those Warehouses like( Quantities, Distance from Each other, Distance from store ).
I want to Deliver this order to the store with the least amount of trips and distance traveled.
Is there an Algorithm for this kind of problem, Or something close i can modify on?
EDIT: Updated Links.

I would advise not to reinvent a wheel as a very first step of your work. Developing/adopting a custom algorithm for such a problem would be a very painful venture in my opinion. I would suggest using either a constraint satisfaction programming (CSP) toolkit or a direct mixed integer programming (MIP) solver.
My point is that it would be much easier to encode your problem using such tools. If performance/accuracy won't be enough for you - you could design a custom solution based on your preliminary results.
For CSP I would suggest Minizinc which has decent documentation and examples.
You could start your MIP research with GLPK. It's not very powerful, but it's definitely capable of dealing with some toy examples.

Related

Algorithms for Minimum resource requirements

I have a question for which I have made some solutions, but I am not happy with the scalability. I'm looking for input of some different approaches / algorithms to solving it.
Problem:
Software can run on electronic controllers (ECUs) and requires
different resources to run a given feature. It may require a given
amount of storage or RAM or a digital or Analog Input or Output for
instance. If we have multiple features and multiple controller options
we want to find the combination that minimizes the hardware
requirements (cost). I'll simplify the resources to letters to
simplify the understanding.
Example 1:
Feature1(A)
ECU1(A,B,C)
First a trivial example. Lets assume that a feature requires 1 unit of resource A, and ECU has 1 unit of resources A, B and C available, it is obvious that the feature will fit in the ECU with resources B & C left over.
Example 2:
Feature2(A,B)
ECU2(A|B,B,C)
In this example, Feature 2 requires resources A and B, and the ECU has 3 resources, the first of which can be A or B. In this case, you can again see that the feature will fit in the ECU, but only if check in a certain order. If you assign F(A) to E(A|B), then F(B) to E(B) it works, but if you assign F(B) to E(A|B) then there is no resource left on the ECU for F(A) so it doesn't appear to fit. This would lead one to the observation that we should prefer non-OR'd resources first to avoid such a conflict.
An example of the above could be a an analog input could also be used as a digital input for instance.
Example 3
Feature3(A,B,C)
ECU3(A|B|C, B|C, A|C)
Now things are a little bit more complicated, but it is still quite obvious to a person that the feature will fit into the ECU.
My problems are simply more scaled up versions of these examples (i.e. multiple features per ECU with more ECUs to choose from.
Algorithms
GA
My first approach to this was to use a genetic algorithm. For a given set of features i.e. F(A,B,C,D), and a list of currently available ECUs find which single or combination of ECUs fit the requirements.
ECUs would initially be randomly selected and features checked they fitted and added to them. If a feature didn't fit another ECU was added to the architecture. A population of these architectures was created and ranked based on lowest cost of housing all the features. Architectures could then be mated in successive generations with mutations and such to improve fitness.
This approached worked quite well, but tended to get stuck in local minima (not the cheapest option) based on a golden example I had worked by hand.
Combinatorial / Permutations
My next approach was to work out all of the possible permutations (the ORs from above) for an ECU to see if the features fit.
If we go back to example 2 and expand the ORs we get 2 permutations;
Feature2(A,B)
ECU2(A|B,B,C) = (A,B,C), (B,B,C)
From here it is trivial to check that the feature fits in the first permutation, but not the second.
...and for example 3 there are 12 permutations
Feature3(A,B,C)
ECU3(A|B|C, B|C, A|C) = (A,B,A), (B,B,A), (C,B,A), (A,C,A), (B,C,A), (C,C,A), (A,B,C), (B,B,C), (C,B,C), (A,C,C), (B,C,C), (C,C,C)
Again it is trivial to check that feature 3 fits in at least one of the permutations (3rd, 5th & 7th).
Based on this approach I was also able to get a solution also, but I have ECUs with so many OR'd inputs that I have millions of ECU permutations which drastically increased the run time (minutes). I can live with this, but first wanted to see if there was a better way to skin the cat, apart from Parallelizing this approach.
So that is the problem...
I have more ideas on how to approach it, but assume that there is a fancy name for such a problem or the name of the algorithm that has been around for 20+ years that I'm not familiar with and I was hoping someone could point me in that direction to either some papers or the names of relevant algorithms.
The obvious remark of simply summing the feature resource requirements and creating a new monolithic ECU is not an option. Lastly, no, this is not in any way associated with any assignment or problem given by a school or university.
Sorry for the long question, but hopefully I've sufficiently described what I am trying to do and this peaks the interest of someone out there.
Sincerely, Paul.

Looks like individual feature plug can be solved as bipartite matching.
You make bipartite graph:
left side corresponds to feature requirements
right side corresponds to ECU subnodes
edges connect each left and right side vertixes with common letters
Let me explain by example 2:
Feature2(A,B)
ECU2(A|B,B,C)
How graph looks:
2 left vertexes: L1 (A), L2 (B)
3 right vertexes: R1 (A|B), R2 (B), R3 (C)
3 edges: L1-R1 (A-A|B), L2-R1 (B-A|B), L2-R2 (B-B)
Then you find maximal matching for unordered bipartite graph. There are few well-known algorithms for it:
https://en.wikipedia.org/wiki/Matching_(graph_theory)
If maximal matching covers every feature vertex, we can use it to plug feature.
If maximal matching does not cover every feature vertex, we are short of resources.
Unfortunately, this approach works like greedy algorithms. It does not know of upcoming features and does not tweak solution to fit more features later. Partially optimization for simple cases can work like you described in question, but in general it's dead end - only algorithm that accounts for every feature in whole feature set can make overall effective solution.
You can try to add several features to one ECU simultaneously. If you want to add new feature to given ECU, you can try all already assigned features plus candidate feature. In this case local optimum solution will be found for given feature set (if it's possible to plug them all to one ECU).

I've not enough reputation to comment, so here's what i wanted to propose for your problem:
Like GA there are some other Random Based approaches too e.g. Bayesian Apporaoch , Decision Tree etc.
In my opinion Decision Tree will suit your problem as it, against some input dataset/attributes, shows a path to each class(in your case ECUs) that helps to select right class/ECU. Train your system with some sample data sets so that it can decide right ECU for your actual data set/Features.
Check Decision Trees - Machine Learning for more information. Hope it helps!

How is IBM Watson Tradeoff Analytics any different from simple constrained decision making?

I am continuously astounded by the technological genius of the IBM Watson package. The tools do things from recognizing the subjects in images to extracting the emotion in a letter, and they're amazing. And then there's Tradeoff Analytics. In their Nests demo, you select a state and then a series of constraints (price must be between W and X, square footage must be between Y and Z, there must be Insured Escrow financing available, etc.) and they rank the houses based on how well they fit your constraints.
It would seem that all Tradeoff Analytics does is run a simple query on the order of:
SELECT * FROM House WHERE price >= W AND price <= X AND square_footage >= Y
AND square_footage <= Z AND ...
Am I not understanding Tradeoff Analytics correctly? I have tremendous respect for the people over at IBM that built all of these amazing tools, but Tradeoff Analytics seems like simple constrained decision making, which appears in any Intro to Programming course as you're learning if statements. What am I missing?

As #GuyGreer pointed out the service indeed uses Pareo Optimization which is much different than simple constraints.
For example:
Say you have three houses
Sqr Footage Price
HouseA 6000 1000K
HouseB 9000 750K
HouseC 8000 800K
Now say your constraints are Sqr Footage > 5000 and Price < 900K
then you are left with House B and House C
Tradeoff Analytics will return to you only houseB.
Since according to Pareto, give your objectives of Price and Footage,
HouseB dominates House C as it has larger footage and is cheaper.
Obviously, this is a made up example, and in real life there are more objecitves (attributes) on which you take into account when you buy a house.
The idea with Pareto, is to find the Pareto Frontier.
Tradeoff Analytics add to Pareto Optimization additional home-grown algorithms to give you more insights on the tradeoff.
Finally the service, is accompanied with a client-side widget that uses novel method for visualizing Pareto Frontiers. In its own a sophisticated problems, given that such frontier is multi-diemnsional.

The page you link to says they use Pareto Optimisation that tries to optimise all the parameters to come to a pareto-optimal solution - a solution or set of solutions for when you can't optimise each individual parameter, so have to settle for some sub-optimal ones.
Rather than just find anything that matches the criteria they are trying to find some sort of optimal solution(s) given the constraints. That's how it's different than simple constrained decision-making.
Note I'm basing this answer completely off of their statement:
The service uses a mathematical filtering technique called “Pareto Optimization,”...
and what I've read about Pareto problems. I have no experience with this technology or Pareto problems myself.

Algorithm for optimal packing with known inventory

Hospitals are changing the way they sterilize their equipment. Previously the local surgeons kept all their own equipment and made their own surgery trays. Now they have to confine to a country wide standard. They want to know how many of the new trays they can make from their existing stock, and how much new equipment they need to buy.
The inventory of medical equipment looks like this:
http://pastebin.com/rstWSurU
each hospitals has codes for various medical equipment and then a number for how many they have of the corresponding item
3 surgery trays with their corresponding items are show in this dictionary.
http://pastebin.com/bUAZhanK
There are a total of 144 different operation trays
the hospitals will be told they need 25 of tray x, 30 of tray y, etc...
They would like to maximize the amounts of trays they can finish with their current stock. They would also like to know what equipment they need to purchase in order to finish the remaining trays.
I have thought about two possible solutions one being representing the problem as a linear programming problem. The other solving the problem by doing a round-robin brute force solve of the first 90% of the problem and solving the remaining 10% by doing a randomized algorithm several times and then pick the best solve of those tries.
I would love to hear if anyone knows a smart way of how to tackle this problem!

If I understand this correctly we can optimize for each hospital separately. My guess is that the following would be a good start for an MIP (Mixed Integer Programming) model:
I use the following indices: i is items and t is trays. x(t,i) indicates how many items we assign to each tray type. y(t) counts the number of trays of each type that we can compose using the available items. From the solution we can calculate the shortages that we need to order.
Of course we are just maximizing the number of trays we can make. There is no consideration of balancing (many trays of one type and few or zero of another). I mitigate a little bit by not allowing to create more trays than required (if we have more items they need to go to other types of trays). This requirement is formulated as an upper bound on y(t).
For large problems we can restrict the (t,i) combinations to the ones that are possible. This will make the model smaller. When using precise math notation:
A further optimization would be to substitute out the variables x(t,i).
Adding shipping surplus items to other hospitals would make the model more difficult. In that case we could end up with a model that needs to look at all hospitals simultaneously. May be an interesting case for some decomposition approach.

Which data mining algorithm would you suggest for this particular scenario?

This is not a directly programming related question, but it's about selecting the right data mining algorithm.
I want to infer the age of people from their first names, from the region they live, and if they have an internet product or not. The idea behind it is that:
there are names that are old-fashioned or popular in a particular decade (celebrities, politicians etc.) (this may not hold in the USA, but in the country of interest that's true),
young people tend to live in highly populated regions whereas old people prefer countrysides, and
Internet is used more by young people than by old people.
I am not sure if those assumptions hold, but I want to test that. So what I have is 100K observations from our customer database with
approx. 500 different names (nominal input variable with too many classes)
20 different regions (nominal input variable)
Internet Yes/No (binary input variable)
91 distinct birthyears (numerical target variable with range: 1910-1992)
Because I have so many nominal inputs, I don't think regression is a good candidate. Because the target is numerical, I don't think decision tree is a good option either. Can anyone suggest me a method that is applicable for such a scenario?

I think you could design discrete variables that reflect the split you are trying to determine. It doesn't seem like you need a regression on their exact age.
One possibility is to cluster the ages, and then treat the clusters as discrete variables. Should this not be appropriate, another possibility is to divide the ages into bins of equal distribution.
One technique that could work very well for your purposes is, instead of clustering or partitioning the ages directly, cluster or partition the average age per name. That is to say, generate a list of all of the average ages, and work with this instead. (There may be some statistical problems in the classifier if you the discrete categories here are too fine-grained, though).
However, the best case is if you have a clear notion of what age range you consider appropriate for 'young' and 'old'. Then, use these directly.

New answer
I would try using regression, but in the manner that I specify. I would try binarizing each variable (if this is the correct term). The Internet variable is binary, but I would make it into two separate binary values. I will illustrate with an example because I feel it will be more illuminating. For my example, I will just use three names (Gertrude, Jennifer, and Mary) and the internet variable.
I have 4 women. Here are their data:
Gertrude, Internet, 57
Jennifer, Internet, 23
Gertrude, No Internet, 60
Mary, No Internet, 35
I would generate a matrix, A, like this (each row represents a respective woman in my list):
[[1,0,0,1,0],
[0,1,0,1,0],
[1,0,0,0,1],
[0,0,1,0,1]]
The first three columns represent the names and the latter two Internet/No Internet. Thus, the columns represent
[Gertrude, Jennifer, Mary, Internet, No Internet]
You can keep doing this with more names (500 columns for the names), and for the regions (20 columns for those). Then you will just be solving the standard linear algebra problem A*x=b where b for the above example is
b=[[57],
[23],
[60],
[35]]
You may be worried that A will now be a huge matrix, but it is a huge, extremely sparse matrix and thus can be stored very efficiently in a sparse matrix form. Each row has 3 1's in it and the rest are 0. You can then just solve this with a sparse matrix solver. You will want to do some sort of correlation test on the resulting predicting ages to see how effective it is.

You might check out the babynamewizard. It shows the changes in name frequency over time and should help convert your names to a numeric input. Also, you should be able to use population density from census.gov data to get a numeric value associated with your regions. I would suggest an additional flag regarding the availability of DSL access - many rural areas don't have DSL coverage. No coverage = less demand for internet services.
My first inclination would be to divide your response into two groups, those very likely to have used computers in school or work and those much less likely. The exposure to computer use at an age early in their career or schooling probably has some effect on their likelihood to use a computer later in their life. Then you might consider regressions on the groups separately. This should eliminate some of the natural correlation of your inputs.

I would use a classification algorithm that accepts nominal attributes and numeric class, like M5 (for trees or rules). Perhaps I would combine it with the bagging meta classifier to reduce variance. The original algorithm M5 was invented by R. Quinlan and Yong Wang made improvements.
The algorithm is implemented in R (library RWeka)
It also can be found in the open source machine learning software Weka
For more information see:
Ross J. Quinlan: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, Singapore, 343-348, 1992.
Y. Wang, I. H. Witten: Induction of model trees for predicting continuous classes. In: Poster papers of the 9th European Conference on Machine Learning, 1997.

I think slightly different from you, I believe that trees are excellent algorithms to deal with nominal data because they can help you build a model that you can easily interpret and identify the influence of each one of these nominal variables and it's different values.
You can also use regression with dummy variables in order to represent the nominal attributes, this is also a good solution.
But you can also use other algorithms such as SVM(smo), with the previous transformation of the nominal variables to binary dummy ones, same as in regression.

Algorithm for creating a school timetable

I've been wondering if there are known solutions for algorithm of creating a school timetable. Basically, it's about optimizing "hour-dispersion" (both in teachers and classes case) for given class-subject-teacher associations. We can assume that we have sets of classes, lesson subjects and teachers associated with each other at the input and that timetable should fit between 8AM and 4PM.
I guess that there is probably no accurate algorithm for that, but maybe someone knows a good approximation or hints for developing it.

This problem is NP-Complete!
In a nutshell one needs to explore all possible combinations to find the list of acceptable solutions. Because of the variations in the circumstances in which the problem appears at various schools (for example: Are there constraints with regards to classrooms?, Are some of the classes split in sub-groups some of the time?, Is this a weekly schedule? etc.) there isn't a well known problem class which corresponds to all the scheduling problems. Maybe, the Knapsack problem has many elements of similarity with these problems at large.
A confirmation that this is both a hard problem and one for which people perennially seek a solution, is to check this (long) list of (mostly commercial) software scheduling tools
Because of the big number of variables involved, the biggest source of which are, typically, the faculty member's desires ;-)..., it is typically impractical to consider enumerating all possible combinations. Instead we need to choose an approach which visits a subset of the problem/solution spaces.
- Genetic Algorithms, cited in another answer is (or, IMHO, seems) well equipped to perform this kind of semi-guided search (The problem being to find a good evaluation function for the candidates to be kept for the next generation)
- Graph Rewriting approaches are also of use with this type of combinatorial optimization problems.
Rather than focusing on particular implementations of an automatic schedule generator program, I'd like to suggest a few strategies which can be applied, at the level of the definition of the problem.
The general rationale is that in most real world scheduling problems, some compromises will be required, not all constraints, expressed and implied: will be satisfied fully. Therefore we help ourselves by:
Defining and ranking all known constraints
Reducing the problem space, by manually, providing a set of additional constraints.This may seem counter-intuitive but for example by providing an initial, partially filled schedule (say roughly 30% of the time-slots), in a way that fully satisfies all constraints, and by considering this partial schedule immutable, we significantly reduce the time/space needed to produce candidate solutions. Another way additional constraints help is for example "artificially" adding a constraint which prevent teaching some subjects on some days of the week (if this is a weekly schedule...); this type of constraints results in reducing the problem/solution spaces, without, typically, excluding a significant number of good candidates.
Ensuring that some of the constraints of the problem can be quickly computed. This is often associated with the choice of data model used to represent the problem; the idea is to be able to quickly opt-for (or prune-out) some of the options.
Redefining the problem and allowing some of the constraints to be broken, a few times, (typically towards the end nodes of the graph). The idea here is to either remove some of constraints for filling-in the last few slots in the schedule, or to have the automatic schedule generator program stop shy of completing the whole schedule, instead providing us with a list of a dozen or so plausible candidates. A human is often in a better position to complete the puzzle, as indicated, possibly breaking a few of the contraints, using information which is not typically shared with the automated logic (eg "No mathematics in the afternoon" rule can be broken on occasion for the "advanced math and physics" class; or "It is better to break one of Mr Jones requirements than one of Ms Smith ... ;-) )
In proof-reading this answer , I realize it is quite shy of providing a definite response, but it none the less full of practical suggestions. I hope this help, with what is, after all, a "hard problem".

It's a mess. a royal mess. To add to the answers, already very complete, I want to point out my family experience. My mother was a teacher and used to be involved in the process.
Turns out that having a computer to do so is not only difficult to code per-se, it is also difficult because there are conditions that are difficult to specify to a pre-baked computer program. Examples:
a teacher teaches both at your school and at another institute. Clearly, if he ends the lesson there at 10.30, he cannot start at your premises at 10.30, because he needs some time to commute between the institutes.
two teachers are married. In general, it's considered good practice not to have two married teachers on the same class. These two teachers must therefore have two different classes
two teachers are married, and their child attends the same school. Again, you have to prevent the two teachers to teach in the specific class where their child is.
the school has separate facilities, like one day the class is in one institute, and another day the class is in another.
the school has shared laboratories, but these laboratories are available only on certain weekdays (for security reasons, for example, where additional personnel is required).
some teachers have preferences for the free day: some prefer on Monday, some on Friday, some on Wednesday. Some prefer to come early in the morning, some prefer to come later.
you should not have situations where you have a lesson of say, history at the first hour, then three hours of math, then another hour of history. It does not make sense for the students, nor for the teacher.
you should spread the arguments evenly. It does not make sense to have the first days in the week only math, and then the rest of the week only literature.
you should give some teachers two consecutive hours to do evaluation tests.
As you can see, the problem is not NP-complete, it's NP-insane.
So what they do is that they have a large table with small plastic insets, and they move the insets around until a satisfying result is obtained. They never start from scratch: they normally start from the previous year timetable and make adjustments.

The International Timetabling Competition 2007 had a lesson scheduling track and exam scheduling track. Many researchers participated in that competition. Lots of heuristics and metaheuristics were tried, but in the end the local search metaheuristics (such as Tabu Search and Simulated Annealing) clearly beat other algorithms (such as genetic algorithms).
Take a look at the 2 open source frameworks used by some of the finalists:
JBoss OptaPlanner (Java, open source)
Unitime (Java, open source) - more for universities

One of my half-term assignments was an genetic-algorithm school table generation.
Whole table is one "organism". There were some changes and caveats to the generic genetic algorithms approach:
Rules were made for "illegal tables": two classes in the same classroom, one teacher teaching two groups at the same time etc. These mutations were deemed lethal immediately and a new "organism" was sprouted in place of the "deceased" immediately. The initial one was generated by a series of random tries to get a legal (if senseless) one. Lethal mutation wasn't counted towards count of mutations in iteration.
"Exchange" mutations were much more common than "Modify" mutations. Changes were only between parts of the gene that made sense - no substituting a teacher with a classroom.
Small bonuses were assigned for bundling certain 2 hours together, for assigning same generic classroom in sequence for the same group, for keeping teacher's work hours and class' load continuous. Moderate bonuses were assigned for giving correct classrooms for given subject, keeping class hours within bonds (morning or afternoon), and such. Big bonuses were for assigning correct number of given subject, given workload for a teacher etc.
Teachers could create their workload schedules of "want to work then", "okay to work then", "doesn't like to work then", "can't work then", with proper weights assigned. Whole 24h were legal work hours except night time was very undesired.
The weight function... oh yeah. The weight function was huge, monstrous product (as in multiplication) of weights assigned to selected features and properties. It was extremely steep, one property easily able to change it by an order of magnitude up or down - and there were hundreds or thousands of properties in one organism. This resulted in absolutely HUGE numbers as the weights, and as a direct result, need to use a bignum library (gmp) to perform the calculations. For a small testcase of some 10 groups, 10 teachers and 10 classrooms, the initial set started with note of 10^-200something and finished with 10^+300something. It was totally inefficient when it was more flat. Also, the values grew a lot wider distance with bigger "schools".
Computation time wise, there was little difference between a small population (100) over a long time and a big population (10k+) over less generations. The computation over the same time produced about the same quality.
The calculation (on some 1GHz CPU) would take some 1h to stabilize near 10^+300, generating schedules that looked quite nice, for said 10x10x10 test case.
The problem is easily paralellizable by providing networking facility that would exchange best specimens between computers running the computation.
The resulting program never saw daylight outside getting me a good grade for the semester. It showed some promise but I never got enough motivation to add any GUI and make it usable to general public.

This problem is tougher than it seems.
As others have alluded to, this is a NP-complete problem, but let's analyse what that means.
Basically, it means you have to look at all possible combinations.
But "look at" doesn't tell you much what you need to do.
Generating all possible combinations is easy. It might produce a huge amount of data, but you shouldn't have much problems understanding the concepts of this part of the problem.
The second problem is the one of judging whether a given possible combination is good, bad, or better than the previous "good" solution.
For this you need more than just "is it a possible solution".
For instance, is the same teacher working 5 days a week for X weeks straight? Even if that is a working solution, it might not be a better solution than alternating between two people so that each teacher does one week each. Oh, you didn't think about that? Remember, this is people you're dealing with, not just a resource allocation problem.
Even if one teacher could work full-time for 16 weeks straight, that might be a sub-optimal solution compared to a solution where you try to alternate between teachers, and this kind of balancing is very hard to build into software.
To summarize, producing a good solution to this problem will be worth a lot, to many many people. Hence, it's not an easy problem to break down and solve. Be prepared to stake out some goals that aren't 100% and calling them "good enough".

My timetabling algorithm, implemented in FET (Free Timetabling Software, http://lalescu.ro/liviu/fet/ , a successful application):
The algorithm is heuristic. I named it "recursive swapping".
Input: a set of activities A_1...A_n and the constraints.
Output: a set of times TA_1...TA_n (the time slot of each activity. Rooms are excluded here, for simplicity). The algorithm must put each activity at a time slot, respecting constraints. Each TA_i is between 0 (T_1) and max_time_slots-1 (T_m).
Constraints:
C1) Basic: a list of pairs of activities which cannot be simultaneous (for instance, A_1 and A_2, because they have the same teacher or the same students);
C2) Lots of other constraints (excluded here, for simplicity).
The timetabling algorithm (which I named "recursive swapping"):
Sort activities, most difficult first. Not critical step, but speeds up the algorithm maybe 10 times or more.
Try to place each activity (A_i) in an allowed time slot, following the above order, one at a time. Search for an available slot (T_j) for A_i, in which this activity can be placed respecting the constraints. If more slots are available, choose a random one. If none is available, do recursive swapping:
a. For each time slot T_j, consider what happens if you put A_i into T_j. There will be a list of other activities which don't agree with this move (for instance, activity A_k is on the same slot T_j and has the same teacher or same students as A_i). Keep a list of conflicting activities for each time slot T_j.
b. Choose a slot (T_j) with lowest number of conflicting activities. Say the list of activities in this slot contains 3 activities: A_p, A_q, A_r.
c. Place A_i at T_j and make A_p, A_q, A_r unallocated.
d. Recursively try to place A_p, A_q, A_r (if the level of recursion is not too large, say 14, and if the total number of recursive calls counted since step 2) on A_i began is not too large, say 2*n), as in step 2).
e. If successfully placed A_p, A_q, A_r, return with success, otherwise try other time slots (go to step 2 b) and choose the next best time slot).
f. If all (or a reasonable number of) time slots were tried unsuccessfully, return without success.
g. If we are at level 0, and we had no success in placing A_i, place it like in steps 2 b) and 2 c), but without recursion. We have now 3 - 1 = 2 more activities to place. Go to step 2) (some methods to avoid cycling are used here).

UPDATE: from comments ... should have heuristics too!
I'd go with Prolog ... then use Ruby or Perl or something to cleanup your solution into a prettier form.
teaches(Jill,math).
teaches(Joe,history).
involves(MA101,math).
involves(SS104,history).
myHeuristic(D,A,B) :- [test_case]->D='<';D='>'.
createSchedule :- findall(Class,involves(Class,Subject),Classes),
predsort(myHeuristic,Classes,ClassesNew),
createSchedule(ClassesNew,[]).
createSchedule(Classes,Scheduled) :- [the actual recursive algorithm].
I am (still) in the process of doing something similar to this problem but using the same path as I just mentioned. Prolog (as a functional language) really makes solving NP-Hard problems easier.

Genetic algorithms are often used for such scheduling.
Found this example (Making Class Schedule Using Genetic Algorithm) which matches your requirement pretty well.

Here are a few links I found:
School timetable - Lists some problems involved
A Hybrid Genetic Algorithm for School Timetabling
Scheduling Utilities and Tools

This paper describes the school timetable problem and their approach to the algorithm pretty well: "The Development of SYLLABUS—An Interactive, Constraint-Based Scheduler for Schools and Colleges."[PDF]
The author informs me the SYLLABUS software is still being used/developed here: http://www.scientia.com/uk/

I work on a widely-used scheduling engine which does exactly this. Yes, it is NP-Complete; the best approaches seek to approximate an optimal solution. And, of course there are a lot of different ways to say which one is the "best" solution - is it more important that your teachers are happy with their schedules, or that students get into all their classes, for instance?
The absolute most important question you need to resolve early on is what makes one way of scheduling this system better than another? That is, if I have a schedule with Mrs Jones teaching Math at 8 and Mr Smith teaching Math at 9, is that better or worse than one with both of them teaching Math at 10? Is it better or worse than one with Mrs Jones teaching at 8 and Mr Jones teaching at 2? Why?
The main advice I'd give here is to divide the problem up as much as possible - maybe course by course, maybe teacher by teacher, maybe room by room - and work on solving the sub-problem first. There you should end up with multiple solutions to choose from, and need to pick one as the most likely optimal. Then, work on making the "earlier" sub-problems take into account the needs of later sub-problems in scoring their potential solutions. Then, maybe work on how to get yourself out of painted-into-the-corner situations (assuming you can't anticipate those situations in earlier sub-problems) when you get to a "no valid solutions" state.
A local-search optimization pass is often used to "polish" the end answer for better results.
Note that typically we are dealing with highly resource-constrained systems in school scheduling. Schools don't go through the year with a lot of empty rooms or teachers sitting in the lounge 75% of the day. Approaches which work best in solution-rich environments aren't necessarily applicable in school scheduling.

Generally, constraint programming is a good approach to this type of scheduling problem. A search on "constraint programming" and scheduling or "constraint based scheduling" both within stack overflow and on Google will generate some good references. It's not impossible - it's just a little hard to think about when using traditional optimization methods like linear or integer optimization. One output would be - does a schedule exist that satisfies all the requirements? That, in itself, is obviously helpful.
Good luck !

I have designed commercial algorithms for both class timetabling and examination timetabling. For the first I used integer programming; for the second a heuristic based on maximizing an objective function by choosing slot swaps, very similar to the original manual process that had been evolved. They main things in getting such solutions accepted are the ability to represent all the real-world constraints; and for human timetablers to not be able to see ways to improve the solution. In the end the algorithmic part was quite straightforward and easy to implement compared with the preparation of the databases, the user interface, ability to report on statistics like room utilization, user education and so on.

You can takle it with genetic algorithms, yes. But you shouldn't :). It can be too slow and parameter tuning can be too timeconsuming etc.
There are successful other approaches. All implemented in open source projects:
Constraint based approach
Implemented in UniTime (not really for schools)
You could also go further and use Integer programming. Successfully done at Udine university and also at University Bayreuth (I was involved there) using the commercial software (ILOG CPLEX)
Rule based approach with heuristisc - See Drools planner
Different heuristics - FET and my own
See here for a timetabling software list

I think you should use genetic algorithm because:
It is best suited for large problem instances.
It yields reduced time complexity on the price of inaccurate answer(Not the ultimate best)
You can specify constraints & preferences easily by adjusting fitness punishments for not met ones.
You can specify time limit for program execution.
The quality of solution depends on how much time you intend to spend solving the program..
Genetic Algorithms Definition
Genetic Algorithms Tutorial
Class scheduling project with GA
Also take a look at :a similar question and another one

This problem is MASSIVE where I work - imagine 1800 subjects/modules, and 350 000 students, each doing 5 to 10 modules, and you want to build an exam in 10 weeks, where papers are 1 hour to 3 days long... one plus point - all exams are online, but bad again, cannot exceed the system's load of max 5k concurrent. So yes we are doing this now in cloud on scaling servers.
The "solution" we used was simply to order modules on how many other modules they "clash" with descending (where a student does both), and to "backpack" them, allowing for these long papers to actually overlap, else it simply cannot be done.
So when things get too large, I found this "heuristic" to be practical... at least.

I don't know any one will agree with this code but i developed this code with the help of my own algorithm and is working for me in ruby.Hope it will help them who are searching for it
in the following code the periodflag ,dayflag subjectflag and the teacherflag are the hash with the corresponding id and the flag value which is Boolean.
Any issue contact me.......(-_-)
periodflag.each do |k2,v2|
if(TimetableDefinition.find(k2).period.to_i != 0)
subjectflag.each do |k3,v3|
if (v3 == 0)
if(getflag_period(periodflag,k2))
#teachers=EmployeesSubject.where(subject_name: #subjects.find(k3).name, division_id: division.id).pluck(:employee_id)
#teacherlists=Employee.find(#teachers)
teacherflag=Hash[teacher_flag(#teacherlists,teacherflag,flag).to_a.shuffle]
teacherflag.each do |k4,v4|
if(v4 == 0)
if(getflag_subject(subjectflag,k3))
subjectperiod=TimetableAssign.where("timetable_definition_id = ? AND subject_id = ?",k2,k3)
if subjectperiod.blank?
issubjectpresent=TimetableAssign.where("section_id = ? AND subject_id = ?",section.id,k3)
if issubjectpresent.blank?
isteacherpresent=TimetableAssign.where("section_id = ? AND employee_id = ?",section.id,k4)
if isteacherpresent.blank?
#finaltt=TimetableAssign.new
#finaltt.timetable_struct_id=#timetable_struct.id
#finaltt.employee_id=k4
#finaltt.section_id=section.id
#finaltt.standard_id=standard.id
#finaltt.division_id=division.id
#finaltt.subject_id=k3
#finaltt.timetable_definition_id=k2
#finaltt.timetable_day_id=k1
set_school_id(#finaltt,current_user)
if(#finaltt.save)
setflag_sub(subjectflag,k3,1)
setflag_period(periodflag,k2,1)
setflag_teacher(teacherflag,k4,1)
end
end
else
#subjectdetail=TimetableAssign.find_by_section_id_and_subject_id(#section.id,k3)
#finaltt=TimetableAssign.new
#finaltt.timetable_struct_id=#subjectdetail.timetable_struct_id
#finaltt.employee_id=#subjectdetail.employee_id
#finaltt.section_id=section.id
#finaltt.standard_id=standard.id
#finaltt.division_id=division.id
#finaltt.subject_id=#subjectdetail.subject_id
#finaltt.timetable_definition_id=k2
#finaltt.timetable_day_id=k1
set_school_id(#finaltt,current_user)
if(#finaltt.save)
setflag_sub(subjectflag,k3,1)
setflag_period(periodflag,k2,1)
setflag_teacher(teacherflag,k4,1)
end
end
end
end
end
end
end
end
end
end
end

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio