Minimum-Waste Print Job Grouping Algorithm?

Minimum-Waste Print Job Grouping Algorithm? - algorithm

I work at a publishing house and I am setting up one of our presses for "ganging", in other words, printing multiple jobs simultaneously. Given that different print jobs can have different quantities, and anywhere from 1 to 20 jobs might need to be considered at a time, the problem would be to determine which jobs to group together to minimize waste (waste coming from over-printing on smaller-quantity jobs in a given set, that is).
Given the following stable data:
All jobs are equal in terms of spatial size--placement on paper doesn't come into consideration.
There are three "lanes", meaning that three jobs can be printed simultaneously.
Ideally, each lane has one job. Part of the problem is minimizing how many lanes each job is run on.
If necessary, one job could be run on two lanes, with a second job on the third lane.
The "grouping" waste from a given set of jobs (let's say the quantities of them are x, y and z) would be the highest number minus the two lower numbers. So if x is the higher number, the grouping waste would be (x - y) + (x - z). Otherwise stated, waste is produced by printing job Y and Z (in excess of their quantities) up to the quantity of X. The grouping waste would be a qualifier for the given set, meaning it could not exceed a certain quantity or the job would simply be printed alone.
So the question is stated: how to determine which sets of jobs are grouped together, out of any given number of jobs, based on the qualifiers of 1) Three similar quantities OR 2) Two quantities where one is approximately double the other, AND with the aim of minimal total grouping waste across the various sets.
(Edit) Quantity Information:
Typical job quantities can be from 150 to 350 on foreign languages, or 500 to 1000 on English print runs. This data can be used to set up some scenarios for an algorithm. For example, let's say you had 5 jobs:
1000, 500, 500, 450, 250
By looking at it, I can see a couple of answers. Obviously (1000/500/500) is not efficient as you'll have a grouping waste of 1000. (500/500/450) is better as you'll have a waste of 50, but then you run (1000) and (250) alone. But you could also run (1000/500) with 1000 on two lanes, (500/250) with 500 on two lanes and then (450) alone.
In terms of trade-offs for lane minimization vs. wastage, we could say that any grouping waste over 200 is excessive.
(End Edit)
...Needless to say, quite a problem. (For me.)
I am a moderately skilled programmer but I do not have much familiarity with algorithms and I am not fully studied in the mathematics of the area. I'm I/P writing a sort of brute-force program that simply tries all options, neglecting any option tree that seems to have excessive grouping waste. However, I can't help but hope there's an easier and more efficient method.
I've looked at various websites trying to find out more about algorithms in general and have been slogging my way through the symbology, but it's slow going. Unfortunately, Wikipedia's articles on the subject are very cross-dependent and it's difficult to find an "in". The only thing I've been able to really find would seem to be a definition of the rough type of algorithm I need: "Exclusive Distance Clustering", one-dimensionally speaking.
I did look at what seems to be the popularly referred-to algorithm on this site, the Bin Packing one, but I was unable to see exactly how it would work with my problem.

This seems similar to the classic Operations Research 'cutting stock' problem. For the formal mathematical treatment try
http://en.wikipedia.org/wiki/Cutting_stock_problem
I've coded solutions for the cutting stock problems using delayed column generation technique from the paper "Selection and Design of Heuristic Procedures for Solving Roll Trim Problems" by Robert W. Haessler (Management Sci. Dec '88). I tested it up to a hundred rolls without problem. Understanding how to get the residuals from the first iteration, and using them to craft the new equation for the next iteration is quite interesting. See if you can get hold of this paper, as the author discusses variations closer to your problem.
If you get to a technique that's workable I recommend using a capable linear algebra solver, rather than re-inventing the wheel. Whilst simplex method is easy enough to code yourself for fractional solutions, what you are dealing with here is harder - it's a mixed integer problem. For a modern C mixed integer solver (MIP) using eg. branch & bound, with Java/python bindings I recommend lp_solve.
When I wrote this I found this NEOS guide page useful. Online solver looks defunct through (for me it returns perl code rather than executing it). There's still some background information.
Edit - a few notes: I'll summarise the differences between your problem and that of the cutting stock:
1) cutting stock has input lengths that are indivisible. You can simulate your divisible problems by running the problem multiple times, breaking up the jobs into 1.0, {0.5, 0.5} time original lengths.
2) your 'length of print run' map to the section length
3) choose a large stock length

I'm going to try and attack the "ideal" case, in which no jobs are split between lanes or printed alone.
Let n be the number of jobs, rounded up to the nearest multiple of 3. Dummy zero-length jobs can be created to make the number of jobs a multiple of 3.
If n=3, this is trivial, because there's only one possible solution. So assume n>3.
The job (or one of the jobs if there are several) with the highest quantity must inevitably be the highest or joint-highest of the longest job group (or one of the joint longest job groups if there is a tie). Equal-quantity jobs are interchangeable, so just pick one and call that the highest if there is a tie.
So if n=6, you have two job groups, of which the longest-or-equal one has a fixed highest or joint-highest quantity job. So the only question is how to arrange the other 5 jobs between the groups. The formula for calculating the grouping waste can be expressed as 2∑hi - ∑xj where the his are the highest quantities in each group and the xjs are the other quantities. So moving from one possible solution to another is going to involve swapping one of the h's with one of the x's. (If you swapped one of the h's with another one of the h's, or one of the x's with another one of the x's, it wouldn't make any difference, so you wouldn't have moved to a different solution.) Since h2 is fixed and x1 and x2 are useless for us, what we are actually trying to minimise is w(h1, x3, x4) = 2h1 - (x3 + x4). If h1 <= x3 <= x4, this is an optimal grouping because no swap can improve the situation. (To see this, let d = x3 - h1 and note that w(x3, h1, x4) - w(h1, x3, x4) = 3d which is non-negative, and by symmetry the same argument holds for swapping with x4). So that deals with the case n=6.
For n=9, we have 8 jobs that can be moved around, but again, it's useless to move the shortest two. So the formula this time is w(h1, h2, x3, x4, x5, x6) = 2h1 + 2h2 - (x3 + x4 + x5 + x6), but this time we have the constraint that h2 must not be less than the second-smallest x in the formula (otherwise it couldn't be the highest or joint-highest of any group). As noted before, h1 and h2 can't be swapped with each other, so either you swap one of them with an appropriate x (without violating the constraint), or you swap both of them, each with a distinct x. Take h1 <= x3 <= x4 <= h2 <= x5 <= x6. Again, single swaps can't help, and a double swap can't help either because its effect must necessarily be the sum of the effects of two single swaps. So again, this is an optimal solution.
It looks like this argument is going to work for any n. In which case, finding an optimal solution when you've got an "ideal case" (as defined at the top of my answer) is going to be simple: sort the jobs by quantity and then chop the sorted list into consecutive groups of 3. If this solution proves not to be suitable, you know you haven't got an ideal case.
I will have a think about the non-ideal cases, and update this answer if I come up with anything.

If I understand the problem (and I am not sure I do), the solution could be as simple as printing job 1 in all three lanes, then job 2 in all three lanes, then job 3 in all three lanes.
It has a worst case of printing two extra sheets per job.
I can think of cases where this isn't optimal (e.g. three jobs of four sheets each would take six pages rather than four), but it is likely to be far far simpler to develop than a Bin Packing solution (which is NP-complete; each of the three lanes, over time, are representing the bins.)

Related

Is there an established algorithm for reaching a goal through the sequential application of known cause and effect tuples?

Let's say that I want to create a fictional product called g.
I know that:
a+b=c
x+y=z
and finally that
c+z=g
So clearly if I start off with products
a,b,x,y
I can create g in three steps:
a+b=c
x+y=z
c+z=g
So a naive algorithm for reaching a goal could be:
For each component required to make the goal (here c and z), recursively find a cause and effect tuple that can create that component.
But there are snags with that algorithm.
For example, let's say that my cause and effect tuples are:
a+b=c
x+y+c=z (NOTE THE EXTRA 'c' REQUIRED!!)
c+z=g
Now when I run my naive algorithm I will do
a+b=c
x+y+c=z (Using up the 'c' I created in the previous step)
c+z=g (Uh oh! I can't do this because I don't have the 'c' any more)
It seems like quite a basic area of research - how we can combine known causes and effects to reach a goal - so I suspect that work must have been done on it, but I've looked around and couldn't find anything and I don't really know where to look now.
Many thanks for any assistance!

Assuming that using a product consumes one item of it, which can then be replaced by producing a second item of that product, I would model this by giving each product a cost and working out how to minimize the cost of the final product. In this case I think this is the same as minimizing the costs of every product, because minimizing the cost of an input never increases the cost of any output. You end up with loads of equations like
a=min(b+c, d+e, f+g)
where a is the cost of a product that can be produced in alternative ways, one way consuming units with cost of b and c, another way consuming units with costs of d and e, another way consuming units with costs of f and g, and so on. There may be cycles in the associated graph.
One way to solve such a problem would be to start by assigning the cost infinity to all products not originally provided as inputs (with costs) and then repeatedly reducing costs where equations show a way of calculating a cost less than the current cost, keeping track of re-calculations caused by inputs not yet considered or reductions in costs. At each stage I would consider the consequences of the smallest input or recalculated value available, with ties broken by a second component which amounts to a tax on production. The outputs produced from a calculation are always at least as large as any input, so newly produced values are always larger than the recalculated value considered, and the recalculated value considered at each stage never decreases, which should reduce repeated recalculation.
Another way would be to turn this into a linear program and throw it at a highly optimized guaranteed polynomial time (at least in practice) linear programming solver.
a = min(b+c, d+e, f+g)
becomes
a = b+c-x
a = d+e-y
a = f+g-z
x >= 0
y >= 0
z >= 0
minimize sum(x+y+z+....)

How to shuffle eight items to approximate maximum entropy?

I need to analyze 8 chemical samples repeatedly over 5 days (each sample is analyzed exactly once every day). I'd like to generate pseudo-random sample sequences for each day which achieve the following:
avoid bias in the daily sequence position (e.g., avoid some samples being processed mostly in the morning)
avoid repeating sample pairs over different days (e.g. 12345678 on day 1 and 87654321 on day 2)
generally randomize the distance between two given samples from one day to the other
I may have poorly phrased the conditions above, but the general idea is to minimize systematic effects like sample cross-contamination and/or analytical drift over each day. I could just shuffle each sequence randomly, but because the number of sequences generated is small (N=5 versus 40,320 possible combinations), I'm unlikely to approach something like maximum entropy.
Any ideas? I suspect this is a common problem in analytical science which has been solved, but I don't know where to look.

By just thinking about:
The base metric that you may be use is the Levenshtein distances or some slightly modification (maybe
myDist(w1, w2) = min(levD(w1, w2), levD(w1.reversed(), w2))
)
Since you want to prevent near distances between any pair of days,
the overall metric can be the sum of the any combinations of sample orders between two days.
Similarity = myDist(day1, day2)
+ myDist(day1, day3)
+ myDist(day1, day4)
+ myDist(day1, day5)
+ myDist(day2, day3)
+ myDist(day2, day4)
+ myDist(day2, day5)
+ myDist(day3, day4)
+ myDist(day3, day5)
+ myDist(day4, day5)
That still is missing, is a heuristic how to create the sample orders.
Your problem reminds me on some fastest path finding problem but with the further difficulty that each selected node influences the weights of the whole graph. So it is much harder.
Maybe a table with all myDistdistances between each pair of the 8! combinations can be created (its commutative, only triangular (without identity diagonal) matrix requiring (~1GB memory)) This may help speeding things up very much.
Maybe take the max from this matrix and consider each combination with value below some threshold as equally worthless to reduce the searchspace.
Build a starting set.
Use 12345678 as the fix day1 since first day does not matter. Never change this.
then repeat until n days are chose:
adding the most distant point from current point.
If there are multiple equal possibility, use the one, that also is most distant from the previous days.
Now iteratively improve the solution - maybe with some ruin-and-recreate-approach. You should always backup the absolute maximum you found and you are able to run as many iterations as you want (and you have time for)
chose (one or two) day(s) with the smallest distance sums to other days
maybe brute force an optimal (in terms of overall distance) combination for these two days.
repeat
If optimization stucks (only same 2 days are chosen or distance is not getting smaller at all)
randomly change one or two days to random orders.
may be totally random (beside day1) starting sets can be selected

Algorithm for container planning

OK guys I have a real world problem, and I need some algorithm to figure it out.
We have a bunch of orders waiting to be shipped, each order will have a volume (in cubic feet), let's say, V1, V2, V3, ..., Vn
The shipping carrier can provide us four types of containers, and the volume/price of the containers are listed below:
Container Type 1: 2700 CuFt / $2500;
Container Type 2: 2350 CuFt / $2200;
Container Type 3: 2050 CuFt / $2170;
Container Type 4: 1000 CuFt / $1700;
No single order will exceed 2700 CuFt but likely to pass 1000 CuFt.
Now we need a program to get an optimized solution on freight charges, that is, minimum prices.
I appreciate any suggestions/ideas.
EDIT:
My current implementation is using biggest container at first, apply the first fit decreasing algorithm for bin packing to get result, then parse through all containers and adjust container sizes according to the content volume...

I wrote a similar program when I was working for a logistics company. This is a 3-dimensional bin-packing problem, which is a bit trickier than a classic 1-dimensional bin-packing problem - the person at my job who wrote the old box-packing program that I was replacing made the mistake of reducing everything to a 1-dimensional bin-packing problem (volumes of boxes and volumes of packages), but this doesn't work: this problem formulation states that three 8x8x8 packages would fit into a 12x12x12 box, but this would leave you with overlapping packages.
My solution was to use what's called a guillotine cut heuristic: when you put a package into the shipping container then this produces three new empty sub-containers: assuming that you placed the package in the back bottom left of the container, then you would have a new empty sub-container in the space in front of the package, a new empty sub-container in the space to the right of the package, and a new empty sub-container on top of the package. Be certain not to assign the same empty space to multiple sub-containers, e.g. if you're not careful then you'll assign the section in the front-right of the container to the front sub-container and to the right sub-container, you'll need to pick just one to which to assign it. This heuristic will rule out some optimal solutions, but it's fast. (As a concrete example, say you have a 12x12x12 box and you put an 8x8x8 package into it - this would leave you with a 4x12x12 empty sub-container, a 4x8x12 empty sub-container, and a 4x8x8 empty sub-container. Note that the wrong way to divide up the free space is to have three 4x12x12 empty sub-containers - this is going to result in overlapping packages. If the box or package weren't cubes then you'd have more than one way to divide up the free space - you'd need to decide whether to maximize the size of one or two sub-containers or to instead try to create three more or less equal sub-containers.) You need to use a reasonable criteria for ordering/selecting the sub-containers or else the number of sub-containers will grow exponentially; I solved this problem by filling the smallest sub-containers first and removing any sub-container that was too small to contain a package, which kept the quantity of sub-containers to a reasonable number.
There are several options you have: what containers to use, how to rotate the packages going into the container (there are usually six ways to rotate a package, but not all rotations are legal for some packages e.g. a "this end up" package will only have two rotations), how to partition the sub-containers (e.g. do you assign the overlapping space to the right sub-container or to the front sub-container), and in what order you pack the container. I used a randomized algorithm that approximated a best-fit decreasing heuristic (using volume for the heuristic) and that favored creating one large sub-container and two small sub-containers rather than three medium-sized sub-containers, but I used a random number generator to mix things up (so the greatest probability is that I'd select the largest package first, but there would be a lesser probability that I'd select the second-largest package first, and so on, with the lowest probability being that I'd select the smallest package first; likewise, there was a chance that I'd favor creating three medium-sized sub-containers instead of one large and two small, there was a chance that I'd use three medium-sized boxes instead of two large boxes, etc). I then ran this in parallel several dozen times and selected the result that cost the least.
There are other heuristics I considered, for example the extreme point heuristic is slower (while still running in polynomial time - IIRC it's a cubic time solution, whereas the guillotine cut heuristic is linear time, and at the other extreme the branch and bound algorithm finds the optimal solution and runs in exponential time) but more accurate (specifically, it finds some optimal solutions that are ruled out by the guillotine cut heuristic); however, my use case was that I was supposed to produce a fast shipping estimate, and so the extreme point heuristic wasn't appropriate (it was too slow and it was "too accurate" - I would have needed to add 10% or 20% to its results to account for the fact that the people actually packing the boxes would inevitably make sub-optimal choices).
I don't know the name of a program offhand, but there's probably some commercial software that would solve this for you, depending on how much a good solution is worth to you.

Zim Zam's answer is good for big boxes, but assuming relatively small boxes you can use a much simpler algorithm that amounts to solving an integral linear equation with a constraint:
Where a, b, c and d are integers being the number of each type of container used:
Given,
2700a + 2350b + 2050c + 1000d <= V (where V is the total volume of the orders)
You want to find a, b, c, and d such that the following function is minimized:
Total Cost C = 2500a + 2200b + 2170c + 1700d
It would seem you can brute force this problem (which is NP hard). Calculate every possible viable combination of a, b, c and d, and calculate the total cost for each combination. Note that no solution will ever use more than 1 container of type d, so that cuts down the number of possible combinations.
I am assuming orders can be split between containers.

What are some examples of problems well suited for Integer Linear Programming?

I've always been writing software to solve business problems. I came across about LIP while I was going through one of the SO posts. I googled it but I am unable to relate how I can use it to solve business problems. Appreciate if some one can help me understand in layman terms.

ILP can be used to solve essentially any problem involving making a bunch of decisions, each of which only has several possible outcomes, all known ahead of time, and in which the overall "quality" of any combination of choices can be described using a function that doesn't depend on "interactions" between choices. To see how it works, it's easiest to restrict further to variables that can only be 0 or 1 (the smallest useful range of integers). Now:
Each decision requiring a yes/no answer becomes a variable
The objective function should describe the thing we want to maximise (or minimise) as a weighted combination of these variables
You need to find a way to express each constraint (combination of choices that cannot be made at the same time) using one or more linear equality or inequality constraints
Example
For example, suppose you have 3 workers, Anne, Bill and Carl, and 3 jobs, Dusting, Typing and Packing. All of the people can do all of the jobs, but they each have different efficiency/ability levels at each job, so we want to find the best task for each of them to do to maximise overall efficiency. We want each person to perform exactly 1 job.
Variables
One way to set this problem up is with 9 variables, one for each combination of worker and job. The variable x_ad will get the value 1 if Anne should Dust in the optimal solution, and 0 otherwise; x_bp will get the value 1 if Bill should Pack in the optimal solution, and 0 otherwise; and so on.
Objective Function
The next thing to do is to formulate an objective function that we want to maximise or minimise. Suppose that based on Anne, Bill and Carl's most recent performance evaluations, we have a table of 9 numbers telling us how many minutes it takes each of them to perform each of the 3 jobs. In this case it makes sense to take the sum of all 9 variables, each multiplied by the time needed for that particular worker to perform that particular job, and to look to minimise this sum -- that is, to minimise the total time taken to get all the work done.
Constraints
The final step is to give constraints that enforce that (a) everyone does exactly 1 job and (b) every job is done by exactly 1 person. (Note that actually these steps can be done in any order.)
To make sure that Anne does exactly 1 job, we can add the constraint that x_ad + x_at + x_ap = 1. Similar constraints can be added for Bill and Carl.
To make sure that exactly 1 person Dusts, we can add the constraint that x_ad + x_bd + x_cd = 1. Similar constraints can be added for Typing and Packing.
Altogether there are 6 constraints. You can now supply this 9-variable, 6-constraint problem to an ILP solver and it will spit back out the values for the variables in one of the optimal solutions -- exactly 3 of them will be 1 and the rest will be 0. The 3 that are 1 tell you which people should be doing which job!
ILP is General
As it happens, this particular problem has a special structure that allows it to be solved more efficiently using a different algorithm. The advantage of using ILP is that variations on the problem can be easily incorporated: for example if there were actually 4 people and only 3 jobs, then we would need to relax the constraints so that each person does at most 1 job, instead of exactly 1 job. This can be expressed simply by changing the equals sign in each of the 1st 3 constraints into a less-than-or-equals sign.

First, read a linear programming example from Wikipedia
Now imagine the farmer producing pigs and chickens, or a factory producing toasters and vacuums - now the outputs (and possibly constraints) are integers, so those pretty graphs are going to go all crookedly step-wise. That's a business application that is easily represented as a linear programming problem.
I've used integer linear programming before to determine how to tile n identically proportioned images to maximize screen space used to display these images, and the formalism can represent covering problems like scheduling, but business applications of integer linear programming seem like the more natural applications of it.
SO user flolo says:
Use cases where I often met it: In digital circuit design you have objects to be placed/mapped onto certain parts of a chip (FPGA-Placing) - this can be done with ILP. Also in HW-SW codesign there often arise the partition problem: Which part of a program should still run on a CPU and which part should be accelerated on hardware. This can be also solved via ILP.

A sample ILP problem will looks something like:
maximize 37∙x1 + 45∙x2
where
x1,x2,... should be positive integers
...but, there is a set of constrains in the form
a1∙x1+b1∙x2 < k1
a2∙x1+b2∙x2 < k2
a3∙x1+b3∙x2 < k3
...
Now, a simpler articulation of Wikipedia's example:
A farmer has L m² land to be planted with either wheat or barley or a combination of the two.
The farmer has F grams of fertilizer, and P grams of insecticide.
Every m² of wheat requires F1 grams of fertilizer, and P1 grams of insecticide
Every m² of barley requires F2 grams of fertilizer, and P2 grams of insecticide
Now,
Let a1 denote the selling price of wheat per 1 m²
Let a2 denote the selling price of barley per 1 m²
Let x1 denote the area of land to be planted with wheat
Let x2 denote the area of land to be planted with barley
x1,x2 are positive integers (Assume we can plant in 1 m² resolution)
So,
the profit is a1∙x1 + a2∙x2 - we want to maximize it
Because the farmer has a limited area of land: x1+x2<=L
Because the farmer has a limited amount of fertilizer: F1∙x1+F2∙x2 < F
Because the farmer has a limited amount of insecticide: P1∙x1+P2∙x2 < P
a1,a2,L,F1,F2,F,P1,P2,P - are all constants (in our example: positive)
We are looking for positive integers x1,x2 that will maximize the expression stated, given the constrains stated.
Hope it's clear...

ILP "by itself" can directly model lots of stuff. If you search for LP examples you will probably find lots of famous textbook cases, such as the diet problem
Given a set of pills, each with a vitamin content and a daily vitamin
quota, find the cheapest cocktail that matches the quota.
Many such problems naturally have instances that require varialbe to be integers (perhaps you can't split pills in half)
The really interesting stuff though is that actually a big deal of combinatorial problems reduce to LP. One of my favourites is the assignment problem
Given a set of N workers, N tasks and an N by N matirx describing how
much each worker charges for the each task, determine what task to
give to each worker in order to minimize cost.
Most solution that naturally come up have exponential complexity but there is a polynomial solution using linear programming.
When it comes to ILP, ILP has the added benefit/difficulty of being NP-complete. This means that it can be used to model a very wide range of problems (boolean satisfiability is also very popular in this regard). Since there are many good and optimized ILP solvers out there it is often viable to translate an NP-complete problem into ILP instead of devising a custom algorithm of your own.

You can apply linear program easily everywhere you want to optimize and the target function is linear. You can make schedules (I mean big, like train companies, who need to optimize the utilization of the vehicles and tracks), productions (optimize win), almost everything. Sometimes it is tricky to formulate your problem as IP and/or sometimes you meet the problem that your solution is, that you have to produce e.g. 0.345 cars for optimum win. That is of course not possible, and so you constraint even more: Your variable for the number of cars must be integer. Even when it now sounds simpler (because you have infinite less choices for your variable), its actually harder. In this moment it gets NP-hard. Which actually means you can solve ANY problem from your computer with ILP, you just have to transform it.
For you I would recommend an intro into reading some basic (I)LP stuff. From my mind I dont know any good online site (but if you goolge you will find some), as book I can recommend Linear Programming from Chvatal. It has very good examples, and describes also real use cases.

The other answers here have excellent examples. Two of the gold standards in business of using integer programming and more generally operations research are
the journal Interfaces published by INFORMS (The Institute for Operations Research and the Management Sciences)
winners of the the Franz Edelman Award for Achievement in Operations Research and the Management Sciences
Interfaces publishes research that uses operations research applied to real-world problems, and the Edelman award is a highly competitive award for business use of operations research techniques.

An algorithm to sort a list of values into n groups so that the sum of each group is as close as possible

Basically I have a number of values that I need to split into n different groups so that the sums of each group are as close as possible to the sums of the others? The list of values isn't terribly long so I could potentially just brute force it but I was wondering if anyone knows of a more efficient method of doing this. Thanks.

If an approximate solution is enough, then sort the numbers descendingly, loop over them and assign each number to the group with the smallest sum.
groups = [list() for i in range(NUM_GROUPS)]
for x in sorted(numbers, reverse=True):
mingroup = groups[0]
for g in groups:
if sum(g) < sum(mingroup):
mingroup = g
mingroup.append(x)

This problem is called "multiway partition problem" and indeed is computationally hard. Googling for it returned an interesting paper "Multi-Way Number Paritioning where the author mentions the heuristic suggested by larsmans and proposes some more advanced algorithms. If the above heuristic is not enough, you may have a look at the paper or maybe contact the author, he seems to be doing research in that area.

Brute force might not work out as well as you think...
Presume you have 100 variables and 20 groups:
You can put 1 variable in 20 different groups, which makes 20 combinations.
You can put 2 variables in 20 different groups each, which makes 20 * 20 = 20^2 = 400 combinations.
You can put 3 variables in 20 different groups each, which makes 20 * 20 * 20 = 20^3 = 8000 combinations.
...
You can put 100 variables in 20 different groups each, which makes 20^100 combinations, which is more than the minimum number of atoms in the known universe (10^80).
OK, you can do that a bit smarter (it doesn't matter where you put the first variable, ...) to get to something like Branch and Bound, but that will still scale horribly.
So either use a fast deterministic algorithm, like larsman proposes.
Or if you need a more optimal solution and have the time to implement it, take a look at metaheuristic algorithms and software that implement them (such as Drools Planner).

You can sum the numbers and divide by the number of groups. This gives you the target value for the sums. Sort the numbers and then try to get subsets to add up to the required sum. Start with the largest values possible, as they will cause the most variability in the sums. Once you decide on a group that is not the optimal sum (but close), you could recompute the expected sum of the remaining numbers (over n-1 groups) to minimize the RMS deviation from optimal for the remaining groups (if that's a metric you care about). Combining this "expected sum" concept with larsmans answer, you should have enough information to arrive at a fast approximate answer. Nothing optimal about it, but far better than random and with a nicely bounded run time.

Do you know how many groups you need to split it into ahead of time?
Do you have some limit to the maximum size of a group?
A few algorithms for variations of this problem:
Knuth's word-wrap algorithm
algorithms minimizing the number of floppies needed to store a set of files, but keeping any one file immediately readable from the disk it is stored on (rather than piecing it together from fragments stored on 2 disks) -- I hear that "copy to floppy with best fit" was popular.
Calculating a cutting list with the least amount of off cut waste. Calculating a cutting list with the least amount of off cut waste
What is a good algorithm for compacting records in a blocked file? What is a good algorithm for compacting records in a blocked file?
Given N processors, how do I schedule a bunch of subtasks such that the entire job is complete in minimum time? multiprocessor scheduling.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio