Advertisement

Recommend an algorithm for this?

Started by May 21, 2008 06:07 AM
47 comments, last by Timkin 16 years, 5 months ago
This is my problem: Say I have a list of 20 people, who I have to split up into 5 groups of 4. Each person has 5 integer attributes (labelled eg 0,1,2,3,4) I want to form the 5 groups where for each of those attributes the average is pretty close compared to the other groups' averages of each of those attributes. eg Group 1 (avg for attribute1 is 4, avg for attribute2 is 3, etc) Group 2 (avg for attribute1 is 4.5, avg for attribute2 is 2.8, etc) etc Can anyone think of any algorithms that might help here? Thank you. edit: I was considering using back tracking... I think thats what it is called. Where I try every way and back track when a solution doesn't work. Might be slow though. [Edited by - anda on May 21, 2008 6:38:14 AM]
Believe it or not, (and I'm a little low on caffeine here this morning, so don't whack me) THIS is a situation where you could actually use genetic algorithms! I'm sure there are other ways, but I was so amused by the fact that, for once, we have a post where a GA could be used, I had to mention it.

Incidentally, you may want to do a bit of reading on bin-packing algorithms. Suffice to say, these are a bitch. Or, in more scientific terms... they are NP-Hard.

Dave Mark - President and Lead Designer of Intrinsic Algorithm LLC
Professional consultant on game AI, mathematical modeling, simulation modeling
Co-founder and 10 year advisor of the GDC AI Summit
Author of the book, Behavioral Mathematics for Game AI
Blogs I write:
IA News - What's happening at IA | IA on AI - AI news and notes | Post-Play'em - Observations on AI of games I play

"Reducing the world to mathematical equations!"

Advertisement
A thing that might work is randomly generate your groups, score them. Take the weakest team and swap the weakest member with a random member of the strongest team(you don't want to go with the strongest guy from the strongest team because then you get a team that is a bunch of losers with one good guy). Repeat until the delta between strong and weak teams is sufficiently small. It may help to block swaps that would result in a weak team from taking a weaker player than the one they are giving up as it would trigger an immediate reswap. Though it may change the strongest team so it may produce more even swapping.
Quote: Original post by InnocuousFox
Believe it or not, (and I'm a little low on caffeine here this morning, so don't whack me) THIS is a situation where you could actually use genetic algorithms!

Indeed. Before you go reaching for the GA cudgel, though, consider simulated annealing. It works a lot like GA, but it's easier to reason about and easier to implement. The initial state would be a random distribution, and the candidate step would be switching two randomly chosen people.
What?!?!? Sneftel and InnocuousFox _recommending_ GAs? WHAT???? I thought it was cold, but....


I agree, SA or GA would work well here.

How much time do you have to run your algorithm? If you have a few minutes, you can just brute force the problem. I can loop through all the 2546168625 (just counting them) in a little over 2 minutes. If you avoid expensive operations (like dividing by 5 or taking square roots) you should be able to find the guaranteed optimum (however you define that) in something like a half hour. Maybe you can abort many branches of the backtracking when you have accumulated a penalty that is larger than the best case so far.
Advertisement
The search space is no larger than 305,540,235,000 (naive upper limit defined by 20c4 * 16c4 * 12c4 * 8c4)

I'd turn the problem into a decision tree myself. Some simple metrics can early-out before depth 5 in a cropping fashion similar to alpha-beta.

It really depends what exactly is being minimized/maximized here.. I dont think that its been defined very well.

Are we taking the maximum deviation from the mean of each attribute (the largest deviation for an attribute from the 5 groups) and using a sum of squares on each attributes deviation?

Or perhaps we are taking the total squared deviation from the mean of each attribute (the common standard deviation) and then doing a squared sum over all attributes?

In any event, regardless of the metric you are using there must be some simple heuristics which can early out from a tree search, cropping that 305 billion nodes down significantly.
GA is just a gradient descent... which might work here. Initialize your groups using a sorted fit-first strategy, then keep doing the permutation that minimize your sum of squared error until you can't find one. You might want to look at 3-permutations too.

You'll most likely get stuck on a local minima, but like others pointed out, its a NP problem.
Quote: Original post by alvaro
How much time do you have to run your algorithm?

Very much preferably a few seconds for 70 people, definitely under 30 seconds.


Because the interface is a website (using a servlet, jsp), if I could display the progress of the algorithm then it wouldn't be so bad and it could take longer.


Quote: Original post by Rockoon1

It really depends what exactly is being minimized/maximized here.. I dont think that its been defined very well.



What do you mean by being minimized and maximized?



Alot of the terms you guys are using I haven't learned much of it yet.

I'm currently learning trees and graphs in uni, only next semester will I be getting into the more complicated stuff.

Although I am not afraid to learn one of these algorithms, but I'll have to pick one or two to be able to learn it in time.

I'm currently looking at the simulated annealing, but theres alot of reading to do.

edit:
I haven't thought out my design completely for organising the people, but what I described at the beginning is what I am trying to accomplish first. I have to start some where.
Simulated annealing in a nutshell: You start with a bad solution. Over a series of "rounds", you try to modify it into a good solution. Each round, you think of a way to slightly change the solution, and see how good an idea it is, and maybe do it. Early on in simulated annealing, you're likely to do it no matter how bad an idea it is; near the end, you only do it if it's a good idea.

This topic is closed to new replies.

Advertisement