Quote: Original post by Kylotan
...I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).
How do you define 'risk' for your agents (quantify it please).
Quote: the duration of the task (which is how long they'll be unavailable for other tasks), and in the situation I envisaged, the attributes of the other members of the proposed group.
This clearly indicates that agents should be trying to maximise their expected future rewards (or minimise expected future losses) given the population of agents (at least in the ideal solution) and the set of tasks.
Okay... more info needed...
Can tasks run concurrently, or are all tasks sequentially ordered? If the latter then each agent has the option to participate in each task. If the former, then an agent must always choose a schedule of tasks to participate in such that this schedule maximises some quantity over this set of tasks. If you force them to a choose a task at any given time they are free and evaluate their potential risk/reward based only on that task, you will not be able to make any assurances about the long term viability of agents (nor encode this in their solutions). They need to be able to consider what it is they are giving up by accepting the following task in order to make rational decisions.
You should probably use a discounted future reward model of expected utility.
Fundamentally you still have one problem: each agent can only make a decision after all other agents have made a decision. You can get around this by asking agents to list their preferences for tasks. Once preferences have been given (and this might be random as a first assignment, or based on some agent attribute) each agent can assess the potential risk/reward of each task more accurately and re-order their preferences. You could iterate this and hope for a stable solution, or merely limit each agent to a finite number of changes they can make to their preference list.