Advertisement

Decision theory, high risk scenarios, etc

Started by February 15, 2008 10:34 AM
30 comments, last by Kylotan 16 years, 9 months ago
Some people value the highs of living on the edge of death or saving a life or glory. In days passed it was an honour to die in battle. Modern Times will include such people as rescue workers, boxers warned by doctors, 'daredevils', David Blaine etc.

So there should be some agents who will go for smaller groups in there. They will all likely be of similar 'dispositions'.
Quote: Original post by Kylotan
Ok, I think my problem is that I don't have an intuitive picture of what this curve looks like, and I have no tool to plot it on. Is it highest where N=1, dropping down as N increases, approaching -C in the limit(N->inf)?
It's highest at some arbitrary point determined by the constants, which is not necessarily 1.

Quote: Perhaps if I retract my original statement of "arrange themselves in groups" and replace it with "be arranged into groups", it will make things clearer. In other aspects of the game, they will sense and act autonomously, but at this point, I'm not interested in having them maximise their reward, just in forming groups of reasonable sizes where they all anticipate a positive reward, taking into account that chance of a massive cost.
Then just do the curve maximum.

Here's a Matlab plot with $=30, D=50, C=1:


Advertisement
Quote: I think the idea is that since you don't know how many rewards you will get in your future lifetime, you don't know how much you could lose by dying. Therefore death is modelled as an infinitely high cost ...

Again, only if you are immortal. Notice that Sneftel puts a cost on participation for every task. His agents effectively believe they will die after (D / C) tasks, and choose accordingly.

Of course, if you can opt-out of tasks without cost, then it's always best to wait until you can max out the curve by joining. What rational agent would be the first to join a task, without knowing how many (if any) other agents will join him?


Quote: Original post by Daerax
So there should be some agents who will go for smaller groups in there. They will all likely be of similar 'dispositions'.


Yeah, the characters in this game will have traits that affect their perception of risk and glory. Similarly, some will prefer smaller groups to larger ones. I'll apply these as modifiers after I fix the baseline for the average character.

Quote: Original post by Sneftel
Quote: Original post by Kylotan
Perhaps if I retract my original statement of "arrange themselves in groups" and replace it with "be arranged into groups", it will make things clearer. In other aspects of the game, they will sense and act autonomously, but at this point, I'm not interested in having them maximise their reward, just in forming groups of reasonable sizes where they all anticipate a positive reward, taking into account that chance of a massive cost.
Then just do the curve maximum.


Ok, I'll play around with SciPad and try to recreate what you did there. And then I suppose I'll have to dig out the calculus book. :) The problem for me was always trying to find (or indeed create) this curve, since it's not very intuitive to me.

Quote: Original post by Argus2
Of course, if you can opt-out of tasks without cost, then it's always best to wait until you can max out the curve by joining. What rational agent would be the first to join a task, without knowing how many (if any) other agents will join him?


I thought I explained above that there isn't any sort of iterative process - you're presented the option of joining A task with N people in it, and that's it. There's no point where an agent takes the decision to be member N+1. (It's possible that if 1 of the N rejects the group, I'll present it again to the rest, but that's just an optimisation to avoid having to pick another permutation of characters.)
I think the part that makes the game a bit boring/bland atm is that all agents know the actual chance of winning and losing. Making agents self-sustained objects with their own parameters for what they think the chances are will affect the result a bit more. Add a small bit of random and make them adjust their expectations based on what they've experienced and seen with others.

Somehow, it feels like you should also include the point of not being able to do something else while involved with an accepted task - so you could be effectively losing by joining a large group.
I wrote up a big reply to this thread yesterday but decided not to post it as I wanted to think about it more... and now I'm looking at things slightly differently, so much of what I wrote yesterday may not be the best approach. Here's my current thinking...

In principle, you need to determine if your agents are risk-averse, risk-neutral or risk-taking and then each agent can make a decision, per task, as to whether to accept the task or not. The problem you have is that in order to solve this problem each agent needs to know how many other agents are going to be involved so that they can reasonably estimate the probabilities for success/failure and subsequently the expected rewards. This creates a problem whereby the desired result is an equilibrium state of a dynamic decision problem, which is not a trivial problem to solve. It has some broad relationships with problems in game theory and economic systems, but I won't go into those here.

Ultimately, if you want an information theoretic solution your agents should NOT be limited to making their decisions based only on the current proposed task.

If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.

Now, the interesting part. Agents should make a bid relative to their risk behaviour. Risk-taking agents will bid more than the expected payoff (but less than the maximum reward that could be obtained... that wouldn't be rational... unless, of course, you want irrational agents to take part as well ;) ). Risk-neutral agents will bid exactly the expected payoff. Risk-averse agents will bid less than the expected payoff.

You can extend this model by actually forcing the agents to pay their bid. If they are accepted and succeed they'll get some reward back. If they fail they have paid a cost to participate. If players are losing money they should adjust their strategy to increase their future expected rewards. The basic method of doing this would be for an agent to 'play it safe' by being more risk-averse and offering less money to participate... but they'll end up taking part in less tasks. By adding a taxation system in you can force agents to become more risk-taking over time. If you get the parameter balance right you can end up with a very nice little system.

I implemented a system like this many years ago, although the application was slight different but had similar aims... and it worked very well (particularly with the noise and taxation).

Cheers,

Timkin
Advertisement
Quote: Original post by dascandy
I think the part that makes the game a bit boring/bland atm is that all agents know the actual chance of winning and losing.


No, in the actual game they will be more than capable of misjudging the risk! It's just that while I tweak the parameters to get the basic system working, I have to work with median values to keep the number of variables manageable.

Quote: Original post by Timkin
Ultimately, if you want an information theoretic solution your agents should NOT be limited to making their decisions based only on the current proposed task.


I was hoping to simplify things by presenting tasks one by one and picking groups for them in turn. I'm not committed to any particular brand of solution, just one that appears to demonstrate some appreciation of the risk/reward payoff for any given group.

Quote: If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.


I don't understand how that deals with situations where they are only safe for larger groups. How do I select the bid level at which I accept participants? I can't do it randomly because the group size must make sense. And I don't think I can just keep adding participants one by one until some perceived safety threshold is reached because that obviously affects the payoff from the task.
Quote: Original post by Kylotan
Quote: If you just want an ad hoc method for determining reasonable assignments of players to tasks, I'd go with a bidding system. Each agent bids an amount it is prepared to pay to take part in the task. Add a small amount of noise to bids for more non-deterministic results. Then, for a given task, select the bid level at which you will accept participants (you can do this randomly, or by looking at the actual bids and working out how many agents you want to participate). All agents offering a bid >= this level are accepted and take part.


I don't understand how that deals with situations where they are only safe for larger groups.


Why do you want to ensure safety? I interpreted your earlier posts to indicate that you wanted agents to be able to make a decision to join or not based on their attitude to risk. By allowing agents to bid, you are essentially asking them to declare their risk strategy (by comparing their bid to the expected payoff).

Quote: How do I select the bid level at which I accept participants?


As to selecting a bid cutoff, you have two choices as I see it.

1) Declare the number of required agents for the task at the beginning of the bidding.

This will enable agents to estimate the expected payoff and bid accordingly. Count down from the highest bid until you have enough agents for the task.

2) Don't declare the required number of agents and simply select a bid cutoff according to some scheme.

For example, take the average bid as the cutoff. Or, starting with the highest bid and working down, take as many bids as equals the total payoff for the task (this will ensure a neutral economy if you don't include taxation).

You may need to iterate over the bidding process several times, culling out people that don't want to participate given the other people bidding. For example, a risk-averse player is unlikely to want to participate with risk-taking players, since the latter will prefer lower groups which have, given the lower probability of success, a lower expected payoff (but again, it's not linear since the share of rewards goes up with fewer participants).



It would be helpful if we had a little more information about the game, like what are the important factors in its design? Is it desireable for agents to live as long as possible? Are they trying to maximise their short term rewards? What is the aim of participation? Just to get the tasks completed? Exactly what information is available to the participants?

Cheers,

Timkin

Quote: Original post by Kylotan
I have a system where several agents have to be arranged into groups to attempt certain abstract tasks, for which a fixed reward is on offer to be shared among the group members if successful.
...
One factor not yet mentioned though is that these tasks carry risk; failed tasks can be considered dangerous and an agent may be destroyed as a result of that task.


These agents can die, and are therefore mortal.
Can a replacement agent be purchased, and if so, how much does it cost?
--"I'm not at home right now, but" = lights on, but no ones home
Quote: Original post by Timkin
Why do you want to ensure safety? I interpreted your earlier posts to indicate that you wanted agents to be able to make a decision to join or not based on their attitude to risk.


No, I want agents to form groups where the size of the group reflects their wish to reduce their perceived risk (while not making the payoff trivial).

Quote: It would be helpful if we had a little more information about the game, like what are the important factors in its design? Is it desireable for agents to live as long as possible?


Yes, it is. That's why I started off with the death analogy, to emphasise that sometimes failure is a really bad thing, from their perspective.

Quote: What is the aim of participation?


Trivially, it's to get the reward. That reward is (mostly) a resource which they can then spend on things, and the cycle repeats.

Quote: Exactly what information is available to the participants?


The riskiness of the task, the total reward available (which will be split evenly among participants if successful), the duration of the task (which is how long they'll be unavailable for other tasks), and in the situation I envisaged, the attributes of the other members of the proposed group.

This topic is closed to new replies.

Advertisement