Ah whoops, I forgot I had a reply to reply to!
Hidden layers
My "memories" setup is just a more generalized form of your standard "hidden layers" construction. Instead of breaking the ANN down into layers, I categorize my neurons as either inputs (whose value is determined before batch-processing begins), memories (whose value may change with every iteration), or outputs (whose value is only used after batch-processing finishes). It also saves the values not just between iterations but also between batches of iterations, hence why I call them "memories". That and it's like a chunk of memory for use as "scratch paper", in whatever manner the ANN sees fit.
So it's capable of everything a traditional ANN with hidden layers is capable of, and then some. And if I wanted to enforce that certain coefficients in the coeff matrix must always be zero, I could make it exactly emulate a traditional "hidden layers" ANN.
"The largest size"
Yes, I suppose I could just have an arbitrarily large array of inputs, and have a reasonable "default value" to indicate that they aren't in use... but there really is no upper limit on how many contact points the feet can have at once. And even if I chose an arbitrarily large "maximum" number... how would I make the response "symmetric"? I.e. if there are multiple ways to represent virtually identical states, it should behave nearly identically regardless. I guess I could sort the contact points by priority somehow?
Also, a real person doesn't pack contact points into an array. Maybe I could attempt to categorize them by what part of the foot they're in contact with, what their normal vector is, etc.? Even categorizing the contact points like that, if I still had a separate array for each... I don't know, it sounds weird. Maybe I'm being too stubborn.
Normalizing my inputs
You bring up a good point, I haven't been (strictly) normalizing my inputs to [-1, 1]... I did realize that some of the goal-state quantities were way too big (because I was multiplying by 60hz to compute them), and so I multiplied them by something like 0.02 to compensate, but... I didn't want to run the inputs through tanh() unnecessarily, because it seems to me that if the ANN has use for the "squashed" inputs, it will do that itself, whereas if it would've been more useful to have the original linear values, it's going to have to do a bunch of extra work to un-squash that data.
Not sure what to make of this:
ANN feedforward (if you choose to implement it) need only be accomplished once per frame as long as your delta-t values in the model all accurately reflect the operating frequency.
"Only limited by the fitness function"
You can't actually mean that, can you? Sorry if this is bordering on nitpicking, but that "only" is so inaccurate I can't help but address it at length.
You said yourself you didn't think something without any hidden-layer neurons would be capable of solving this sort of problem. That's just the extreme case of "too few memories". Or do you mean to suggest letting it dynamically increase the number of memories? Hmm... I could try it, but I'd need convincing.
And the number of iterations matters too. Iterations are signal propagation... If I configured my ANN to emulate a traditional feed-forward hidden-layers ANN with 5 hidden layers, and it only did one iteration per physics tick, every action it takes will be 5 ticks late for the input that prompted it. That's only 83.3 ms, so it's actually better than the average human reaction time, but that assumes 5 hidden layers is sufficient. Even discounting the extreme case, "zero iterations", clearly the number of iterations is going to have an effect on the quality of the solutions.
And then there's the choice of inputs and outputs!
I don't know what it would look like, but I can say with reasonable confidence that there is some curve in (number of memories, number of iterations) space below which no coefficient matrix will be "good enough" (choice of inputs and outputs are implicit parameters). Though, as I said earlier, I would be very much surprised if (100, 8) is on the wrong side of that curve.
Multiple behaviors
The thing that makes this difficult is that I have a multifaceted goal state, all the parts of which need to be achieved simultaneously (I think). The most recent formulation of this goal state is a desired net force and torque vector on each of three bones: the left foot, the right foot, and the pelvis. And by "net" I mean what's measurable after the constraint solver has had its say, so it's equivalent to specifying a linear and angular velocity I want it to end up with, or a position and orientation. And in fact, a desired pos & ori is how I'm currently selecting the desired force and torque vectors... but in theory I can choose a goal state in any of those terms.
"Dynasties"
The thing is, I can't just let them evolve completely separately and then hope to simply lerp them together or something. If I don't force it to attempt multiple goals simultaneously, the strategies it comes up with to achieve one goal will come at the exclusion of the others.
I've been experimenting with a scheme for GA with multiple simultaneous scoring categories, which at one point I considered calling "dynasties". Inspiration for the idea came from a phrase I once encountered in a paper, "GA with villages". I didn't look up what it meant, but from the sound of it, I guessed it's a scheme for compromising between preserving genetic diversity (between "villages") and maintaining enough homogeneity for crossovers to be viable (within "villages").
Anyway here's how my "dynasties" scheme works:
For each scoring category there is a group of N parents, and every generation, each parent chooses a replacement for itself, which may either be an exact clone, a single-parent mutant, or a crossover with any of the parents in any of the categories. When there were only 1 or 2 parents per category, the label "dynasties" felt more appropriate, but when there's a lot, it's more like "nepotistic apprenticeships"
But whatever, it's not like I'm planning on patenting it.
I can't really tell if it's a step in the right direction. It definitely isn't "good enough" yet though, and I get the sense it's not going to get there just by being left to run for a few days.
Emulating PD with feed-forward ANN
I don't see how that's possible? To get values for the I and D terms, it needs to be able to remember what those components were from the last tick. If it stores "memories" separate from the normal inputs and outputs, doesn't it cease to qualify as "feed-forward"? Or is it a special case where some of the outputs were to be fed back in as inputs? Because that sounds exactly like the original premise that led me to come up with my "memories" system. If you do that, regardless of whether it qualifies as "feed-forward", the idea of having a "training set" of the correct outputs for a given set of inputs, including those memories as both inputs and outputs, becomes much more complicated, if not impossible. At least, I don't know how to do it.
Or did you just mean I would give it the values for P, I, and D as normal inputs, rather than any kind of "emulation"?
Aside: rules check: are my posts here too blog-like? I know some places have rules (or guidelines) against using the forum as your dev blag, and I posted this in the AI board (though it's since been moved to Math & Physics) rather than "Your Announcements"... But I just have so much to say!