Quote: Original post by RagharQuote: Original post by wodinoneeye
From what Ive seen so far of Intel's Larrabee, it may be very useful in AI applications. The chip which allegedly will start being produced late this year is to have 16 or 32 P5 2Ghz CPU cores on it (each with its own L1 cache).
Intel plans to initially use them as GPUs, but as each processor is capable of 'atomic' (non-ganged) operations they should be able to run independant programs.
Larrabee is in order CPU, which decreases real power by about 3x. Larrabee is GFX card. Larrabee has thermal design (TDP) 300 W
Majority of people simply like theirs electricity bill too much to use 300 W monster.Quote: AI in games could easily be increased magnitudes to get away from the mannikin-like or limited choreographed scripted NPCs/Monsters/Opponents. Simultaneously running 32 AI scripts would allow more complex/longer/deeper behaviors. Planners and task driven AI methods would be more doable (these often call for alot of evaluation of parallel solutions where only one gets picked as 'best') or doing numerous pattern matchings to classify a changing situation.
All my AIs are compiled (in Java), so no scripting.
Majority of bugs in games are because of scripting, thus multiple parallel scripts are recipe for disaster.Quote: One limitation may be the memory bandwidth. Some AI solutions search thru large data sets (the AI 'programming' itself is often really just script data).
Yes on die controller is still too weak.Quote: 32 data hungry cores would quickly overwhelm memory processing (even if they go to DDR5...) Trying to retain code in the cache space of each core might be difficult (how small can you fit a byte-code engine ???)
It depends on memory interface. GFX card could have wide memory interface, thus easily compensate for increased amount of cores.
With 8 memory chips, the card would have 512 bit memory interface. More chips, or more wide chips, and it would increase its transfer rate even more, as long as data would stay on die.
Larrabee is a chip that Intel also has plans to use for data crunching (not just GPU/Physix). Likely there will be variants with multiples of 8 (so they can sell the half defective chips...) and they dont have to stop at 32. Its also allegedly to use a 45nm fabrication process which will be superceded later, dropping the power used significantly.
When I say scripting it could be bytecode (or equivalent) or compiled scripts, its more the pattern of coding at a somewhat higher level. It might be the source of errors, but its the shear bulk of it that is needed and the irregularity of it (its effectively an 'asset' often created by semi-programmers). The testing difficulties are also problematic because of the combinatoric explosion of endcases within so much logic (and the specs being 'loose'.
With this architecture if you could get the interpretor to fit in the L1 cache (which each core has) and run the compacted bytecode codeblocks thru the data cache it might be similar/faster(?) than bulkier compiled code which has to be continually fetched on the wide memory bus.
As for a wide bus you still have to wait for cache misses (in contention with 31 other cores) aggrevated by the more random nature of AI data flow. Context switching between objects data when each core is running hundreds of scripts makes for a high data turnover, which can only be increased if more complex AI is used. Planner style task processing evaluates many solutions (and their options) and then picks only one to execute. Behaviors reactive to the game situation requires that to be done constantly (churning thru ALOT of data AND script logic...).