Advertisement

AI Algorithms on GPU

Started by August 04, 2005 06:55 PM
20 comments, last by Name_Unknown 19 years, 6 months ago
Just my 2 cents...

In the game we're currently working on, its often the GPU who's falling back... so having it doing more computations would just be silly.

With next gen consoles, I really don't see why we would want to compute things on the GPU, and as for PC, there's too many different hardware to really rely on it.

Eric
Maybe we're approaching this from the wrong angle. Up to now, we've thought about how to adopt current AI algorithms for use on a GPU, but what about a graphical effect that can be enhanced with AI that ran on a GPU?

If we can build a complete particle system engine that is GPU based, then why not start from there and say let's improve on the particle and make "smart" particles? So, let's say we use the particle system in a slightly different way. Let's say we have a game where nanotech is used and the players can use a system called "smart lighting dust." So, basically, you can then create a particle system of light sources that swarmed around the level or around the player highlighting certain things or increasing visibility in general.

The original starting point of GPGPU was not just trying to get the GPU to do things other than graphics, but rather, can we piggy-back things onto our graphics that benefit the overall cause. So, instead of thinking that we want to translate AI stuff onto a GPU, why not think whether we can piggy-back AI stuff into graphics itself. So, maybe "smart graphics" or "smart textures" even.
Advertisement
If I remember correctly, the slowdown from reading from a graphics card is less of a problem on the pci-express cards. The pci-express port was developed with 2-way communication in mind.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Quote:
Original post by Sneftel
You still need to deal with the fact that the GPU expects to work on 4-element floating point vectors; that it wants to perform triangle rasterization; that it expects to only perform random access in the form of "textures"; that it's unable to perform any cross-core communication. None of these limitations are inherent to SIMD architectures, and none of them needs to be present in a consumer-level SIMD chip.

This is merely an argument to use Sh or Brook to abstract that layer though. Both of these can target GPUs and provide a generalized stream processing interface. I also wouldn't be at all surprised if both of these can target Cell, multi-core, etc. in the future as well.

Quote:
Original post by Sneftel
Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.

... both companies already have multi-core processors on the market. I don't think there has been any doubt of this evolution for the past few years at least. Still, even multi-core processors are not designed as efficient stream processing engines (like the Cell's SPEs). We do need generalized stream processing/DSP-like things in modern computers, that's for sure. Whether that will end up being provided by some evolution of GPUs, or built into the hardware is really immaterial. Either GPUs will become more general and not be just "GPUs" any more, or we'll convert to something like the Cell architecture and probably won't need GPUs any more.

In either case, GPUs are here now, they're cheap and fast stream processors, and thus I see no reason not to continue research using them... If a large chunk of the research was about finding ways around GPU limitations that would not exist in generalized DSPs, then maybe there'd be something to complain about (research probably unless in the next five years). However I don't think that this is the case at all: the problems that we're hitting now with GPUs are general stream processing and parallel algorithm problems for the most part. Sh and Brook already solve 99% of the interface problems.

Quote:
Original post by AndyTX
Quote:
Original post by Sneftel
You still need to deal with the fact that the GPU expects to work on 4-element floating point vectors; that it wants to perform triangle rasterization; that it expects to only perform random access in the form of "textures"; that it's unable to perform any cross-core communication. None of these limitations are inherent to SIMD architectures, and none of them needs to be present in a consumer-level SIMD chip.

This is merely an argument to use Sh or Brook to abstract that layer though. Both of these can target GPUs and provide a generalized stream processing interface. I also wouldn't be at all surprised if both of these can target Cell, multi-core, etc. in the future as well.

Brook and Sh handle the annoying programming interface issues, but not the underlying hardware design. They can't add in features the card doesn't have, such as inter-fragment communication. If you want to do something parallelizable and future-proof, you'd be better off with MPI (though of course these are less for stream processing).

Quote:
Original post by Sneftel
Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.

... both companies already have multi-core processors on the market. I don't think there has been any doubt of this evolution for the past few years at least.
I think I was unclear there... when I say "multicore", I'm not talking about hyperthreading or the AMD X2. I'm talking about the sort of stream processing that GPUs and the Cell processor can do. Still, I think we're mostly on the same page here.

Well there seems to be some pathfinding written already as pixel shaders. See Pathfinding on the GPU on the shadertech.com site (source is available).

Also found another one on their first page but havent tried that one yet.

Advertisement
Quote:
Original post by Sneftel
Brook and Sh handle the annoying programming interface issues, but not the underlying hardware design. They can't add in features the card doesn't have, such as inter-fragment communication.

Well excepting perhaps "scatter" (which can still be done on the GPU with various trickery, although not altogether efficiently), all of the standard stream processing operations CAN be done on the GPU. I'm not sure what you're referencing with inter-fragment communication since to my knowledge NOT allowing that is a fundimental property that differentiates standard CPUs and DSPs (perhaps THE most important property). If you're speaking of things like "reductions", those CAN be implemented efficiently on a GPU (and are in Brook IIRC and soon Sh).

Quote:
Original post by Sneftel
I think I was unclear there... when I say "multicore", I'm not talking about hyperthreading or the AMD X2. I'm talking about the sort of stream processing that GPUs and the Cell processor can do. Still, I think we're mostly on the same page here.

I won't disagree with you here. Like I said, either the GPU will morph into a general stream processor (and then optionally be integrated into the motherboard - that detail is really insignificant to our development efforts), or something more general will come along and be able to take over the role that the GPU is playing right now.

In any case I think we all agree that stream/parallel processing is becoming more and more important. Right now, I'd still argue though that GPUs are relatively cheap, powerful and readily available stream processors that - with the aid of Sh and/or Brook - are easily used as such.
- Making A* on a GPU is silly : it needs random access on memory, slow with GPU.
- A* it's for finding path...
- Algorithm to find path a 'matrix' way ? Perhaps this
1) Build a matrix A, where A(i,j) = 0 if we can reach node(j) from node(i) in one step
2) For the matrix multiplications, replace + by max, and * by +
3) Compute A^n : A(i,j) will give you the distance you need to reach node(j) from node(i), if this distance is < n
- A^n is cheap to compute. Example
A^7 = (A^4) * (A^3)
A^4 = (A^2) * (A^2)
A^3 = (A^2) * A
A^2 = A * A
4 multiplications instead of 7, log2(n) instead of n ;)

- It could help for a pathfinding, I'am not sure it's very interesting (complexity, efficency)
Anyway, with a special matrix (path on a square grid as example), I'am nearly sure there is some tricks to reduce computations.
Quote:
Original post by Sneftel
Mark my words: within five years, we'll see multi-core processing on Intel and AMD processors. Get ready for that.


Hmm, you do know Athlon X2 and Pentium 4D are already available? ;-) I've had SMP dual and quad machines for years.




"It's such a useful tool for living in the city!"

This topic is closed to new replies.

Advertisement