Advertisement

physx chip

Started by April 20, 2006 07:42 PM
223 comments, last by GameDev.net 18 years, 5 months ago
The GPU has already filled in the niche market for parallel data tasks. There are already basic physics simulations "graphic gems" and its only going to get better over time. The new shader models will only make physics simulation easier and more powerful, and doesn't involve a middleman. I believe the people at AGEIA and some investors took what seemed like a good idea and ran with it without really thinking things through. Time will tell that I am right.
The problem with the GPU for general-purpose computation is, it's stupid at doing things that don't fit neatly into four-float vectors. The thing is, that's why a PPU is also (in my opinion) a stupid idea. I DO think we're going to have something like a PPU in our systems, but it's going to be a general-purpose vectorized processing unit. Its application will not be etched into its silicon.
Advertisement
Quote: Original post by Anonymous Poster
The GPU has already filled in the niche market for parallel data tasks. There are already basic physics simulations "graphic gems" and its only going to get better over time. The new shader models will only make physics simulation easier and more powerful, and doesn't involve a middleman. I believe the people at AGEIA and some investors took what seemed like a good idea and ran with it without really thinking things through. Time will tell that I am right.



PhysX SDK aka NovodeX is also the physics engine included with the PS3 Dev Kits. Ageia's software has a very good support for parallel processing which is why they don't really have themselves dug into a hole even if the PPU doesn't take off.
Quote: PhysX is nothing but a 400 MHz chip with a handful of SIMD units that are very much like SSE. And given the extra overhead the actual available processing power of a PPU could be closer to a CPU than you think.

By the way, let's have a look at Cell. 4 GHz x 8 vector coprocessors x 8 floating-point calculations per clock cycle = 256 GFLOPS. This just wipes the floor with PhysX. Also, a GeForce 7800 GTX is 165 GFLOPS. And yes, Cell is a CPU! x86 processors are evolving in the same direction.


Cell is overhyped :-P

Anyways I dunno who is the better performer, but I will this summer ;-)

I plan on upgrading my athlon64 system to a dual core FX and purchasing a PPU(Im a Mech E, I figure it will be nice for some fluid simulations and rigid body dynamics, not 100% accurate, but who knows maybe it will?)
Another good thing is that NovodeX(AKA the PhysX SDK) Is free to use in a commercial product if you support hardware accelerated features via a ppu. On top of that NovodeX is better then ODE, dunno how it compares to newton though.

Quote:
Quote: Once upon a time, real-time lighting wasn't efficient either. When it became practical, it started appearing everywhere.

What are you talking about? Lighting has been real-time since the first 3D game. Don't mistake a modern CPU for a pocket calculator.


I'm not sure that's what he meant. Real-time lighting is a simple "choose greyscale light based on dot product" calculation. This is pretty much required for 3D. But good looking dynamic lights, lighting based on blended diffuse, pixel shading, and the newest techniques like HDR and normal maps are lighting that wasn't efficient until more recently. Now it's everywhere, possibly even on those pocket calculators.

Check out my new game Smash and Dash at:

http://www.smashanddashgame.com/

FYI from BFG techs website

Specifications


Processor: AGEIA PhysX


Memory Interface: 128-bit GDDR3


Memory Capacity: 128MB


Peak Instruction Bandwidth: 20 Billion/sec


Sphere-Sphere Collisions: 530 Million/sec max


Convex-Convex (Complex Collisions): 533,000/sec max



smokin!
Advertisement
Quote: Original post by Cubed3
Processor: AGEIA PhysX
Memory Interface: 128-bit GDDR3
Memory Capacity: 128MB
Peak Instruction Bandwidth: 20 Billion/sec
Sphere-Sphere Collisions: 530 Million/sec max
Convex-Convex (Complex Collisions): 533,000/sec max

Processor: Intel Pentium D 950
Memory Capacity: ~2 GB
Instruction Bandwidth: 20.4 guops/sec (sustainable)
Sphere-Sphere Collisions: 1.7 billion/sec (theoretical)
Triangle-Triangle Intersection: 425 million/sec (theoretical)

I should also add that this processor has a crappy architecture compared to next generation's standards. The efficient branching and cache also allow advanced optimizations to avoid wasting time with the brute-force approach. So let's not stare ourselves blind at the raw numbers. I'm sure Ageia fears a direct benchmark between PhysX and the latest CPU for a real game.

Smokin?
Quote: Original post by C0D1F1ED
Processor: Intel Pentium D 950
Memory Capacity: ~2 GB
Instruction Bandwidth: 20.4 guops/sec (sustainable)
Sphere-Sphere Collisions: 1.7 billion/sec (theoretical)
Triangle-Triangle Intersection: 425 million/sec (theoretical)




Somehow I highly doubt those numbers...
Where did you get them?


I don't think its nearly as powerful as that


PhsyX vs Pentium XE 840 HT

[Edited by - Cubed3 on April 22, 2006 9:40:40 AM]
Quote: Original post by Cubed3
Where did you get them?

Just calculate them. 3.4 GHz x dual-core x 3 instructions per clock (sustained) = 20.4 gigainstructions per second. SSE can do 4 floating-point operations per clock (but only one can start every clock cycle) so GFLOPS might be even higher. It's a theoretical maximum but so is the number from PhysX. The other numbers are derived from this, assuming optimal SSE code.

Quote: PhsyX vs Pentium XE 840 HT

What's the source of this video? Ageia? Of course they will show a demo with a badly optimized software version! Don't be fooled by that. Their marketing is perfect, they want to sell the hardware, but I'm only interested in the capabilities of the product in a real situation versus optimized software.

Besides, a 6+ GFLOPS CPU not capable of handling 6000 objects at more than 5 FPS? Please. That's 200,000 floating-point operations per object. Two-hundred-thousand! Unless you're doing some really stupid brute force collision detection that's a vast amount of processing power.
Quote: Original post by C0D1F1ED
Quote: Original post by Cubed3
Where did you get them?

Just calculate them. 3.4 GHz x dual-core x 3 instructions per clock (sustained) = 20.4 gigainstructions per second. SSE can do 4 floating-point operations per clock (but only one can start every clock cycle) so GFLOPS might be even higher. It's a theoretical maximum but so is the number from PhysX. The other numbers are derived from this, assuming optimal SSE code.

Quote: PhsyX vs Pentium XE 840 HT

What's the source of this video? Ageia? Of course they will show a demo with a badly optimized software version! Don't be fooled by that. Their marketing is perfect, they want to sell the hardware, but I'm only interested in the capabilities of the product in a real situation versus optimized software.

Besides, a 6+ GFLOPS CPU not capable of handling 6000 objects at more than 5 FPS? Please. That's 200,000 floating-point operations per object. Two-hundred-thousand! Unless you're doing some really stupid brute force collision detection that's a vast amount of processing power.


How did you calculate the sphere-sphere collisions? Im qutie curious :P

Also the source of the video is from ageia of course... But the software physics is being done via NovodeX which is a very optimized physics engine.

200,000 floating point operations per object? Is that per frame or per second?

This topic is closed to new replies.

Advertisement