Advertisement

Alpha maps in the frame buffer

Started by July 22, 2004 07:29 PM
18 comments, last by vincoof 20 years, 4 months ago
It was a logic error (pretty much what vincoof pointed out earlier, but it took me a little while to figure out). AP - the mistake you noticed was introduced with adapting code for this post.

Thanks for the input, though, everyone.
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
Does it means it works now ? If so, congrats :)
Advertisement
Quote: Original post by vincoof
Does it means it works now ? If so, congrats :)


Yes, it works now. However I'm pretty appalled by my GF2's performance fillrate-wise. It seems at ~1000x700 I get around 60-70 frames per second (windowed mode). However, even after rigorous culling and depth exclusion, when I render:

1761 quads (for picking info, no texture or color data)
7044 multitextured triangle fans of 6 vertices (including texture coordinates, color data and blending enabled)

I get this:



While the framerate looks generally on par compared to what an empty frame would produce, it still seems a bit too small.

Things get out of hands when I add a few splatted regions. The triangle counts for the following frame frame are:

1659 quads (feedback mode)
6923 triangle fans (this includes up to two multitextured extra passes on some regions, three passes total)

Here's the result:




To me that framerate seems quite horrid given that I'm actually passing less geometrical data to the graphics card this time. I wonder if this is because of the somewhat crippled GF2 I'm using (there's something wrong with it - stuff like TV-out not working) or might this be a driver issue? I'd go with "I can't code", but all of the passes are subjected to the same culling tests. Just one more sidenote: when I decrease the splatted texture scaling (say, by 100-fold), framerate goes up to 24-28...
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
You don't check picking over every frame, do you ? This is going to kill performance.

Also, when picking is enabled, do you send all vertex information while picking ? You shouldn't send texcoords, normals, etc.

As a last note, it's true that abusive texture repeating can slow down things. Many hardware are optimized for rendering texcoords in the [0,1] range, so when the texcoord exceeds some value, there may be an impressive framerate drop.
No, I'm not seding anything but essential vertex data in feedback mode. As for texture coordinates - I think I'm currently abusing them rather extensively (as in ranging scale from 50 to 100). Sigh... if it only weren't such a headache to keep track of them at all times...
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
Feedback ? Aren't you using the select buffer instead ? Also, do you query it EVERY frame ? Can you disable the pass dedicated to picking and compare the framerate ?

As for texture coordinates, I think 50 to 100 is not really "abusing". The framerate dramatically drops when it comes to thousands of repeats. However, it may be true than even 50 to 100 hits the framerate.
Advertisement
Select, yes of course - a typo. I'm updating every other frame - I could optimise it more only to update when the viewport changes or when the geometry changes, but right now I'm only drawing a trivial 3000 triangles in select mode and disabling it shows no framerate increase. All in all, I've made my peace with the framerate since there seems to be no one resource bottleneck and since the code is pretty optimised by now, I'll just forward the blame on the video card. I haven't had the opportunity so far to test it on any other computer either so speculation seems a bit rash.

The absolutely biggest bottleneck right now is the use of glTexSubImage2D() - even though I'm only updating ~ 9-60 pixels per frame (this only happens occasionally, depending on user input), I need to update one pixel at a time and the framerate drop is quite noticeable (by at least a third, if not more). Is it possible to increase performance through the use of some extension?
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
I'm impressed that disabling selection doesn't increase the framerate. I'm not sure either your FPS count is correct or your enabling/disabling selection is correct.

As for speeding up texture updating, well there two ways that could help :
1- if texels to update are close enough, try to pack them all in a bigger subtexture and send them all in a single glTexSubImage2D call (or several glTexImage2D calls, if your pack is really too big but you can make 3-4 small packs).
2- try copying from graphics card memory to graphics card memory, that is use glCopyTexSubImage2D if you can. The same rule to pack the texels also apply in that case.
Well, disabling picking only gives me 2-4 frames per second (if picking is otherwise done every other frame) - doesn't sound much of a performace bottleneck to me.

As for grouping textels for updating VRAM, I sometimes need to update small rectangular areas (but mostly it's just one texel here and there), which I could group into lines, but other than that I don't see any way to further optimise it.

I added per-vertex lighting, which effectively eats up a few more frames. Since I've only ever done lightmapping and PVL, I was wondering what the gains of per pixel lighting are compared to the other methods. Sure, lighmapping eats up fillrate while PVL won't allow to use fancier shade formations, but how does PPL serve better than those two?
"Literally, it means that Bob is everything you can think of, but not dead; i.e., Bob is a purple-spotted, yellow-striped bumblebee/dragon/pterodactyl hybrid with a voracious addiction to Twix candy bars, but not dead."- kSquared
PPL vs Lightmapping
PPL allow to get good rendering for dynamic lights and/or viewer-dependant lighting effects (such as specular lighting) that lightmaps can not render.
The biggest problems with PPL are :
1- It eats a lot of shading power (vertex processing and fragment processing), and most PPL algorithms need multipass to take into account multiple lights,
2- The user needs a good hardware, not only for performance reasons, but before for being able to render the effect itself. And the older the hardware, the more passes you need per light,
3- The engine is much more complex, and artist need to work much more, needless to say there are very few file formats out there that can handle normal maps and such. Generally developers write their own file format, thus their own exporter/importer.

If your lightmapping works and if you don't need to move the light or to render specular highlightss, then stick to lightmapping. It's by far the easiest way to render nice-looking environment, uses very few performance, and often looks way better than PPL.

If you're looking for PPL for performance reasons, it's a dead end. There are very few applications where PPL can speed up the rendering compared to lightmaps.

PPL vs PVL
Nothing special to say about it, except obvious things :
1- PPL looks better than PVL in all cases,
2- PPL is slower to render than PVL in all cases,
3- PPL is much more complex to implement in a 3D engine than PVL,
moreover in the case of per-pixel bump-mapping (enhanced version of PPL) :
4- PPL is much more complex to handle on the artist side,
5- PPL needs textures, thus uses texture memory and wastes texture units,
6- PPL is a pain in the @$$ when it comes to render multiple lights : either it kills performance, or it's very hard to setup, or it's not even possible sometimes (due to blending limitations in OpenGL). Note that this last point also applies in the non-bump-mapping case.

This topic is closed to new replies.

Advertisement