🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

An experiment in very high performance blending: Pulse Width Modulative Blending

Started by
10 comments, last by curved_ 2 years, 1 month ago

sorry – I posted this while trying to get a newline after this image. No delete button? idk. Sorry for the confusion.

Advertisement

I am developing a 5v5 multiplayer FPS called Inferno Arena that I made a ground-up engine for with the goal of having the lowest input latency as possible. One of the fundamental design decisions I made for the graphics portion of the engine is to try a blending method I had pondered that requires no object sorting.

I call it Pulse Width Modulative Blending, and it works by blending over time instead of on a frame-by-frame basis. It does this by implementing a blending function in the pixel shader that creates a pseudorandom number for each pixel which is used to generate a chance that the pixel is rendered at full opacity or not rendered at all. The chance of rendering is the texel's alpha value. This full-or-nothing rendering allows the Z buffer to function properly when rendering subsequent polygons on the same screen pixel, and allows the Z-buffer on the GPU to do the sorting instead of doing it on the CPU.

In screenshots the technique looks rough, but while playing at 2,000 FPS, it actually looks pretty good. Its main advantage is full screen particle effects like smoke and such have much lower performance impact than traditional methods.

This is definitely not a technique applicable to most games or engines, but I thought you all would enjoy the concept.

curved_ said:
In screenshots the technique looks rough, but while playing at 2,000 FPS, it actually looks pretty good.

So basically you do stochastic transparency, and you leave temporal accumulation to happen for free in the players brain?

But how can you display 2000 fps? Some displays can do 360 afaik, but not more than that?

@JoeJ right

The refresh rate of the monitor does limit the number of pixel changes per second that the player sees, but it works well in my opinion even at 60 hz. There are monitors in development with 1,000 hz refresh rates by the way, so 2,000 FPS is not a meaningless amount.

curved_ said:
so 2,000 FPS is not a meaningless amount.

Then we have 0.5 ms for one update of the games physics and other systems. /:O\

But if you can render 2000 fps and display at 60 hz, you could accumulate 33 frames. That's enough to get AA, motion blur, and almost noise free transparency.
And you could post a YouTube video.

Though, if all i do per frame is this:

I only get 1400 fps.
So probably i need to upgrade… : )

Inferno Arena runs all the game code (physics etc) at 1,000 ticks per second in another thread, and input polling in another thread at up to 70,000 hz, but the fastest mouse out there runs at 8k hz, so it is artificially limited.

The whole thing is in an exercise in reducing input latency. Faster buffer flips with de-synchronized drawing can technically get you a newer game state to (a part of) the screen.

Furthermore, what happens when scene complexity changes and you overshoot your processing budget temporarily? Hitching. The engine is designed for absolutely no compromises for performance. Ultimately I am developing the game to maximize the best monitors out there not 60hz ones.

@JoeJ also that looks a lot like IMGui. If it is, I implemented my in-game console with it, and found ImGui::Render() causes a lot of performance impact even if you display nothing with it.

curved_ said:
ImGui::Render() causes a lot of performance impact even if you display nothing with it.

Actually the larger problem to me is: At such high framerates, my input sucks. Mouse moves too slowly and feels broken. There's something i do wrong, but never tried to figure it out yet

curved_ said:
Inferno Arena runs all the game code (physics etc) at 1,000 ticks per second in another thread, and input polling in another thread at up to 70,000 hz, but the fastest mouse out there runs at 8k hz, so it is artificially limited.

I see that's kind of Descent clone with no lighting. And it's noisy and 5v5 - so is that your game? Pretty interesting!

I've played Quake 3 at 120Hz CRT, but since that time i'm locked to 60Hz flatscreens.
Though, modern games struggle at hitting even that. I remember the recent Halo: Introduction level at stable 60fps, but then, es soon as open world shows up, down to 45. Felt unplayable, so i reduced all settings to minimum, and then i got 55. On 8 core CPU with Vega 56. :(

Recently i tried the UE5 city demo, which runs at 20 fps. Are they serious? And still, one AAA dev after another proudly announcing to use it? Requiring 1000 bucks GPUs to get what? 40 fps? They really can't wait to kill their business, it seems.

It's really nice to see some indie people keep doing their own thing. The future is ours ;D

curved_ said:
Inferno Arena runs all the game code (physics etc) at 1,000 ticks per second in another thread

That's not possible for the regular case with gravity and interactive rigid bodies, which causes a lot of constraints to solve and collision detection work.
Physics simulation also causes runtime spikes, e.g. if a formerly sleeping stack of boxes tips over due to some impact. Runtime would fluctuate between 100 and 1000 Hz in such cases. That's pointless, so personally i aim for physics at locked 90 or 120 Hz.

For high fps graphics with nice lighting e.g. realtime GI, we would need to decouple lighting from rendering the frame, so a form of caching like texture space shading. But that's very hard to implement and nobody has done it yet.
Currently the industry goes the easy route of doing everything with temporal accumulation in screenspace, which gives similar performance advantages of distributing work over multiple frames.
I still plan to work on the former, but my ideas are quite rough til yet.

Seriously, i really think you should try this:

JoeJ said:
But if you can render 2000 fps and display at 60 hz, you could accumulate 33 frames. That's enough to get AA, motion blur, and almost noise free transparency. And you could post a YouTube video.

You can render 2000 fps for real - so you must do this! You would be the first to demonstrate real motion blur, and all advantages would become visible to anybody.
It's almost no work to implement, and then you can turn ‘wasting power for no benefit’ into true and unique advantage.
I guess 60fps + motion blur is as good as 2000 fps to our brains, and high fps displays probably never become mainstream. (Increasing display brightness is the true goal here, not frequency. We would need 10k nits to match some realistic impression, and current displays only have 500.)

Never heard of this approach to stippling for alpha. Is the choice to discard a fragment based on a screen space texel grid either computed by using math of fragments location, or via a screenspace texture? I get that the alpha of the surface comes into play for probability, but where does the actual stippling pattern come from?

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Its not a stippling pattern, it is a PRNG I created that uses a series of shader inputs to create a fairly good pattern with low moire qualities.

// Create a unique random number for each pixel and mod it by 1000.
// Multipliers are all prime numbers to prevent moire effects.
// random_number is a global int updated by the CPU every frame that acts as a seed
// inner_poisition is an interpolated position of this texel generated by the vertex shader
float frag_rand = inner_position.y*7079+inner_position.x*6959;
float frag_rand2 = random_number+inner_position.z*4969;
float frag_rand3 = inner_position.y*inner_position.x*inner_position.z;
int frag_mod = int(frag_rand);
frag_mod ^= int(frag_rand2) >> (frag_mod % 32) | int(frag_rand2) << (32 - frag_mod % 32);
frag_mod ^= int(frag_rand2) * int(frag_rand);
frag_mod ^= int(frag_rand3);
frag_mod %= 1000;

if(frag_mod<gl_FragColor.a * 1000)
{
// draw frag
}

This topic is closed to new replies.

Advertisement