Tuesday, July 21, 2009

Advanced Lighting for TGEA

This stuff (advanced lighting) has been lying around for a few months now collecting dust, been to busy with other stuff so I decided I’ll spend a few days rounding off the new lighting system I developed as well as finishing the port for TGEA otherwise I’ll never do it – This technique should not be confused with the new advanced lighting that comes with T3D as they are radically different from one another.

The first part of the video (inside the cave) is done with the up to date lighting system, the rest of the video is old recompressed prototype footage so the quality will be low. I don’t have time to make proper video captures…

Here’s a few screenshots from a scene I used to so some testing with.

screenshot_123-00006

screenshot_123-00005

screenshot_123-00002

screenshot_123-00003

This lighting system is intended for shader model 3.0 (SM3) and upward GPUs. Theoretically I could make it work for SM2 but the performance would be decrepit. This is what I like to call ‘Real Lighting’ because everything happens in real-time so to speak, no pre-calculated light and shadow textures. Every texel gets lit on demand. ALL lighting and transform computations takes place on the GPU, even all the light data for each light resides on the GPU. Shaders can easily compute the light/shadow factors by using a abstraction shader library I developed to extract the relevant data from textures. Lighting is full HDR and shadow fidelity is not affected by the complexity of geometry. This has advantages like proper specular relfections, high quality normal (bump) mapping.  Everything in the scene gets lit uniformly by the light sources. In other word being it a ShapeBase, TSStatic or Interioninstance object they all get lit by the same lighting dynamics and they all cast shadows. There are no proprietary shadow algorithm for any of the above mentioned object types. Shadow computation is independent from object type and are done in the shaders. So there are no ugly, inaccurate out of date projected shadows casting through walls etc. All objects are also shadow receivers. All shadow data are embedded/abstracted in the lighting solution one gets by using the shader lighting library.

“So are the shadows screenspace shadows?” I’ve been asked this questions many times by people who has seen the lighting system in action. No it isn’t. I can see how this can easily be confused for screenspace shadows. Keep in mind that with this technique all the lighting and shadow data is available to the shaders, so shadow factors are already factored into any pixel shader’s (which does lighting calculations) fragment output. Screenspace shadows (or shadow masks as some people like to call them) is a post effect and usually takes a Gaussian blurred greyscale shadow buffer and modulates it with the final scene to overlay the shadows. This usually requires additional shadow render passes which is just nasty.  The biggest drawback in each implementation of screenspace shadows I came across is that the shadow masks simply ignore light color values. Imagine a green light shining on a spot from the left and a red light shining on the same spot from the right. If an object now stands on the right edge of the spot blocking of some of the red light we should see a green shadow silhouette on the spot because although some of the red light don’t reach the spot the green light still does. Shadows masks seems to modulate over all colors regardless.

Last but not least one needs a proper light manager to determine exactly which lights to render and which not. I won’t go into details here. All I can say is that the manager is very effective in collecting only visible lights, no occluded lights get rendered and there are no lights ‘popping’ in the scene. The implementation is rather effective and optimized, the demo video never dropped below 60 fps with DOF, SSAO, bloom and volumetric lighting enabled.

If there are people interested in the details of the technique described on this page then they should post questions and I’ll answer them. I’ll update this blog at a later stage and clarify the details a bit.

Thursday, April 9, 2009

First post for 2009!

Hi folks! Been some time since I’ve posted anything on my blog. Been very busy doing some new developments. Developed a new framework SDK for structural fatigue testing and analysis in the aerospace and automotive industry called the Cubus Host Framework SDK for a British based company I do work for. It is being tested and integrated world-wide at the moment, so I have been MIA for quite some time. Have also been heads down on some new features for GG’s T3D release and am still busy debugging some stuff! Developing features for game engines and integrating them aren’t  trivial! I have received various emails on why the blog is standing still, well now you know!

For those of you who missed it, there are a few blogs show piecing some of the work on this page over at Garage Games’ site, here are the links,

http://www.garagegames.com/community/blogs/view/16083
http://www.garagegames.com/community/blogs/view/16165
http://www.garagegames.com/community/blogs/view/16437
http://www.garagegames.com/community/blogs/view/16587

Here is what I plan to blog about next,

AI Part I – Environmental awareness.

GPU Particles

GPU destruction dynamics

Cloth enhancements

Enhanced lighting and shadowing.

All or most of the above should go onto GG’s blog in the near future as well.

Thursday, October 2, 2008

Simulating wetness in TGEA

Since starting this blog I received numerous emails asking me to reveal some of the 'secrets' of the techniques displayed on this blog. So I decided to post a few things in a more informative way.

In this post I'll describe the use of the COLOR0-COLOR3 render targets (introduced in DX9 if remember correctly) during a single render pass to obtain dept-buffer data.

I'll specifically focus on adding depth buffer information to the rendering pipeline. Shader model v4.0 (DX10) allows one to read the depth buffer directly, unfortunately this is not available in DX9 which is what TGEA currently runs on.

I'll first give a short overview (taking major quantum leaps) of how to add depth buffer info into the rendering pipeline and then I'll give a practical example of depth buffer usage. FSAO & Depth of field (blogged on this page) makes use of it as well.

Getting depth buffer data  requires no pre-processing on the CPU apart from defining the depth buffer render target and attaching it to one of the open render target slots, for example COLOR1  as well as handling screen resizing etc.

Now, most of you probably bind your pixel shaders to COLOR0 by defining the pixel shader main function something like,

void main( Conn In) : COLOR 
or,
struct Frag
{
float4 col : COLOR0;
};
Frag main( Conn In) ....


Now, simply extending the above struct to,




struct Frag
{ float4 col : COLOR0;
float depth : COLOR1;
};



and binding your depth render target to COLOR1 before a scene render pass gives the you the ability to set and access depth info during scene render passes. Actually you need two depth render targets defined and need to swap them after each scene render pass, you cannot have a render target bound to both input and output slots.



Now in your pixel shader you can output data to both the COLOR0 and COLOR1 (depth) in the same pass. So now with little shader overhead you have depth info without any pre-processing on the CPU, i.e.



OUT.col = some value



OUT.depth = depth value (computed in whichever way seems practical)



OK, I'm not going to write about this any further, by now you get the idea.



Let's look at an example of how the depth buffer can be applied in practice to solve a specific problem.



Let me start of by giving an overview of the video clip. The clip demonstrates the use of a multi-pass CustomMaterial chain to simulate wetness. It also shows conditional pass execution - I'll blog this at a later stage but in this demo the custom $Weather::IsRaining script variable has to evaluate  to 'true' so that the pass that adds wetness to geometry can execute. I also extended the precipitation class to allow for shaded ripples. This is done by drawing a quad on a surface where rain drops collide with it. The normals used for the waves are generated by in the shader. Why do I use a depth buffer? Look at the video clip below, the ripple quads protruding over surface edges draws over these edges leaving ripples floating in the air. The debug version of  the ripple shader draws the erroneous ripple areas in red.  This can be determined by evaluating the ripple texel-depths against the current depth-buffer.



I suggest you download the high-res version on this one, a lot of detail is lost in the online version...





Download Hi-Res version here -



To summarize, this example shows how the depth buffer can be used to clip geometry. Consider also that the SAME depth buffer data gets used for FSAO and Depth of field.

Monday, September 22, 2008

GPU Water Simulation

The video shows an example of how Verlet integration can be used to simulate dynamic water surfaces.

It uses the same technology as the cloth simulation with minor modifications.

This video shows how the water surface interacts with rain and projectile collisions.

Simulation is 100% GPU based. All collision computations are done with a fast approximation shader. There are however a bit of pre-processing involved with collisions in that  data from the CPU has to be transferred to the shader.

See my post on GPU Cloth for more information...

Download Hi-Res version here - 

Friday, September 19, 2008

Artwork

The modeling of the interiors and character (dubbed 'Dude' for lack of a better name!) was modeled by Ruan van der Westhuizen.

Ruan is a highly talented animator/modeler. The character seen in the demo videos was done purely for the purpose of having a nice character for the demos. Ruan later adapted it for the 'Make Something Unreal' competition hosted by Intel and Epic Games.

This was his first attempt of modeling a character for TGEA, not too bad at all if I may say so.... The other amazing thing is the short amount of time he uses to churn these things out... Go have a look at his work on his blog at ruanwest.blogspot.com.

In the meantime here is a short video giving a quick overview of his workflow.

Email update

Updated the email address in my profile to gerhard@hybrid-logic.com.

Thursday, September 18, 2008

GPU Cloth Simulation in Torque Game Engine

Download Hi-Res version here - 

Cloth simulation using is done using Verlet integration and is 100% GPU based using shader V3.0.

TGEA up to date has no support for vertex texture fetching. I integrated this seamlessly into the TGEA texture stage pipeline. One can now specify vertex textures inside CustomMaterial instances using shader V3 or dynamically from code (which I do quite a lot). This was necessary to make GPU cloth simulation possible in TGEA.

There is a high level of interaction with the game environment, including object and character collisions.

Collision detection is 100% GPU based using a very fast
collision approximation shader algorithm.

Cloth can be easily manipulated using the world editor. Cloth vertices can be attached to any scene node(s) and then be
manipulated by these control nodes.

A vast array of simulation and rendering parameters can be accessed via the mission inspector.

Forces like wind and gravity can easily be applied. For example, a custom wind force could be used or the wind settings of the current mission could be used.

Cloth can be cut either manually in the editor and could be used to 'sculpt' the cloth and then save it for later use in a  mission/level. Cloth can also be cut by objects like projectiles colliding with it.

Friday, September 12, 2008

Fullscreen ambient occlusion mapping

This blog contains a brief description of my implementation of fullscreen ambient occlusion (FSAO) also known as Screen Space Ambient Occlusion (SSAO) mapping inside torque game engine advanced.

The demo videos demonstrates SSAO at a quality level with has a negligible effect on the frame rate and SSAO has been exaggerated for demonstration purposes.

Ambient occlusion mapping can add great lighting detail to a scene and make it look more natural.

Download Hi-Res version here -

Fullscreen ambient occlusion has many advantages over baked ambient occlusion maps. Baked ambient occlusion maps are static and usually time consuming to generate on the fly and
requires a lot of memory and usually needs to baked into the shadow maps which also requires a lot of memory to achieve decent quality.

I came across an ambient occlusion map resource on Garage Game's site. Unfortunately this resource bakes the ambient occlusion maps into the shadow maps during scene lighting. Not only is the baking process VERY slow but because it is being baked it is also very static. It therefore does not support ambient occluded lighting for dynamic in-game  moving objects which is a major drawback.

Furthermore, baked ambient occluded lighting in Torque can ONLY be applied to interiors, staticshape or shapebased objects are excluded from receiving ambient occlusion.

Other options are dynamic ambient occlusion and fullscreen ambient occlusion. Dynamic ambient occlusion is a better option than baked ambient occlusion but requires a lot of maintenance like tree structures and object culling making it tedious and also very much dependent on the CPU.

I decided to implement the best solution that matches the in-game speed of baked ambient occlusion which is fullscreen ambient occlusion (FSAO) .

An advantage of FSAO is that it uses information that is already present in most shaders. These are texel depth and texel normals information. Shaders simply write this information to the relevant render targets which in turn gets used by the fullscreen FSAO shader during render post processing, see my blog posts on my new implementation of the CustomMaterial class as well as my post on multi-channel rendering to understand how this can easily be done during a single scene render pass. FSAO is also 100% GPU based which is in indication of the little amount of code management needed to run FSAO.

Use of shaders during rapid prototyping development

 

Texturing is probably still the most important aspect to achieve a good look and feel for any game. A lot of detail and a specific look and feel can be achieved by using texturing.

Intelligent use of shaders however also plays an important role as well and can greatly enhance the look and feel of a game.

This video illustrates how a minimally textured scene can be brought to life by the use of shaders.

Download Hi-Res version here -

Skin shader (Subsurface skin scatter simulation)

 

Single pass fast sub-surface skin scattering shader, using normalsmap for surface detailing and a custom Lambert shading model for subsurface illumination simulation.

An important aspect of the shader is that it achieves a respectable level of realism and light interaction without degrading in-game frame rate and performance.

Download Hi-Res version here -

Torque Game Engine Advanced Enhancements

 

Following is a VERY brief description of some changes, fixes and additions made by myself to TGEA 1.0.3 and TGEA 1.7.0. I Don't attempt to address everything I've done in detail (I can write a book by now, LOL!) I will focus on some of the core issues. 

AI

The one important thing missing in TGEA! Added a fully functional AI system into the torque pipeline. Techniques used are mostly computational intelligence techniques like Particle Swarm Optimization (PSO) for swarm behavior and fitness based solution finding, genetic algorithms for evolving AI and neural networks for agent decision making. These are all used in a layered approach. The AI component is fully customizable.

Multicore Support

TGEA doesn't really have multicore support. In build 1.0 the thread affinity was single! I made some extensive modification and adjustments  to the main game loop to add multi-core support using OpenMP maintaining the integrity and keeping all components in sync running on different threads like networking and input events. I ported this over to my version of TGEA 1.7.0. The AI component also make extensive use of this multicore support.

Materials

TGEA provides a very good Materials pipeline. Two classes are available, "Material" and "CustomMaterial". The first being very good in that one just specify features and shaders are procedurally generated for you. This however is completely inadequate if you are planning to do some advanced shading, I see it sort of the artist's version for adding shader support in TGEA. Problem is that shader features are hard-coded, for example you are forced to use the normals shader as generated by the Materials class and cannot specify your own normals shader, the features are thus sort of static.  Luckily Garage Games had the insight to provide the CustomMaterial component for the purpose of using your own custom shaders. However, all is not what it seems. I made extensive use of it in TGEA 1.0, having said that, I had to make a LOT of changes and tweaks. I still did not have the level of control I wanted. CustomMaterials seems to be a bit of an orphaned entity in the world of GG. I came to the full realization of this after I have ported my code to TGEA 1.7.0, a lot of things were missing which I will not go into details here. I extended and modified (basically ripped it all out and replaced it) the CustomMaterials class to give me more control. It now has better pass control as well as logic flow-control over multiple passes, this is a very powerful feature to have, basically allows me to do what I want to. It allows shader passes to render or to be ignored based on certain user defined states in the game. It also allows for specific versions (not pixel shader versions) of a shader to execute depending on wether or not a certain game-state is true or false (this is used by Multi-channel rendering described below). It also allows different texture stages for different passes.

Multi-channel Rendering

Added the ability and support to render shader output to multiple render targets (COLOR0-COLOR4). It makes use of COLORx registers as an optimization opportunity to output pixels to relevant render targets. It is however not limited to the allotted COLORx registers, any amount of additional channels (render targets) can be specified. Rendering to multiple render targets allows one to get rid of multiple-pass rendering which can induce a big performance hit with complex scenes. It is for instance used to combine multiple fullscreen shader effects which share a subset of inputs. An example is screenspace ambient occlusion and depth of field, both depend on a pixel depth buffer. It is therefore practical to let shaders output pixel depth for example to the COLOR2 render target and output color texels to the COLOR0 render target, all in one scene render pass. I actually output pixel depth information to the COLOR0 render target's alpha channel using SrcBlendAlphaOnly (I added the state to TGEA's render states) By knowing beforehand which fullscreen shaders can be applied one can use the knowledge to make shaders output all additional required data to the COLOR1-COLOR4 render targets in a single scene render pass. My new CustomMaterial class mentioned in the previous paragraph allows to switch dynamically between certain versions of the same shader depending on wether it needs to output data to certain render targets. For example fullscreen ambient occlusion requires normals data to be rendered to one of the render targets. A specific version writing normals data to the render target or not writing normals data to the render target will be dynamically activated depending on wether fullscreen ambient occlusion is active or not.

FXMaterial

Added a new Material type for using HLSL FX shaders generated in tools like RenderMonkey, FX Composer and Mentil Mill directly in TGEA. Work is still in progress but is functional at the time of this writing...

Vertex Texture Stages (D3DVERTEXTEXTURESAMPLERx)

TGEA up to date has no support for vertex texture fetching. I integrated this seamlessly into the TGEA texture stage pipeline. One can now specify vertex textures inside CustomMaterial instances using shader V3 or dynamically from code (which I do quite a lot).

DynamicFX

DynamicFX is basically an addition I made to TGEA which makes use of shader version 3's vertex texture fetching capabilities. Examples of DynamicFX in my version of TGEA includes dynamic destruction of objects, interactive water and cloth, all of which is 100% GPU based (including collision detection) and deformable bodies.

Shaders

I wrote an endless amount of shaders for TGEA that I will not attempt to list them here. I'll demonstrate some of them in the demo videos but they include object self ambient occlusion, fullscreen ambient occlusion, dynamic depth of field, enhanced sunlight and automatic dirt for interiors, subsurface scattering (used for skin on demo character)

Shader Particles

Added support for lighting and mapping shaders to particle textures, this allows me to do things like shaded rain and particle based heat haze effects.

Misc Issues

Fixed many issues with things like render targets and the backbuffer, TGEA 1.7.0 seems to have some issues with the backbuffer and offscreen render targets.

Added dynamic lighting to static meshes imported into interiors. This is missing in TGEA static interior meshes only receive ambient light.

Will extend this list at a later stage (It is quite a long list)