DarkPlaces engine modification - (a bit) faster shadows

Developer discussion of experimental fixes, changes, and improvements.

Moderators: Nexuiz Moderators, Moderators


  • Hello everyone!

    I'm new to this forum, so please forgive me if this is not the best way to announce it - I've been looking at DarkPlaces shadowing system, and i've made a patch, that makes use of GPU in Shadow Volumes algorithm - in both zPass and zFail versions I've pushed vertex extrusion to GPU - pretty close to one of TODO's about putting it on the GPU side.

    I've also changed some little things in skeletal animation and memory management (MEMPARANOIA or sth, probably known issue on nexuiz forums?), so it runs a bit better.

    I've been modifying the latest stable release of DarkPlaces from LordHavoc's site, if someone wishes to know.

    Patch is available here:

    http://students.mimuw.edu.pl/~ts248384/ ... dows.patch

    cheers,

    Tom
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm


  • Cool! Too bad the patch didn't work for me with the latest SVN engine though, as it broke all dynamic lights with shadows (no light was rendered) :(
    User avatar
    FruitieX
    Keyboard killer
     
    Posts: 588
    Joined: Mon Nov 13, 2006 4:47 pm
    Location: Finland

Wed Aug 26, 2009 4:27 pm

Wed Aug 26, 2009 5:01 pm

  • It seems to break realtime DYNAMIC lights when shadows are enabled, but the realtime world lights work absolutely GREAT! Awesome work!

    Edit: apparently shadows are not cast correctly... Try the stormkeep2 map for an example: http://pics.nexuizninjaz.com/images/2u7 ... vmianf.jpg
    User avatar
    FruitieX
    Keyboard killer
     
    Posts: 588
    Joined: Mon Nov 13, 2006 4:47 pm
    Location: Finland


  • tomasz.swierczek wrote:I'm new to this forum, so please forgive me if this is not the best way to announce it - I've been looking at DarkPlaces shadowing system, and i've made a patch, that makes use of GPU in Shadow Volumes algorithm - in both zPass and zFail versions I've pushed vertex extrusion to GPU - pretty close to one of TODO's about putting it on the GPU side.


    While looking through your patch I have a few comments:
    You disabled some memset's in the skeletal model animation code - I know these are redundant but in my experience it improves framerate to clear memory before writing it (it forces it to be in CPU cache before it is written).

    You disabled MEMPARANOIA defines - I agree with this in principle however divverent has told me that he has fixed the performance problems more recently, so this part is unnecessary now (but in the source you based your patch on, it is probably a good idea).

    You are using a texcoord1f array for projection distance which could easily be merged into the vertex3f array to make it a vertex4f array which would read faster on the GPU.

    You are still copying vertex data for static meshes when casting shadows - it is possible to preprocess the shadowmesh of the model such that it has a vertex4f array with w=0 and w=1 variants of every vertex, and the CPU only needs to generate a new triangle elements array to produce a shadow volume, leaving the shadowvertex array as a static VBO.

    You are not making use of the R_SetupShader_SetPermutation system in gl_rmain.c which manages all shader generation, in particular you should take a look at the function R_SetupGenericShader and make a copy of that which activates your new shader, automatically compiling it on first use (by way of the permutation system).
    LordHavoc
    Site Admin
     
    Posts: 191
    Joined: Wed Mar 29, 2006 7:39 am
    Location: western Oregon, USA

Thu Sep 03, 2009 12:04 pm

  • FruitieX wrote:It seems to break realtime DYNAMIC lights when shadows are enabled, but the realtime world lights work absolutely GREAT! Awesome work!

    Edit: apparently shadows are not cast correctly... Try the stormkeep2 map for an example: http://pics.nexuizninjaz.com/images/2u7 ... vmianf.jpg


    well... maybe I am wrong (so please correct me) but I can't sew anything bad with this picture - there is that bright lava and health pickup is casting shadows just ok for me - maybe if you could post a comparison with version without a patch?

    I had some problems with the range of shadow, though I did not notice errors in direction of shadow.
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm


  • LordHavoc wrote:
    tomasz.swierczek wrote:I'm new to this forum, so please forgive me if this is not the best way to announce it - I've been looking at DarkPlaces shadowing system, and i've made a patch, that makes use of GPU in Shadow Volumes algorithm - in both zPass and zFail versions I've pushed vertex extrusion to GPU - pretty close to one of TODO's about putting it on the GPU side.


    While looking through your patch I have a few comments:
    You disabled some memset's in the skeletal model animation code - I know these are redundant but in my experience it improves framerate to clear memory before writing it (it forces it to be in CPU cache before it is written).

    You disabled MEMPARANOIA defines - I agree with this in principle however divverent has told me that he has fixed the performance problems more recently, so this part is unnecessary now (but in the source you based your patch on, it is probably a good idea).

    You are using a texcoord1f array for projection distance which could easily be merged into the vertex3f array to make it a vertex4f array which would read faster on the GPU.

    You are still copying vertex data for static meshes when casting shadows - it is possible to preprocess the shadowmesh of the model such that it has a vertex4f array with w=0 and w=1 variants of every vertex, and the CPU only needs to generate a new triangle elements array to produce a shadow volume, leaving the shadowvertex array as a static VBO.

    You are not making use of the R_SetupShader_SetPermutation system in gl_rmain.c which manages all shader generation, in particular you should take a look at the function R_SetupGenericShader and make a copy of that which activates your new shader, automatically compiling it on first use (by way of the permutation system).


    As far as I know, static meshes cast shadows in a precompiled way - the shadow mesh is being build once when a map is loaded (compiling real time light or sth?), co there is no need to exttrude any vertex on the GPU, as they are only extruded once? Correct me if I'm wrong.

    I realize that the shaders - to match engine's conventions - should be done with permutations but... Yes, I must say I do not appreciate this kind of shader code - it does not make things clear in reading and - what is even more important - for profiling/optimizing.

    Also, as this is my first time posting a patch here - and there are some difficulties with how it is working (though I didn't see any on my laptop) - I will leave it as it is now, maybe trying to fix light errors if I got to know where they are (previous post, about image)

    When everything will be fine, surely this should be done in a permutation-way, but again - only to be compatible with nexuis conventions.

    As about performance - it matters on low-end CPUs, even if you don't notice the difference, with more dynamic lights and more triangles it will become more visible - every CPU cicle is worth optimizing, isn't it? :-)
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm


  • tomasz.swierczek wrote:As far as I know, static meshes cast shadows in a precompiled way - the shadow mesh is being build once when a map is loaded (compiling real time light or sth?), co there is no need to exttrude any vertex on the GPU, as they are only extruded once? Correct me if I'm wrong.


    The world is statically cast, yes, but most models (pickups for example) are not animated, this allows use of static geometry in more optimized ways than is possible with dynamic geometry, the double vertex float[4] array I mentioned is the usual technique.

    tomasz.swierczek wrote:I realize that the shaders - to match engine's conventions - should be done with permutations but... Yes, I must say I do not appreciate this kind of shader code - it does not make things clear in reading and - what is even more important - for profiling/optimizing.


    Note that I'm not complaining about this, only saying that it would have been easier for you, and ultimately it will have to be adapted to this approach for engine maintenance reasons.

    tomasz.swierczek wrote:Also, as this is my first time posting a patch here - and there are some difficulties with how it is working (though I didn't see any on my laptop) - I will leave it as it is now, maybe trying to fix light errors if I got to know where they are (previous post, about image)


    It's a good idea, although I think it needs more optimizations before it will be a consistent speed boost, some people say it's actually slower.

    tomasz.swierczek wrote:When everything will be fine, surely this should be done in a permutation-way, but again - only to be compatible with nexuis conventions.

    As about performance - it matters on low-end CPUs, even if you don't notice the difference, with more dynamic lights and more triangles it will become more visible - every CPU cicle is worth optimizing, isn't it? :-)


    Low-end GPUs are usually paired with low-end CPUs, and the low-end GPUs are much slower than the low-end CPUs at this particular task, it's a matter of baseline performance, the method that works best on the worst computers is used.

    Note that shadow optimization can only increase fps by the same amount that turning off shadows does, if this is a large percentage of render time then optimizations are warranted, but I have never seen it be a large percentage in practice.
    LordHavoc
    Site Admin
     
    Posts: 191
    Joined: Wed Mar 29, 2006 7:39 am
    Location: western Oregon, USA


  • LordHavoc wrote:
    Low-end GPUs are usually paired with low-end CPUs, and the low-end GPUs are much slower than the low-end CPUs at this particular task, it's a matter of baseline performance, the method that works best on the worst computers is used.

    Note that shadow optimization can only increase fps by the same amount that turning off shadows does, if this is a large percentage of render time then optimizations are warranted, but I have never seen it be a large percentage in practice.



    You are absolutely right about this - it really is quite a little performance gain, no doubt. Even though, putting those calculations on the GPU makes CPU free to do i.e. more vertex-animation or so - with gprof under my Ubuntu 9.04 I've seen that time spent on shadow generation is about half the time it was. If people are experiencing it being slower - well, you can't fit everyone's combination of CPU and GPU and driver version. Also, writing code for TNT's or even GF 5 today is a bit... well, you name it.

    I am currently working on a shooter game on my studies, Its kind a big - one - year project with couple of friends, and I am working on the 3D engine side (for a bit longer, as it is also my area of interest) - you can see sample screenshots here:

    https://students.mimuw.edu.pl/~ts248384 ... enSlotter/

    (that's University's server address, content from OpenArena)

    Doing this kind of graphics and keeping "up to the old hardware" today seems not good road map for me, as at some point people will have to understand that without at least GF 9600 you aren't able to play games using modern technologies, or they'll have to stick to old Quake III Arena.

    I will take a look at my shadow code once again this weekend, surely will try to do everything with GPU extrusion using vec4 rather than vec3 with texcoord1f.

    I will try to do my best to make this modification useful.

    Probably around Monday evening I will post something here (my local time, I'm in Poland)

    Thanks for comments :-)
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm

Fri Sep 04, 2009 6:29 pm

Fri Sep 04, 2009 6:37 pm

  • Not really, at least until I realized that each arrow is carefully pointing at a shadow on the FRONT of the ledge, are these bidirectional shadows of some kind? :)
    LordHavoc
    Site Admin
     
    Posts: 191
    Joined: Wed Mar 29, 2006 7:39 am
    Location: western Oregon, USA

Fri Sep 04, 2009 6:38 pm

Mon Sep 07, 2009 3:59 pm

  • LordHavoc wrote:Not really, at least until I realized that each arrow is carefully pointing at a shadow on the FRONT of the ledge, are these bidirectional shadows of some kind? :)


    In fact... yes, these actually were "bidirectional shadows" because I've made a really, really stupid mistake in GLSL shader code :-)

    I've changed it now - here is the patch (from Nexuiz SVN version I've downloaded on Friday last week, as everyone wanted):

    https://students.mimuw.edu.pl/~ts248384 ... vec4.patch

    changes:

    - uses vec4 instead of vec3 for dynamic shadows (no texcoord1f, only vertices)
    - uses r_SetupShader_Permutation && that uber shader code
    - looks as it should (no bidirectional shadows)

    hope it will work :)
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm

Mon Sep 07, 2009 4:27 pm

  • Seems to work fine now, I got a tiny performance boost of 1 fps (from 13.5 to 14.4) by using the patch. I'm getting warnings during compile though:

    Code: Select all
    r_shadow.c: In function ‘R_Shadow_VolumeFromList’:
    r_shadow.c:1247: warning: passing argument 2 of ‘R_SetupGPUExtrusionShader’ discards qualifiers from pointer target type


    Edit: oh wait, it seems to crash the GLSL shaders, which disables deluxemapping/offsetmapping. I get this error in the DP console:
    Code: Select all
    <19:19:56> GLSL shader glsl/default.glsl depth/shadow failed!  some features may not work properly.
    <19:19:56> OpenGL 2.0 shaders disabled - unable to find a working shader permutation fallback on this driver (set r_glsl 1 if you want to try again)
    User avatar
    FruitieX
    Keyboard killer
     
    Posts: 588
    Joined: Mon Nov 13, 2006 4:47 pm
    Location: Finland

Mon Sep 07, 2009 6:15 pm

  • FruitieX wrote:Seems to work fine now, I got a tiny performance boost of 1 fps (from 13.5 to 14.4) by using the patch. I'm getting warnings during compile though:

    Code: Select all
    r_shadow.c: In function ‘R_Shadow_VolumeFromList’:
    r_shadow.c:1247: warning: passing argument 2 of ‘R_SetupGPUExtrusionShader’ discards qualifiers from pointer target type


    Edit: oh wait, it seems to crash the GLSL shaders, which disables deluxemapping/offsetmapping. I get this error in the DP console:
    Code: Select all
    <19:19:56> GLSL shader glsl/default.glsl depth/shadow failed!  some features may not work properly.
    <19:19:56> OpenGL 2.0 shaders disabled - unable to find a working shader permutation fallback on this driver (set r_glsl 1 if you want to try again)


    what is your GPU?

    probably there is a mistake in that builtin shader string, I am just fixing it (hell... so many #define's...)

    Edit:

    try this one:

    http://students.mimuw.edu.pl/~ts248384/ ... c4_2.patch

    on mine GPU (NVIDIA GF 8600 GT) it's fine
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm

Mon Sep 07, 2009 6:53 pm

Mon Sep 07, 2009 9:48 pm

  • FruitieX wrote:Edit: Yeah, it works now. :)


    Hurray! :D thanks for testing, FruitieX :)
    tomasz.swierczek
    Newbie
     
    Posts: 8
    Joined: Wed Aug 26, 2009 2:28 pm

Wed Sep 09, 2009 10:07 am

  • *cough* :(
    no improvement on my GeForce 7950 GX2 / Core 2 Duo E6600

    This GPU/CPU combination seems to be flawed in many ways. There are many situations where I just get ridiculously low FPS where other people don't notice a thing...
    User avatar
    Blµb
    Alien trapper
     
    Posts: 277
    Joined: Thu Mar 29, 2007 1:49 pm

Wed Sep 09, 2009 11:24 am

  • For what it's worth, this patch completely breaks shadows on GF4MX. I suspect a missing feature is not being detected properly.
    User avatar
    parasti
    Alien
     
    Posts: 110
    Joined: Sun May 11, 2008 11:32 pm
    Location: On the walls and the ceiling

Wed Sep 09, 2009 7:02 pm

  • tomasz.swierczek wrote:probably there is a mistake in that builtin shader string, I am just fixing it (hell... so many #define's...)


    Be sure to use developer 1 or higher when testing shaders, otherwise you don't get the errors/warnings the driver wants to report.

    I ought to make all debug builds default to developer 1 too...
    LordHavoc
    Site Admin
     
    Posts: 191
    Joined: Wed Mar 29, 2006 7:39 am
    Location: western Oregon, USA



Return to Nexuiz - Development




Information
  • Who is online
  • Users browsing this forum: No registered users and 1 guest