CFlags for recompiling

Tips on how to tweak Nexuiz for the best performance

Moderators: Nexuiz Moderators, Moderators

CFlags for recompiling

Postby Ed » Mon Sep 18, 2006 3:41 pm

The gentoo wiki carries this page of safe cflags for different CPU architectures:
http://gentoo-wiki.com/Safe_Cflags

Do the Nexuiz devs think these are safe cflags for DP/Nexuiz? What cflags are used by default?
Laters losers.
Ed
Forum addon
 
Posts: 1172
Joined: Wed Mar 01, 2006 12:32 am
Location: UK

Postby esteel » Mon Sep 18, 2006 3:54 pm

Nexuiz or better DarkPlaces uses currently:
-O2 -fno-strict-aliasing

-funroll-loops is known the make the binary very large while only being a very very small tad faster
-ffast-math is known to give problems with gcc 4.1 and 64 machines for DP
esteel
Site admin and forum addon
 
Posts: 3924
Joined: Wed Mar 01, 2006 8:27 am

Postby Ed » Tue Sep 19, 2006 2:55 pm

I wrote a script to coimpile multiple binaries with different march flags and optimisations and then benchmark each to provide performance analysis:
Code: Select all
#!/bin/bash
#A script to compile many versions of Nexuiz and benchmark

for arch in i686 pentiumpro pentium2 pentium3 c3-2 athlon athlon-tbird athlon-xp athlon-4 athlon-mp
# Available processor architectures:
# i386 i486 i586 pentium pentium-mmx i686 pentiumpro pentium2 pentium3 c3 c3-2 k6 k6-2 k6-3 athlon athlon-tbird athlon-xp athlon-4 athlon-mp etc.
do

for optimise in -O2 -O3
# Optimisation types: -O -O2 -O3 -Os
do

make clean
make nexuiz CFLAGS_RELEASE="" OPTIM_RELEASE="" CFLAGS_COMMON="-march=$arch $optimise -fno-strict-aliasing -ffast-math -fomit-frame-pointer"
cp nexuiz-glx ../nexuiz-glx-$arch$optimise
cd ..
./nexuiz-glx-$arch$optimise -game compile -sndspeed 48000 -benchmark demos/demo1
cd darkplaces
done
done


Extract the dp source to a darkplaces subdir, put this script there and run it. In this form it only does glx binaries as my SDL libs are a mess though I have done it with SDL on another computer.

My system:
Athlon XP 2400+
1Gb RAM
Geforce 6600GT 128Mb AGP
Linux 2.6.12-12mdk
GCC 4.0.1
Nexuiz 2,0

Settings in Nexuiz are purely arbitary but are fairly high in this case. -fomit-frame-pointer was used in most tests as I have found it to give a meaningful performance boost:
Image

The standard make settings give 50.37fps. By enabling -fomit-frame-pointer I get 50.94fps, by changing to -O3, 51.14fps.

The startling thing is the difference by architecture. It doesn't make that much difference! The athlon-xp option should be fastest but isn't. Athlon-4 beats it and has a small lead over the i686. c3-2 is even quicker! These of course don't reflect what Nexuiz would perform like on that CPU.

I did try all the -O options including -O and -Os which were in general significantly slower.

I have also tried this to lesser extent on a Via Nehemiah (c3-2) and found that the i686 -Os is best there. c3-2 is surprisingly slow.

I would recommend sticking to i686 as it doesn't seem to make much difference (and can be a lot worse) compiling for your architecture but also try different -O options and try -fomit-frame-pointer.

For the official build, i686 seems like the best bet to stay with, despite the Nexuiz requirements needing a Pentium 3/Athlon or better. -O3 and -fomit-frame-pointer may be worth looking at, particularly with GCC 4.

I would be interested if other people can repeat this experiment on different architectures. I would do a full run on my Nehemiah but it's graphics are hopeless and ATi broke my Dad's tbird. Feel free to add or remove options from the script, this was just what works on my setup.
Laters losers.
Ed
Forum addon
 
Posts: 1172
Joined: Wed Mar 01, 2006 12:32 am
Location: UK

Postby tZork » Tue Sep 19, 2006 3:12 pm

Interesting stuff Ed. One thing i would sugest is running the bechmark twice w/o exiting the game, not sure on how to do that tho (with automation) :| this would be to remove the shader compiletimes from the equation as thise are preddy uninteresting in a game preformance point of view.
HOF:
<Diablo> the nex is a "game modification"
<Diablo> quake1 never had a weapon like that.
<Vordreller> there was no need for anything over 4GB untill Vista came along
<Samua>]Idea: Fix it? :D
<Samua>Lies, that only applies to other people.
tZork
tZite Admin
 
Posts: 1337
Joined: Tue Feb 28, 2006 6:16 pm
Location: Halfway to somwhere else

Postby esteel » Tue Sep 19, 2006 3:21 pm

Your benchmark makes the differences look rather huge but remember the difference from slowest to fastest was only TWO FPS. Thats not very much and only gentoo types would spend much time to get 2 fps more :P Sorry could not resist..
esteel
Site admin and forum addon
 
Posts: 3924
Joined: Wed Mar 01, 2006 8:27 am

Postby Ed » Tue Sep 19, 2006 3:54 pm

tZork wrote:One thing i would sugest is running the bechmark twice w/o exiting the game ... this would be to remove the shader compiletimes from the equation as thise are preddy uninteresting in a game preformance point of view.

The shader compilation makes it more CPU bound so will show more differences in compilation. The benchmark is of course entirely arbitary and I would not use the same settings on every machine as the point is the difference between builds on the same machine doing the same thing.

esteel wrote:the difference from slowest to fastest was only TWO FPS.

That's not all of the compilations I did. Many of the -O and -Os options were around 47fps and the slowest I got with any settings was 28fps. That makes a huge difference.
esteel wrote:only gentoo types would spend much time to get 2 fps more

Part of the point is actually that the gentoo cflags I linked to at the beginning are nto actually the best options. People compiling Nexuiz through Portage are not getting the performance they think they are.and would be better off using the standard binary. However, there are also ways of getting more than with the normal binary using other options that would not be used in Portage as they'd be unsafe system wide.
Laters losers.
Ed
Forum addon
 
Posts: 1172
Joined: Wed Mar 01, 2006 12:32 am
Location: UK


Return to Nexuiz - Performance Tips

Who is online

Users browsing this forum: No registered users and 1 guest

cron