RtStage2 source release

Show-off, reference material & tools.

RtStage2 source release

Postby Phantom on 08 May 2006, 09:04

Release is imminent. I am building a small package now.

The original plan was to release both my 'pretty' tracer (with textures, lights and stuff) and the 'speedking' edition (bare bones tracer), but after looking at the 'pretty' tracer I decided that it is not good enough to release. It just contains too many problems that I fixed recently (esp. memory management issues), which will probably require tons of support should anyone try to build and run it.

The original plan was also to release the current 'speedking' edition, but without the actual tracer (just the kd-tree compiler), but since I am holding on to the 'pretty' tracer, I will release my full current project. It's somewhat 'work in progress', obviously, so I will update it regularly. Right now, it is completely optimized to run Toxie's benchmark scenes, it does not contain code for recursive ray tracing (just first-hit code), no shading, no texturing and so on. It also has quite a bit of 'work in progress' code, like #defines for stuff that I am testing at the moment. It does however show how to implement a very basic 4x4 packet tracer, how to feed it rays, how to shade, how to do intersections (Carsten style), how to do memory management and of course how to build a kd-tree in O(N log N) time.

As soon as the package is up, I'll drop another note here.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Phantom on 08 May 2006, 09:23

File has been uploaded to the repository:

http://ompf.org/alpha/bikker/rtStage2_may_8th_2006.zip
(thusly moved)

Some notes, besides the one in my first post:
- Only one scene included: Sponza. Add extra scenes to 'meshes' directory.
- Code compiles under VS2005 and VC6/icc combo; icc is by far the fastest. VS is used for debugging.
- See common.h for settings: MAXTREEDEPTH, PRIMSPERLEAF influence tree generation. TRAVCOST and INTRCOST are self-explanatory, I assume. TREECLIP is used to enable/disable full clipping. REORDERINGPRIMS still works, but doesn't help at all, like I mentioned. REBUILDTREE cannot be disabled, as tree saving/loading is broken atm.
- kdhelp contains all helper functions, the old nlog2n compiler (might still work) and some code backups. You shouldn't need it, but if you are looking for list insertion/deletion and that sort of stuff, it's there. :)
- This package is the exact code that was used for the demo I released earlier, except for a couple of things:
* Intersection code has been changed to Carsten's approach. It's faster and more accurate.
* Timer has been replaced by tbp's wallclock. Much more stable, and more accurate.
* Incoherent packets are now detected properly, but not yet handled (code is there, but doesn't work).

Well I suppose that's it for now, if there are questions, just let me know. Small fixes will be posted in this thread, major releases will get a new package upload.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 08 May 2006, 09:46

No (official) Linux support? I guess I'll have to port it myself then. More fun for me :)
I'll get started in around 6 hours after I get home from work. If it is not too much different from the RT articles it shouldn't take too long.

Btw, does it compile with GCC? If it does it will probably be easy to get running in Linux. With previous code most of the time was spent on making it digestable for GCC.

[edit]
I should have read both of the posts. Too bad ICC can't compile in VC mode under Linux.

[edit2]

have you tested it in 64bit OS? I hope you have coded it 64bit-safe. If not it gives even more fun for me since I have no 32bit OS installed :-P
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 08 May 2006, 09:51

You shouldn't have too much trouble getting it to run under gcc, I hope. I did obey the for..next scope rules, for example. That said, I have little experience with gcc, so I really shouldn't make any claims. :) Keep me posted.

EDIT: If you edit, so do I. 64bit-ness: It's a pure 32bit thingy right now. The kd-tree contains raw pointers, so that's going to be a problem. 64-bitness is also pretty low on my priority list, as I won't have access to a 64-bit machine for the next couple of months. My next laptop (next month) will be dualcore, but not 64bit. Tbp once mentioned some issues for 64bit apps (kd-tree pointers being the most obvious); it shouldn't be too hard to get it working as a 64-bit app, and it should give a small speed boost. I believe tbp mentioned something like a 20% immediate gain, mostly because of the extra registers.

EDIT2: Does anyone know if XP32bit accepts 64bit executables if the host platform is 64bit?
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby tbp on 08 May 2006, 09:58

Gcc plays it rough with alignments. I haven't looked at the source yet, so i can't say if that's going to hit you in the back.
Glad you've used the wallclock thingy, now i'd need to excise the full thing with proper crossplatformness.

Got to walk the walk, edit time: going for 64 bit is a breeze unless you've been naughty. Then it hurts.
Anyway, you can cross-compile to 32bit provided you have all the required 32bit libs somewhere.

About xp64/32, i dunno. But guess the answer is a big no (that would require some massive emulation).
User avatar
tbp
Overlord
 
Location: France

Postby Phantom on 08 May 2006, 10:03

The previous stats suffered hugely from the lack of timer accuracy. I was dividing the total rays cast figure by the msec count, which means that at 10fps, I can only detect multiples of 1% changes. Right now I can detect much smaller changes, which is very nice.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 08 May 2006, 10:05

Phantom wrote:EDIT2: Does anyone know if XP32bit accepts 64bit executables if the host platform is 64bit?
AFAIK, 32bit OS can only run 32bit apps. 64bit OS can run both, 64 and 32bit apps.

You could probably install 64bit OS through some virtual machine. It wouldn't be fast but hopefully usable. Unfortunately I don't know for sure what VM's support that. I think qemu might be able to handle it.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby tbp on 08 May 2006, 10:14

GetTickCount is absolutely horrible, lots of jitter, large grain... Perf counters are a bit better, they generally rely on PIC timers (~3 Mhz); the real trouble is the high latency of those calls (i measured them back in the days to be > 5000 cycles on a mono cpu box) and the fact that they aren't that reliable (faulty HAL and so on). On the other hand, cpu with varying frequency aren't an issue.

I prefer to use rdtsc, mostly because then measuring has negligible impact, but you need to be extra extra careful: you can't handle varying freq, multi core/cpu systems are a pain (even more so on xp when the kernel don't synchronize them) etc...
User avatar
tbp
Overlord
 
Location: France

Once again ...

Postby Shadow007 on 08 May 2006, 13:31

Once again, I'd like to thank you for your sharing the source !!!
Shadow007
 

Postby Phantom on 08 May 2006, 14:15

What's the normal way to deactivate rays in a packet? I tried setting tfar to -1 for the rays that should be skipped, but that causes problems that look like mailboxing problems, i.e. black spots near node boundaries. I added a special mask now for ray deactivation that is used inside the intersection test to mask out primitive & distance updates just before they are updated, but this feels like I'm keeping track of rays that are supposed to be inactive for too long...
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 08 May 2006, 15:52

I just started to port it. So far I have succesfully compiled surface.cpp :)
I use cbuild for building. http://awiki.tomasu.org/bin/view/Main/CBUILD

I wanted to ask what extra libraries may I use? I wouldn't like to use plain-X for opening window and handling input. I would prefer Allegrobut if I must I can use SDL too.

I last used SDL about three years ago to do a really simple test. Opening a window wouldn't probably be a big problem, I'll just have to read some documentation to remember that stuff.
I have used Allegro for the last five years for everything that needed graphics output. It would give all sorts of nice things like bitmap loading (bmp, tga and pcx native, jpg and png as addons), simple thread based timers and keyboard and mouse input. With allegrogl it even has really simple OpenGL support including extension loading. Allegro works cross-platform on windows, Linux and OSX in both 32 and 64bit modes. Also it has DOS support together with some Linux console graphics libraries :D

[edit]

It seems I need to have some replacements for __forceinline, __int64 and LARGE_INTEGER. I'll find the __forceinline from gcc manual. Does MSVC support intXX_t? If it does I could replace those other two with either int64_t or uint64_t.
For QueryPerformance* I'll probably use gettimeofday replacement.

If anyone thinks there are better alternatives let me know.

[edit2]

Woot!
Now three files out of six are compiled: surface, kdtree and kdhelp. Half way there!
Hopefully the GetTickCount and wallclock_t replacements work the same way as under windows.

[edit3]
A little help here. In raytracer.cpp line 55:
_declspec(align(16))
IData Engine::m_ID[8];

Is it correct if I write it this way:
IData Engine::m_ID[8] __attribute__ ((aligned (16)));
Last edited by Ho Ho on 08 May 2006, 17:21, edited 1 time in total.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 08 May 2006, 17:20

The __int64 is purely for the timer class.

Please note that besides soloapp.cpp, there should be no platform-specific code anywhere. Soloapp opens a window via the win32 api, and passes a pointer to the tracer. Soloapp.cpp also handles timing, camera positions and some other things.

Btw, even surface.cpp/.h should be pretty much platform independent; it simply encapsulates a 16 or 32 bit linear frame buffer.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 08 May 2006, 17:23

Please note that besides soloapp.cpp, there should be no platform-specific code anywhere
If you don't count the #include <windows.h> you have almost everywhere then you are correct :)
I got the __int64 thingie covered and I think it should work correctly. Of cource I can't be sure until I get it running.

Btw, I edited the last post for the third time.

[edit]

Another question. GCC complains about some function calls. Functions are defined like this:
void SetColor( Color& a_Color )
and called like this:
SetColor( Color( r, g, b ) );

Gcc sais:
no matching function for call to `Raytracer::Material::SetColor(Raytracer::Color)'

Las time I simply made a local variable with the specified color but is there a better way I could use that would not mean I have to add huge pile of local variables all over the place? I know I'll have the same problem later on with vectors.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 08 May 2006, 17:45

Your alignment replacement is correct.

About the SetColor call: You indeed need to place the parameter in a temp var, otherwise gcc can't reference it. It's ugly, but the only way. I remember that hassle from my Symbian days... If anyone knows a better way I would like to hear it too.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 08 May 2006, 18:03

Ok, now all but soloapp are compiled. For my luck I didn't have to use nearly as much local variables as I did in the legodemo you gave me some while ago.

I'll bike around for a half an hour to clear my mind for the last part of porting and then start trying to get it to actually work with 64bit. So far it seemed there wasn't too many places where you used pointers and ints in the same places. Hopefully I didn't miss something big and ugly :)
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby tbp on 08 May 2006, 18:16

<stdint.h> has all the types you need in a portable - ehe - way, that's why you see uint32_t etc in my code.

_MM_ALIGN16 comes with Intel SSE headers and provide portable alignment.

SetColor shouldn't need a temporary, but you don't quote enough code for me help.

I said i'll post proper code for the wallclock_t thingy, code which works on win32/linux with gcc/msvc etc...

There's no need for sdl, it's bloated and slow, just use some bare X11 with shm extension for fast buffer copy. FreeImage is a simple lib that supports every format under the sun and works on win32 and linux just fine.

I may have forgotten some remarks along the way :P
User avatar
tbp
Overlord
 
Location: France

Postby Ho Ho on 08 May 2006, 18:54

_MM_ALIGN16 comes with Intel SSE headers and provide portable alignment.
I'll take a look at it but it seemed like there was only a single place where alingment was used. I'll fix it later when I've got the other things working.
SetColor shouldn't need a temporary, but you don't quote enough code for me help.
In scehe.h, Material class:
void SetColor( Color& a_Color ) { m_Color = a_Color; }
In scene.cpp somewhere near line 110:
m_Mat[curmat]->SetColor( Color( r, g, b ) );
GCC said that it couldn't find Raytracer::Material::SetColor(Raytracer::Color)' but possible candidat might be Raytracer::Material::SetColor(Raytracer::&Color)'

I can't see a reason why would you need more code. If you do you can see it for yourself :)
I said i'll post proper code for the wallclock_t thingy, code which works on win32/linux with gcc/msvc etc...
Until you do that I'll use my replacement that I hope works OK.
There's no need for sdl, it's bloated and slow, just use some bare X11 with shm extension for fast buffer copy.
Do you happen to know where could I see a plain "hello world" example for that?

Btw, could someone create a svn server somewhere? It would make working on a project with several developers much easier.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby tbp on 08 May 2006, 19:12

Dunno, try with this signature "void SetColor(const Color &a_Color)"
Btw a class with only public members/methods is really a struct in disguise :)

For the X11/shm skeleton, i don't have any link handy. But that shouldn't be too hard to find, i mean you have the sources for the whole system, don't you?
I could excise some code, but not tonite.
User avatar
tbp
Overlord
 
Location: France

Postby Ho Ho on 08 May 2006, 19:28

tbp wrote:Dunno, try with this signature "void SetColor(const Color &a_Color)"
Thanks, that fixed it.
tbp wrote:Btw a class with only public members/methods is really a struct in disguise :)
I know that. Basically all there is different between classes and structs is that in class, everything is private by default but public in struct.
tbp wrote:But that shouldn't be too hard to find, i mean you have the sources for the whole system, don't you?
And what do you think, how many programs use plain X?

Now thinking about it perhaps glxgears does that. I'll take a look at its sources :)

[edit]

I checked glxgears and it is not very suitable unless I would want to send the data to GPU. Luckily I found a little tutorial on the subject that should help me: http://nobug.ifrance.com/nobug2/article ... t_xwin.htm
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby tbp on 09 May 2006, 04:40

Ho Ho wrote:And what do you think, how many programs use plain X?

Seriously? A metric ton, that begins with those bundled with X11. There was a X11 world before gnome, KDE and sdl you know.

I should have pointed to it, but it didn't cross my mind, PTC from Mr Gaffer did exactly what your looking for.
User avatar
tbp
Overlord
 
Location: France

Postby Phantom on 09 May 2006, 05:17

Ah, PTC! http://www.gaffer.org. Great stuff.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Phantom on 09 May 2006, 12:51

Quick progress report: I got incoherent packets working, Toxie's transformed kitchen is flawless now except for some minor but apparent kd-tree building issues at the kitchen door (now that I'm writing this, I think it's simply exceeding the maximum tree depth, which is erroneously handled by not emiting triangles in my compiler), and performance impact is zero (since only one in every 255 tiles requires splitting, and splitting itself has near zero overhead, average overhead is negligible), so that's great.

Next step is multi-threading, as I will have access to a dual core laptop tomorrow. I'm rapidly switching notebooks atm, so I need to focus on things that I can actually test. :) Besides, on my current system, the icc installer doesn't work for some reason, so I can't do a proper fresh demo... Sorry.

About the SVN: It's all fine with me, I could even setup something on my server (I have good experiences with tortoise, might install that), but I won't participate. I can't handle random contributions right now. So if you want to move forward with that, you'll have to maintain your own version, and sync it every now and then with my releases.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 09 May 2006, 13:09

I coded until 2a.m last night and I got it to compile. Of course that was the easy part compared to what needs to be done :)
I'm having trouble with 64bitness in KDTree building. Currently I haven't got past the ebox struct with all of its pointer arithmetics. Could you describe in a few words what does the ebox struct do and what it's variables hold?

Also you seem to assume certain struct sizes. Unfortunately in 64bit most of the structs and classes are bigger than in 32bit and probably mess up the pointer arithmetics. For example the ebox struct is 24 bytes in 32bit but 36 bytes in 64bit IIRC. Same with every other class where you use longs or pointers.

I think one possibility to solve 32/64bit issues would be to abstract things a bit and use simple 32bit ints as offset from array start instead of the real pointers. They shouldn't cause too much slowdown but quite a lot of code needs to be changed. Tbp, how have you solved it?

I don't think I'll create my own server. Unless you change a lot of code every update I don't think manual syncing would be a big trouble.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 09 May 2006, 14:16

Hoho,

Manually syncing should not be an issue. I made several minor changes, but these are all pretty isolated, a simple merge tool should make the job very easy.

64bit stuff: Can't you just port to 32bit linux first? That way, you can start with working code before diving into 64bit stuff...

About using offsets in an array instead of pointers: That should work, but there's one problem: The memory manager currently allocates small blocks of memory on demand, which means that kd-tree nodes are not in a continuous chunk of memory. This could be fixed by changing the memory manager so that it allocates a new (larger) block each time the existing block appears to be too small (i.e., create larger block, copy old block to large one, delete old one). This would also make the kd-tree saving & loading valid again.

About ebox & EBox: An 'ebox' holds three events, one for each axis, i.e. primitive start or end, plus planar events.

An EBox encapsulates two ebox'es, thus 6 events. Advantage is that some data that is shared among ebox'es needs to be stored only once (e.g., the pointer to the original primitive), and that all data for a single primitive is closely packed together in memory. See tbp's notes for details.

Actual data layout: an ebox has three 'next' pointers, these are used to place the events in a single-linked list. Since the address of each ebox is divisible by four, the lower bits can be used to store two bits. These are used to flag the event as 'primitive start', 'planar' or 'primitive end' event. Other than that, the event needs a position.

The EBox contains, as said, two ebox'es, one for 'minima' along each of the three axii, one for 'maxima'. Besides that, there's a pointer to the original primitive, some flags, and a pointer to a clone: When a primitive is straddling the split plane, it needs to be split, and thus a full EBox is needed in both the left and right child lists. For this, the EBox is first cloned, then clipped to the left and right nodes.

That's all. Everything should be fine with offsets, as long as you change the memory manager slightly.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby toxie on 09 May 2006, 15:16

first of all: nice release! definetly good to read src code and thus should help a lot of the guys out there to
get into the world of rtrt! (your optimized 4x4 traversal ROCKZ btw.! how much faster is it over 2x2??!)

second: bad news: everything works fine with intel c++ 8.x+9.0, but crashes with 9.1 (when optimizations are enabled only though).. Weird thing: The tree setup visualizer works BUT it crashes during rendering, as
Scene::GetKdTree()->GetRoot() returns some strange (nonvalid) address..

Some debugging just showed that the first two packets are traced and the third gets this weird address -> crash..
toxie
 

Postby tbp on 09 May 2006, 15:24

http://gcc.gnu.org/onlinedocs/gcc-4.1.0 ... 64-Options
See -m32 for 32bit codegen.

It's much simpler and efficient, again, to have 2 representations: one when building the tree, another when using it to render stuff.
As said in another thread, in fact you don't need the tree when building - you do that via recursion/stack, so it could be just streamed to disk or whatever. Lowering the representation is cheap, you can tweak the whole tree and it's easy to put it in a continuous block.
User avatar
tbp
Overlord
 
Location: France

Postby Phantom on 09 May 2006, 15:58

I didn't have to go through any kind of special trouble to create a tree directly in a renderable format. Only thing I need is a postprocessing step, where I replace a linked list of primitives with a pointer in the object list array plus a size (number of prims in that leaf). See 'BuildTriAccels' (this method actually has a bad name now, as TriAccel's are created together with the primitives; this method only does the operation I just described).

Toxie: I didn't have any problems with the intel compiler. Did you use it under Linux? I believe I used the win32/icc 9.0x version.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby toxie on 09 May 2006, 16:24

No Linux (aka Hippie-OS ;)) used.
But it only happens with 9.1 (not 9.0x) which just came out today.

And to bug you again: How much faster is the 4x4 compared to 2x2 in your case?!
toxie
 

Postby Ho Ho on 09 May 2006, 16:25

Phantom wrote:64bit stuff: Can't you just port to 32bit linux first? That way, you can start with working code before diving into 64bit stuff...
Of cource I could if I had a 32bit OS installed. I tried some livecd but as it's kernel didn't have half the needed modules I couldn't even access my files.
tbp wrote:See -m32 for 32bit codegen.
Thanks, that seems to do the trick and I don't have to install 32bit OS for now.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 09 May 2006, 16:50

@Toxie: Hard to say, I changed the entire intersection code, changed the kd-tree compiler and replaced the intersection code so I have no idea how much impact each of these had on it's own. Bad research, I guess...
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 09 May 2006, 17:22

I got it compiling with the -m32 fag and it enters the mainloop too. Unfortunately I can't get any visuals yet because I can't link 32bit programs with the installed 64bit Xlib. Even if I could link it my X code is not fully functional anyway.

I guess I'll make it to render to files until I get 64bit support done or 32bit OS installed. That way people without X installed can run it too :)
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 09 May 2006, 17:31

Image

Sorry had to post that. Despair.com rules. :)
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 09 May 2006, 18:46

Some of the first images from under 32bit Linux:
Image
Image
I'm not sure if that black line is meant to be there. Perhaps it is caused by the PPM exporter* or something I changed in rendering. As can be seen those lines moved when camera moved.

*) I borrowed and modified it from tbp's sphereflake tracer :)

As you can see there is definitely something wrong with my wallclock replacement. When I compared the time it took to render and write ten frames with the time it took to render and not write those ten frames I found that it took about 0.37s per frame or ~2.7 FPS @ 1024x1024 on P4 3.6GHz.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 09 May 2006, 20:42

That frame rate includes writing to a file? Otherwise it would be a bit slow. :)

About the black lines: Those are the incoherent packets, i.e. 4x4 tiles where not all ray directions have the same signs. The version you have still simply skips them. And as you move the camera, these lines thus also move. I fixed this today, I just downloaded the new icc (9.1), if all works OK I'll upload a fixed version (that'll be tomorrow).
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 09 May 2006, 20:57

Phantom wrote:That frame rate includes writing to a file?
I found that the framerate I reported earlier was wrong. It actually showed that images are written to file at 2.7 images per second :D

I added my own timer that I know works and found that with SPEEDTEST defined I get about 4.6FPS. Without that defined I get ~11.2FPS, both without writing to file.
Is it correct to assume that with a single pointlight and textures framerate drops about 2x?

I'll start porting to 64bit tomorrow after merging with your new code that you'll upload.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 09 May 2006, 21:36

Toxie: Did you test any scenes besides the KitchenTransformed with the 9.1 compiler? I'm getting crashes on that scene, but on all other scenes, the tracer / compiler works fine. I'll download BoundsChecker tonight to see what's wrong with the kitchen.

I do get very excellent speed by the way, the 9.1 compiler is great. VS2005 integration is very cool too, now I can finally see if PGO helps.

I'm VERY happy with the new intersection code btw, it's far more accurate; fairy forrest always had some black spots, but right now, it's flawless. That also means it's faster; tracing the black spots tends to be expensive (as the rays pass through the entire scene).
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Phantom on 09 May 2006, 22:10

New release now available. Check here:
rtStage2_may_9th_2006.zip in the scene repository
(tbp, can you move it? tnx)

This package now contains both a working executable (compiled with icc 9.1) and a source release.

Changes:
- Incoherent packet handling;
- VS2005 project files for icc9.1 (rename sln_old to sln if you don't want to use icc);
- Resolution & 'speedtest' define now read from scene.txt.

If you just want to play around, do the following:
- Open scene.txt, select the scene you want (13 = scene6, 14 = fairy forrest, feel free to add your own);
- Select resolution (see top of file);
- Uncomment 'speedtest' to get a canonical timing, or comment it to get a nice fly-by;
- Store more obj/mtl combo's or ra2 files in the meshes folder.

Source release:
- Sources are in src dir;
- Use winmerge or araxis to do a nice merge, shouldn't be too hard;
- Enjoy. Code works under icc9.1 with all optimizations set to max, except for the kitchen. Please report additional problems. There should be no visible artifacts.

I have a feeling that I am forgetting a ton of things, but it's late here.. Need to catch some sleep.

O yeah, one thing I wanted to ask: icc9.1 isn't producing .syn files during instrumented sessions. Some obscure website (intel.com) mentioned something about code not calling any functions, but I don't see what I could be doing wrong. Anyone? And if any Intel guy is reading this: Could Intel support this community with some free icc licenses? Right now I'm basically moving from one evaluation to another.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 09 May 2006, 22:43

From the looks of it, it shouldn't be too hard to merge my changes with your code. I'll do it sometime tomorrow and when I've got something worth showing I'll upload it.
The biggest thing is with window managment. Currently I have none but there are quite a bit of #ifdefs to cut out the windows part from soloapp.cpp. I guess window managment should be abstracted away a bit so there wouldn't be the need for mixing X11, winapi and other code together in one file.
And if any Intel guy is reading this: Could Intel support this community with some free icc licenses? Right now I'm basically moving from one evaluation to another.
Intel provides non-commercial versions of its software for Linux. Unfortunately their versoins are a bit behind compared to the latest ones. ICC 9.0 has been availiable for quite some time already but vtune newest version will be availiable a few months from now.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 10 May 2006, 08:37

The original plan was to put all platform specific stuff in soloapp.cpp, but right now, there's a bit more code there (like moving the camera, loading global settings), so it might need to be split. You could also just exclude soloapp from the linux build, and replace it by your own soloapp_linux.cpp or whatever. Then again, perhaps you can get away with some #ifdef's, so that a single source package works on all machines...

FYI, the icc9.1 compiler complained about the same SetColor( Color(r,g,b) ) that you mentioned, so I fixed it in the latest source code. It was really just a couple of instances, and it's all in init code (file loading, primarily).
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby Ho Ho on 10 May 2006, 08:41

Phantom wrote:FYI, the icc9.1 compiler complained about the same SetColor( Color(r,g,b) ) that you mentioned, so I fixed it in the latest source code. It was really just a couple of instances, and it's all in init code (file loading, primarily).
Didn't you try what tbp suggested here? It worked for me.
In theory, there is no difference between theory and practice. But, in practice, there is.
Jan L.A. van de Snepscheut
User avatar
Ho Ho
 
Location: Estonia

Postby Phantom on 10 May 2006, 08:47

No, didn't bother. In my experience this causes problems in other places, where the passed argument is not const. And since this is not mission-critical, who cares. :)
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby toxie on 10 May 2006, 08:57

@phantom: New SRC version now works with 9.1.. The old one crashed when using Sponza (didn't try other scenes) and Optimizations enabled..
toxie
 

Postby Phantom on 10 May 2006, 09:15

Hm, odd, I just added a couple of random alignment statements... Perhaps that was the problem.
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby tbp on 10 May 2006, 09:57

Phantom wrote:No, didn't bother. In my experience this causes problems in other places, where the passed argument is not const. And since this is not mission-critical, who cares. :)

Overload? Idioms are mixed in that class anyway.

Uploaded.
http://ompf.org/alpha/bikker/rtStage2_may_9th_2006.zip

Phantom wrote:I didn't have to go through any kind of special trouble to create a tree directly in a renderable format.

I'm not saying you had. Just that it cannot be easily extended to properly support 64bitness and that realloc-ing is just plain wrong (that's not a freaking vector, that's a tree damnit!). :P

As much as i'd prefer to work on a collaborative effort, i'm not going to chase Mr Bikker releases. Set up a sourceforge project or better somehting on savannah or ...
User avatar
tbp
Overlord
 
Location: France

Postby tbp on 10 May 2006, 10:09

A quick excerpt that should help a bit with X11 (it's a tad crappy but eh...).
http://ompf.org/tidbits/x11.cpp
User avatar
tbp
Overlord
 
Location: France

kudos, questions, etc

Postby madmethods on 10 May 2006, 20:53

First off, kudos for putting this package out.

First question, do people have some basic timing results from this package that I can compare to? (both for building and tracing)

Second question, just to confirm -- none of this (building or tracing) is currently multi-threaded?

-G
madmethods
 

Re: kudos, questions, etc

Postby madmethods on 10 May 2006, 21:07

Okay, forget the multithreading question (already answered by snooping around :).

Still curious about timings.

-G
madmethods
 

Juste a few words ...

Postby Shadow007 on 11 May 2006, 11:27

Just a few words to tell I've begun to include the source in my Ogre plugin. At the moment, my goal is to overlay the raytraced rendering above the Ogre one. I got that partly working. So far, I'm still only showing the KDtree compilation in front of the image. Remade it this morning to check out the minimal states I need to integrate the next source release.

At the moment, the scene used is still the one provided by Jakko.

The KDTree compilation beeing quite long (10 minutes in debug), I'll try to get it "split" (by un-recursiving it first) so that I can build the tree over more than 1 frame. (I could'nt find a way to do the ShowWindow thing).

I guess it won't be good for the building length, but it should be OK.

The step after that will be to capture the primitives from the Ogre engine, and add them sequentially to the scene.

Then, I'll have to check allocations/deallocations so that I'm able to avoid memory leaks after compiling/rendering a set of scenes (as opposed to only one rendered for the process's duration).

I've only got a question about the use of static class/members : Is that for performance reasons ?
Shadow007
 

Postby Phantom on 11 May 2006, 11:32

Yes it is. :)
--------------------------------------------------------------
Arauna - Game-oriented real-time ray tracing
http://igad.nhtv.nl/~bikker
Phantom
Overlord
 
Location: Houten, Netherlands

Postby toxie on 12 May 2006, 11:26

Played around with Pluecker in my own source(s) but it didn't work out for me (even with 4x4).

But i can suggest 2 small optimizations to your code:
a) In the tri intersection loop over the 4 ray sub-packets it helps to check if the current sub-packet is valid/not masked (otherwise don't calculate the intersection for this sub-packet)
b) (v0s & v1s & v2s) | ((v0s^0xf) & (v1s^0xf) & (v2s^0xf)) is the same as
(v0s & v1s & v2s) | ((v0s | v1s | v2s)^0xf) (at least if i haven't gone braindead during the last year ;))
so this fits better to your (v0s | v1s | v2s) < 0xf test.
toxie
 

Next

Return to Tools, demos & sources

Who is online

Users browsing this forum: No registered users and 0 guests

cron