--
Jon Olick
id Software
http://www.mobygames.com/developer/sheet/view/developerId,54871/Heavenly Sword (2007), Sony Computer Entertainment Europe Ltd.
Ratchet & Clank Future: Tools of Destruction (2007), Sony Computer Entertainment America, Inc.
Warhawk (2007), Sony Computer Entertainment America, Inc.
MotorStorm (2006), Sony Computer Entertainment Incorporated
Resistance: Fall of Man (2006), Sony Computer Entertainment Incorporated
Jak X: Combat Racing (2005), SCEA
Men of Valor (2004), Vivendi Universal Games, Inc.
Medal of Honor: Allied Assault (2002), Electronic Arts, Inc.
CIA Operative: Solo Missions (2001), ValuSoft, Inc.
toxie wrote:ray tracey, you're a stalker..![]()
toxie wrote:uhmmm.. wow.. so all the talking/speculation about id using some voxel grid ray tracing/casting were indeed true..
are you allowed to leak some details at this point?
(rough info on geometry handling vs. textures vs. dynamic/moving/new objects vs. lighting vs. ...)
Ray Tracey wrote:Wow, very impressive track record:http://www.mobygames.com/developer/sheet/view/developerId,54871/Heavenly Sword (2007), Sony Computer Entertainment Europe Ltd.
Ratchet & Clank Future: Tools of Destruction (2007), Sony Computer Entertainment America, Inc.
Warhawk (2007), Sony Computer Entertainment America, Inc.
MotorStorm (2006), Sony Computer Entertainment Incorporated
Resistance: Fall of Man (2006), Sony Computer Entertainment Incorporated
Jak X: Combat Racing (2005), SCEA
Men of Valor (2004), Vivendi Universal Games, Inc.
Medal of Honor: Allied Assault (2002), Electronic Arts, Inc.
CIA Operative: Solo Missions (2001), ValuSoft, Inc.
Lead Programmer, Naughty Dog
SCE, Playstation Edge
and now id!
on topic: do you perform raycasting on the GPU or CPU?
beason wrote:Sweet!
But, as a class, does that mean it will have very limited seating? Say, around 30?
I want to witness the awesome tech but don't want to either stand in line for hours nor take up a spot from a more applicable participant.
Lastly, a nub question: why do ray casting for the scene? What does ray casting give you that the traditional hw API can't do? If I had to guess, I would say you could get more polys on screen, but maybe only by a factor of 10 or less, depending on system memory vs. video card memory. (I don't do much hw rasterization so I don't know its limits.)
Zelex wrote:The primary benefit to a single player game is that it gives you a generational skip in geometric complexity. (Which is about 8x additional geometry) However there are lots of development pipeline benefits behind the scenes that come in for free. If you never have to worry about geometric complexity and you never have to worry about texture detail, then development is greatly simplified. You can simply throw a ton of artists at the problem without worrying about impacts on performance or runtime memory requirements. Thats huge and should save lots of development effort.
Zelex wrote:Currently using CUDA, but I plan on trying out Larabee.
I’ve been pitching this idea to both NVIDIA and Intel and just everybody about directions as we look toward next generation technologies.
PCPER: How dramatic would a hardware change have to be to take advantage of the structures you are discussing here?
CARMACK: It’s interesting in that the algorithms would be something that, it’s almost unfortunate in the aspect that these algorithms would take great advantage of simpler bit-level operations in many cases and they would wind up being implemented on this 32-bit floating point operation-based hardware. Hardware designed specifically for sparse voxel ray casting would be much smaller and simpler and faster than a general purpose solution but nobody in their right mind would want to make a bet like that and want to build specific hardware for technology that no one has developed content for. The idea would be that you have to have a general purpose solution that can approach all sorts of things and is at least capable of doing the algorithms necessary for this type of ray tracing operation at a decent speed. I think it’s pretty clear that that’s going to be there in the next generation. In fact, years and years ago I did an implementation of this with complete software based stuff and it was interesting; it was not competitive with what you could do with hardware, but it’s likely that I’ll be able to put something together this year probably using CUDA. If I can make something that renders a small window at a modest frame rate and we can run around some geometrically intricate sparse voxel octree world and make a 320x240 window at 10 fps and realize that on next-generation hardware that’s optimized more for doing this we can go ahead and get 1080p 60 Hz on there.
Zelex wrote:Hybrid rendering
Characters/Dynamic Objects with traditional rendering techniques
World with raycasting (and is completely static, so lighting is baked)
Ray Tracey wrote:So no more hexagonal trees and tubes in the next generation![]()
Ray Tracey wrote:Can you tell what performance you're getting on a GTX280 for a 720p scene?
Ray Tracey wrote:Which architecture (Larrabee or GPU) do you think is better for sparse voxel ray casting?
Ray Tracey wrote:Carmack said in the PCPer interview:I’ve been pitching this idea to both NVIDIA and Intel and just everybody about directions as we look toward next generation technologies.
andPCPER: How dramatic would a hardware change have to be to take advantage of the structures you are discussing here?
CARMACK: It’s interesting in that the algorithms would be something that, it’s almost unfortunate in the aspect that these algorithms would take great advantage of simpler bit-level operations in many cases and they would wind up being implemented on this 32-bit floating point operation-based hardware. Hardware designed specifically for sparse voxel ray casting would be much smaller and simpler and faster than a general purpose solution but nobody in their right mind would want to make a bet like that and want to build specific hardware for technology that no one has developed content for. The idea would be that you have to have a general purpose solution that can approach all sorts of things and is at least capable of doing the algorithms necessary for this type of ray tracing operation at a decent speed. I think it’s pretty clear that that’s going to be there in the next generation. In fact, years and years ago I did an implementation of this with complete software based stuff and it was interesting; it was not competitive with what you could do with hardware, but it’s likely that I’ll be able to put something together this year probably using CUDA. If I can make something that renders a small window at a modest frame rate and we can run around some geometrically intricate sparse voxel octree world and make a 320x240 window at 10 fps and realize that on next-generation hardware that’s optimized more for doing this we can go ahead and get 1080p 60 Hz on there.
Will "next-generation" graphics cards (or Larrabee) contain special hardware to accelerate the ray casting and if so, what performance boost could be expected from that?
Ray Tracey wrote:So is it correct to say that the world is raycasted voxels and characters/dynamic objects is rasterized polygons?
That's very interesting. Do you think GPU's are going to have/need generic caches?I basically explain how its kind of the destiny of graphics cards to have hardware that enhances raycasting as it will likely be developed to enhance rasterization and then raycasting can piggyback on the technology.
If they plan to compete with larrabee, they betterRay Tracey wrote:Do you think GPU's are going to have/need generic caches?
ingenious wrote:This may be a lame question here, but what do you exactly mean by "sparse voxel octree ray casting" ? A practical application of ray tracing through a good old octree in order to get some nice secondary/volume effects in a game? And why grids? Because they're useful for things other than ray casting? Because their traversal is stackless? Because they're cheap to build/store?
Zelex wrote:ingenious wrote:This may be a lame question here, but what do you exactly mean by "sparse voxel octree ray casting" ? A practical application of ray tracing through a good old octree in order to get some nice secondary/volume effects in a game? And why grids? Because they're useful for things other than ray casting? Because their traversal is stackless? Because they're cheap to build/store?
Lots of good questions. The primary difference between this approach and a typical raycaster is that in this approach you don't have polygons. The oct-tree is the geometry. So why do it this way? You get unqiue texturing and unique geometry as a side effect and it turns out that this is incredibly good for CUDA / Larrabee. I'll attempt to answer those questions in a bit more detail in the talk.
JohnTsakok wrote:im interested in the resolutions you guys can render now.. and are the actual voxels small enough to not be visible. Also, for texture mapping, are textures indepenedent of the geometry or is the geometry the actual texture since artist now just paint geometry...
Nobody's-Octree wrote:This octree representation may be very useful for LEGO computer games - a very bricky solution![]()
Is it something like in this paper from Klein et al. : http://cg.cs.uni-bonn.de/default.asp?page=http://www.cg.cs.uni-bonn.de/publications/publication.asp?id=247&language=en ?
Zelex wrote:the geometry and texture are the same thing.
bouliiii wrote:the streamer has to be really good when the point of view is moving fast
Brian Karis wrote:My colleagues and I have been doing some research in this area and the closest paper we have found is this:
http://www.crs4.it/vic/cgi-bin/bib-page ... :2008:SGR'
close second:
http://artis.imag.fr/Publications/2008/CNL08/
For those of us unable to attend will there be any papers, powerpoints, videos, etc. posted afterwards?
bouliiii wrote:Finally, I am not sure that virtual sparse octree is the only solution to handle huge quantities of geometry. Brute force tesselation also provides very good result. I wrote a small application which is already able to output subdivision surfaces with 500 millions triangles /sec on a 8800GT i.e. almost the theoretical limit of the rasterizer. The tesselation also provides a very cache efficient result. For the production of data, I am not sure that it is better or worse than octree. Tesselation is natural with characters (think about ZBrush which easily provides a coarse mesh and its displacement map) and of course with terrains. OF course, octree are really good with their ability to be directly painted and there is no required data about the topology of the surface. However, mega-texturing already provides the ability to paint everything. So for the future:
megatexture + tesselation > virtual sparse octree or megatexture + tesselation < virtual sparse octree ??
I have no personal idea about it. But, we are currently investigating the first solution for the current gaming platforms.
Ben
PS: Zelex: you did not try to implement the octree on the CELL ?! It may be nice to have such a nice implementation on dev-net like Edge.
Ray Tracey wrote:So basically you're doing quantum physics now (Schrödinger's cat in box is dead and alive at the same time)?
Ray Tracey wrote:I would like to know that as well. Is there a lag or some kind of progressive refinement happening when you're rapidly streaming in new voxel sets and is this where the lack of caches on GPU's hurts the most?


jogshy wrote:- Assume you want a moderate view distance ( like 500m or 1Km in an outdoor scene ).
- Assume a 256Mb graphics card ( think in a PS3, for example ).
- 1 voxel = RGBA color(4 bytes) + 1 quantized normal(let's say... 1byte) + 1 materialID ( for properties like specular, etc...) = 6bytes
- That's a 355^3 voxel scene.
How can be 355x355x355 enough for an outdoor scene like a beach+mountains in Crysis? Ok, Ok... you can perform some kind of adaptive voxeling so furthest voxels won't be displayed at full detail. Ok, a tri-linear LOD can help but I bet you'll need like 5k^3 to get a good result for a Crysis/Quake Wars outdoor level ( because you want clouds, mountains, forest, etc... ). ZBrush uses pixols(voxels with extended data)... but a model is not a large scene...
jogshy wrote:On the other hand... what you do if, for example, you change the point of view suddently ( think in Portal )... how do you stream that without a big loading laaaaaaaag? And more considering a FPS game where frames per second are important...
jogshy wrote:And... what about dynamic objects? You could stream a static scene... but to voxelize a character, a barrel falling due to gravity, etc... you gonna need a super-fast octree construction.
jogshy wrote:And other problem... do you convert polygons to voxels using an offline tool or do you make it on real time? Or do your artist model directly using voxels/point clouds?
jogshy wrote:Perhaps an uniform grid + simple raycast could handle better uniformly-distributed scenes? Uniform grids can be dynamically updated at lightspeed and they are cache-friendly and easily parallelizable... but the teapot-in-stadium can be a problem when your traverse it...
jogshy wrote:Btw.. have you seen this article?
Model made of 1,5M voxels:
Tons of instances of that model with shadows and reflection:
Zelex wrote:jogshy wrote:On the other hand... what you do if, for example, you change the point of view suddently ( think in Portal )... how do you stream that without a big loading laaaaaaaag? And more considering a FPS game where frames per second are important...
There are a few techniques to mitigate that problem, but Quake Wars solved a similar issue by dropping the player in by parachute. So there are technology solutions and game play solutions to that issue.
Ray Tracey wrote:Have you been experimenting with solid state drives yet and do you think they will be cheap enough for next generation systems?
Zelex wrote:Doesn't the Wii already have a solid state drive?
Point taken (damn Wii)Zelex wrote:If microsoft and Sony want their consoles to be entertainment centers, they are going to need massive storage. So, probably not solid state for this next generation, but one can hope!
Zelex wrote:jogshy wrote:Perhaps an uniform grid + simple raycast could handle better uniformly-distributed scenes? Uniform grids can be dynamically updated at lightspeed and they are cache-friendly and easily parallelizable... but the teapot-in-stadium can be a problem when your traverse it...
I don't think a uniform grid will work for most games.
straaljager wrote:Could you elaborate on that? Is it mainly because of the teapot in the stadium problem or are there other limitations compared to octrees?
straaljager wrote:BTW Do you know if John Carmack is going to give some details on the sparse voxel octree raycasting in his Quakecon keynote this thursday?
Clément ELBAZ wrote:I am still a bit puzzled about one fact : you seem to say that sparse voxel octree can only handle static geometry, at least for now.
...
Clément ELBAZ wrote:Hardware limitations can be bypassed.
Clément ELBAZ wrote:Games like Crysis, STALKER, even Doom 3 have succeeded providing complex non-static environments.
Clément ELBAZ wrote:Carmack talked about a proof-of-concept released this year, do you know if it will be publicly available ?
Clément ELBAZ wrote:A straightforward method for handling octrees based dynamic scenes composed of instances of static objects is to have one octree per "base object", and shoot rays converted by instance from world space to octree space into the tree, handling without problems multiple instances of the data contained in the octree. Does that kind of approach would be feasible with SVO ?
Zelex wrote:
In some respects, you are correct. Certainly some rasterization tricks are still applicable like deferred shading and we plan on using them for dynamic objects. However, if you want to use ray tracing to light your environment entirely, the hardware just isn't fast enough for the resolutions required by next generation games.
Zelex wrote:
Doom 3 had static environments and dynamic objects, but non-static lighting. Rage and Half-life 2 have certainly proven that a static environment with mostly static lighting and dynamic objects produces compelling gameplay and visuals.
Zelex wrote:I highly doubt it, but you never know!
Zelex wrote:There are some complications in the details here. Putting additional work into the ray casting kernel would make it noticably slower, even if the code was rarely executed due to register limitations. So that means that you have to dynamically update the oct-tree to avoid that. Which means uploading more traffic to the GPU, or alternatively generating the modified oct-tree data structure on the GPU. In any of those cases, it means dynamically managing the allocation of nodes in the oct-tree which adds all sorts of complications such as fragmentation related cache issues. Doing node allocation on the GPU sounds nice but requires lots of extra data to do properly. And what happens if you need to modify a section of data which isn't loaded into memory? Do you modify that section at load time? If you do then you have to coordinate your dynamic updating of the oct-tree with the GPU who may also be dynamically updating the geometry. There is a different approach which I talk about in my talk which does not require updating of the oct-tree, but would require slight modifications to the tracing code, which will affect performance which is why I say that its probably not a good idea for current generation technology. Emphasis on the "probably" and "current generation technology". If the next generation console technology is better than I expect it to be some of these issues may no longer be a problem, in which case there is definitely a plan for that.
Clément ELBAZ wrote:
Mmmmh, so if I summarize as I understand it :
...
...
Clément ELBAZ wrote:If we have the VRAM saturated with the main octree, we are indeed a bit stuck. Maybe there is room for a second octree, with less memory allocated to it, that would be constantly changing data ? You could instanciate the same objects many times by simply changing the rays directions, as I described precedently. So with this kind of "instancing", the bandwidth would be less problematic (but it implies that dynamic geometry to be not unique, I'm aware that fits pretty bad with id Tech 5 and 6 philosophy). You could also uploading a "frame" of an animated pack of voxel (or two frames to blend, or three or four to blend even more smoothly...).
As we increase the size of the dynamic octree, we probably would have to decrease the size of the static one indeed, but remember it could be kept coherent with the dynamic / static ratio of the rendered world. More parts of the world are becoming dynamic, less are staying static, and so the size of the two octrees.
The rendering of two octrees doesn't seams to have very strong problems. You render the static one, obtain the depth for each pixel on the screen, then you render the dynamic one and z-kill the dynamic pixels against the depth of the static ones to obtain a properly fit between the two images.
Zelex wrote:Thats an awful lot of assumptions based on a single paragraph.I would love to explain it all to you, but I don't really have the time at the moment. I'll do my best to answer all your questions in the siggraph talk.
Zelex wrote:Certainly a valid technique, however that doesn't solve many of the problems I mentioned.
for each frame
// Drawing static
if( static octree deprecated ) stream to GPU the static octree update
ray trace static octree into gBuffer using point of view aligned rays
// Handling of dynamic objects
for each type of dynamic object
load the dynamic octree on the GPU with data of this dynamic object from the RAM
for each instance of this dynamic object in the world
load the matrix World, matrix world of this instance
create matrix WorldView, multiplication of World with the point of view matrix
ray trace dynamic octree into gBuffer, using WorldView aligned rays (and zkill the result against the current gBuffer).
// Handling dynamic lighting
get shaded image from deferred shader and current gBuffer.
display shaded image !
Clément ELBAZ wrote:Too bad I will not attend SIGGRAPH
Currently, what are you working on ?
Clément ELBAZ wrote:Sure, generally speaking, dynamics environments can't be as optimized as statics ones, that is a fact.
...
Zelex wrote:bouliiii wrote:Finally, I am not sure that virtual sparse octree is the only solution to handle huge quantities of geometry. Brute force tesselation also provides very good result. I wrote a small application which is already able to output subdivision surfaces with 500 millions triangles /sec on a 8800GT i.e. almost the theoretical limit of the rasterizer. The tesselation also provides a very cache efficient result. For the production of data, I am not sure that it is better or worse than octree. Tesselation is natural with characters (think about ZBrush which easily provides a coarse mesh and its displacement map) and of course with terrains. OF course, octree are really good with their ability to be directly painted and there is no required data about the topology of the surface. However, mega-texturing already provides the ability to paint everything. So for the future:
megatexture + tesselation > virtual sparse octree or megatexture + tesselation < virtual sparse octree ??
I have no personal idea about it. But, we are currently investigating the first solution for the current gaming platforms.
Ben
PS: Zelex: you did not try to implement the octree on the CELL ?! It may be nice to have such a nice implementation on dev-net like Edge.
What you say is definitely true. An infinite geometry engine is definitely possible on current generation hardware. I wrote one while at Sony. However, it requires lots of man art hours to tweak the art data into submission. Ray casting is kind of the next evolution of that kind of technology where the goal is to make it easy and economical. There are plenty of hurdles still yet to pass before that goal is met in my critical eyes, but I'm making great strides towards it.
Clément ELBAZ wrote:Mmmmh, so if I summarize as I understand it :
- In your implementation, the octree never leaves the GPU. However you precedently talked about a progressive refinement. I guess it is kept at a minimal traffic, so am I correct if I say that for a static scene and a static point of view, the octree stays indefinitely in the GPU and there is no octree data sent to the GPU at each frame ?
Users browsing this forum: MSN [Bot] and 2 guests