IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Jag Raytracer
Orion_
post 11 May 2006, 20:05
Post #1


Rick dangerous
***

Group: Level1
Posts: 1.043
Joined: 2-1 06
Member No.: 29



This is a little bad attempt to Fixed Point Raytracing on Jaguar.
The Raytracer is 100% GPU Code. (836bytes)

I know this is not really looking like a raytracer, but actually it is smile.gif
the bad colors and precision is due to the fixed point :/
it's not optimised, it was only a little technical test to see how fast this would be on GPU ^^
This was coded on emulator Project Tempest, but it also work on a real jaguar except that it seems there is some glitchs with the jagware logo :/

anyway, enjoy smile.gif

(promise, I will try to start making some more useful stuff from now ^^)

Jag Raytracer



--------------------
my Website with all my homebrew projects !

"C'est la ou tu vois la supériorité de la vitamine C sur les dragibus !" - Fadest, RGC 2008
Go to the top of the page
 
+Quote Post
GT Turbo
post 11 May 2006, 20:11
Post #2


Another world
****

Group: Administrators
Posts: 3.204
Joined: 13-5 05
From: Alsace, France
Member No.: 2



QUOTE (Orion_ @ 11 May 2006, 21:05) *
(promise, I will try to start making some more useful stuff from now ^^)


Nice attempt Orion, we're waiting for the next thing flowers.gif


GT poulpe.gif


--------------------
Come do the Poulpe dance with us! (Sh3-RG)

C Vitamin is Superior to Dragibus !! (@Fadest) C.V. Rules !!

'asm is what real coders use, all the others are script kiddies' (@CJ)

'HLL is High Level Lamer' (@CJ)

'for each routine we spend 1 week in developement and 2 months in optimising GOOD WORK, C' (@GGN)

C.V.S.D. member of the Jagware community
Go to the top of the page
 
+Quote Post
Zerosquare
post 11 May 2006, 20:25
Post #3


Rick dangerous
***

Group: Administrators
Posts: 2.096
Joined: 4-1 06
Member No.: 30



Lovely flowers.gif


--------------------
« Mon PC on dirait un Amiga tellement c'est instable » – GT Turbo
« Soit A un niveau d'absurdité, il existe un post N tel que... » – Azrael et al., 2006
Go to the top of the page
 
+Quote Post
Symmetry of TNG
post 13 May 2006, 09:45
Post #4


Great giana sister
*

Group: Level1
Posts: 146
Joined: 31-3 06
From: Asmroad 101
Member No.: 40



Cool! =)

I have no idea how it actually works, but I can't imagine it is what I assume "raytracing" to be (ie allot of vector algebra, crossproducts,dotproducts, n stuff) ...since i assume that code to be bigger than 836 bytes wink.gif
there must be some simplification involved... or?

Is that 3 shaded spheres ? ...or just one? (the green one?) ..i mean regarding fixpoint, why does the red&blue look smoother shaded than the greenone?... wouldnt the math be the same?...
are there some Z values involved?..and how do you calculate the shadevalues?

Just qurious biggrin.gif

is it .8 or .16 fixpoint? ... (or signed 7.8 or something)....
And sorry for beeing "nosy" smile.gif but Im just qurious if you use vectormath or not.. and if that is the precision you get with the fixpoint?...

..Its nice work Orion! =)



some thoughts:
-Could the DSP do half of the screen? ... (ie ~2 faster... not counting the 16bit bus to dsp, hence the "~")

-Do you storew pixels to screen space with external stores?
A good idea would be to let the gpu build, say 1 scanline of the screen in GPU memory, and then Phraseblitt a hole scanline to screenspace... killing all external stores!
Combine this with the 1st point.. letting the dsp do same thing but other half of screen...

-Another gpu thingy is to do 2pixels at the same time... (dont exactly know how the inner loop looks but..) usually you can optimize it and interleave 2 similar calculations, and build 2 pixels at the same time, and make 1 store to GPU memory (storing two 16bit pixels at the same time).. and then after a scanline do the phrblitt to screen...
(this might increase part of the codesize by *2.. (&register usage) since you have to do same thing twice... and you might have to do special "pre-loop" setups.. to get it working... (common on the falcons dsp) but you kill all waitstates that you might have in the code.. and half the loopcount.... making use of true RISC power.. (1tick/instruction)

>I will try to start making some more useful stuff from now ^^)

Ohh... it was that then wink.gif
But this could be "usefull!" ... wink.gif ...in my eyes it is defenitely NOT a waste of time anyway.. tongue.gif

cheers


--------------------
--Aim for the future or dwell in the past?--
--Support JagCF! & Gameplay-for-the-Masses!--
Go to the top of the page
 
+Quote Post
Orion_
post 13 May 2006, 10:26
Post #5


Rick dangerous
***

Group: Level1
Posts: 1.043
Joined: 2-1 06
Member No.: 29



QUOTE (Symmetry of TNG @ 13 May 2006, 10:45) *
I have no idea how it actually works, but I can't imagine it is what I assume "raytracing" to be (ie allot of vector algebra, crossproducts,dotproducts, n stuff) ...since i assume that code to be bigger than 836 bytes wink.gif
there must be some simplification involved... or?

it is real raytracing, only one light, and 3 shaded sphere, no rotation, but it use a lots of dotproducts, 2 squareroot, some div, and that for each pixel, the GPU is quite fast I was impressed, but the fixed point make this fast too.
I can release the source code If you want smile.gif but it's not really a clean source code, and it have some dirty last minute hack because I had problem with some distance comparison.

QUOTE
Is that 3 shaded spheres ? ...or just one? (the green one?) ..i mean regarding fixpoint, why does the red&blue look smoother shaded than the greenone?... wouldnt the math be the same?...
are there some Z values involved?..and how do you calculate the shadevalues?

The green one is closer to the camera than the 2 others, and I think because of the fixed point and lost of precision, the shaded colors are ugly :/
the shadevalue is calculated doing crossproduct between the light vector (from the light to the intersected point on the sphere) and the normal at the intersected point of the sphere surface.

QUOTE
is it .8 or .16 fixpoint? ... (or signed 7.8 or something)....

signed .8 (else I think it will overflow)

QUOTE
-Could the DSP do half of the screen? ... (ie ~2 faster... not counting the 16bit bus to dsp, hence the "~")

-Do you storew pixels to screen space with external stores?
A good idea would be to let the gpu build, say 1 scanline of the screen in GPU memory, and then Phraseblitt a hole scanline to screenspace... killing all external stores!
Combine this with the 1st point.. letting the dsp do same thing but other half of screen...

(I use external storew)
actually I was thinking of doing that before starting the raytracer, that's why I tried a simple test with the GPU to predict how fast It will be using those optimisation, even by reducing the screen to 160x120 I think I will not do better than 2fps, I don't know if that be worth it.


--------------------
my Website with all my homebrew projects !

"C'est la ou tu vois la supériorité de la vitamine C sur les dragibus !" - Fadest, RGC 2008
Go to the top of the page
 
+Quote Post
Symmetry of TNG
post 14 May 2006, 09:44
Post #6


Great giana sister
*

Group: Level1
Posts: 146
Joined: 31-3 06
From: Asmroad 101
Member No.: 40



so its raytracing, nice! =)

well source are always nice, and Im interested in the topic, and I could do a quick check if I notice some obvious optimisation possebilities. So if you whant to, then please do so =)

(I have some PC c source for this that i planed to dig into & convert.. but i never got to that..

>shadevalue is calculated doing crossproduct
you mean DotProduct? ... wink.gif ..but still... you mean you do a cross product at each pixel of the sphere to find the normal & then do a light*normal to get the shade.. thats still ALLOT of work.. (and it is the true "algebra way".... then it isnt that bad speedwise =)


>signed .8 (else I think it will overflow)
well it might still overflow of you do MAC's ..but with s7.8 (signed 7integer 8fixpoint) you get the correct sign with the built in mult instructions.. If you go for s15.16 then you get higher precision but you ned allot of more work sad.gif


>I use external storew
Well if there is nothing else on the bus then perhapps.. but in worst case 1 storew will take the time it takes for the OP to do all objects in the objectlist!... since the OP hogs the bus while it is doing the OL, then storing internally is a much better way since it can always do that independently of what the other system is doing.

I did this when i did the maniac optimisation of my fire routine... first version storew'd ..and it becomes much much faster to build a scanline internally & phrblitt.
I noticed this even more with the "water" routine i made... (aarrgghhh!) ended up with a circular scanline buffer that was phrase blitted in & out, and calculated 2pixels at a time with half the loopcount.... and that was extremely much faster than doing 5 or 6 external mem accesses for each pixel... but it becomes a heck of alot more complex...
ahh well...

Its still nice work you did! =)
And it IS usfull!... i can imagine a 96K TYS demo and that could be a "gfx renderer" instead of storing the bitmap wink.gif ..so..

chers!


--------------------
--Aim for the future or dwell in the past?--
--Support JagCF! & Gameplay-for-the-Masses!--
Go to the top of the page
 
+Quote Post

Fast ReplyReply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



Lo-Fi Version Time is now: 22-7-2014 / 20:12