Jump to content
Jagware
Sign in to follow this  
Tursi

Gpu Dmaen Bit Breaks Storep

Recommended Posts

Tursi    0

GPU DMAEN bit and STOREP doesn't work.

 

When DMAEN is set on the GPU, the high data register does not appear to be written to external memory on a STOREP (or it appears to write a fixed value but not the value in the high register - disabling DMAEN resumes normal operation.)

 

Didn't test LOADP.

 

Also, it doesn't work to overlap your OPL and your animation buffer. ;)

 

Share this post


Link to post
Share on other sites
Zerosquare    10

Interesting - yet another Jaguar bug (as if it didn't have enough already :D )

 

But the documentation states that DMAEN should not be used anyways (that's TOM bug #24 in TechRef). Enabling it gives more priority to the GPU than to the Object Processor, so if bus contention occurs between the two, the OP may not get the graphics data quickly enough and the display can be corrupted.

Share this post


Link to post
Share on other sites
Tursi    0

The documentation isn't a good reason not to experiment. :) We know it's got a lot of flaws.

 

If the OP is not running or is processing only stop and branch objects, then the condition can't occur. I used the blitter high priority mode very successfully by running it during an OP GPU Interrupt, while the OP was halted. This particular case was an experiment and the experiment didn't work, but I think it's a good idea to document any known odd behaviour. Sometimes you can exploit odd behaviour in unexpected ways. ;)

 

Share this post


Link to post
Share on other sites
SCPCD    0
GPU DMAEN bit and STOREP doesn't work.

 

When DMAEN is set on the GPU, the high data register does not appear to be written to external memory on a STOREP (or it appears to write a fixed value but not the value in the high register - disabling DMAEN resumes normal operation.)

 

Didn't test LOADP.

 

Also, it doesn't work to overlap your OPL and your animation buffer. ;)

In your test, is there OP or the DSP running ?

I have made a test and this next code works fine (no OP nor DSP running)

 

	.phrase
gpu_code_start3:
.gpu
.org		G_RAM
pGflags					.equr	r28
cGflags					.equr	r29

movei		#G_FLAGS,pGflags			;Flags GPU
load		(pGflags),cGflags
bclr		#3,cGflags
bclr		#14,cGflags					;select bank0
bset		#15,cGflags
store		cGflags,(pGflags)			;mise a jour des flags

nop
nop
movei		#G_HIDATA,r1
movei		#$100000,r2

movei		#256,r3

movei		#-1,r4
moveq		#0,r0
.gpu_loop:
store		r0,(r1)
storep		r4,(r2)

addqt		#8,r2
addqt		#1,r0
subq		#1,r3
subqt		#1,r4

jr			NE,.gpu_loop
nop

GPU_STOP

.68000
gpu_code_end3:
dc.l		0

 

Share this post


Link to post
Share on other sites
Tursi    0

Yeah, the OP is running (though the machine is in VBlank and so it should only be executing branch and stop objects). It's actually just my JagLion code except I changed the initialization of the GPU to set DMAEN, and the only place that uses STOREP is the buffer clear function. Instead of clearing it I got a regular pattern.

 

If you want to try it, the latest version went on my site today (just a lot of little bugfixes to improve stability, and I was using it as a test base for my project). http://harmlesslion.com/software/jaglion

 

You'll find the line for setting DMAEN commented out in CALCMAND.S on line 240.

 

Of course, if you find a bug in my work there that'd be great to clear that pending issue. Or even if you confirm it. :)

 

Share this post


Link to post
Share on other sites
SebRmv    2

I didn't even know about this flag :D

Thanks for pointing that thus.

 

While reading the description, I also read the following warning (in Jag_v8)

 

WARNING - writing a value to the flag bits and making use of those flag bits in the following instruction will

not work properly due to pipe-lining effects. If it is necessary to use flags set by a STORE instruction, then

ensure that at least one other instruction lies between the STORE and the flags dependent instruction.

 

So let's deviate from the original subject :P

Does it mean that for leaving an interruption handler, the sample code

given by Atari does not always work?

 

Remember that the sample code (given by Atari) is:

int_serv:
  movei GPU_FLAGS,r30; point R30 at flags register
  load (r30),r29; get flags
  bclr 3,r29; clear IMASK
  bset 11,r29; and interrupt 2 latch
  load (r31),r28; get last instruction address
  addq 2,r28; point at next to be executed
  addq 4,r31; updating the stack pointer
  jump (r28); and return
  store r29,(r30); restore flags

 

Is the restore flags instruction is at the good place?

If I understand correctly the warning, they say that

is is not safe to put the last store after the jump.

 

This theory is consistent with the following remark (found in bug sections)

· We've found that you can't put the IMASK clear in the delay slot of the jump out of the interrupt, because

the instruction that was interrupted may not get the correct register bank (TWI - Brian McKee)

 

Does anybody have experienced such things?

I may have a (random) bug related to this.

Share this post


Link to post
Share on other sites
SCPCD    0
So let's deviate from the original subject :P

Does it mean that for leaving an interruption handler, the sample code

given by Atari does not always work?

 

WARNING - writing a value to the flag bits and making use of those flag bits in the following instruction will

not work properly due to pipe-lining effects. If it is necessary to use flags set by a STORE instruction, then

ensure that at least one other instruction lies between the STORE and the flags dependent instruction.

Mean that you should not have an instruction that use at least one bit of the flags.

for example :

movei #GPU_FLAGS,r30
load (r30),r29
bset #14,r29    ; use of BANK1
store r29,(r30)   
jump T,(r3)        ; r3 is valide into BANK1
nop

This code is not safe because it's not sure that the r3 for the jump instruction is read into the BANK1

but :

movei #GPU_FLAGS,r30
load (r30),r29
bset #14,r29    ; use of BANK1
store r29,(r30)
nop
jump T,(r3)        ; r3 is valide into BANK1
nop

is safe because we have a wait state slot that give the time to the pipeline to write correctly the r29 result to the GPU_FLAGS register.

 

 

the next code is also safe :

movei #GPU_FLAGS,r30
load (r30),r29
bset #14,r29    ; use of BANK1
jump T,(r3)        ; r3 is valide into BANK0 <- Warning : bank 0 not bank 1 !
store r29,(r30)

because :

- the jump instruction read r3 register and flags (that is not used here because T) before executing the store instruction

- then execute the store instruction

- and then jump.

 

int_serv:
  movei GPU_FLAGS,r30; point R30 at flags register
  load (r30),r29; get flags
  bclr 3,r29; clear IMASK
  bset 11,r29; and interrupt 2 latch
  load (r31),r28; get last instruction address
  addq 2,r28; point at next to be executed
  addq 4,r31; updating the stack pointer
  jump (r28); and return
  store r29,(r30); restore flags

 

Is the restore flags instruction is at the good place?

yes, because we should read r28 into the correct bank (wich is actually into the BANK0 selected by the IMASK = 1) and consequently, we should read r28 before to write into the flags register.

And as we should return to the previous bank ([0 or 1] depending of the bank used before the interrupt) before returning to the interrupted code, the only good place for the store is into the "nop" slot of the jump :)

 

 

If I understand correctly the warning, they say that

is is not safe to put the last store after the jump.

nop, it's just that the coder should take into account the pipeline-effect when he modify flags register for the next instruction. (like all instructions that don't have writeback protection)

 

This theory is consistent with the following remark (found in bug sections)

· We've found that you can't put the IMASK clear in the delay slot of the jump out of the interrupt, because

the instruction that was interrupted may not get the correct register bank (TWI - Brian McKee)

 

Does anybody have experienced such things?

I may have a (random) bug related to this.

exept that this is a bug related into the "lies and damned lies" that is not true.

 

If this doen't work, It would be impossible to use interrupts when a code use the BANK1 and I use always the bank1 for the main part and bank0 for interrupts and save states and it works perfectly (for exemple for the demo FACTS)

 

Share this post


Link to post
Share on other sites
SCPCD    0
Yeah, the OP is running (though the machine is in VBlank and so it should only be executing branch and stop objects). It's actually just my JagLion code except I changed the initialization of the GPU to set DMAEN, and the only place that uses STOREP is the buffer clear function. Instead of clearing it I got a regular pattern.

 

If you want to try it, the latest version went on my site today (just a lot of little bugfixes to improve stability, and I was using it as a test base for my project). http://harmlesslion.com/software/jaglion

 

You'll find the line for setting DMAEN commented out in CALCMAND.S on line 240.

 

Of course, if you find a bug in my work there that'd be great to clear that pending issue. Or even if you confirm it. :)

I think that I have found why it doesn't work :

you set the GPU's DMA priority mode when it starts, but this means that the GPU run always at DMA priority, so all external acces of the GPU core break the bus :

for exemple :

    loadw (r20),r24    ; save old scanline
.lwait1:
    nop
    loadw (r20),r4
    and r18,r4        ; mask VC off to just a line counter
    shrq #1,r4        ; divide by 2
    cmp r4,r24        ; wait for a change
    jr EQ,.lwait1
    nop
    move r4,r24        ; save the old value

r20 is VC register which is external of the GPU core like all registers that is not in the GPU section (all registers from page 10 to 17 are regarded as external registers like Jerry registers)

 

Like this, the GPU takes the priority during OP processing.

I think that there would have a strange things append during these case.

 

A solution to have DMA priority during only the VBLANK is to add :

cpuint0:
    movei #G_FLAGS,r30
    load (r30),r29            ; read flags    
    bset #15,r29            ; DMA mode
    store    r29,(r30)
    nop

at the start of the interrupt routine and modify the exit interrupt routine like this :

exitint:
; finished interrupt, clean up
    movei #G_FLAGS,r30
    load (r30),r29            ; read flags
    bclr #3,r29                ; clear IMASK
    bset #9,r29                ; reset CPU int
    
    bclr #15,r29            ; return to normal mode

 

The GPU run in normal priority mode during normal operation then turn into higher priority level during the interrupt routine and return to normal mode after the interrupt.

 

 

Share this post


Link to post
Share on other sites
Tursi    0
If this doen't work, It would be impossible to use interrupts when a code use the BANK1 and I use always the bank1 for the main part and bank0 for interrupts and save states and it works perfectly (for exemple for the demo FACTS)

 

I can confirm that I do a lot of GPU interrupt work and have never had trouble with putting the flag restore in the delay slot. Another note to remember is that (I can't remember where it says this), the manual also notes there is a 3 cycle stall after a jump - this is while it reloads the pipeline. This gives the flags instruction lots of time to propegate through the pipeline. (Here we are, p62 where it is listing all the stalls:)

 

after a jump or jr (three clock cycles if executing out of internal memory).

Share this post


Link to post
Share on other sites
Tursi    0
I think that I have found why it doesn't work :

you set the GPU's DMA priority mode when it starts, but this means that the GPU run always at DMA priority, so all external acces of the GPU core break the bus :

 

This is a good theory - in that I didn't think of it at all. But have you run the program and looked at the results? They do not look like any kind of OP interruption at all, it is very clearly only the memory buffer that is being impacted. It's extremely consistent. The OP is, in my experience, a sensitive little bugger, and doesn't like to be trifled with, doing all kinds of screwy things that usually result in an empty screen. ;)

 

I will give your suggestion a try and see if it makes a difference. ;)

 

(edit) And now I have done so. This is very interesting... you are right that it clears up the buffer corruption.

 

I can't say that I understand why, though. The interference is extremely odd. It boils down to:

 

-Having DMAEN set

-While the GPU is reading VC and manipulating BG

-While the OPL is processing the display list with a bitmap object

-Causes STOREP to behave incorrectly on a GPU-processed CPU interrupt during vertical blank

 

Note that the STOREs work fine. Confirmed that it doesn't matter if DMAEN is enabled while writing G_HIDATA, that register still gets the correct value. Also confirmed it's not just the OPL screwing up on transparent phrases (walk the lion and you can see that the bytes which are cleared during the animation sequence stay cleared until the keyframe is reloaded.)

 

Very strange, but thanks for helping to narrow the cases a bit further.

 

 

Share this post


Link to post
Share on other sites
Guest
You are commenting as a guest. If you have an account, please sign in.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoticons maximum are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×