Jump to content

About The Blitter Speed


Orion_

Recommended Posts

ok, I just done some test using the blitter because I want to do an effect using the blitter and I wanted to know if the blitter is faster taking graphics in GPU Cache or in RAM. the answer is, yes and no.

so here is the results: all test done by using blit destination in ram. with a 12x12 sprite using copy only.

 

in Phrase Mode: the blitter is a bit faster when source is in ram !! so a bit slower when source is in gpu cache.

in Pixel Mode: the blitter is approximately twice faster when source is in gpu cache, so twice slower when source is in ram

 

I hope these results will help you speeding your blitting code ;)

 

you can see here a little test: http://onori.free.fr/BLIT2.zip

first red bar is blitting with source in ram, and second red bar is blitting with source in gpu cache, both in phrase mode, test done and calibrated for 50Hz.

Link to comment
Share on other sites

  • 2 weeks later...

another blitter speed test, here is the advice of the week:

better call the blitter from GPU than from the 68k ! (and don't forget to stop #$2000 the 68k ;))

 

yeah, because the 68k is daaaaamn sloooow !!!

I've done a little map blit test using tiles, the code to call the blitter is really really simple.

so we can say that it doesn't really count whenever the routine without blitter call is faster in gpu than in 68k.

the routine calling the blitter using 68k take 75% of the VBL.

the same routine but in GPU code (also calling the blitter the same way) take 45% of the VBL !!

 

do the math ;) (and kill the 68k !!)

Link to comment
Share on other sites

another blitter speed test, here is the advice of the week:

better call the blitter from GPU than from the 68k ! (and don't forget to stop #$2000 the 68k ;))

 

yeah, because the 68k is daaaaamn sloooow !!!

I've done a little map blit test using tiles, the code to call the blitter is really really simple.

so we can say that it doesn't really count whenever the routine without blitter call is faster in gpu than in 68k.

the routine calling the blitter using 68k take 75% of the VBL.

the same routine but in GPU code (also calling the blitter the same way) take 45% of the VBL !!

 

do the math ;) (and kill the 68k !!)

 

As I have said in another thread about blitter, it is also really important to set the minimum number of blitter registers each time. In your case, I guess that for example A1_BASE is always the same so it is better to set it once for all outside your loop.

By doing this very small change, I got a great spead improvement with 68k blitting (and I guess it is the same with the GPU)

Link to comment
Share on other sites

I already do that in both 68k and GPU code ;) but thanks for the advice.

 

here is my code for the test:

 

68k:

    move.l    #screen,A1_BASE
    move.l    #tileset,A2_BASE

    move.w    #1,d0        ; Y
    swap    d0
    move.w    #-16,d0        ; X
    move.l    d0,A1_STEP
    move.l    d0,A2_STEP

    move.l    #PIXEL16|XADDPHR|WID320|PITCH1,A1_FLAGS
    move.l    #PIXEL16|XADDPHR|WID64|PITCH1,A2_FLAGS

    move.l    #$00100010,d6            ; B_COUNT 16x16
    move.l    #LFU_REPLACE|SRCEN|UPDA1|UPDA2,d7; B_CMD !

    move.w    #$F000,BG

    moveq    #16-1,d1
Ylop:    moveq    #20-1,d0
Xlop:

; Blit Info Position

    move.w    d1,d2        ; Y
    lsl.w    #4,d2        ; *16
    swap    d2
    move.w    d0,d2        ; X
    lsl.w    #4,d2        ; *16
    move.l    d2,A1_PIXEL

    moveq    #32,d2        ; X: tile 1
    move.l    d2,A2_PIXEL

; Blit !!

    move.l    d6,B_COUNT
    move.l    d7,B_CMD

    dbra    d0,Xlop
    dbra    d1,Ylop

 

 

GPU:

    movei    #screen,r0
    movei    #A1_BASE,r1
    store    r0,(r1)
    movei    #tileset,r0
    movei    #A2_BASE,r1
    store    r0,(r1)

    movei    #$0001FFF0,r0; y+1 x-16
    movei    #A1_STEP,r1
    store    r0,(r1)
    movei    #A2_STEP,r1
    store    r0,(r1)


    movei    #PIXEL16|XADDPHR|WID320|PITCH1,r0
    movei    #A1_FLAGS,r1
    store    r0,(r1)

    movei    #PIXEL16|XADDPHR|WID64|PITCH1,r0
    movei    #A2_FLAGS,r1
    store    r0,(r1)

    movei    #$00100010,r0            ; B_COUNT 16x16
    movei    #B_COUNT,r1
    movei    #LFU_REPLACE|SRCEN|UPDA1|UPDA2,r2; B_CMD !
    movei    #B_CMD,r3

    moveq    #1,r11        ; WAIT BLITTER MASK

    movei    #A1_PIXEL,r6
    movei    #A2_PIXEL,r7
    movei    #32,r8        ; X: tile 1

    movei    #Ylop,r13


    movei    #$F000,r4
    movei    #BG,r5
    storew    r4,(r5)


    movei    #16,r4        ; Y
Ylop:    movei    #20,r5        ; X
Xlop:

.waitb:    load    (r3),r12    ; Wait for the blitter to complete !
    and    r11,r12
    jr    EQ,.waitb
    nop


; Blit Info Position

    move    r4,r9        ; Y
    shlq    #4,r9        ; *16
    rorq    #16,r9        ; swap
    move    r5,r10        ; X
    shlq    #4,r10        ; *16
    or    r10,r9
    store    r9,(r6)

    store    r8,(r7)        ; A2_PIXEL, X: tile 1


; Blit !!

    store    r0,(r1)        ; B_COUNT
    store    r2,(r3)        ; B_CMD

    subq    #1,r5
    jr    NE,Xlop
    nop

    subq    #1,r4
    jump    NE,(r13)
    nop

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...