Jump to content
Jagware
Sign in to follow this  
Orion_

About The Blitter Speed

Recommended Posts

Orion_    1

ok, I just done some test using the blitter because I want to do an effect using the blitter and I wanted to know if the blitter is faster taking graphics in GPU Cache or in RAM. the answer is, yes and no.

so here is the results: all test done by using blit destination in ram. with a 12x12 sprite using copy only.

 

in Phrase Mode: the blitter is a bit faster when source is in ram !! so a bit slower when source is in gpu cache.

in Pixel Mode: the blitter is approximately twice faster when source is in gpu cache, so twice slower when source is in ram

 

I hope these results will help you speeding your blitting code ;)

 

you can see here a little test: http://onori.free.fr/BLIT2.zip

first red bar is blitting with source in ram, and second red bar is blitting with source in gpu cache, both in phrase mode, test done and calibrated for 50Hz.

Share this post


Link to post
Share on other sites
Orion_    1

another blitter speed test, here is the advice of the week:

better call the blitter from GPU than from the 68k ! (and don't forget to stop #$2000 the 68k ;))

 

yeah, because the 68k is daaaaamn sloooow !!!

I've done a little map blit test using tiles, the code to call the blitter is really really simple.

so we can say that it doesn't really count whenever the routine without blitter call is faster in gpu than in 68k.

the routine calling the blitter using 68k take 75% of the VBL.

the same routine but in GPU code (also calling the blitter the same way) take 45% of the VBL !!

 

do the math ;) (and kill the 68k !!)

Share this post


Link to post
Share on other sites
SebRmv    2
another blitter speed test, here is the advice of the week:

better call the blitter from GPU than from the 68k ! (and don't forget to stop #$2000 the 68k ;))

 

yeah, because the 68k is daaaaamn sloooow !!!

I've done a little map blit test using tiles, the code to call the blitter is really really simple.

so we can say that it doesn't really count whenever the routine without blitter call is faster in gpu than in 68k.

the routine calling the blitter using 68k take 75% of the VBL.

the same routine but in GPU code (also calling the blitter the same way) take 45% of the VBL !!

 

do the math ;) (and kill the 68k !!)

 

As I have said in another thread about blitter, it is also really important to set the minimum number of blitter registers each time. In your case, I guess that for example A1_BASE is always the same so it is better to set it once for all outside your loop.

By doing this very small change, I got a great spead improvement with 68k blitting (and I guess it is the same with the GPU)

Share this post


Link to post
Share on other sites
Orion_    1

I already do that in both 68k and GPU code ;) but thanks for the advice.

 

here is my code for the test:

 

68k:

    move.l    #screen,A1_BASE
    move.l    #tileset,A2_BASE

    move.w    #1,d0        ; Y
    swap    d0
    move.w    #-16,d0        ; X
    move.l    d0,A1_STEP
    move.l    d0,A2_STEP

    move.l    #PIXEL16|XADDPHR|WID320|PITCH1,A1_FLAGS
    move.l    #PIXEL16|XADDPHR|WID64|PITCH1,A2_FLAGS

    move.l    #$00100010,d6            ; B_COUNT 16x16
    move.l    #LFU_REPLACE|SRCEN|UPDA1|UPDA2,d7; B_CMD !

    move.w    #$F000,BG

    moveq    #16-1,d1
Ylop:    moveq    #20-1,d0
Xlop:

; Blit Info Position

    move.w    d1,d2        ; Y
    lsl.w    #4,d2        ; *16
    swap    d2
    move.w    d0,d2        ; X
    lsl.w    #4,d2        ; *16
    move.l    d2,A1_PIXEL

    moveq    #32,d2        ; X: tile 1
    move.l    d2,A2_PIXEL

; Blit !!

    move.l    d6,B_COUNT
    move.l    d7,B_CMD

    dbra    d0,Xlop
    dbra    d1,Ylop

 

 

GPU:

    movei    #screen,r0
    movei    #A1_BASE,r1
    store    r0,(r1)
    movei    #tileset,r0
    movei    #A2_BASE,r1
    store    r0,(r1)

    movei    #$0001FFF0,r0; y+1 x-16
    movei    #A1_STEP,r1
    store    r0,(r1)
    movei    #A2_STEP,r1
    store    r0,(r1)


    movei    #PIXEL16|XADDPHR|WID320|PITCH1,r0
    movei    #A1_FLAGS,r1
    store    r0,(r1)

    movei    #PIXEL16|XADDPHR|WID64|PITCH1,r0
    movei    #A2_FLAGS,r1
    store    r0,(r1)

    movei    #$00100010,r0            ; B_COUNT 16x16
    movei    #B_COUNT,r1
    movei    #LFU_REPLACE|SRCEN|UPDA1|UPDA2,r2; B_CMD !
    movei    #B_CMD,r3

    moveq    #1,r11        ; WAIT BLITTER MASK

    movei    #A1_PIXEL,r6
    movei    #A2_PIXEL,r7
    movei    #32,r8        ; X: tile 1

    movei    #Ylop,r13


    movei    #$F000,r4
    movei    #BG,r5
    storew    r4,(r5)


    movei    #16,r4        ; Y
Ylop:    movei    #20,r5        ; X
Xlop:

.waitb:    load    (r3),r12    ; Wait for the blitter to complete !
    and    r11,r12
    jr    EQ,.waitb
    nop


; Blit Info Position

    move    r4,r9        ; Y
    shlq    #4,r9        ; *16
    rorq    #16,r9        ; swap
    move    r5,r10        ; X
    shlq    #4,r10        ; *16
    or    r10,r9
    store    r9,(r6)

    store    r8,(r7)        ; A2_PIXEL, X: tile 1


; Blit !!

    store    r0,(r1)        ; B_COUNT
    store    r2,(r3)        ; B_CMD

    subq    #1,r5
    jr    NE,Xlop
    nop

    subq    #1,r4
    jump    NE,(r13)
    nop

Share this post


Link to post
Share on other sites
Guest
You are commenting as a guest. If you have an account, please sign in.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoticons maximum are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×