Jump to content
Jagware

SCPCD

Level2
  • Content count

    1,134
  • Joined

  • Last visited

Posts posted by SCPCD


  1. For me the Jaguar CF has allways been an interesting project, it is nice to see you are still making progress and I hope all goes well.

    Thanks :)

    I heard that the CF has a network function the other day which was news to me as I was not aware of it being mentioned prevoisusly, what exactly is the purpose of the JagCF Network?

    If it is for networked gaming will it be comaptiable with the existing Jaguar network specs and interfaces such as the Catbox and JagLink 2 or is it a totally new networking solution that will only work in CF to CF networks?

    The purpose of The JagCF Network is to replace the actual jaguar network with many integreted features (network address, error checking, message with priority and other :)).

    This is not compatible with the existing games.


  2. Hi !

     

    I'm working on the last hardware part of the JagCF before the launch of final proto.

     

    This is both new boards for testing JagCF new hardware that are not including onto my actual prototype :

    the first one : Spy Board (for spy the network with a PC)

    - JagCF Network

     

    the second one : Test Board

    - new JagCF Power circuitry (to have a better efficiency)

    - JagCF Network (because I haven't this onto my actual jagcf proto)

    - Integrated Catnip

     

    post-5-1189436172_thumb.jpg

     

    Regards !


  3. So.. you are saying that its slower to copy data FROM gpu mem 2 dram (5632) than it is to copy TO gou mem(3584)?

    Any ideas why? I dont realy understand that from a logical point of view.. ? ..does the DRAM timings differ in read vs write? ...sram should have same read & write time right? ..how about dram?

    or is it an OP stall, page miss on writes to dram or what? why is it slower to copy TO dram than it is to copy from it?

    I think that it's a pipelining effect that takes different time for a read or a write :

    a read from a slow memory to a fast memory is easily pipelined. The opposit is not so easy to avoid lose of cycles.

    What are the GPU dooing in your example code?

    GPU don't work in my example :)

    I'll try soon whith a code.

    Im just curious because I have always wondered what happens when you start a GPU->dram blitt and continue running GPU code afterwards... I mean logically someone would have to halt to give the blitter acceess to gpumem right?... ie a gpu2-dram blitt would halt gpu exec and parallelism would be lost, or?

    Can this be verified with your LA? ....ie faster blitts from gpumem if gpu is turned off. (or perhapps its just a stall issue, once the blitt start it finishes in same time?!)

    When the GPU execute code from internal memory, it uses 50% MAX of the internal memory bandwidth. (1 instruction per cycle, and prefetch of 2 instructions, -> 1 memory access each 2 cycles)

    So blitt when gpu run at is higher speed reduce the blitt speed.

     

    I can verified it with the LA ;)

    Last question:

    When i did my own timing test of the memory (simple color "rasters" on screen) the DRAM->DRAM blitt code was As Fast as anything that had to do with the GPU memory!... I assume src&desr was within the same Page in dram hence the speed..

    But it would be nice to se such a timing...

    I will do it someday ;)

     

    sorry for all question, but timing issues are interesting from an optimisation point of view :P

    Exactly !

     

    Nice work!

    Thanks ;)


  4. interesting! and what about the pixel mode?

    in pixel mode we have same cycle time but insteed of a phrase it's a pixel size that is transfered :

    - in PIXEL1 there is 64 DRAM read

    - in PIXEL2 there is 32 DRAM read

    - in PIXEL4 there is 16 DRAM read

    - in PIXEL8 there is 8 DRAM read

    - in PIXEL16 there is 4 DRAM read

    - in PIXEL32 there is 2 DRAM read

     

    the number of cycle for each read/write is exactly the same as before depending of the source & destination.

     

    So for PIXEL16 in pixel mode, this is 4 times slower than in phrase mode.

     

    in phrase mode, is it the same result if using 8 bpp, 16 bpp or 32 bpp?

    yes it's exactly the same result.

     

    (I haven't verified that datas are correct for less than 32bpp mode from & to the GPU)


  5. Now the OP !

     

    first test with a standard 16-bit and not scaled object :

     

    post-5-1186935385_thumb.jpg

     

    E : write back of the previous obj.

    the time between the last pixel phrase read and the write back completion is 5 cycles.

    This Followed by the 2 phrase of obj type 1 with one read each 2 cycles

     

    The time between last obj phrase and the first pixel phrase is 5 cycles

     

    then the OP read pixels :

    We can see the 2 first phrase read very close I have not yet a real explication about that. (maybe a pipelining effect ?)

     

    then 3 cycles between A and B.

    followed by all other pixel phrase with a rate of 3 cycles.

     

     

    then a 16-bit RMW and not scaled object :

     

    post-5-1186934483_thumb.jpg

     

    E : write back of the previous obj.

    the time between the last pixel phrase read and the write back completion is 5 cycles.

    This Followed by the 2 phrase of obj type 1 with one read each 2 cycles.

     

    The time between last obj phrase and the first pixel phrase is 5 cycles

     

    then the OP read pixels :

    5 cycles between A and B.

    followed by all other pixel phrase with a rate of 4 cycles.

     

     

    Note :

    Transparent or not, CRY or RGB, it's the same result for timing. ;)


  6. This topic goal is to give informations about different timing acces for the blitter and to know more about blitter operating.

     

    First of all, the exemple code :

        move.l        #PITCH1|PIXEL32|WID128|XADDPHR,d0
        moveq        #0,d1
    
        move.l        d0,A2_FLAGS
        move.l        #source,A2_BASE
        move.l        d1,A2_PIXEL
        move.l        d1,A2_STEP
        
        move.l        d0,A1_FLAGS
        move.l        #destination,A1_BASE
        move.l        d1,A1_PIXEL
        move.l        d1,A1_FPIXEL
        move.l        d1,A1_STEP
        move.l        d1,A1_FSTEP
        move.l        d1,A1_CLIP
        move.l        d1,A1_INC
        move.l        d1,A1_FINC
        
        move.l        #$00010400,B_COUNT
        move.l        #SRCEN|LFU_REPLACE,B_CMD

    configure the blitter in 32-bit by pixel and transfert in Phrase mode.

    Source & destination will be different for each case that are describe bellow.

     

    We blitt : $0001 * $0400 * 4 (because 32-bit pixel selected) = 4096bytes.

    SRCEN : activation of a source, and LFU_REPLACE for a simple data copy.

    all other blitter register are not used and initialised here to zero.

     

    The GPU->DRAM transfert in phrase mode :

    source : $F03000 (G_RAM)

    destination : somewhere into the DRAM (phrase aligned)

     

    post-5-1186932355_thumb.jpg

    Result : 11 cycles per phrase -> 4096*11/8 = 5632 cycles for 4K

     

    The DRAM->GPU transfert in phrase mode :

    source : somewhere into the DRAM (phrase aligned)

    destination : $F03000 (G_RAM)

     

    post-5-1186932269_thumb.jpg

    Result : 7 cycles per phrase -> 4096*7/8 = 3584 cycles for 4K

     

    The DRAM->GPU speed transfert in phrase mode :

    source : somewhere into the DRAM (phrase aligned)

    destination : $F03000+$8000 (G_RAM+$8000)

     

    post-5-1186932120_thumb.jpg

    Result : 5 cycles per phrase -> 4096*5/8 = 2560 cycles for 4K

     

     

    -----------------------------------------------

    other information in the futur :)

    If you have a special timing to mesure, I can help ;)


  7. what are the limitations on the jag Sprite wise?

    the only limitation on the jag is the bandwidth and the memory size (2MByte).

    How many sprites per scan line

    only limited by the bandwith of the jag's memory.

    how many colors per palette

    the jag can draw 1, 2, 4, 8, 16, or 24-bit sprite, for 1,2, 4 and 8 it uses a color palette.

    more importantly how much memory is generally allocated for sprite usage? (meaning.. how big can a sprite be (tiling or not) and how many sprites can be loaded at a time?).

    It's possible to draw easily (so with a not complex sprite engine) 5 full screen sprites (320x240x16-bit) by VBL. (but highest sprite is possible, see the Hi-rez demo)

    It's easy to draw >100 sprites 32x32x16bit by VBL.

     

    Regards.


  8. Hi !

     

    I have tested JagWorm ;)

     

    Graphics are very nice :) and it's a fun game :)

     

    But I have noticed that the game crashes, when it is starting, whith the JagCF connected into the cartridge slot.

    Does your game read or check something in the cartridge memory map ? (included EEPROM/GPIO1/2/3/4) (I haven't implemented yet all of these things on my JagCF so it could be the problem ;))

     

    It would be fine that we found the origin of this problem, so I could add your game into my CF demos & mini games ;)

     

     

    Regards. :)

×