GP2X
From DaphneWiki
How to map memory
First, open "/dev/mem" with the 'open' function, using O_RDWR.
Magic Eyes Registers
mmap(0, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, /dev/mem fd, 0xc0000000)
Remember that most of the registers seem to be 16-bit so you may want to use a (Uint16 *) pointer (you'll have to divide all offsets by 2 of you do).
Video Memory
mmap(0, 0x4B000, PROT_READ|PROT_WRITE, MAP_SHARED, /dev/mem fd, 0x4000000 - 0x4B000)
The above is what Rlyeh does in his code. The 0x4b000 is 320x240x4. He may start at 0x3FFB5000 because of a wiki post here [1] .
FDC
Frame Dimension Converter (13.2)
(from the Magic Eyes PDF) "The Frame Dimension Converter (FDC) transforms 2-D data made during the process of Encoding/Decoding MPEG into the 1-D data format. During this work, 4:2:0 to 4:2:2 format conversion, Rotation are performed."
ie 2D frame to 1D frame
2D separated 4:2:0, 4:2:2, or 4:4:4 --> YUY2 4:2:2
Rotation (0, 90, 180, 270 degrees).
Registers:
Control Register FDC_CNTL
Frame Size Register FDC_FRAME_SIZE
Source Address Registers FDC_LUMA_OFFSET FDC_CB_OFFSET FDC_CR_OFFSET
Destination Address Registers FDC_DST_BASE_L FDB_DST_BASE_H
Status Register FDC_STATUS (can indicate when FDC is busy)
Other Regs FDC_DERING FDC_OCC_CNTL
SC
Scale Processor (13.3)
Receives data from external memory, the ISP, or the Frame Dimensional Convertor (FDC).
Supports Coarse scale and fine scale.
Scale is split into Pre-scale and Post-scale.
PreScale
Coarse scale without filtering, more than 2x down scaling is exclusively possible.
SC_PRE_VRATIO SC_PRE_HRATIO SC_SRC_PXL_WIDTH SC_SRC_PXL_REQCNT
PostScale
Up/Down scale with filtering.
SC_POST_VRATIO SC_POST_HRATIO
Field/Frame Function
SC_SRC_ODD_ADDR SC_SRC_EVEN_ADDR SC_DST_ADDR SC_DST_WPXL_WIDTH
Y/Cb/Cr Separating Function
Only seems to be needed for mpeg or jpeg encoding.
SC_SEP_ADDR SC_DST_PXL_WIDTH
Mirror
SC_MIRROR
bits: 0 - src horizontal mirror (0 = disable, 1 = mirror) 1 - src vertical mirror 2 - dst horizontal mirror 3 - dst vertical mirror 4-5 - scale _source_ data format (0 = yuy2, 1 = yvyu, 2 = uyvy, 3 = vyuy)
Scale Status
SC_STATUS (has "busy" and "done" bits)
Output
Outputs to either external memory or the external display directly.
Other Registers
SC_CMD (Scale control register) SC_SEP_LUMA_ADDR SC_SEP_CB_ADDR SC_SEP_CR_ADDR SC_DELAY SC_MEM_CNTR SC_IRQ
MLC (Multi Layer Controller)
Inputs
Receives data from Scale Processor, Frame Dimensional Converter, and External Memory. (see MLC_YUV_CNTL for where to set this for YUV)
Data Format
Y/Cr/Cb, RGB, indexed color, OSD/sub-picture/cursor data format.
Mirroring
Controls 4-division display of Y/Cr/Cb data. The mirror function can only be enabled when the source is memory (section 13-25).
Color Correction
Does brightness/contrast for luminance, and hue/saturation for chrominance. Does dithering and gamma correction for RGB.
Color Space Conversion
Converts Y/Cr/Cb to RGB.
Scaling
for Y/Cr/Cb and RGB: coarse scaling only (see 13.4.2.10) Registers for YUV Region A (top): MLC_YUVA_TP_HSC (section 13-64), MLC_YUVA_TP_VSC (see 13.4.2.10 for others)
Algorithm:
horizontal scale value = (W / 320) * 1024
vertical scale value = (H / 240) * MLC_VLA_TP_PXW (MLC_VLA_TP_PXW _may_ be equal to 320 * 2 according to ryleh's code)
YUV Layer
Handles only Y/Cr/Cb data. Supports scaling, mirroring, and 4-division display.
Input: Receives data from scale processor, FDC, or external memory.
Divided into Region A and Region B. Each Region is divided into Top Region and Bottom Region. The bottom regions can only be enabled when the source is memory (section 13-25).
Region Differences (this is why you'd have to choose one or the other)
Region A receives data from external memory and scale processor.
Region B receives data from external memory and Frame Dimension Converter.
Y/Cr/Cb format
Basically this seems to be just regular YUY2.
Region Dimensions Example
YUV Region A (top) MLC_YUVA_STX (section 13-64) : starting X (horizontal) MLC_YUVA_ENDX (section 13-65) : ending X MLC_YUVA_TP_STY (section 13-65) : starting Y (vertical) MLC_YUVA_TP_ENDY (13-65) : ending Y
Interlaced/Progressive Display
Section 13.4.2.11, this may be neecessary to explicitly set
Source Addresses (where image data comes from, I presume)
MLC_YUVA_TP_OADR[L/H] (13-65) - Odd Field Source Address of Top Region A MLC_YUVA_TP_EADR[L/H] (13-65) - Even Field Source Address ot Top Region A
Other Registers
MLC_OVLAY_CNTR (overlay control register) - used to enable/disable RGB regions, and YUV regions
MLC_YUV_EFFECT (13-63) - controls mirroring and whether to divide regions into top/bottom
MLC_YUV_CNTL (13-63) - controls whether Region A inputs from external memory or scale processor, whether Region B inputs from external memory or FDC, and whether region A or region B has priority. Also controls primitive down scaling (by skipping pixels, 1/2, 1/3 or 1/4 down scaling).
MLC_YUVA_TP_PXW (13-64) - horizontal pixel width (I'm not sure what this does)
RGB Stuff
MLC_STL_CNTL (13-68) - still image control register (still image seems equivalent to RGB)
MLC_STL_MIXMUX (13-68) - STL, color key, alpha blending stuff
MLC_STL_ALPHA[L/H] (13-68) - alpha blending stuff
MLC_STL_OADR[L/H] (13-71) - odd field source address of RGB layer
MLC_STL_EADR[L/H] (13-72) - even field source address of RGB layer
Hardware Cursor
see section 13-72
Luminance and Chrominance Enhancement
see section 13-73
CPU #2 Stuff
Pause/Unpause
SYSCLKENGEG (C000 0904h) page 95. Bit 0 enables or disables the second CPU.
States: Reset vs Operation
DUALCTRL940 (C000 3B48h) page 80. Bit 7 -> 0: operation, 1: reset
CPU Addressing
DUALCTRL940 (C000 3B48h) page 80. Bits 0-6 -> add this value to upper 8 bits of ARM940.
Interrupts
Interrupt Generation
It's possible to write to a location on cpu #1 and generate an interrupt on cpu #2. Similarly, it's possible to write to a location on cpu #2 and generate an interrupt on cpu #1. This seems like it could enhance performance, but I don't know how to use this. ARM 920 Interrupt Enable Register (DUALINT920, C000 3B40, page 79). Safe to set to 0. ARM 940 Interrupt Enable Register (DUALINT940, C000 3B42, page 79). Safe to set to 0.
Interrupt Pending
Registers 3b44 and 3b46 deal with some pending interrupt state. Sample code sets these both to 0xFFFF.
To load a program into CPU #2 and run it
- Set CPU #2 into reset state.
- Pause CPU #2.
- Set interrupt state to safe defaults described above.
- Load binary code into a shared memory location. For example in DZZ's ogg vorbis code that runs on CPU #2, he mmap's the shared memory (from /dev/mem) to 0x03000000 and then sets CPU #2's CPU Addressing bits to '03' so that CPU #2 is relative to 0x03000000. He then adds 0x200000 to this number to make a place for shared variables (including buffers). Presumably this is because he assumes that the 940 code's size won't exceed 0x200000.
- Set CPU #2 into Operational state, and also set the CPU addressing at the same time. The CPU addressing must match up with the shared memory's address (see previous step).
- Unpause CPU #2.
To stop a program from running in CPU #2
- Set CPU #2 into reset state.
- Pause CPU #2.
Caveats for compiling for 940
- Function pointers do not seem to get initialized at all and do not get statically allocated either (ie the address of the function pointer is outside of the 940 binary). So you will need to leave a little room after the 940 binary to leave space for these 'ghost' variables.
- Integers, if initialized to 0, may not get statically allocated. You may need to initialize them to something else.
- It's a really good idea to initialize all memory that will be used by the 940 to 0 before loading and running the 940 program. This may save you some extreme debugging pain!