

# logiJPGD Multi-Channel Motion JPEG Decoder

February 15<sup>th</sup>, 2021 Data Sheet Version: v2.0

## Xylon d.o.o.

Fallerovo setaliste 22 10000 Zagreb, Croatia Phone: +385 1 368 00 26 Fax: +385 1 365 51 67

E-mail: <a href="mailto:support@logicbricks.com">support@logicbricks.com</a>
URL: <a href="mailto:www.logicbricks.com">www.logicbricks.com</a>

### **Features**

- Supports Xilinx<sup>®</sup> Zynq<sup>®</sup>-7000 All Programmable SoC, 7 series FPGA families and Zynq UltraScale+™ MPSoC architectures
- Compliant with the Baseline Sequential DCT mode of the ISO/IEC 10918-1 JPEG standard
- Supports multi-channel configurations and works simultaneously with up to four HD video streams

| Core Facts                |                                    |  |  |
|---------------------------|------------------------------------|--|--|
| Provided with Core        |                                    |  |  |
| Documentation             | User's Manual                      |  |  |
| Design File Formats       | Encrypted VHDL                     |  |  |
| Constraints Files         | Reference designs constraint files |  |  |
| Reference Designs &       | Reference designs for Xilinx ZC702 |  |  |
| Application Notes         | Evaluation Kit                     |  |  |
| Additional Items          | Bare-metal software driver         |  |  |
| Simulation Tool Used      |                                    |  |  |
| Mentor Graphics' Modelsim |                                    |  |  |
| Support                   |                                    |  |  |
| Support provided by Xylon |                                    |  |  |

- · Keeps up the exact frame rate to ensure smooth playback of Motion JPEG encoded video
- Automatic parsing of JPEG markers and recovery from the error state (invalid marker recognition)
- On-the-fly Huffman (2 DC, 2 AC) and Quantization (up to 4) tables reprogramming from the header data
- Advanced scatter-gather input DMA minimizes the CPU overhead in video over IP applications
- High data throughput of 2 pixels/clock, i.e. 200 MPix/sec when running at 100 MHz
- Video input/output resolutions up to 2048x2048
- Pixel formats: YUV 4:2:0 and YUV 4:2:2
- Frame integrity check able to detect the "frozen" video channel
- Generates the vertical sync (VSYNC) pulse to enable easy synchronizations with other SoC modules
- Hardware handshaking controlled double/triple buffering ensures flicker free video output
- Included bare-metal software driver enables faster application development
- Configurable input interface: DMA (AXI4 Master interface) or streaming (AXI4-Stream interface)
- Configurable output interface: DMA (AXI4 Master interface) or streaming (AXI4-Stream video interface)
- 64-bit ARM<sup>®</sup> AMBA<sup>®</sup> AXI4 compliant memory interface
- Supports native non-swizzle video outputs and swizzle (Xylon proprietary) for the highest efficiency

Table 1: Example Implementation Statistics for Xilinx® FPGAs

| Family                                        | Fmax | (MHz) |      |      |     |      |       |      | _    |     |                  |  | PLL/ | BUFG/ |  | Design |
|-----------------------------------------------|------|-------|------|------|-----|------|-------|------|------|-----|------------------|--|------|-------|--|--------|
| (Device)                                      | sclk | clk   | LUT  | FF   | IOB | BRAM | DSP48 | ММСМ | BUFR | GTx | Tools            |  |      |       |  |        |
| Zynq <sup>®</sup> -7000<br>(XC7Z020-1)        | 120  | 84    | 6695 | 3997 | 0   | 26.5 | 36    | 0    | 0    | 0   | Vivado<br>2020.1 |  |      |       |  |        |
| Zynq <sup>®</sup> -7000<br>(XC7Z045-2)        | 150  | 150   | 5982 | 3987 | 0   | 26.5 | 36    | 0    | 0    | 0   | Vivado<br>2020.1 |  |      |       |  |        |
| Kintex <sup>®</sup> -7<br>(XC7K325-2)         | 150  | 150   | 5972 | 3987 | 0   | 26.5 | 36    | 0    | 0    | 0   | Vivado<br>2020.1 |  |      |       |  |        |
| Zynq <sup>®</sup> UltraScale+™<br>(XCZU9EG-1) | 150  | 150   | 6030 | 3974 | 0   | 26.5 | 36    | 0    | 0    | 0   | Vivado<br>2020.1 |  |      |       |  |        |

<sup>1)</sup> Assuming the following configuration: 32-bit AXI4-Lite register interface, 8-bit AXI4-Stream video input interface, YUV422 AXI4-Stream output interface, maximal resolution 2048 pixels, No fast first stage Huffman decoding, Huffman decoder LUT x1.

2) Assuming all interfaces internally connected.



Figure 1: logiJPGD Architecture

## Features (cont)

- Simple programming of control registers through AXI4-Lite interface
- Parametrical VHDL design that allows tuning of slice consumption and features set
- Available IP core deliverables for Xilinx Vivado<sup>®</sup> IP Integrator (IPI) and ISE<sup>®</sup> (XPS)\* implementation tools
- IP deliverables include software driver, documentation and technical support
- Reference design for the Xilinx ZC702 Evaluation Kit available on request
- Plug and Play with Xilinx, third-party and other Xylon logicBRICKS IP cores, like the logiCVC-ML display controller, the logiWIN video input and the logiVIEW video processor IP core

## **Applications**

This IP core can be used in all applications that require JPEG or MJPEG video decompression. It is particularly well-suited for multi-channel video over IP applications like:

- Four-Camera Surround View and other Advanced Driver Assistance (ADAS)
- Multi-camera surveillance systems
- ...

## **General Description**

The logiJPGD Multi-Channel Motion JPEG Decoder is Xylon logicBRICKS IP core for still image and video decompression applications on Xilinx All Programmable MPSoC, SoC, and FPGA devices. It includes all logic blocks necessary for quick implementations of memory-based DMA and/or streaming-based SoC architectures and enables simultaneous MJPEG decompression of up to four HD video streams.

At the center of the logiJPGD IP core is the ISO/IEC 10928-1 JPEG standard compliant Baseline DCT decoder block that works with the YUV (4:2:0 and 4:2:2) video formats and supports all standard features such as the automatic parsing of JPEG markers and invalid markers recognition. The JPEG decoder block is coupled with

<sup>\*</sup> The last available logiJPGD IP core version for ISE Design Suite: v1\_01\_a

the advanced AMBA AXI4 compatible DMA logic blocks that handle video data transfers to and from an external memory.

The compressed video data can be fetched from memory or accepted as stream input. If generic C\_INPUT\_TYPE is set to AXI4 Master (DMA), the input scatter-gather DMA fetches the compressed video data from non-contiguous memory buffers to feed in the JPEG decoder block. This DMA block is especially well suited for "video over IP" applications (Figure 2), where the video payload from multiple video channels comes in a non-guaranteed order and encapsulated in network frames, i.e. the Ethernet UDP packets. With a very low burden on the system CPU, it enables highly automated transfers of unwrapped video while respecting the accurate order of video channels. If generic C\_INPUT\_TYPE is set to AXI4 stream, the logiJPGD accepts compressed video data over AXI4-Stream interface.

The decompressed video data can be stored to memory or sent as a stream output. If generic C\_OUTPUT\_TYPE is set to AXI4 Master (DMA), the logiJPGD output DMA logic block stores the decompressed video frames in contiguous video memory buffers and in the format suitable for further video processing and video display. This block generates the vertical sync (VSYNC) pulse and implements the hardware handshaking controlled double and triple buffering to ensure flicker free video output and simpler synchronization with the consecutive video processing IP cores. The logiJPGD IP core's special feature that enables easy integration with other IP cores and a smooth video playback is its ability to keep up the exact frame rates of input video channels. If generic C\_OUTPUT\_TYPE is set to AXI4 stream, the logiJPGD sends decompressed video data over AXI4-Stream video interface.

A variety of interrelated factors affect the achievable logiJPGD IP core's performance. Targeted Xilinx All Programmable logic, the overall SoC design's complexity, the input video resolutions, frame rates and number of video channels are some of the more common parameters that affect the core's performance.



With the assured 2 pixels/clock output, the logiJPGD IP core provides high data throughput even in low- and mid-range FPGAs, i.e. 200 MPix/sec when running at 100MHz. Such data throughput enables smooth decoding of one 1080p60 Full HD or multiple 720p30 HD video inputs.

In case of the insufficient video data throughput of a single logiJPGD instance, the required video performance can be achieved by additional logiJPGD IP instances for a parallel processing of multiple video channels.

The logiJPGD is fully embedded into Xilinx Vivado IP Integrator to hide a complexity from the end-user and to make its integration with the on-chip AMBA AXI4 bus easy. The provided logiJPGD software driver simplifies the IP core programming. Parametrizable VHDL design allows for tuning of slice consumption and features set through an easy-to-use GUI interface. The logiJPGD can be smoothly integrated with other logicBRICKS, Xilinx or third-party IP cores.

Xylon provides the logiJPGD reference design that can be used as a starting point to evaluate and develop Xilinx-based MJPEG video processing embedded systems. Please contact <a href="mailto:info@logicbricks.com">info@logicbricks.com</a> to submit a request for evaluation.

## **Application Example**

Figure 2 illustrates an example of the logiJPGD IP core use in the multi-camera Advanced Driver Assistance (ADAS) application. Four video cameras installed around the vehicle's perimeter encapsulate the MJPEG compressed video in the Ethernet UDP packets and stream them through a network switch and a single Ethernet cable towards the Gigabit Ethernet IP core controller in the Xilinx Zynq-7000 All Programmable SoC.

In order to ensure the maximum data transfer speed and the minimum CPU load, the Ethernet UDP IP software stack runs only a basic packet analysis and stores the encapsulated UDP packets in physically scattered memory buffers implemented in the DRAM memory (Figure 2, Data Flow 1).



Figure 2: logiJPGD Use Case Scenario Example

The logiJPGD IP core fetches only video payload from the scattered Ethernet UDP packets and does the MJPEG video decompression (Figure 2, Data Flow 2). The uncompressed video is stored in contiguous video frame buffers within the DRAM memory (Figure 2, Data Flow 3) from where it can be fetched (Figure 2, Data Flow 4) for further video processing or video display.

## **Functional Description**

The logiJPGD consists of the following blocks: Read DMA, AXIS2Parallel, JPEG Decoder, Output Buffer (DMA), Stream Generator and Registers. Figure 1 shows the internal IP core's structure.

### Read DMA

The Read DMA implements the scatter-gather DMA, reads the JPEG video data from the memory and feeds in the compressed video to the JPEG Decoder. Read DMA is in use only when C\_INPUT\_TYPE is set to DMA.

# **AXIS2Parallel**

AXIS2Parallel module is instantiated if stream input is enabled. It converts AXI4 Stream interface to parallel interface compatible with JPEG Decoder. Different error detections are done in this module.

### JPEG Decoder

The JPEG Decoder is compliant with the Baseline Sequential DCT mode of the ISO/IEC 10918-1 JPEG standard. It decodes (decompresses) the JPEG coded YUV video input and provides the decompressed YUV video output.

## **Output Buffer (DMA)**

The Output Buffer buffers and transfers the decoded/decompressed video data to the memory. Output Buffer is in use only when generic C\_OUTPUT\_TYPE is set to DMA.

#### **Stream Generator**

The Stream Generator converts JPEG blocks to video lines and generates video stream compliant with AXI4-Stream Video standard from decompressed video. Stream Generator is in use only when generic C\_OUPUT\_TYPE is set to Streaming.

### Registers

All logiJPGD registers are instantiated in this block. The CPU has access to all these registers through the AXI4-Lite interface.

### **Core Modifications**

The core is supplied in encrypted VHDL formats compatible with Xilinx Vivado IP Integrator and ISE Platform Studio implementation tools, which allows the user to take full control over configuration parameters. Table 2 outlines some important logiJPGD configuration parameters selectable prior to the VHDL synthesis. For a complete list of parameters, please consult the logiJPGD User's Manual delivered with the IP core.

| Parameter                 | Description                                                                                                                                                          |  |  |  |
|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| C_INPUT_TYPE              | Input type: 0 – DMA (AXI4 Master), 1 – Streaming (AXI4-Stream)                                                                                                       |  |  |  |
| C_OUTPUT_TYPE             | Output type: 0 – DMA (AXI4 Master), 1 – Streaming (AXI4-Stream Video)                                                                                                |  |  |  |
| C_OUT_SWIZZLE_EN          | Video output type: 0 - not swizzle, 1 - swizzle output image                                                                                                         |  |  |  |
| C_ROW_STRIDE              | Distance in number of pixels between same column pixels for adjacent rows.                                                                                           |  |  |  |
| C_BUFFER_OFFSET           | Double and triple buffering: buffer offset in number of lines, relative to the memory start.                                                                         |  |  |  |
| C_IN_M_LITTLE_ENDIAN      | Input memory layout endianness (0 – big endian, 1 – little endian)                                                                                                   |  |  |  |
| C_OUT_M_LITTLE_ENDIAN     | Output memory layout endianness (0 – big endian, 1 – little endian)                                                                                                  |  |  |  |
| C_M_AXIS_FORMAT           | Video AXI4-Stream JPEG format: 0 – YUV4:2:2, 3 – YUV4:2:0                                                                                                            |  |  |  |
| C_CH_NO                   | Number of channels                                                                                                                                                   |  |  |  |
| C_FAST_FIRST_STAGE_DECODE | If fast first stage decode is enabled (1), Huffman decoding is done one symbol at one period of clock, otherwise, decoding requires two periods of clock for symbol. |  |  |  |

**Table 2: Subset of logiJPGD Configuration Parameters** 

The logiJPGD has been constructed with regard to adaptability to various JPEG types. However, there may be instances where source code modification would be necessary. Therefore, if you wish to reach the optimal use of the logiJPGD core and/or to supplement some of your specific functions, you can order the source code or allow us to tailor the logiJPGD to your requirements. The logiJPGD VHDL source code is available at the additional cost from Xylon.

# **Core I/O Signals**

The core signal I/Os have not been fixed to specific device pins to provide flexibility for interfacing with user logic. Descriptions of all signal I/Os are provided in the Table 3.

Table 3: Core I/O Signals

| Signal                     | Signal<br>Direction | Description                                 |
|----------------------------|---------------------|---------------------------------------------|
| Global Signals             |                     |                                             |
| RST                        | Input               | Global reset input                          |
| CLK                        | Input               | Main clock input                            |
| Memory Interface           |                     |                                             |
| IN_M AXI4 Master Interface | Bus                 | Input memory ARM AMBA AXI4 master interface |

| Signal                                                                      | Signal<br>Direction | Description                                                                 |  |  |
|-----------------------------------------------------------------------------|---------------------|-----------------------------------------------------------------------------|--|--|
| OUT_M AXI4 Master Interface                                                 | Bus                 | Output memory ARM AMBA AXI4 master interface                                |  |  |
| Streaming Interface                                                         |                     |                                                                             |  |  |
| S_AXIS AXI4-Stream Interface Bus Input ARM AMBA AXI4-Stream slave interface |                     | Input ARM AMBA AXI4-Stream slave interface                                  |  |  |
| M_AXIS AXI4-Stream Video Interface                                          | Bus                 | Output ARM AMBA AXI4-Stream master video interface                          |  |  |
| Register Interface                                                          |                     |                                                                             |  |  |
| AXI4-Lite Slave Interface                                                   | Bus                 | AMBA AXI4-Lite slave interface                                              |  |  |
| Auxiliary Signals                                                           |                     |                                                                             |  |  |
| VSYNC                                                                       | Output              | Vertical sync                                                               |  |  |
| CURR_VBUFF[C_CH_NO*2-1:0]                                                   | Output              | The current external stream video memory buffer (two bits per channel)      |  |  |
| NEXT_VBUFF[C_CH_NO*2-1:0]                                                   | Input               | The next external stream video memory buffer (two bits per channel)         |  |  |
| SW_VBUFF_REQ[C_CH_NO-1:0]                                                   | Output              | External request for the memory buffer switch (one bit per channel)         |  |  |
| SW_VBUFF_GRANT[C_CH_NO-1:0]                                                 | Input               | Granted external request for the memory buffer switch (one bit per channel) |  |  |

### **Verification Methods**

The logiJPGD is fully supported by the Xilinx Vivado Design Suite HLx. This tight integration tremendously shortens IP integration and verification. The logiJPGD has been already used in automotive production systems today and its full implementation does not require any particular skills beyond general Xilinx tools knowledge

## **Recommended Design Experience**

The user should have experience in the following areas:

- Xilinx design tools
- ModelSim

### **Available Support Products**

Xylon provides the logiJPGD reference design for the Xilinx Zynq-7000 AP SoC based ZC702 Evaluation Kit on request. The reference design demonstrates video decompression of MJPEG encoded movie clips stored on the SD card plugged into the ZC702 kit. The logiJPGD design includes evaluation logicBRICKS IP cores and can be used as a starting point to evaluate and develop Xilinx-based MJPEG video processing embedded systems.

To learn more about the Xylon reference designs, contact Xylon or visit the web:

Email: <a href="mailto:info@logicbricks.com">info@logicbricks.com</a>

The logiJPGD IP core is often used with the logiVIEW Perspective Transformation and Lens Correction Image Processor IP core for video and imaging applications. The logiVIEW IP core removes fish eye distortions caused by extreme wide-angle Field Of View (FOV) lenses, makes complex homographic transformations and non-homographic transformations, i.e. video texturing on curved surfaces.

To learn more about the Xylon logiVIEW IP core, contact Xylon or visit the web:

Email: sales@logicbricks.com

URL: www.logicbricks.com/Products/logiVIEW.aspx

Version: v2.0

# **Ordering Information**

This product is available directly from Xylon under the terms of the Xylon's IP License. Please visit our web shop or contact Xylon for pricing and additional information:

Email: <u>sales@logicbricks.com</u>
URL: <u>www.logicbricks.com</u>

This publication has been carefully checked for accuracy. However, Xylon does not assume any responsibility for the contents or use of any product described herein. Xylon reserves the right to make any changes to product without further notice. Our customers should ensure that they take appropriate action so that their use of our products does not infringe upon any patents. Xylon products are not intended for use in the life support applications. Use of the Xylon products in such appliances is prohibited without written Xylon approval.

### **Related Information**

### Xilinx Programmable Logic

For information on Xilinx programmable logic or development system software, contact your local Xilinx sales office, or:

Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 Phone: +1 408-559-7778 Fax: +1 408-559-7114

URL: <u>www.xilinx.com</u>

# **Revision History**

| Version | Date        | Note                                                                             |
|---------|-------------|----------------------------------------------------------------------------------|
| 1.00.   | 17.04.2014. | Initial draft                                                                    |
| 1.10.   | 13.10.2015. | Official logicBRICKS release                                                     |
|         |             | Added AXI4-Stream input and AXI4-Stream output.                                  |
| 2.0     | 03.01.2017. | Updated Features list, General Description and logiJPGD architecture.            |
|         |             | Updated Figure 1, Table 1, Table 2 and Table 3.                                  |
| 2.0     | 19.09.2017. | Added support for newer devices and newer implementation tools. Updated Table 1. |
| 2.0     | 15.02.2021. | Updated Table 1.                                                                 |