Main Content

Model Design for AXI4-Stream Interface Generation

For designs that require high speed data transfers use AXI4-Stream interfaces. You can implement a simplified, streaming protocol in your model by using HDL Coder™. The software generates AXI4-Stream interfaces in the IP core.

Choose from these three modeling styles based on how your algorithm operates:

  • Sample-Based Modeling — Use these guidelines when your algorithm operates on a stream of samples.

  • Frame-Based Modeling — Use these guidelines when your algorithm operates on a complete frame of data. The data signals at the design under test (DUT) boundary can be vectors or matrices. Do not use this mode if you want to model the Valid and Ready signals.

  • Legacy Frame-Based Modeling — Use these guidelines when your algorithm operates on a stream of samples and you want to simulate the data signal as a frame on the boundary of the design under test (DUT).

    Note

    The Legacy Frame-Based Modeling style will be deprecated in a future release. If you want to model the Valid and Ready signals, use the sample-based modeling style.

Sample-Based Modeling

When you want to simulate the data signal as a stream of samples on the DUT boundary, model in sample-based mode. You can model the data signal as either a scalar or a vector. If you model the data signal as a vector, in the HDL Coder Workflow Advisor Task 1.2.Set Target Interface > Interface Options set the Sample Packing Dimension to All. HDL Coder packs the vector elements together and treats the vector as a single sample. You can specify how HDL Coder packs the data by using the Packing Mode. See Sample Packing Dimension.

Simplified Streaming Protocol

To map the design under test (DUT) ports to AXI4-Stream interfaces, use the simplified AXI4-Stream protocol. You do not have to model the actual AXI4-Stream protocol and instead can use the simplified protocol. When you run the IP Core Generation workflow, the generated HDL code contains wrapper logic that translates between the simplified protocol and the actual AXI4-Stream protocol. The simplified protocol requires fewer protocol signals, eases the handshaking mechanism between valid and ready signals, and supports bursts of arbitrary lengths.

Use the simplified AXI4-Stream protocol for write and read transactions. When you want to generate an AXI4-Stream interface in your IP core, in your DUT interface, implement these signals:

  • Data

  • Valid

Optionally, when you map scalar DUT ports to an AXI4-Stream interface, you can model these signals:

  • Ready

  • Other protocol signals, such as:

    • TSRTB

    • TKEEP

    • TLAST

    • TID

    • TDEST

    • TUSER

Data and Valid Signals

When the Data signal is valid, the Valid signal is asserted. This diagram illustrates the Data and Valid signal relationship according to the simplified streaming protocol. When you run the IP core generation workflow, HDL Coder adds a streaming interface module in the HDL IP core that translates the simplified protocol to the full AXI4-stream protocol. In this image, the clock signal is clk.

Model Data and Valid Signals in Simulink

  1. Enclose the algorithm that processes the Data signal by using an enabled subsystem.

  2. Control the enable port of the enabled subsystem by using the Valid signal.

For example, you can directly connect the Valid signal to the enable port.

You can also use a controller in your DUT that generates an enable signal for the enabled subsystem.

Ready Signal (Optional).  Downstream components use back pressure to tell upstream components they are not ready to receive data. The AXI4-Stream interfaces in your DUT can optionally include a Ready signal. Use the Ready signal to:

  • Apply back pressure in an AXI4-Stream slave interface. For example, drop the Ready signal when the downstream component is not ready to receive data.

  • Respond to back pressure in an AXI4-Stream master interface. For example, stop sending data when the downstream component Ready signal is low.

When you use a single streaming channel, by default, HDL Coder generates the Ready signal and the logic to handle the back pressure. The back pressure logic ties the Ready signal to the DUT Enable signal. When the input master Ready signal is low, the DUT is disabled, and the output slave Ready signal is driven low. Because HDL Coder generates the back pressure logic and Ready signal, when you use a single streaming channel, the Ready signal is optional and you do not have to model this signal at the DUT port.

Block diagram view illustrating the auto-generated Ready signal and back pressure logic.

When you use multiple streaming channels, HDL Coder generates a ready signal and does not generate the back pressure logic. In a DUT that has multiple streaming channels:

  • The master channel ignores the Ready signal from downstream components.

  • The slave channel Ready signal is high, which causes upstream components to continue sending data.

The absence of a back pressure logic might result in data being dropped. To avoid data loss and to apply back pressure on the slave interface or respond to back pressure from the master interface in your design:

  • Model the Ready signal for each additional stream interface.

  • Map the modeled Ready signal to a DUT port for the additional interface.

When you do not model the Ready signal, the Set Target Interface task displays a warning that provides names of interfaces that require a Ready port. If your design does not require applying or responding to back pressure, ignore this warning.

AXI4-Stream interface with Ready Signals

AXI4-Stream Input

This image illustrates the timing relationship between the Data, Valid, and Ready signals according to the simplified streaming protocol. In this image clock is clk. The AXI4-Stream Slave module sends the DataIn and ValidIn signals after asserting the ReadyIn signal from the DUT. This is represented by data packets A,B,D, and E in the image. When you drop the ReadyIn signal, the module always sends one more DataIn and ValidIn signal. This is represented by data packet C in the image. When you model the ReadyIn signal, the DUT must be able to accept one more value after de-asserting the ready signal.

Simplified streaming protocol input timing diagram

For example, if you have a first in first out (FIFO) in your DUT to store a frame of data, to apply backpressure to the upstream component, model the Ready signal based on the FIFO almost full signal.

Simplified streaming protocol Ready signal model

AXI4-Stream Output

This image illustrates the timing relationship between the Data, Valid, and Ready signals according to the simplified streaming protocol. In this image clock is clk. You can send DataOut and ValidOut signals to the AXI4-Stream Master module after you assert the ReadyOut signal. This is represented by data packets A,B,E,F,and G in the image. You can optionally send one more DataOut and ValidOut signal after the ReadyOut signal drops. This is represented by data packet C in the image. You can only send one additional packet after the ReadyOut signal drops, subsequent data packets will be dropped until the Ready signal is asserted again. This is represented by data packet D in the image.

Simplified streaming protocol output

The optional one cycle latency between the Valid and Ready signals of the simplified streaming protocol allows you to use a classic or first word fall through (FWFT) FIFO to handle the backpressure from downstream components.

Downstream backpressure handling with FWFT FIFO

You can use a FWFT FIFO to store a frame of data and handle back pressure from downstream components by modeling the ValidOut signal as ReadyOut and FIFO not empty.

Classic FWFT FIFO based Ready signal model

For a DUT using an FWFT FIFO there is zero latency for the Ready signal between upstream and downstream components. This image shows the timing relationship between the Data, Valid, and Ready signals. The clock signal is clk.

Timing diagram showing relationship between Data, Valid, and Ready signals for FWFT FIFO

Downstream backpressure handling with Classic FIFO

If your DUT uses a classic FIFO to store a frame of data, model the ValidOut signal as ReadyOut and FIFO not empty.

Classic FIFO based DUT Ready signal model

This image shows the timing relationship between the Data, Valid, and Ready signals. The clock signal is clk.

Timing diagram showing relationship between Data, Valid, and Ready signals for Classic FIFO

If you do not model the Ready signal, HDL Coder generates the signal and the associated back pressure logic. When you generate the IP core, HDL Coder adds a streaming interface module in the HDL IP core that translates the simplified protocol to the full AXI4-Stream protocol.

Note

If you enable delay balancing, the coder inserts one or more delays on the Ready signal. Disable delay balancing for the Ready signal path.

TLAST Signal (optional).  The AXI4-Stream interface on your DUT can optionally model a TLAST signal, which is used to indicate the end of a frame of data. If you do not model this signal, HDL Coder generates it for you. On the AXI4-Stream Slave interface, the incoming TLAST signal is ignored. On the AXI4-Stream Master interface, the autogenerated TLAST signal is asserted when the number of valid samples counts up to the default frame length value. The default frame length value can be set by using the AXI4-Stream interface options in the Target Interface Table. See Interface Options for AXI4-Stream Data.

When the IP core has an AXI4 Slave interface, the default frame length value is stored in a programmable register in the IP core. You can change the default frame length during run time. When the default frame length register is changed in the middle of a frame, the TLAST counter state is reset to zero and the TLAST signal is asserted early. You can find the address for the programmable TLAST register in the IP core generation report.

Frame-Based Modeling

You can design your DUT to operate on frames of data and map the data ports to a streaming interface. To map frame ports ( vectors, matrices, and complex matrices) to an AXI4-Stream interface use the frame-to-sample optimization. For more information, see Model Design for Frame-Based IP Core Generation.

Legacy Frame-Based Modeling

Design your algorithm to operate on a stream of samples and model the data signal as a vector. To operate in this mode in the HDL Coder Workflow Advisor Task 1.2.Set Target Interface > Interface Options set the Sample Packing Dimension to None.

Note

This modeling style will be deprecated in a future release.

Data and Valid Signal Modeling Requirements

When you map vector ports to AXI4-Stream interfaces:

  • Connect each DUT input vector data port to a Serializer1D block.

    The Serializer1D block must have a ValidOut port and the Ratio set to the vector bit width.

  • Connect each DUT output vector data port to a Deserializer1D block.

    The Deserializer1D block must have a ValidIn port and the Ratio set to the vector bit width.

  • Connect each scalar port that maps to an AXI4-Lite interface to a Rate Transition block.

    The ratio in the Rate Transition block must match the ratio in the Serializer1D and Deserializer1D blocks.

  • Each scalar port that maps to an external port must have the same sample time as the streaming algorithm subsystem.

The streaming algorithm subsystem follows the same Data and Valid signal modeling pattern as the pattern for mapping scalar ports to an AXI4-Stream interfaces. See Model Data and Valid Signals in Simulink. When you use frame-based modeling, you cannot use protocol signals other than Data and Valid. For example, Ready and TLAST are not supported.

Example

To map vector ports to AXI4-Stream interfaces, open the hdlcoder_sfir_fixed_vector.slx model. In the hdlcoder_sfir_fixed_vector.slx model, the symmetric_fir block is the streaming algorithm subsystem.

Model Designs with Multiple Streaming Channels

When you run the IP Core Generation workflow, you can map multiple scalar DUT ports to AXI4-Stream Master and AXI4-Stream Slave channels. In legacy frame-based mode, you can use at most one AXI4-Stream Master channel and one AXI4-Stream Slave channel.

Note

In the sample-based mode when you use multiple streaming channels, HDL Coder generates the Ready signal but does not generate the back pressure logic. If you want your design to handle back pressure, model the Ready signal in your design.

To learn more, see Generate HDL IP Core with Multiple AXI4-Stream and AXI4 Master Interfaces.

When you model your DUT using the frame-based mode you can map multiple frame DUT ports to AXI4-Stream Master and AXI4-Stream Slave channels. When you use the frame-based mode HDL Coder generates the ready and valid signals for all the streaming ports.

Model Designs That Have Multiple Sample Rates

When you run the IP Core Generation workflow, use the HDL Coder software for designs that have multiple sample rates. When you map the interface ports to AXI4-Stream Master or AXI4-Stream Slave interfaces, to use multiple sample rates, map the DUT ports that map to these AXI4 interfaces to run at the fastest rate of the design or at rates slower than the design rate.

HDL Coder runs the DUT ports mapped to AXI4-Stream master and slave interfaces at rates slower than the model design rate by:

  • Setting the AXI4-Stream master channel valid signal to high at the first cycle every N clock cycles. For example, if the design rate is eight times faster than the slow rate DUT ports, the valid signal is high for the first clock cycle every eight clock cycles.

  • Asserting back pressure on the AXI4-Stream slave interface to make sure that the incoming data is streamed at the rate of one data frame every N clock cycles. For example, if the design rate is eight times faster than the slow rate DUT ports, the first frame is streamed at clock cycle one, the second frame at clock cycle nine, and so on.

When you map the AXI4-Stream Interface DUT port to the fastest rate in the design, the valid signal is high, making sure there is no back pressure on the AXI4-Stream slave interface.

When designing models that have multiple sample rates, for each AXI4-Stream interface, AXI4-Stream interface signals, such as data signals, valid signals and all the optional signals, need to be mapped at same rate.

To learn more, see Multirate IP Core Generation.

Interface Options for AXI4-Stream Data

When you run the IP Core Generation workflow on a model that has vector data, you can specify how the vector data is treated as a sample or as a frame by using the Sample Packing Dimension. When the vector data is treated as a sample, you can specify how the vector elements are packed together by using the Packing mode option.

When you run the IP Core Generation workflow on a model that has an AXI4- Stream interface with none of the signals mapped to TLAST, you can specify the TLAST register value by using the DefaultFrameLength option.

Default Frame Length

When you do not model the TLAST signal, specify the default frame length (TLAST) value for the AXI4 Stream Master interface. The TLAST signal is created for you In the generated IP core and the signal is asserted when the number of valid samples counts up to the value in the default frame length counter. When the generated IP core has an AXI4 Slave interface, HDL Coder generates the default frame length as a programmable register. When the default frame length register is changed in the middle of a frame, the TLAST counter state is reset to zero and the TLAST signal is asserted early. For more information, see TLAST Signal (optional). When you do not select the Generate default AXI4 slave interface the default frame length is generated as a constant value and not as a programmable register.

Sample Packing Dimension

Specify if the vector data is treated as a sample or as a frame:

  • None. This value is the default value. When you specify None, vectors are treated as frames and vector elements are streamed one after the other. For example, when the input is a six-by-one vector in the first clock cycle, the first vector element is streamed, the second vector element, in the second clock cycle, and so on. To use this mode, the model must contain a Serializer block for inputs and a Deserializer block for the outputs. The Packing mode is not available when the Sample Packing Dimension is set to None.

  • All. When you specify All, the vectors are packed together and streamed in a single clock cycle. For example, when the input is a six-by-one vector, all vector elements are packed together and streamed in a single clock cycle. In this case, you can specify how the vector elements are packed by using the Packing mode option.

Packing Mode

When you set Sample Packing Dimension to All, you can specify how HDL Coder packs vector elements, complex data, and complex vectors by specifying the Packing Mode parameter to either Bit Aligned or Power of 2 Aligned. This setting applies to the AXI4-Stream master and slave channels.

Non-Complex Vector Data Packing

  • Bit Aligned. When you set the Packing Mode to this setting, HDL Coder packs the vector elements directly next to each other. If the packed bit width is less than the AXI4-Stream channel width, then pad the packed data with zeros to match the channel width.

    For example, if the AXI4-Stream channel width is 256 bits, there are four vector elements, and the vector elements are 30 bits long, the total data width is 120 bits. When the packing mode is set to Bit Aligned, HDL Coder packs the AXI4-Stream data as shown in this diagram.

    zero padded bit aligned vector data

  • Power of 2 Aligned. In this mode, the vector elements are first padded with zeros to the closest power of two boundary. Then, the padded elements are packed together. If the packed vector bit width is less than the AXI4-Stream channel width, then the packed data is padded with zeros to match the channel width.

    For example, if the AXI4-Stream channel width is 256 bits, there are four vector elements, and the vector elements are 30 bits long, the total data width is 120 bits. When the packing mode is set to Bit Aligned, HDL Coder packs the AXI4-Stream data as shown in this diagram. When the packing mode is set to Power of 2 Aligned, HDL Coder packs the AXI4-Stream data as shown in this diagram.

    zero padded power of two aligned data

    Each vector element of bit width 30 is padded with zeroes of bit width two to extend the vector element width to 32, the nearest power of two boundary.

Complex Data Packing.  When the input data is a complex number, HDL Coder pads the real and imaginary parts with zeroes to the closest power of two boundary. HDL Coder then packs the data together and pads the data with zeroes to match the channel width. For example, if the AXI4-Stream channel width is 64-bits and the complex data type is ufix7, HDL Coder packs the stream data as shown in this image:

complex data with zero padding to the closest power of two boundary

HDL Coder pads the complex data with zeroes of bit width one to extend the real and imaginary data packets to a size eight, and then pads the data with zeroes of width 48 to extend the complex data frame size to 64, the AXI4-Stream channel width.

When you use the generated software interface model to stream complex data to the target board, you must pack the complex data as shown in the example images.

Complex Vector Data Packing.  To model complex data signals as frames on the DUT boundary follow the frame-based modeling requirements. In this case, HDL Coder packs the complex data elements of the frame as described in the Complex Data Packing section.

Alternatively, if you want to treat the entire complex vector as a sample, follow the guidelines in the Sample Based Modeling section. HDL Coder sets the Sample Packing Dimension parameter to All and the Packing Mode parameter to Power of 2 Aligned.

For example, if the AXI4-Stream channel width is 64-bits, the complex data type isufix7, and the complex vector has three elements, HDL Coder packs the data as shown in this image:

complex vector with zeroes padded to the closest power of two boundary

HDL Coder pads the complex data with:

  • Zeroes of bit width one after every real and imaginary vector element to extend the data size to eight, the nearest power of two boundary.

  • Zeroes of bit width 16 at the end to extend the frame size to 64, the width of the AXI4-Stream channel.

When you use the generated software interface model to stream complex data to the target board, you must pack the complex data in the data packing format as shown in the example images.

Restrictions

When you map scalar or vector DUT ports to AXI4-Stream interfaces:

  • Xilinx® Zynq®-7000 or Intel® Quartus® Prime must be your target platform.

  • Xilinx Vivado® or Intel Quartus Prime must be your synthesis tool.

  • Processor/FPGA synchronization must be Free Running.

Related Examples

More About