Complex Divide HDL Optimized
Divide one input by another using CORDIC algorithm and generate optimized HDL code
Since R2021a
Libraries:
      Fixed-Point Designer HDL Support / 
      Math Operations
   
Description
The Complex Divide HDL Optimized block outputs the result of dividing the scalar num by the scalar den, such that y = num/den.
Examples
Implement Hardware-Efficient Complex Divide HDL Optimized
How to use the Complex Divide HDL Optimized block.
Customize Output Value of Real Divide HDL Optimized Block When Denominator Is Zero
Use the divideByZero port to customize the value of the block output when division by zero occurs.
How to Set CORDIC Input Word Length and Maximum Shift Value to Achieve Desired Precision
Provides a starting point for the input data type and number of iterations or maximum shift value required for the CORDIC algorithm to achieve a desired accuracy.
Limitations
Data type override is not supported for the Complex Divide HDL Optimized block.
Ports
Input
Numerator, specified as a scalar, vector, or matrix. num and den must have the same dimensions.
Slope-bias representation is not supported for fixed-point data types.
Data Types: single | double | fixed point
Complex Number Support: Yes
Denominator, specified as a scalar, vector, or matrix. num and den must have the same dimensions.
Slope-bias representation is not supported for fixed-point data types.
Data Types: single | double | fixed point
Complex Number Support: Yes
Whether input is valid, specified as a Boolean scalar. This control signal
              indicates when the data from the num and
                den input
              ports are valid. When this value is 1 (true),
              the block captures the values at the input ports num and
              den. When this value is 0 (false),
              the block ignores the input samples.
Data Types: Boolean
Output
Output computed by dividing num by den, such that y = num/den, returned as a complex scalar.
Tips
The data type at the output port y is specified by the Output datatype parameter.
Data Types: single | double | fixed point
Since R2024b
Whether the value at the y output
              port is the result of a division by zero operation, returned as a Boolean scalar. When
              the value of this signal is 1 (true), the
              corresponding output value at the y port is the result of division by
              zero. When the value of this signal is 0
              (false), the corresponding output value at the y port
              is the result of division by a nonzero value.
See Division by Zero Behavior for a description of the default divide by zero behavior.
Dependencies
To enable this port, select the Show divide by zero port parameter.
Data Types: Boolean
Whether the output data is valid, returned as a Boolean scalar. When the value of
              this control signal is 1 (true), the block has
              successfully computed the output at port y. When
              this value is 0 (false), the output data is not
              valid.
Data Types: Boolean
Parameters
Data type of output y,
            specified as fixdt(1,18,10), single,
              double, fixdt(1,16,0), or as a user-specified
            data type expression. The type can be specified directly or expressed as a data type
            object, such as Simulink.NumericType.
Programmatic Use
To set the block parameter value programmatically, use
			the set_param function.
To get the block parameter value
			programmatically, use the get_param function.
| Parameter: | OutputType | 
| Values: | fixdt(1,18,10)(default) |single|double|fixdt(1,16,0)|<data type expression> | 
| Data Types: | char|string | 
Example: set_param(gcb,"OutputType","fixdt(1,16,0)")
Since R2024b
Select this parameter to show the divideByZero port.
Programmatic Use
To set the block parameter value programmatically, use
			the set_param function.
To get the block parameter value
			programmatically, use the get_param function.
| Parameter: | dbzPort | 
| Values: | 0(false) (default) |1(true) | 
| Data Types: | logical | 
Example: set_param(gcb,"dbzPort",1)
Since R2024b
Automatically select the CORDIC maximum shift value based on input word length. When
            this parameter is selected, the default CORDIC maximumShiftValue is
            equal to wl - 1, where wl = max(num.WordLength +
              ~issigned(num), den.WordLength + ~issigned(den)).
Programmatic Use
To set the block parameter value programmatically, use
			the set_param function.
To get the block parameter value
			programmatically, use the get_param function.
| Parameter: | autoMaximumShiftVal | 
| Values: | on(default) |off | 
| Data Types: | char|string | 
Example: set_param(gcb,"autoMaximumShiftVal","off")
Since R2024b
Maximum shift value of hyperbolic vectoring CORDIC, specified as a positive
            integer-valued scalar. The default value for this parameter is wl -
            1, where wl = max(num.WordLength + ~issigned(num), den.WordLength +
              ~issigned(den)).
Dependencies
To enable this parameter, clear the Automatically select CORDIC maximum shift value based on input word length parameter.
Tips
See Customizable Pipelining for more information.
Programmatic Use
To set the block parameter value programmatically, use
			the set_param function.
To get the block parameter value
			programmatically, use the get_param function.
| Parameter: | maximumShiftValue | 
| Values: | 10(default) | positive integer-valued scalar | 
| Data Types: | char|string | 
Example: set_param(gcb,"maximumShiftValue","10")
Since R2024b
Number of CORDIC iterations to perform per pipeline stage, specified as a positive integer-valued scalar.
See Customizable Pipelining for more information. See How to Interface with the Complex Divide HDL Optimized Block and Hardware Resource Utilization for more information and examples showing how this parameter impacts latency and hardware resource utilization.
Programmatic Use
To set the block parameter value programmatically, use
			the set_param function.
To get the block parameter value
			programmatically, use the get_param function.
| Parameter: | nIterPerReg | 
| Values: | 1(default) | positive integer-valued scalar | 
| Data Types: | char|string | 
Example: set_param(gcb,"nIterPerReg","2")
Tips
The blocks Divide by Constant HDL Optimized, Real Divide HDL Optimized, and Complex Divide HDL Optimized all perform the division operation and generate optimized HDL code.
- Real Divide HDL Optimized and Complex Divide HDL Optimized are based on a CORIDC algorithm. These blocks accept a wide variety of inputs, but will result in greater latency. 
- Divide by Constant HDL Optimized accepts only real inputs and a constant divisor. Use of this block consumes DSP slices, but will complete the division operation in fewer cycles and at a higher clock rate. 
Algorithms
CORDIC is an acronym for COordinate Rotation DIgital Computer. The Givens rotation-based CORDIC algorithm is one of the most hardware-efficient algorithms available because it requires only iterative shift-add operations (see References). The CORDIC algorithm eliminates the need for explicit multipliers. Using CORDIC, you can calculate various functions such as sine, cosine, arcsine, arccosine, arctangent, and vector magnitude. You can also use this algorithm for divide, square root, hyperbolic, and logarithmic functions.
The precision of the CORDIC algorithm is a function of the data type used and the maximum shift value or number of iterations of the CORDIC kernel. Using a data type with a larger word length and performing more iterations of the CORDIC algorithm can reduce the numeric error of the result. However, doing so also increases the latency of the computation and the utilizes more hardware resources. For more information, see How to Set CORDIC Input Word Length and Maximum Shift Value to Achieve Desired Precision.
For fixed-point inputs when the denominator den is zero:
For floating-point inputs, the Complex Divide HDL Optimized block follows IEEE® Standard 754.
Tip
Enable the divideByZero port to output a flag when the block is given zero as an input value for the divisor.
Because of its fully pipelined nature, the Complex Divide HDL Optimized block is able to accept input data on any cycle, including consecutive cycles. To send input data to the block, the validIn signal must be set to true. When the block has finished the computation and is ready to send the output, it will set validOut to true for one clock cycle. For inputs sent on consecutive cycles, validOut will also be set to true on consecutive cycles. Both the numerator and the denominator must be sent together on the same cycle.

The latency depends on the input data type, as summarized in the table. When the input
        is a fixed-point or scaled double fi, the word length of the inputs
          num and
          den can
        differ. In the table, u represents the input with the larger word
          length.
| Input Type | Latency | 
|---|---|
| Fixed point or scaled double  | 
 where 
 and
                       | 
| Floating point | 0 | 
The Complex Divide HDL Optimized block uses fully pipelined architecture
        that implements iterative CORDIC-based rotation, normalization, and a CORDIC-based division
        algorithm. If the inputs num and den are
        fixed-point or scaled double data types, the block uses multiple pipeline stages for
        computation. If both inputs are signed and have the same word length, then rotating the
        denominator into a real value requires num.WordLength iterations. The
        normalization requires nextpow2(num.WordLength) iterations. The number of
        CORDIC division iterations depends on the value of the CORDIC maximum shift
          value parameter. A larger word length can provide higher resolution, but
        requires more iterations to process. The Complex Divide HDL Optimized block
        can perform multiple iterations per pipeline stage, which results in lower latency at the
        cost of a longer critical path in the generated HDL code.
For example, if num and den are signed and have 18-bit word
        lengths, then rotating the denominator into a real value requires 18
        iterations and normalization requires 5 iterations. If the Automatically select
          CORDIC maximum shift value based on input word length parameter is selected,
        the CORDIC maximum shift value is 18 - 1 = 17 and requires
          17 iterations. The total number of iterations is 18 + 5 + 17 =
          40 and the latency of the block is ceil((total number of
          iterations)/nIterPerReg) + 1. If the number of iterations per pipeline register
        is set to 1, then the block latency is 41; if the
        number of iterations per pipeline register is set to 2, then the block
        latency is 21. If the number of iterations per pipeline register is
        greater than the total number of required iterations, the block performs all iterations in
        one pipeline stage and the total latency is minimized to 2.
This block supports HDL code generation using the Simulink® HDL Workflow Advisor. For an example, see HDL Code Generation and FPGA Synthesis from Simulink Model (HDL Coder) and Implement Digital Downconverter for FPGA (DSP HDL Toolbox).
This example data was generated by synthesizing the block on a Xilinx® Zynq®-7000 xc7z045 SoC. The synthesis tool was Vivado® v2023.1.2.
The following synthesis results show the effect of the Number of iterations per pipeline register parameter on the latency and hardware resource utilization.
nIterPerReg = 1These parameters were used for synthesis:
- Input data type: - sfix18_en10
- Output data type: - sfix18_en10
- Input dimension: scalar 
- Automatically select CORDIC maximum shift value based on input word length: - on
- Number of iterations per pipeline register: - 1
- Target frequency: 300 MHz 
- Latency for this configuration: 41 
| Resource | Usage | Available | Utilization (%) | 
|---|---|---|---|
| Slice LUTs | 3246 | 218600 | 1.48 | 
| Slice Registers | 2668 | 437200 | 0.61 | 
| DSPs | 0 | 900 | 0.00 | 
| Block RAM Tile | 0 | 545 | 0.00 | 
| URAM | 0 | 0 | 
| Value | |
|---|---|
| Requirement | 3.3333 ns (300 MHz) | 
| Data Path Delay | 2.829 ns | 
| Slack | 0.485 ns | 
| Clock Frequency | 351.08 MHz | 
nIterPerReg = 2These parameters were used for synthesis:
- Input data type: - sfix18_en10
- Output data type: - sfix18_en10
- Input dimension: scalar 
- Automatically select CORDIC maximum shift value based on input word length: - on
- Number of iterations per pipeline register: - 2
- Target frequency: 150 MHz 
- Latency for this configuration: 21 
| Resource | Usage | Available | Utilization (%) | 
|---|---|---|---|
| Slice LUTs | 2999 | 218600 | 1.37 | 
| Slice Registers | 1393 | 437200 | 0.32 | 
| DSPs | 0 | 900 | 0.00 | 
| Block RAM Tile | 0 | 545 | 0.00 | 
| URAM | 0 | 0 | 
| Value | |
|---|---|
| Requirement | 6.6667 ns (150 MHz) | 
| Data Path Delay | 3.153 ns | 
| Slack | 3.495 ns | 
| Clock Frequency | 315.29 MHz | 
nIterPerReg = 3These parameters were used for synthesis:
- Input data type: - sfix18_en10
- Output data type: - sfix18_en10
- Input dimension: scalar 
- Automatically select CORDIC maximum shift value based on input word length: - on
- Number of iterations per pipeline register: - 3
- Target frequency: 150 MHz 
- Latency for this configuration: 15 
| Resource | Usage | Available | Utilization (%) | 
|---|---|---|---|
| Slice LUTs | 2988 | 218600 | 1.37 | 
| Slice Registers | 1008 | 437200 | 0.23 | 
| DSPs | 0 | 900 | 0.00 | 
| Block RAM Tile | 0 | 545 | 0.00 | 
| URAM | 0 | 0 | 
| Value | |
|---|---|
| Requirement | 6.6667 ns (150 MHz) | 
| Data Path Delay | 4.394 ns | 
| Slack | 2.266 ns | 
| Clock Frequency | 227.24 MHz | 
References
[1] Volder, Jack E. “The CORDIC Trigonometric Computing Technique.” IRE Transactions on Electronic Computers. EC-8, no. 3 (Sept. 1959): 330–334.
[2] Andraka, Ray. “A Survey of CORDIC Algorithm for FPGA Based Computers.” In Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays, 191–200. https://dl.acm.org/doi/10.1145/275107.275139.
[3] Walther, J.S. “A Unified Algorithm for Elementary Functions.” In Proceedings of the May 18-20, 1971 Spring Joint Computer Conference, 379–386. https://dl.acm.org/doi/10.1145/1478786.1478840.
[4] Schelin, Charles W. “Calculator Function Approximation.” The American Mathematical Monthly, no. 5 (May 1983): 317–325. https://doi.org/10.2307/2975781.
Extended Capabilities
Slope-bias representation is not supported for fixed-point data types.
HDL Coder™ provides additional configuration options that affect HDL implementation and synthesized logic.
This block has one default HDL architecture.
| General | |
|---|---|
| ConstrainedOutputPipeline | Number of registers to place at
                        the outputs by moving existing delays within your design. Distributed
                        pipelining does not redistribute these registers. The default is
                                 | 
| In R2024b: FlattenHierarchy | Remove PWM Reference Generator block hierarchy from
                    generated HDL code. The default is  | 
| InputPipeline | Number of input pipeline stages
                        to insert in the generated code. Distributed pipelining and constrained
                        output pipelining can move these registers. The default is
                                 | 
| OutputPipeline | Number of output pipeline stages
                        to insert in the generated code. Distributed pipelining and constrained
                        output pipelining can move these registers. The default is
                                 | 
Supports binary-point scaled fixed-point data types only.
Version History
Introduced in R2021aSeveral improvements have been made to the Complex Divide HDL Optimized block:
- Custom pipelining is supported via the new CORDIC maximum shift value and Number of iterations per pipeline register parameters. 
- The latency of this block has been reduced. Latency depends on the specified data type and pipeline configuration. See How to Interface with the Complex Divide HDL Optimized Block for more information. 
- HDL resource utilization has been further optimized to require fewer hardware resources. See Hardware Resource Utilization for example synthesis results. 
- An optional divideByZero port has been added to output a flag when the corresponding output is a result of division by zero. 
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)



