Vector functions#
Functions#
Element-by-element clipping of a value.
Multiply a vector of complex f16 by a vector of real f16.
Multiply complex f16 vectors.
Adds a constant offset to each element of a vector.
Subtract two vectors of 16 bit floats.
Fills a constant value into a floating-point vector.
Computes the dot product of two vectors.
Computes the absolute value of a vector on an element-by-element basis.
Computes the dot product of two complex vectors.
Negate a vector of 16 bit floats.
Elementwise multiply two vectors of 16 bit floats.
Add two vectors of 16 bit floats.
Add two vectors of signed 8 bit integers.
Clamp all signed 8 bit integers in a vector to a certain range.
Copy one 16 bit float vector into another.
Scale a vector of 16-bits floats with a float16 scale.
Conjugates the elements of a complex data vector.
Computes the magnitude squared of the elements of a complex data vector.
Function Documentation#
sl_math_mvp_vector_clip_f16#
sl_status_t sl_math_mvp_vector_clip_f16 (const float16_t * input, float16_t * output, float16_t low, float16_t high, size_t num_elements)
Element-by-element clipping of a value.
[in] | input | Input vector, input A. |
[out] | output | Output vector, output Z. |
[in] | low | Lower bound. |
[in] | high | Higher bound. |
[in] | num_elements | Length of input and output vectors. |
This function will do an element-by-element clipping of a value. The value is constrained between 2 bounds. Both vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
68
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_clip.h
sl_math_mvp_complex_vector_mult_real_f16#
sl_status_t sl_math_mvp_complex_vector_mult_real_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)
Multiply a vector of complex f16 by a vector of real f16.
[in] | input_a | The complex vector, input A. |
[in] | input_b | The real vector, input B. |
[out] | output | The complex output vector, output Z. |
[in] | num_elements | The number of elements in the input and output vectors. |
This function will perform the following operation: Z = A * B. Both vectors must be of same length. If both vector buffers are 4-byte aligned, the function will operate twice as fast using MVP complex processing. Maximum vector length is 1M (2^20) elements in the 4-byte aligned case, and 512K when oen or more of the complex vectors are 2-byte aligned.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
67
of file platform/compute/math/mvp/inc/sl_math_mvp_complex_vector_mult.h
sl_math_mvp_complex_vector_mult_f16#
sl_status_t sl_math_mvp_complex_vector_mult_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)
Multiply complex f16 vectors.
[in] | input_a | Complex input vector A. |
[in] | input_b | Complex input vector B. |
[out] | output | Complex output vector. |
[in] | num_elements | The number of complex elements in the input and output vectors. |
This function will multiply two complex vectors. It is assumed that both input vectors, and the output vector have same length. All input and output buffers must be 4-byte aligned. Maximum vector length is 1M (2^20) elements.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
91
of file platform/compute/math/mvp/inc/sl_math_mvp_complex_vector_mult.h
sl_math_mvp_vector_offset_f16#
sl_status_t sl_math_mvp_vector_offset_f16 (const float16_t * input, const float16_t offset, float16_t * output, size_t num_elements)
Adds a constant offset to each element of a vector.
[in] | input | Input vector. |
[in] | offset | Offset value. |
[in] | output | Output vector. |
[in] | num_elements | The number of elements in the input and output vectors. |
Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
62
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_offset.h
sl_math_mvp_vector_sub_f16#
sl_status_t sl_math_mvp_vector_sub_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)
Subtract two vectors of 16 bit floats.
[in] | input_a | First input vector, input A. |
[in] | input_b | Second input vector, input B. |
[out] | output | Output vector, output Z. |
[in] | num_elements | Length of all input and output vectors. |
This function will perform the following operation: Z = A - B. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
66
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_sub.h
sl_math_mvp_vector_fill_f16#
sl_status_t sl_math_mvp_vector_fill_f16 (float16_t * output, const float16_t value, size_t num_elements)
Fills a constant value into a floating-point vector.
[in] | output | Vector to fill. |
[in] | value | Fill value. |
[in] | num_elements | Length of the output vector. |
Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
61
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_fill.h
sl_math_mvp_vector_dot_product_f16#
sl_status_t sl_math_mvp_vector_dot_product_f16 (const float16_t * input_a, const float16_t * input_b, size_t num_elements, float16_t * output)
Computes the dot product of two vectors.
[in] | input_a | Input vector a. |
[in] | input_b | Input vector b. |
[in] | num_elements | The number of elements in the input vectors. |
[out] | output | The result. |
The vectors are multiplied element-by-element and then summed.
All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Note
Depending on the function arguments, the MVP implementation can calculate the dot product in different ways that may effect the rounding errors. If the same input vectors are calculated with different memory alignmen, the results may not be identical.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
72
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_dot_product.h
sl_math_mvp_vector_abs_f16#
sl_status_t sl_math_mvp_vector_abs_f16 (float16_t * input, float16_t * output, size_t num_elements)
Computes the absolute value of a vector on an element-by-element basis.
[in] | input | Input vector. |
[in] | output | Output Vector. |
[in] | num_elements | The number of elements in the vectors. |
The output vector can be the same as or differnt to the input vector. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
62
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_abs.h
sl_math_mvp_complex_vector_dot_product_f16#
sl_status_t sl_math_mvp_complex_vector_dot_product_f16 (float16_t * input_a, float16_t * input_b, size_t num_elements, float16_t * output)
Computes the dot product of two complex vectors.
[in] | input_a | Input vector a. |
[in] | input_b | Input vector b. |
[in] | num_elements | The number of complex elements in the input vectors. |
[out] | output | Dot product result. |
The vectors are multiplied element-by-element and then summed.
Maximum vector length is 1M (2^20), and all vectors must be 4-byte aligned.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
63
of file platform/compute/math/mvp/inc/sl_math_mvp_complex_vector_dot_product.h
sl_math_mvp_vector_negate_f16#
sl_status_t sl_math_mvp_vector_negate_f16 (const float16_t * input, float16_t * output, size_t num_elements)
Negate a vector of 16 bit floats.
[in] | input | Input vector. |
[out] | output | Output vector. |
[in] | num_elements | Length of input and output vectors. |
This function will perform the following operation: Z = - A. Vectors must be of equal length. If both vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case. In-place negation is supported (input and output reference same buffer).
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
66
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_negate.h
sl_math_mvp_vector_mult_f16#
sl_status_t sl_math_mvp_vector_mult_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)
Elementwise multiply two vectors of 16 bit floats.
[in] | input_a | First input vector, input A. |
[in] | input_b | Second input vector, input B. |
[out] | output | Output vector, output Z. |
[in] | num_elements | Length of all input and output vectors. |
This function will perform the following operation: Z[i] = A[i] * B[i]. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
66
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_mult.h
sl_math_mvp_vector_add_f16#
sl_status_t sl_math_mvp_vector_add_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)
Add two vectors of 16 bit floats.
[in] | input_a | First input vector, input A. |
[in] | input_b | Second input vector, input B. |
[out] | output | Output vector, output Z. |
[in] | num_elements | The number of elements in the vectors. |
This function will perform the following operation: Z = A + B. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
67
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_add.h
sl_math_mvp_vector_add_i8#
sl_status_t sl_math_mvp_vector_add_i8 (const int8_t * input_a, const int8_t * input_b, int8_t * output, size_t num_elements)
Add two vectors of signed 8 bit integers.
[in] | input_a | First input vector, input A. |
[in] | input_b | Second input vector, input B. |
[out] | output | Output vector, output Z. |
[in] | num_elements | The number of elements in the vectors. |
All vectors must be of the same length. This function will perform the following operation: Z = A + B. The add operation is performing a saturation add, which means that the operation will never overflow or underflow. When adding two elements would overflow (>127) then the result will be 127. When adding two elements would underflow (<-128) then the result will be -128.
Returns
SL_STATUS_OK.
92
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_add.h
sl_math_mvp_clamp_i8#
sl_status_t sl_math_mvp_clamp_i8 (int8_t * data, size_t num_elements, int8_t min, int8_t max)
Clamp all signed 8 bit integers in a vector to a certain range.
[inout] | data | Vector with data values. |
[in] | num_elements | The number of elements in the vector. |
[in] | min | Minimum value, after operation no elements will be < min. |
[in] | max | Maximum value, after operation no elements will be > max. |
Given a min/max value, this function will make sure that none of the element in the input vector will be < min or > max. If any elements are < min then the value will be modified to min. If any elements are > max then value will be modified to max.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
65
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_clamp.h
sl_math_mvp_vector_copy_f16#
sl_status_t sl_math_mvp_vector_copy_f16 (const float16_t * input, float16_t * output, size_t num_elements)
Copy one 16 bit float vector into another.
[in] | input | Input vector, input A. |
[out] | output | Output vector, output Z. |
[in] | num_elements | Length of input and output vectors. |
This function will perform the following operation: Z = A. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
65
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_copy.h
sl_math_mvp_vector_scale_f16#
sl_status_t sl_math_mvp_vector_scale_f16 (const float16_t * input, float16_t scale, float16_t * output, size_t num_elements)
Scale a vector of 16-bits floats with a float16 scale.
[in] | input | The input vector. |
[in] | scale | The value by which to scale the vector. |
[out] | output | Output vector, output Z. |
[in] | num_elements | Length of input and output vectors. |
This function will perform the following operation: Z[i] = A[i] * scale. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
66
of file platform/compute/math/mvp/inc/sl_math_mvp_vector_scale.h
sl_math_mvp_complex_vector_conjugate_f16#
sl_status_t sl_math_mvp_complex_vector_conjugate_f16 (float16_t * input, float16_t * output, size_t num_elements)
Conjugates the elements of a complex data vector.
[in] | input | Input vector. |
[in] | output | Output Vector. |
[in] | num_elements | The number of complex elements in the vectors. |
Maximum vector length is 1M (2^20), and all vectors must be 4-byte aligned.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
61
of file platform/compute/math/mvp/inc/sl_math_mvp_complex_vector_conjugate.h
sl_math_mvp_complex_vector_magnitude_squared_f16#
sl_status_t sl_math_mvp_complex_vector_magnitude_squared_f16 (const float16_t * input, float16_t * output, size_t num_elements)
Computes the magnitude squared of the elements of a complex data vector.
[in] | input | Input vector. |
[in] | output | Output Vector. |
[in] | num_elements | The number of complex elements in the input vector and the number of scalar elements in the output vector. |
The input vector shall point to the source that is a vector of complex numbers and the output vector shall point to a vector where the result will be written.
Returns
SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.
64
of file platform/compute/math/mvp/inc/sl_math_mvp_complex_vector_magnitude_squared.h