Vector functions#

Functions#

sl_status_t

sl_math_mvp_vector_add_f16(const float16_t *input_a, const float16_t *input_b, float16_t *output, size_t num_elements)

Add two vectors of 16 bit floats.

sl_status_t

sl_math_mvp_vector_add_i8(const int8_t *input_a, const int8_t *input_b, int8_t *output, size_t num_elements)

Add two vectors of signed 8 bit integers.

sl_status_t

sl_math_mvp_vector_clip_f16(const float16_t *input, float16_t *output, float16_t low, float16_t high, size_t num_elements)

Element-by-element clipping of a value.

sl_status_t

sl_math_mvp_vector_mult_f16(const float16_t *input_a, const float16_t *input_b, float16_t *output, size_t num_elements)

Elementwise multiply two vectors of 16 bit floats.

sl_status_t

sl_math_mvp_vector_copy_f16(const float16_t *input, float16_t *output, size_t num_elements)

Copy one 16 bit float vector into another.

sl_status_t

sl_math_mvp_complex_vector_dot_product_f16(float16_t *input_a, float16_t *input_b, size_t num_elements, float16_t *output)

Computes the dot product of two complex vectors.

sl_status_t

sl_math_mvp_vector_scale_f16(const float16_t *input, float16_t scale, float16_t *output, size_t num_elements)

Scale a vector of 16-bits floats with a float16 scale.

sl_status_t

sl_math_mvp_complex_vector_conjugate_f16(float16_t *input, float16_t *output, size_t num_elements)

Conjugates the elements of a complex data vector.

sl_status_t

sl_math_mvp_vector_sub_f16(const float16_t *input_a, const float16_t *input_b, float16_t *output, size_t num_elements)

Subtract two vectors of 16 bit floats.

sl_status_t

sl_math_mvp_complex_vector_magnitude_squared_f16(const float16_t *input, float16_t *output, size_t num_elements)

Computes the magnitude squared of the elements of a complex data vector.

sl_status_t

sl_math_mvp_vector_abs_f16(float16_t *input, float16_t *output, size_t num_elements)

Computes the absolute value of a vector on an element-by-element basis.

sl_status_t

sl_math_mvp_vector_negate_f16(const float16_t *input, float16_t *output, size_t num_elements)

Negate a vector of 16 bit floats.

sl_status_t

sl_math_mvp_clamp_i8(int8_t *data, size_t num_elements, int8_t min, int8_t max)

Clamp all signed 8 bit integers in a vector to a certain range.

sl_status_t

sl_math_mvp_vector_offset_f16(const float16_t *input, const float16_t offset, float16_t *output, size_t num_elements)

Adds a constant offset to each element of a vector.

sl_status_t

sl_math_mvp_complex_vector_mult_real_f16(const float16_t *input_a, const float16_t *input_b, float16_t *output, size_t num_elements)

Multiply a vector of complex f16 by a vector of real f16.

sl_status_t

sl_math_mvp_complex_vector_mult_f16(const float16_t *input_a, const float16_t *input_b, float16_t *output, size_t num_elements)

Multiply complex f16 vectors.

sl_status_t

sl_math_mvp_vector_dot_product_f16(const float16_t *input_a, const float16_t *input_b, size_t num_elements, float16_t *output)

Computes the dot product of two vectors.

sl_status_t

sl_math_mvp_vector_fill_f16(float16_t *output, const float16_t value, size_t num_elements)

Fills a constant value into a floating-point vector.

Function Documentation#

sl_math_mvp_vector_add_f16#

sl_status_t sl_math_mvp_vector_add_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)

Add two vectors of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	First input vector, input A.
const float16_t *	[in]	input_b	Second input vector, input B.
float16_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	The number of elements in the vectors.

This function will perform the following operation: Z = A + B. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_add_i8#

sl_status_t sl_math_mvp_vector_add_i8 (const int8_t * input_a, const int8_t * input_b, int8_t * output, size_t num_elements)

Add two vectors of signed 8 bit integers.

Parameters

Type	Direction	Argument Name	Description
const int8_t *	[in]	input_a	First input vector, input A.
const int8_t *	[in]	input_b	Second input vector, input B.
int8_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	The number of elements in the vectors.

All vectors must be of the same length. This function will perform the following operation: Z = A + B. The add operation is performing a saturation add, which means that the operation will never overflow or underflow. When adding two elements would overflow (>127) then the result will be 127. When adding two elements would underflow (<-128) then the result will be -128.

Returns

SL_STATUS_OK.

sl_math_mvp_vector_clip_f16#

sl_status_t sl_math_mvp_vector_clip_f16 (const float16_t * input, float16_t * output, float16_t low, float16_t high, size_t num_elements)

Element-by-element clipping of a value.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	Input vector, input A.
float16_t *	[out]	output	Output vector, output Z.
float16_t	[in]	low	Lower bound.
float16_t	[in]	high	Higher bound.
size_t	[in]	num_elements	Length of input and output vectors.

This function will do an element-by-element clipping of a value. The value is constrained between 2 bounds. Both vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_mult_f16#

sl_status_t sl_math_mvp_vector_mult_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)

Elementwise multiply two vectors of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	First input vector, input A.
const float16_t *	[in]	input_b	Second input vector, input B.
float16_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	Length of all input and output vectors.

This function will perform the following operation: Z[i] = A[i] * B[i]. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_copy_f16#

sl_status_t sl_math_mvp_vector_copy_f16 (const float16_t * input, float16_t * output, size_t num_elements)

Copy one 16 bit float vector into another.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	Input vector, input A.
float16_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	Length of input and output vectors.

This function will perform the following operation: Z = A. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_vector_dot_product_f16#

sl_status_t sl_math_mvp_complex_vector_dot_product_f16 (float16_t * input_a, float16_t * input_b, size_t num_elements, float16_t * output)

Computes the dot product of two complex vectors.

Parameters

Type	Direction	Argument Name	Description
float16_t *	[in]	input_a	Input vector a.
float16_t *	[in]	input_b	Input vector b.
size_t	[in]	num_elements	The number of complex elements in the input vectors.
float16_t *	[out]	output	Dot product result.

The vectors are multiplied element-by-element and then summed.

Maximum vector length is 1M (2^20), and all vectors must be 4-byte aligned.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_scale_f16#

sl_status_t sl_math_mvp_vector_scale_f16 (const float16_t * input, float16_t scale, float16_t * output, size_t num_elements)

Scale a vector of 16-bits floats with a float16 scale.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	The input vector.
float16_t	[in]	scale	The value by which to scale the vector.
float16_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	Length of input and output vectors.

This function will perform the following operation: Z[i] = A[i] * scale. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_vector_conjugate_f16#

sl_status_t sl_math_mvp_complex_vector_conjugate_f16 (float16_t * input, float16_t * output, size_t num_elements)

Conjugates the elements of a complex data vector.

Parameters

Type	Direction	Argument Name	Description
float16_t *	[in]	input	Input vector.
float16_t *	[in]	output	Output Vector.
size_t	[in]	num_elements	The number of complex elements in the vectors.

Maximum vector length is 1M (2^20), and all vectors must be 4-byte aligned.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_sub_f16#

sl_status_t sl_math_mvp_vector_sub_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)

Subtract two vectors of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	First input vector, input A.
const float16_t *	[in]	input_b	Second input vector, input B.
float16_t *	[out]	output	Output vector, output Z.
size_t	[in]	num_elements	Length of all input and output vectors.

This function will perform the following operation: Z = A - B. All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_vector_magnitude_squared_f16#

sl_status_t sl_math_mvp_complex_vector_magnitude_squared_f16 (const float16_t * input, float16_t * output, size_t num_elements)

Computes the magnitude squared of the elements of a complex data vector.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	Input vector.
float16_t *	[in]	output	Output Vector.
size_t	[in]	num_elements	The number of complex elements in the input vector and the number of scalar elements in the output vector.

The input vector shall point to the source that is a vector of complex numbers and the output vector shall point to a vector where the result will be written.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_abs_f16#

sl_status_t sl_math_mvp_vector_abs_f16 (float16_t * input, float16_t * output, size_t num_elements)

Computes the absolute value of a vector on an element-by-element basis.

Parameters

Type	Direction	Argument Name	Description
float16_t *	[in]	input	Input vector.
float16_t *	[in]	output	Output Vector.
size_t	[in]	num_elements	The number of elements in the vectors.

The output vector can be the same as or differnt to the input vector. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_negate_f16#

sl_status_t sl_math_mvp_vector_negate_f16 (const float16_t * input, float16_t * output, size_t num_elements)

Negate a vector of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	Input vector.
float16_t *	[out]	output	Output vector.
size_t	[in]	num_elements	Length of input and output vectors.

This function will perform the following operation: Z = - A. Vectors must be of equal length. If both vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case. In-place negation is supported (input and output reference same buffer).

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_clamp_i8#

sl_status_t sl_math_mvp_clamp_i8 (int8_t * data, size_t num_elements, int8_t min, int8_t max)

Clamp all signed 8 bit integers in a vector to a certain range.

Parameters

Type	Direction	Argument Name	Description
int8_t *	[inout]	data	Vector with data values.
size_t	[in]	num_elements	The number of elements in the vector.
int8_t	[in]	min	Minimum value, after operation no elements will be < min.
int8_t	[in]	max	Maximum value, after operation no elements will be > max.

Given a min/max value, this function will make sure that none of the element in the input vector will be < min or > max. If any elements are < min then the value will be modified to min. If any elements are > max then value will be modified to max.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_offset_f16#

sl_status_t sl_math_mvp_vector_offset_f16 (const float16_t * input, const float16_t offset, float16_t * output, size_t num_elements)

Adds a constant offset to each element of a vector.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input	Input vector.
const float16_t	[in]	offset	Offset value.
float16_t *	[in]	output	Output vector.
size_t	[in]	num_elements	The number of elements in the input and output vectors.

Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_vector_mult_real_f16#

sl_status_t sl_math_mvp_complex_vector_mult_real_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)

Multiply a vector of complex f16 by a vector of real f16.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	The complex vector, input A.
const float16_t *	[in]	input_b	The real vector, input B.
float16_t *	[out]	output	The complex output vector, output Z.
size_t	[in]	num_elements	The number of elements in the input and output vectors.

This function will perform the following operation: Z = A * B. Both vectors must be of same length. If both vector buffers are 4-byte aligned, the function will operate twice as fast using MVP complex processing. Maximum vector length is 1M (2^20) elements in the 4-byte aligned case, and 512K when oen or more of the complex vectors are 2-byte aligned.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_vector_mult_f16#

sl_status_t sl_math_mvp_complex_vector_mult_f16 (const float16_t * input_a, const float16_t * input_b, float16_t * output, size_t num_elements)

Multiply complex f16 vectors.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	Complex input vector A.
const float16_t *	[in]	input_b	Complex input vector B.
float16_t *	[out]	output	Complex output vector.
size_t	[in]	num_elements	The number of complex elements in the input and output vectors.

This function will multiply two complex vectors. It is assumed that both input vectors, and the output vector have same length. All input and output buffers must be 4-byte aligned. Maximum vector length is 1M (2^20) elements.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_dot_product_f16#

sl_status_t sl_math_mvp_vector_dot_product_f16 (const float16_t * input_a, const float16_t * input_b, size_t num_elements, float16_t * output)

Computes the dot product of two vectors.

Parameters

Type	Direction	Argument Name	Description
const float16_t *	[in]	input_a	Input vector a.
const float16_t *	[in]	input_b	Input vector b.
size_t	[in]	num_elements	The number of elements in the input vectors.
float16_t *	[out]	output	The result.

The vectors are multiplied element-by-element and then summed.

All vectors must be of equal length. If all vector buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Note

Depending on the function arguments, the MVP implementation can calculate the dot product in different ways that may effect the rounding errors. If the same input vectors are calculated with different memory alignmen, the results may not be identical.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_vector_fill_f16#

sl_status_t sl_math_mvp_vector_fill_f16 (float16_t * output, const float16_t value, size_t num_elements)

Fills a constant value into a floating-point vector.

Parameters

Type	Direction	Argument Name	Description
float16_t *	[in]	output	Vector to fill.
const float16_t	[in]	value	Fill value.
size_t	[in]	num_elements	Length of the output vector.

Maximum vector length is 1M (2^20), and 2M in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.