Matrix functions#

Functions#

sl_status_t

sl_math_mvp_matrix_scale_f16(const sl_math_matrix_f16_t *input, float16_t scale, sl_math_matrix_f16_t *output)

Scale each element in a float16 matrix by a float16 scale.

void

sl_math_matrix_init_f16(sl_math_matrix_f16_t *matrix, size_t num_rows, size_t num_cols, float16_t *data)

Matrix initialization.

sl_status_t

sl_math_mvp_matrix_vector_mult_f16(const sl_math_matrix_f16_t *input_a, const float16_t *input_b, float16_t *output)

Multiply a matrix with a vector, both of 16 bit floats.

sl_status_t

sl_math_mvp_complex_matrix_mult_f16(const sl_math_matrix_f16_t *input_a, const sl_math_matrix_f16_t *input_b, sl_math_matrix_f16_t *output)

Multiply two matrices of complex 16 bit floats.

sl_status_t

sl_math_mvp_matrix_mult_f16(const sl_math_matrix_f16_t *input_a, const sl_math_matrix_f16_t *input_b, sl_math_matrix_f16_t *output)

Multiply two matrices of 16 bit floats.

sl_status_t

sl_math_mvp_matrix_sub_f16(const sl_math_matrix_f16_t *input_a, const sl_math_matrix_f16_t *input_b, sl_math_matrix_f16_t *output)

Subtract two matrices of 16 bit floats.

sl_status_t

sl_math_mvp_matrix_transpose_f16(const sl_math_matrix_f16_t *input, sl_math_matrix_f16_t *output)

Transpose a matrix.

sl_status_t

sl_math_mvp_complex_matrix_transpose_f16(const sl_math_matrix_f16_t *input, sl_math_matrix_f16_t *output)

Transpose a complex f16 matrix.

sl_status_t

sl_math_mvp_matrix_add_f16(const sl_math_matrix_f16_t *input_a, const sl_math_matrix_f16_t *input_b, sl_math_matrix_f16_t *output)

Add two matrices of 16 bit floats.

Function Documentation#

sl_math_mvp_matrix_scale_f16#

sl_status_t sl_math_mvp_matrix_scale_f16 (const sl_math_matrix_f16_t * input, float16_t scale, sl_math_matrix_f16_t * output)

Scale each element in a float16 matrix by a float16 scale.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input	Input matrix.
float16_t	[in]	scale	Scale value.
sl_math_matrix_f16_t *	[out]	output	Output matrix.

This function will multiply each element in the input matrix by a scale, and write the result to the output matrix. The input and output matrices must be the same size. Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_matrix_init_f16#

void sl_math_matrix_init_f16 (sl_math_matrix_f16_t * matrix, size_t num_rows, size_t num_cols, float16_t * data)

Matrix initialization.

Parameters

Type	Direction	Argument Name	Description
sl_math_matrix_f16_t *	[in]	matrix	Pointer to a matrix.
size_t	[in]	num_rows	The number of rows in the matrix.
size_t	[in]	num_cols	The number of cols in the matrix.
float16_t *	[in]	data	A pointer to the matrix data.

sl_math_mvp_matrix_vector_mult_f16#

sl_status_t sl_math_mvp_matrix_vector_mult_f16 (const sl_math_matrix_f16_t * input_a, const float16_t * input_b, float16_t * output)

Multiply a matrix with a vector, both of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input_a	The input matrix.
const float16_t *	[in]	input_b	The input vector.
float16_t *	[out]	output	The output vector.

This function will perform the following operation: Z = A * b (matrix vector multiplication). The vector must be equal in length to the number of columns in matrix A. The output vector will be equal in length to the number of rows in matrix A.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_matrix_mult_f16#

sl_status_t sl_math_mvp_complex_matrix_mult_f16 (const sl_math_matrix_f16_t * input_a, const sl_math_matrix_f16_t * input_b, sl_math_matrix_f16_t * output)

Multiply two matrices of complex 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input_a	First input matrix, input A.
const sl_math_matrix_f16_t *	[in]	input_b	Second input matrix, input B.
sl_math_matrix_f16_t *	[out]	output	Output matrix, output Z.

The number of columns of the first matrix must be equal to the number of rows of the second matrix. Also the output matrix row count must match matrix A row count and output matrix column count must match matrix B column count. All input and output matrix data buffers must be 4-byte aligned (a complex f16 element occupies 4 bytes of storage). Maximum matrix size is 1024 x 1024 which is 1M (2^20) complex f16 elements. Maximum column and row size is 1024 elements.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_matrix_mult_f16#

sl_status_t sl_math_mvp_matrix_mult_f16 (const sl_math_matrix_f16_t * input_a, const sl_math_matrix_f16_t * input_b, sl_math_matrix_f16_t * output)

Multiply two matrices of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input_a	First input matrix, input A.
const sl_math_matrix_f16_t *	[in]	input_b	Second input matrix, input B.
sl_math_matrix_f16_t *	[out]	output	Output matrix, output Z.

This function will perform the following operation: Z = A * B (matrix multiplication). The number of columns of the first matrix must be equal to the number of rows of the second matrix. If the input is 4 bytes aligned, and the number of columns in matrix B is divisible by 2, it will be 2x faster.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_matrix_sub_f16#

sl_status_t sl_math_mvp_matrix_sub_f16 (const sl_math_matrix_f16_t * input_a, const sl_math_matrix_f16_t * input_b, sl_math_matrix_f16_t * output)

Subtract two matrices of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input_a	First input matrix, input A.
const sl_math_matrix_f16_t *	[in]	input_b	Second input matrix, input B.
sl_math_matrix_f16_t *	[out]	output	Output matrix, output Z.

This function will perform the following operation: Z = A - B. All matrices must have equal dimensions. If all matrice buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum matrices size is 1M (2^20) elements, and 2M elements in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_matrix_transpose_f16#

sl_status_t sl_math_mvp_matrix_transpose_f16 (const sl_math_matrix_f16_t * input, sl_math_matrix_f16_t * output)

Transpose a matrix.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input	Input matrix.
sl_math_matrix_f16_t *	[out]	output	output matrix.

This function will fill the output matrix with the transposed version of the input matrix. The maximum value for the rows and cols argument is 1024.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_complex_matrix_transpose_f16#

sl_status_t sl_math_mvp_complex_matrix_transpose_f16 (const sl_math_matrix_f16_t * input, sl_math_matrix_f16_t * output)

Transpose a complex f16 matrix.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input	Input matrix.
sl_math_matrix_f16_t *	[out]	output	output matrix.

This function will fill the output matrix with the transposed version of the input matrix. The maximum value for the rows and cols argument is 1024. Matrix input and output data buffers must be 4-byte aligned.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.

sl_math_mvp_matrix_add_f16#

sl_status_t sl_math_mvp_matrix_add_f16 (const sl_math_matrix_f16_t * input_a, const sl_math_matrix_f16_t * input_b, sl_math_matrix_f16_t * output)

Add two matrices of 16 bit floats.

Parameters

Type	Direction	Argument Name	Description
const sl_math_matrix_f16_t *	[in]	input_a	First input matrix, input A.
const sl_math_matrix_f16_t *	[in]	input_b	Second input matrix, input B.
sl_math_matrix_f16_t *	[out]	output	Output matrix, output Z.

This function will perform the following operation: Z = A + B. All matrices must have equal dimensions. If all matrice buffers are 4-byte aligned, the function will operate twice as fast using MVP parallel processing. Maximum matrices size is 1M (2^20) elements, and 2M elements in the 4-byte aligned case.

Returns

SL_STATUS_OK on success. On failure, an appropriate sl_status_t errorcode is returned.