According to NVIDIA Developer documentation, the grouped GEMM API in cuBLASLt offers several advanced capabilities: Description Each matrix in the group can have its own Micap M sub i Nicap N sub i Kicap K sub i Mixed Precision
Unlike standard GEMM APIs that take a single set of matrix pointers, the grouped GEMM interface typically requires arrays of metadata: cublaslt grouped gemm
cuBLASLt Grouped GEMM: Accelerating Irregular Matrix Workloads According to NVIDIA Developer documentation
float alpha = 1.0f, beta = 0.0f; cublasLtMatmulGrouped(handle, nullptr, matmulDesc, &alpha, &beta, (void**)A_ptrs, (void**)B_ptrs, (void**)C_ptrs, (void**)C_ptrs, groupCount, groupPlans); beta = 0.0f