Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optm l1 #579

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Optm l1 #579

wants to merge 3 commits into from

Conversation

nsinghamd
Copy link

Level 1 Optimizations

  • Complex axpy (single complex and double complex datatype SIMD)
  • Complex dot (single complex and double complex datatype SIMD)

nsinghamd and others added 3 commits November 11, 2021 15:45
Details
    - Added Framework optimizations for BLAS and CBLAS interfaces for caxpyv_(cblas_caxpyv) and zaxpyv_ (cblas_zaxpyv).
    - Added new axpyv AVX2 kernels for c and z data types for AMD EPYC family.

AMD-Internal: [CPUPL-1231]

Change-Id: I9bc0c21fef9da84533adcef76427977430b27ea7
…head

Details:
    - Kernel is called directly from API call to avoid framework overhead in case of complex float and complex double precisions.
    - Added SIMD code for complex float and complex double and unrolled for loop 5 times to improve performance

AMD-Internal: [CPUPL-1057]

Change-Id: I3b9d202398cacc0168882c9d6da2b450c27466a0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant