Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for c_next in the auxinfo_t struct. #632

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

fgvanzee
Copy link
Member

@fgvanzee fgvanzee commented May 11, 2022

This branch contains preliminary support for a new .c_next field within the auxinfo_t struct. It is fully implemented for gemm. Caveats:

  • The "wrap-around" address computation for the edge cases is not yet verified (but should be close to correct).
  • For now, only the gemm macrokernel (bli_gemm_ker_var2()) sets the .c_next field. The gemmt, trmm, and trsm macrokernels are (for now) oblivious.

(h/t to @devinamatthews and AMD for their contributions to this feature)

Note: I think we should wait until some of @devinamatthews's pending changes (which impact the non-gemm macrokernels) are merged before we extend this to the other level-3 operations. (I'm referring specifically to de-macroification.)

Details:
- Added .c_next field to auxinfo_t struct definition.
- Defined accessor macros for the auxinfo_t.c_next field.
- Compute reasonable values for c_next within the gemm macrokernel
  (bli_gemm_ker_var2.c) and embed them within the local auxinfo_t
  that is passed along into the gemm microkernel. Thanks to Devin
  Matthews and AMD for their contributions toward this feature.
@fgvanzee fgvanzee marked this pull request as ready for review May 11, 2022 22:21
@fgvanzee fgvanzee self-assigned this May 12, 2022
@fgvanzee
Copy link
Member Author

Note to self: Credit LeickR in the final squashed commit log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant