flash-attention/csrc/fused_softmax
2024-03-13 20:46:57 -07:00
..
fused_softmax.cpp Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00
scaled_masked_softmax_cuda.cu Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00
scaled_masked_softmax.h Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00
scaled_upper_triang_masked_softmax_cuda.cu Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00
scaled_upper_triang_masked_softmax.h Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00
setup.py Make nvcc threads configurable via environment variable (#885) 2024-03-13 20:46:57 -07:00
type_shim.h Add Megatron attention implementation for benchmarking 2022-10-23 23:04:16 -07:00