flash-attention

History

Chirag Jain 50896ec574 Make nvcc threads configurable via environment variable (#885 )		2024-03-13 20:46:57 -07:00
..
fused_softmax.cpp	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00
scaled_masked_softmax_cuda.cu	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00
scaled_masked_softmax.h	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00
scaled_upper_triang_masked_softmax_cuda.cu	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00
scaled_upper_triang_masked_softmax.h	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00
setup.py	Make nvcc threads configurable via environment variable (#885 )	2024-03-13 20:46:57 -07:00
type_shim.h	Add Megatron attention implementation for benchmarking	2022-10-23 23:04:16 -07:00