Go to file

Woosuk Kwon 3e9f991d6a Use FlashAttention for `multi_query_kv_attention` (#4 )		2023-03-01 21:13:08 -08:00
cacheflow	Use FlashAttention for `multi_query_kv_attention` (#4 )	2023-03-01 21:13:08 -08:00
csrc	Implement `single_query_cached_kv_attention` kernel (#3 )	2023-03-01 15:02:19 -08:00
tests/kernels	Use FlashAttention for `multi_query_kv_attention` (#4 )	2023-03-01 21:13:08 -08:00
.gitignore	Add gitignore	2023-02-16 07:47:21 +00:00
README.md	Use FlashAttention for `multi_query_kv_attention` (#4 )	2023-03-01 21:13:08 -08:00
server.py	Use FlashAttention for `multi_query_kv_attention` (#4 )	2023-03-01 21:13:08 -08:00
setup.py	Implement `single_query_cached_kv_attention` kernel (#3 )	2023-03-01 15:02:19 -08:00

CacheFlow

Installation

pip install cmake torch transformers
pip install flash-attn # This may take up to 10 mins.
pip install -e .

python server.py