vllm/models at 3e9f991d6acd7efd90f04f1f530b837a40c93442 - vllm

History

Woosuk Kwon 3e9f991d6a Use FlashAttention for `multi_query_kv_attention` (#4 )		2023-03-01 21:13:08 -08:00
..
__init__.py	Add input metadata	2023-02-22 19:01:20 +00:00
attention.py	Use FlashAttention for `multi_query_kv_attention` (#4 )	2023-03-01 21:13:08 -08:00
input_metadata.py	Fix attention	2023-02-23 23:02:25 +00:00
model_utils.py	Fix a bug in tying OPT embeddings (#1 )	2023-02-24 16:29:36 -08:00
opt.py	Fix a bug in tying OPT embeddings (#1 )	2023-02-24 16:29:36 -08:00
sample.py	Fix sampler	2023-02-23 20:30:12 +00:00