This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
1
Packages
Projects
Releases
Wiki
Activity
51a08e7d8f
vllm
/
vllm
/
attention
/
backends
History
Antoni Baum
0ab278ca31
[Core] Remove unnecessary copies in flash attn backend (
#5138
)
2024-06-03 09:39:31 -07:00
..
__init__.py
[Core] Refactor Attention Take 2 (
#3462
)
2024-03-25 04:39:33 +00:00
abstract.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
blocksparse_attn.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
flash_attn.py
[Core] Remove unnecessary copies in flash attn backend (
#5138
)
2024-06-03 09:39:31 -07:00
flashinfer.py
[Misc] Take user preference in attention selector (
#4960
)
2024-05-23 07:55:56 +09:00
rocm_flash_attn.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
torch_sdpa.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
xformers.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00