squall/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Richard Liu	cd34029e91	Refactor TPU requirements file and pin build dependencies (#10010 ) Signed-off-by: Richard Liu <ricliu@google.com>	2024-11-05 16:48:44 +00:00
Nikita Furin	1b73ab2a1f	[CI/Build] Quoting around > (#9956 )	2024-11-02 12:50:28 -07:00
Woosuk Kwon	211fe91aa8	[TPU] Correctly profile peak memory usage & Upgrade PyTorch XLA (#9438 )	2024-10-30 09:41:38 +00:00
Daniele	a2c71c5405	[CI/Build] remove .github from .dockerignore, add dirty repo check (#9375 )	2024-10-17 10:25:06 -07:00
Daniele	ee5f34b1c2	[CI/Build] use setuptools-scm to set __version__ (#4738 ) Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-23 09:44:26 -07:00
Yangshen⚡Deng	6a512a00df	[model] Support for Llava-Next-Video model (#7559 ) Co-authored-by: Roger Wang <ywang@roblox.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2024-09-10 22:21:36 -07:00
Woosuk Kwon	eeffde1ac0	[TPU] Upgrade PyTorch XLA nightly (#7967 )	2024-08-28 13:10:21 -07:00
Woosuk Kwon	90bab18f24	[TPU] Use mark_dynamic to reduce compilation time (#7340 )	2024-08-10 18:12:22 -07:00
Woosuk Kwon	73388c07a4	[TPU] Fix dockerfile.tpu (#7331 )	2024-08-08 20:24:58 -07:00
Earthwalker	7f8d612d24	[TPU] Support tensor parallelism in async llm engine (#6891 )	2024-07-29 12:42:21 -07:00
Woosuk Kwon	fad5576c58	[TPU] Reduce compilation time & Upgrade PyTorch XLA version (#6856 )	2024-07-27 10:28:33 -07:00
Woosuk Kwon	c467dff24f	[Hardware][TPU] Support MoE with Pallas GMM kernel (#6457 )	2024-07-16 09:56:28 -07:00
Woosuk Kwon	4552e37b55	[CI/Build][TPU] Add TPU CI test (#6277 ) Co-authored-by: kevin <kevin@anyscale.com>	2024-07-15 14:31:16 -07:00
Woosuk Kwon	08c5bdecae	[Bugfix][TPU] Fix outlines installation in TPU Dockerfile (#6256 )	2024-07-09 02:56:06 -07:00
Woosuk Kwon	1a8bfd92d5	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00

15 Commits