ollama

History

Daniel Hiltgen 17b7186cd7 Enable concurrency by default This adjusts our default settings to enable multiple models and parallel requests to a single model. Users can still override these by the same env var settings as before. Parallel has a direct impact on num_ctx, which in turn can have a significant impact on small VRAM GPUs so this change also refines the algorithm so that when parallel is not explicitly set by the user, we try to find a reasonable default that fits the model on their GPU(s). As before, multiple models will only load concurrently if they fully fit in VRAM.		2024-06-21 15:45:05 -07:00
..
auth.go	Revert "use post token"	2024-05-11 22:19:14 -07:00
download.go	server: skip blob verification for already verified blobs	2024-06-05 16:39:11 -07:00
fixblobs_test.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
fixblobs.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
images.go	server: remove jwt decoding error (#5027 )	2024-06-13 11:21:15 -07:00
layer.go	Merge pull request #3718 from ollama/mxyng/modelname-3	2024-05-29 12:02:07 -07:00
manifest_test.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
manifest.go	fix: skip removing layers that no longer exist	2024-06-10 11:32:19 -07:00
model.go	fix: multiple templates when creating from model	2024-06-12 13:35:49 -07:00
modelpath_test.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
modelpath.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
prompt_test.go	change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347 )	2024-03-26 13:04:17 -07:00
prompt.go	change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347 )	2024-03-26 13:04:17 -07:00
routes_create_test.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
routes_delete_test.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
routes_list_test.go	add OLLAMA_MODELS to envconfig (#5029 )	2024-06-13 12:52:03 -07:00
routes_test.go	Extend api/show and ollama show to return more model info (#4881 )	2024-06-19 14:19:02 -07:00
routes.go	Extend api/show and ollama show to return more model info (#4881 )	2024-06-19 14:19:02 -07:00
sched_test.go	Enable concurrency by default	2024-06-21 15:45:05 -07:00
sched.go	Enable concurrency by default	2024-06-21 15:45:05 -07:00
upload.go	lint	2024-06-04 11:13:30 -07:00