- only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context |
||
|---|---|---|
| .. | ||
| client.go | ||
| client.py | ||
| types.go | ||