* restore model load duration on generate response - set model load duration on generate and chat done response - calculate createAt time when response created * remove checkpoints predict opts * Update routes.go |
||
|---|---|---|
| .. | ||
| llama.cpp | ||
| ggml.go | ||
| gguf.go | ||
| llama.go | ||
| llm.go | ||
| utils.go | ||