This guide walks through creating an Ollama model from an existing model on HuggingFace from PyTorch, Safetensors or GGUF. It optionally covers pushing the model to [ollama.ai](https://ollama.ai/library).
## Supported models
Ollama supports a set of model architectures, with support for more coming soon:
Next, create a `Modelfile` for your model. This file is the blueprint for your model, specifying weights, parameters, prompt templates and more.
```
FROM ./q4_0.bin
```
(Optional) many chat models require a prompt template in order to answer correctly. A default prompt template can be specified with the `TEMPLATE` instruction in the `Modelfile`:
```
FROM ./q4_0.bin
TEMPLATE "[INST] {{ .Prompt }} [/INST]"
```
### Step 4: Create an Ollama model
Finally, create a model from your `Modelfile`:
```
ollama create example -f Modelfile
```
Next, test the model with `ollama run`:
```
ollama run example "What is your favourite condiment?"
```
### Step 5: Publish your model (optional - in alpha)
Publishing models is in early alpha. If you'd like to publish your model to share with others, follow these steps:
1. Create [an account](https://ollama.ai/signup)
2. Ollama uses SSH keys similar to Git. Find your public key with `cat ~/.ollama/id_ed25519.pub` and copy it to your clipboard.
3. Add your public key to your [Ollama account](https://ollama.ai/settings/keys)
Next, copy your model to your username's namespace:
```
ollama cp example <yourusername>/example
```
Then push the model:
```
ollama push <yourusername>/example
```
After publishing, your model will be available at `https://ollama.ai/<your username>/example`
## Quantization reference
The quantization options are as follow (from highest highest to lowest levels of quantization). Note: some architectures such as Falcon do not support K quants.
-`q2_K`
-`q3_K`
-`q3_K_S`
-`q3_K_M`
-`q3_K_L`
-`q4_0` (recommended)
-`q4_1`
-`q4_K`
-`q4_K_S`
-`q4_K_M`
-`q5_0`
-`q5_1`
-`q5_K`
-`q5_K_S`
-`q5_K_M`
-`q6_K`
-`q8_0`
## Manually converting & quantizing models
### Prerequisites
Start by cloning the `llama.cpp` repo to your machine in another directory: