Document supported models (#127)

This commit is contained in:
Woosuk Kwon 2023-06-02 22:35:17 -07:00 committed by GitHub
parent 0eda2e0953
commit 62ec38ea41
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 58 additions and 3 deletions

View File

@ -39,11 +39,13 @@ class LLM:
def generate(
self,
prompts: List[str],
prompts: Union[str, List[str]],
sampling_params: Optional[SamplingParams] = None,
prompt_token_ids: Optional[List[List[int]]] = None,
use_tqdm: bool = True,
) -> List[RequestOutput]:
if isinstance(prompts, str):
prompts = [prompts]
if sampling_params is None:
# Use default sampling params.
sampling_params = SamplingParams()

View File

@ -14,7 +14,6 @@ make html
## Open the docs with your brower
```bash
cd build/html
python -m http.server
python -m http.server -d build/html/
```
Launch your browser and open localhost:8000.

View File

@ -10,3 +10,10 @@ Documentation
getting_started/installation
getting_started/quickstart
.. toctree::
:maxdepth: 1
:caption: Models
models/supported_models
models/adding_model

View File

@ -0,0 +1,7 @@
.. _adding_a_new_model:
Adding a New Model
==================
Placeholder

View File

@ -0,0 +1,40 @@
.. _supported_models:
Supported Models
================
CacheFlow supports a variety of generative Transformer models in `HuggingFace Transformers <https://github.com/huggingface/transformers>`_.
The following is the list of model architectures that are currently supported by CacheFlow.
Alongside each architecture, we include some popular models that use it.
.. list-table::
:widths: 25 75
:header-rows: 1
* - Architecture
- Models
* - :code:`GPT2LMHeadModel`
- GPT-2
* - :code:`GPTNeoXForCausalLM`
- GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
* - :code:`LlamaForCausalLM`
- LLaMA, Vicuna, Alpaca, Koala
* - :code:`OPTForCausalLM`
- OPT, OPT-IML
If your model uses one of the above model architectures, you can seamlessly run your model with CacheFlow.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Alternatively, you can raise an issue on our `GitHub <https://github.com/WoosukKwon/cacheflow/issues>`_ project.
.. tip::
The easiest way to check if your model is supported is to run the program below:
.. code-block:: python
from cacheflow import LLM
llm = LLM(model=...) # Name or path of your model
output = llm.generate("Hello, my name is")
print(output)
If CacheFlow successfully generates text, it indicates that your model is supported.