2023-02-09 19:24:15 +08:00
|
|
|
# CacheFlow
|
2023-02-24 20:04:49 +08:00
|
|
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
|
|
|
|
```bash
|
2023-03-12 15:23:14 +08:00
|
|
|
pip install psutil numpy torch transformers
|
2023-03-02 13:13:08 +08:00
|
|
|
pip install flash-attn # This may take up to 10 mins.
|
2023-02-24 20:04:49 +08:00
|
|
|
pip install -e .
|
|
|
|
|
```
|
|
|
|
|
|
2023-03-29 14:48:56 +08:00
|
|
|
## Test simple server
|
2023-02-24 20:04:49 +08:00
|
|
|
|
|
|
|
|
```bash
|
2023-03-22 04:45:42 +08:00
|
|
|
ray start --head
|
2023-03-29 14:48:56 +08:00
|
|
|
python simple_server.py
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The detailed arguments for `simple_server.py` can be found by:
|
|
|
|
|
```bash
|
|
|
|
|
python simple_server.py --help
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## FastAPI server
|
|
|
|
|
|
|
|
|
|
Install the following additional dependencies:
|
|
|
|
|
```bash
|
|
|
|
|
pip install fastapi uvicorn
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
To start the server:
|
|
|
|
|
```bash
|
|
|
|
|
ray start --head
|
|
|
|
|
python -m cacheflow.http_frontend.fastapi_frontend
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
To test the server:
|
|
|
|
|
```bash
|
|
|
|
|
python -m cacheflow.http_frontend.test_cli_client
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Gradio web server
|
|
|
|
|
|
|
|
|
|
Install the following additional dependencies:
|
|
|
|
|
```bash
|
|
|
|
|
pip install gradio
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Start the server:
|
|
|
|
|
```bash
|
|
|
|
|
python -m cacheflow.http_frontend.fastapi_frontend
|
|
|
|
|
# At another terminal
|
|
|
|
|
python -m cacheflow.http_frontend.gradio_webserver
|
2023-02-24 20:04:49 +08:00
|
|
|
```
|