1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| INFO llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 135.23 seconds INFO api_server.py:958] Starting vLLM API server on http://0.0.0.0:8000 Available routes are: Route: /openapi.json, Methods: GET, HEAD Route: /docs, Methods: GET, HEAD Route: /docs/oauth2-redirect, Methods: GET, HEAD Route: /redoc, Methods: GET, HEAD Route: /health, Methods: GET Route: /ping, Methods: GET, POST Route: /tokenize, Methods: POST Route: /detokenize, Methods: POST Route: /v1/models, Methods: GET Route: /version, Methods: GET Route: /v1/chat/completions, Methods: POST Route: /v1/completions, Methods: POST Route: /v1/embeddings, Methods: POST Route: /pooling, Methods: POST Route: /score, Methods: POST Route: /v1/score, Methods: POST Route: /v1/audio/transcriptions, Methods: POST Route: /rerank, Methods: POST Route: /v1/rerank, Methods: POST Route: /v2/rerank, Methods: POST Route: /invocations, Methods: POST INFO: Started server process [1] INFO: Waiting for application startup. INFO: Application startup complete.
|