Restrict Model Access
Restrict models by Virtual Key​
Set allowed models for a key using the models param
curl 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{"models": ["gpt-3.5-turbo", "gpt-4"]}'
This key can only make requests to models that are gpt-3.5-turbo or gpt-4
Verify this is set correctly by
- Allowed Access
- Disallowed Access
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello"}
]
}'
Expect this to fail since gpt-4o is not in the models for the key generated
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-1234" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Hello"}
]
}'
API Reference​
Restrict models by team_id​
litellm-dev can only access azure-gpt-3.5
1. Create a team via /team/new
curl --location 'http://localhost:4000/team/new' \
--header 'Authorization: Bearer <your-master-key>' \
--header 'Content-Type: application/json' \
--data-raw '{
"team_alias": "litellm-dev",
"models": ["azure-gpt-3.5"]
}'
# returns {...,"team_id": "my-unique-id"}
2. Create a key for team
curl --location 'http://localhost:4000/key/generate' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data-raw '{"team_id": "my-unique-id"}'
3. Test it
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-qo992IjKOC2CHKZGRoJIGA' \
--data '{
"model": "BEDROCK_GROUP",
"messages": [
{
"role": "user",
"content": "hi"
}
]
}'
{"error":{"message":"Invalid model for team litellm-dev: BEDROCK_GROUP. Valid models for team are: ['azure-gpt-3.5']\n\n\nTraceback (most recent call last):\n File \"/Users/ishaanjaffer/Github/litellm/litellm/proxy/proxy_server.py\", line 2298, in chat_completion\n _is_valid_team_configs(\n File \"/Users/ishaanjaffer/Github/litellm/litellm/proxy/utils.py\", line 1296, in _is_valid_team_configs\n raise Exception(\nException: Invalid model for team litellm-dev: BEDROCK_GROUP. Valid models for team are: ['azure-gpt-3.5']\n\n","type":"None","param":"None","code":500}}%
API Reference​
Per-Member Model Overrides (Team-Scoped Defaults)​
Requires TEAM_MODEL_OVERRIDES=true environment variable or litellm.team_model_overrides_enabled = True.
By default, every team member can access all models in team.models. With per-member model overrides, you can:
- Set
default_modelson a team — the models every member gets by default - Set
modelson individual team members — additional models only they can access
A member's effective models = default_models ∪ member.models. If neither is set, falls back to team.models (full backward compatibility).
Enable the Feature​
Add to your config.yaml:
environment_variables:
TEAM_MODEL_OVERRIDES: "true"
1. Create a Team with Default Models​
curl -L 'http://localhost:4000/team/new' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_alias": "engineering",
"models": ["gpt-4", "gpt-4o-mini", "gpt-4o"],
"default_models": ["gpt-4o-mini"]
}'
models— the full pool of models the team is allowed to usedefault_models— the subset every member gets by default (must be a subset ofmodels)
2. Add Members with Per-User Overrides​
# Alice gets the default (gpt-4o-mini only)
curl -L 'http://localhost:4000/team/member_add' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"member": {"role": "user", "user_id": "alice"}
}'
# Bob gets gpt-4o in addition to the default
curl -L 'http://localhost:4000/team/member_add' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"member": {"role": "user", "user_id": "bob", "models": ["gpt-4o"]}
}'
| Member | Override | Effective Models |
|---|---|---|
| Alice | none | ["gpt-4o-mini"] |
| Bob | ["gpt-4o"] | ["gpt-4o-mini", "gpt-4o"] |
3. Generate Keys and Test​
# Generate key for Bob
curl -L 'http://localhost:4000/key/generate' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{"team_id": "<team-id>", "user_id": "bob"}'
- Allowed (Bob → gpt-4o)
- Denied (Bob → gpt-4)
curl -L 'http://localhost:4000/chat/completions' \
-H 'Authorization: Bearer <bob-key>' \
-H 'Content-Type: application/json' \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
Returns 200 OK — gpt-4o is in Bob's effective set.
curl -L 'http://localhost:4000/chat/completions' \
-H 'Authorization: Bearer <bob-key>' \
-H 'Content-Type: application/json' \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello"}]}'
Returns 401 Unauthorized — gpt-4 is in the team pool but not in Bob's effective set.
4. Update Member Overrides​
# Add gpt-4 to Bob's overrides
curl -L 'http://localhost:4000/team/member_update' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"user_id": "bob",
"models": ["gpt-4o", "gpt-4"]
}'
# Remove all overrides (Bob falls back to default_models only)
curl -L 'http://localhost:4000/team/member_update' \
-H 'Authorization: Bearer <your-master-key>' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "<team-id>",
"user_id": "bob",
"models": []
}'
Validation Rules​
| Rule | Error |
|---|---|
default_models must be a subset of team.models | 400 on /team/new and /team/update |
Member models must be a subset of team.models | 400 on /team/member_add and /team/member_update |
Key models must be a subset of effective models | 403 on /key/generate |
Narrowing team.models auto-prunes stale default_models | Automatic on /team/update |
Backward Compatibility​
When the feature flag is off or when neither default_models nor member models is configured:
get_effective_team_models()returnsteam.modelsunchanged- All existing teams and keys work exactly as before
- Zero extra database queries on the auth hot path
View Available Fallback Models​
Use the /v1/models endpoint to discover available fallback models for a given model. This helps you understand which backup models are available when your primary model is unavailable or restricted.
The include_metadata parameter serves as an extension point for exposing additional model metadata in the future. While currently focused on fallback models, this approach will be expanded to include other model metadata such as pricing information, capabilities, rate limits, and more.
Basic Usage​
Get all available models:
curl -X GET 'http://localhost:4000/v1/models' \
-H 'Authorization: Bearer <your-api-key>'
Get Fallback Models with Metadata​
Include metadata to see fallback model information:
curl -X GET 'http://localhost:4000/v1/models?include_metadata=true' \
-H 'Authorization: Bearer <your-api-key>'
Get Specific Fallback Types​
You can specify the type of fallbacks you want to see:
- General Fallbacks
- Context Window Fallbacks
- Content Policy Fallbacks
curl -X GET 'http://localhost:4000/v1/models?include_metadata=true&fallback_type=general' \
-H 'Authorization: Bearer <your-api-key>'
General fallbacks are alternative models that can handle the same types of requests.
curl -X GET 'http://localhost:4000/v1/models?include_metadata=true&fallback_type=context_window' \
-H 'Authorization: Bearer <your-api-key>'
Context window fallbacks are models with larger context windows that can handle requests when the primary model's context limit is exceeded.
curl -X GET 'http://localhost:4000/v1/models?include_metadata=true&fallback_type=content_policy' \
-H 'Authorization: Bearer <your-api-key>'
Content policy fallbacks are models that can handle requests when the primary model rejects content due to safety policies.
Example Response​
When include_metadata=true is specified, the response includes fallback information:
{
"data": [
{
"id": "gpt-4",
"object": "model",
"created": 1677610602,
"owned_by": "openai",
"fallbacks": {
"general": ["gpt-3.5-turbo", "claude-3-sonnet"],
"context_window": ["gpt-4-turbo", "claude-3-opus"],
"content_policy": ["claude-3-haiku"]
}
}
]
}
Use Cases​
- High Availability: Identify backup models to ensure service continuity
- Cost Optimization: Find cheaper alternatives when primary models are expensive
- Content Filtering: Discover models with different content policies
- Context Length: Find models that can handle larger inputs
- Load Balancing: Distribute requests across multiple compatible models
API Parameters​
| Parameter | Type | Description |
|---|---|---|
include_metadata | boolean | Include additional model metadata including fallbacks |
fallback_type | string | Filter fallbacks by type: general, context_window, or content_policy |
Advanced: Model Access Groups​
For advanced use cases, use Model Access Groups to dynamically group multiple models and manage access without restarting the proxy.