Monitoring & Quotas¶
Gasket provides metrics for token usage and enforces configurable quotas per backend profile.
Metrics¶
All metrics carry the following labels:
| Label | Description |
|---|---|
user |
Gasket user identity |
api_key |
API key identifier |
backend_profile |
Backend profile name |
openai_backend |
Specific OpenAI backend URL |
model |
Model name from the request |
Metric Types¶
| Metric | Description |
|---|---|
| Token usage | Prompt and completion token counts per request |
| API call latency | Request duration in milliseconds |
| API call success/failure | Success and error counters |
| Daily active unique API users (Gasket) | Distinct Gasket users per day |
| Daily active unique API users (Open WebUI) | Distinct Open WebUI users per day |
| Daily active unique API users (all) | Combined distinct users per day |
| Daily active unique Gasket Portal users | Distinct portal logins per day |
Quota Configuration¶
Quotas are configured per backend profile and define a maximum number of tokens allowed within a rolling time period (e.g. 10,000 tokens per 24 hours).
Quota Scopes¶
| Scope | Description |
|---|---|
| Per API key | Limits token usage for a single API key |
| Per Gasket user per backend profile | Limits a user across all their keys on a profile |
| Per Open WebUI user per backend profile | Limits an Open WebUI identity across a profile |
| Per Gasket user globally | Limits a user across all backend profiles |
| Per Open WebUI user globally | Limits an Open WebUI identity across all backend profiles |
Quota Enforcement Flow¶
Incoming request
│
▼
Check database for active block status
│
├─ Blocked? ──► Reject request (429)
│
▼
Proxy request to backend
│
▼
Allow response through
│
+-trigger background task-+
|
▼
Query to evaluate quota consumption
│
├─ Quota exceeded? ──► Write block status + expiry to database
Block Status¶
When a quota is exceeded:
- A block status and expiry timestamp are written to the PostgreSQL database for the relevant user/key/scope.
- On every subsequent incoming request, the block status is checked before proxying.
- Once the block expires, requests flow through normally.