sAIfety is a drop-in proxy server that sits between your application and OpenAI or Anthropic. It automatically applies safety guardrails — PII redaction, prompt injection blocking, topic filtering, and toxicity detection — to every AI request and response without requiring code changes.

Does sAIfety work with OpenAI and Anthropic?

Yes. sAIfety supports both the OpenAI API (via POST /v1/chat/completions) and the Anthropic Messages API (via POST /v1/messages). You only need to change the base_url in your existing client — no other code changes are required.

How do I integrate sAIfety into my app?

Sign up at app.saifety.dev, add your OpenAI or Anthropic API key, and copy your proxy key. Then change one line in your code: set base_url to https://app.saifety.dev/v1 and use your proxy key as the api_key. All requests are then automatically checked against your configured guardrails.

Is sAIfety open source?

Yes. sAIfety is open source under the Apache 2.0 licence. The source code is available at github.com/marshallprogramming/saifety. A hosted, managed version is also available at saifety.dev.

Open source · Works with OpenAI & Anthropic

sAIfety
Guardrails for every AI call.
Zero code changes.

A drop-in proxy that automatically protects your AI features — blocking prompt injection, redacting PII, filtering topics, and validating outputs before they reach your users.

Get started free → See how it works

The problem

Every AI feature ships with the same risks.

Most teams build the same safety checks from scratch, every time. sAIfety means you build them once.

Without sAIfety

Rebuilt in every app, every time

PII reaches the model and gets logged by your AI provider
Users manipulate the AI by overriding your system prompt
No record of what's being sent to or returned from the AI
Safety logic scattered across multiple codebases
One rule change means updating every app that calls AI
Different teams apply different (or no) standards

With sAIfety

Consistent, centralised, automatic

PII is redacted before it ever leaves your infrastructure
Injection attempts are blocked at the proxy layer
Every request and outcome is logged to an audit trail
All rules live in one YAML file, not scattered in code
Update a policy once — every app inherits it instantly
Per-tenant profiles for different risk levels

How it works

One URL change. Full coverage.

sAIfety is a transparent proxy — it speaks the same API as OpenAI and Anthropic. Sign up, add your AI key, and point your code at sAIfety.

Sign up and add your keys

Create a free account at app.saifety.dev. Add your OpenAI or Anthropic API key — we store it encrypted and never log it.

# You get a proxy key like:
sk-saifety-a1b2c3d4e5f6...

Change one line

Point your OpenAI or Anthropic client at sAIfety instead of the AI API directly. Use your proxy key. That's the only code change.

# before
OpenAI(api_key="sk-...")

# after — one line change
OpenAI(
  api_key="sk-saifety-...",
  base_url="https://app.saifety.dev/v1"
)

Configure your rules

Use the dashboard to configure which guardrails apply — PII redaction, prompt injection blocking, topic filters. Changes take effect instantly.

pii: enabled: true
  action: redact
prompt_injection:
  enabled: true
topic_filter:
  blocked_topics:
    - competitor
    - lawsuit

Guardrails

Everything that can go wrong, covered.

Six guardrails run on every request and response, configurable per tenant.

🔏

PII Redaction

Detects sensitive personal data in user messages before they reach the model. Choose to silently redact or block the request entirely.

🛡️

Prompt Injection

Blocks attempts to override your system instructions — jailbreaks, persona hijacking, "ignore all previous instructions" patterns.

🚫

Topic Filter

Blocks requests that mention any topic you configure as off-limits. Useful for brand safety, legal compliance, or competitive reasons.

☣️

Toxicity Filter

Inspects model responses before they reach your users, blocking hate speech, slurs, and harmful content.

📐

Output Validation

Enforce a maximum response length, or require the model's response to conform to a JSON schema — useful for structured data pipelines.

📋

Audit Log

Every request is logged — tenant, API used, outcome, reason if blocked, and message preview. Queryable via API or the dashboard.

Integration

Works with what you already use.

Compatible with the official OpenAI and Anthropic SDKs in Python and JavaScript — no wrapper libraries, no lock-in.

✓ OpenAI Python & JS SDK
✓ Anthropic Python & JS SDK
✓ Any HTTP client (curl, fetch, axios)
✓ LangChain, LlamaIndex, and other frameworks that use OpenAI-compatible endpoints

# The only change needed in your entire codebase
from openai import OpenAI

client = OpenAI(
    api_key="sk-saifety-...",            # ← your proxy key
    base_url="https://app.saifety.dev/v1",  # ← add this
)

# Everything else stays identical
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

// The only change needed in your entire codebase
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "sk-saifety-...",             // ← your proxy key
  baseURL: "https://app.saifety.dev/v1",  // ← add this
});

// Everything else stays identical
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

# Works with the official Anthropic SDK too
from anthropic import Anthropic

client = Anthropic(
    api_key="sk-saifety-...",             # ← your proxy key
    base_url="https://app.saifety.dev",    # ← add this
)

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

Dashboard

See everything. Understand anything.

A live dashboard shows every request in real time — what was blocked, why, and by which guardrail. Edit your guardrail rules per-tenant directly from the UI. No YAML files, no restarts.

sAIfety Dashboard — localhost:8000

Total Requests

1,284

Blocked

Pass Rate

96.3%

Proxy Status

● Online

TimeTenantOutcomeDetails

14:32:01 default passed What is the capital of France?

14:31:58 strict blocked Request contains PII: email

14:31:44 default blocked Prompt injection attempt detected

14:31:39 default passed Summarise this document for me...

Pricing

Free to self-host. Hosted plans start at $49.

Run sAIfety on your own infrastructure for free — forever. Or use our hosted service and skip the ops work entirely.

Free

Try the hosted version free, or self-host with no limits.

forever

200 requests / month (hosted)
All guardrails included
OpenAI & Anthropic support
Dashboard with live tester
Audit log & token metrics
Unlimited if self-hosted

sAIfety Guardrails for every AI call. Zero code changes.