🛡️ Pegasi Shield

A lightweight safety and reliability layer for large‑language‑model (LLM) applications.

Overview

Pegasi Shield sits between your application and any LLM (OpenAI, Claude, local models, etc.).
It inspects every prompt and response, blocks or edits unsafe content, and logs decisions for auditing—all with minimal latency and no data egress.

🔬 Research: FRED

Pegasi Shield’s hallucination module is powered by FRED — Financial Retrieval‑Enhanced Detection & Editing. The method was peer‑reviewed and accepted to the ICML 2025 Workshop. Code, evaluation harness and demo notebooks are in fred/.

🔧 Key capabilities

Area	What Shield provides
Prompt security	Detects and blocks prompt injections, role hijacking, system‑override attempts.
Output sanitisation	Removes personal data, hate speech, defamation and other policy violations.
Hallucination controls	Scores and rewrites ungrounded text using a 4B parameter model at performance on par with oo3.
Observability	Emits structured traces and metrics (OpenTelemetry) for dashboards and alerts.
Deployment	Pure‑Python middleware, Docker image, or Helm chart for Kubernetes / VPC installs.

⚡ Quick start

*Coming July 18th

pip install pegasi-shield

from pegasi_shield import Shield
from openai import OpenAI

client = OpenAI()
shield = Shield()                       # uses default policy

messages = [{"role": "user", "content": "Tell me about OpenAI o3"}]
response = shield.chat_completion(
    lambda: client.chat.completions.create(model="gpt-4.1-mini", messages=messages)
)

print(response.choices[0].message.content)

Shield.chat_completion accepts a callable that runs your normal LLM request. Shield returns the same response object—or raises ShieldError if the call is blocked.

📚 How it works

Prompt firewall — lightweight rules (regex, AST, ML) followed by an optional LLM check.
LLM request — forwards the original or patched prompt to your provider.
Output pipeline
- heuristics → vector similarity checks → policy LLM
- optional “Hallucination Lens” rewrite if factuality score is below threshold.
Trace — JSON event with allow/block/edit decision and risk scores.

All stages are configurable via YAML or Python.

Roadmap

v0.5 launch (July 18th)
LiveKit Agent Tutorial
LangGraph Agent Tutorial
Fine‑grained policy language
Streaming output inspection
JavaScript/TypeScript SDK

Contributing

Issues and pull requests are welcome. See CONTRIBUTING.md for details.

License

Apache 2.0

::contentReference[oaicite:0]{index=0}

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
core		core
examples		examples
fred		fred
static/images		static/images
.DS_Store		.DS_Store
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_lock.txt		requirements_lock.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ Pegasi Shield

Overview

🔬 Research: FRED

🔧 Key capabilities

⚡ Quick start

📚 How it works

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

pegasi-ai/shield

Folders and files

Latest commit

History

Repository files navigation

🛡️ Pegasi Shield

Overview

🔬 Research: FRED

🔧 Key capabilities

⚡ Quick start

📚 How it works

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages