Responsible AI

Our Commitment to Responsible AI

Transparency, oversight, and safety are foundational to how we build and operate Prometheus AI.

Prometheus AI is an Autonomous Defense Platform that uses machine learning to detect threats, classify attacker intent, and drive automated response. With that capability comes responsibility. This policy outlines how we ensure our AI systems operate transparently, fairly, and safely.

1. AI System Transparency

We believe you should understand what our models do and how they reach their conclusions:

Model Architecture: Prometheus uses a 5-model deep learning ensemble: a deep threat classifier (residual network), a behavioral sequence model (LSTM), a malware feature classifier, an ensemble meta-learner (stacking), and an anomaly autoencoder. Each model contributes a score, and the ensemble produces a final classification.
Feature Extraction: Models analyze fixed-size feature vectors (25 dimensions in the production feature extraction pipeline) extracted from security telemetry. Features cover network traffic patterns, process behavior metrics, authentication sequences, file system activity, and system-level indicators. No personally identifiable information (PII) is used as a model input.
Classification Output: Every detection includes a threat type classification, confidence score, contributing model versions, and the specific features that drove the classification. This information is available in the Client Portal and via API.
Explainability: Our detection pipeline generates human-readable explanations for every alert, mapping detected behaviors to specific MITRE ATT&CK techniques and kill chain phases.

2. Human Oversight

AI does not operate unsupervised. We maintain multiple layers of human oversight:

Shadow Scoring: New models run in shadow mode alongside production models. Their predictions are logged and compared but never acted upon until validated by our promotion process.
5-Gate Promotion: Before a model can enter production, it must pass five validation gates: accuracy threshold, false positive rate limit, consistency check against the existing model, performance benchmarking, and stability over a minimum observation period.
Analyst Review: High-impact automated actions (endpoint isolation, network containment) can be configured to require human approval through our playbook approval workflow. Clients control which actions are fully automated and which require sign-off.
Disagreement Tracking: When shadow and production models disagree, the disagreement is logged, surfaced in the admin dashboard, and analyzed to improve both models.
Auto-Rollback: If a promoted model shows a spike in false positives in production, the system automatically rolls back to the previous model version without human intervention required.

3. Bias Mitigation

We actively work to prevent bias in our detection models:

Balanced Training Data: Our training pipeline balances classes through undersampling over-represented categories and SMOTE-like interpolation for rare attack types. This prevents the model from being biased toward common attack patterns at the expense of detecting rare but critical threats.
Leakage Audits: We perform data leakage audits to ensure that training data does not contain features that would give the model an unfair advantage in testing but would not be available in real-world detection scenarios.
Cross-Sensor Validation: Models are trained and validated on data collected from six independent honeypot sensors, each receiving distinct attack traffic. Held-out test data is strictly separated from training data, and leakage audits ensure no training-time-only features contaminate evaluation.
Diverse Real-World Data: Our models are trained on over 2.6 million real attack sessions collected from six live honeypot sensors, representing thousands of unique threat actor IPs from diverse geographic origins. This provides authentic, varied threat signal rather than synthetic or lab-generated data.

4. Data Privacy in ML

We design our ML pipeline with privacy at every layer:

No PII in Features: Our feature extraction pipeline processes raw telemetry into numerical feature vectors. No usernames, hostnames, IP addresses, or other PII is used as a model input. Features represent behavioral patterns (connection frequency, entropy scores, timing distributions) rather than identifiable data.
Anonymization: When telemetry is used for model training, client identifiers are stripped and source data is anonymized using SHA256 hashing before entering the training pipeline.
Federated Threat Intelligence: Our BOLO (Be On the Lookout) system shares threat indicators across clients with source identities explicitly stripped at the code level. Receiving clients benefit from collective intelligence without ever seeing which client was originally targeted.
Self-Hosted Deployment: The entire Prometheus platform — detection engine, ML models, and API — can be deployed on-premise using our Docker-based deployment. Agents connect to the customer-hosted server via a configurable endpoint URL. In a self-hosted deployment, no data leaves the organization's network. The agent also performs local event collection and queuing, supporting offline resilience.

5. Model Safety

We ensure that ML models cannot cause harm even when they fail:

Deterministic Rules Backstop: ML models are never the sole decision-maker. Deterministic detection rules (beaconing patterns, ransomware file mutations, brute force thresholds) operate independently of ML and will catch threats even if all models fail.
Confidence Thresholds: Automated response actions require ML confidence scores above configurable thresholds. Low-confidence detections generate alerts for human review rather than triggering automated responses.
Response Level Controls: Clients can set maximum response levels (monitor, alert, contain, neutralize) that cap the severity of automated actions regardless of model confidence.
Protected System Accounts: The response executor maintains a hardcoded list of 30+ protected system accounts that can never be killed or quarantined, regardless of model output.
Rollback Capability: All containment playbooks include defined rollback actions (IP unblocking, user re-enabling, domain unblocking). Rollback can be triggered via a single API call or admin action on any playbook execution, reversing containment immediately.

6. Continuous Monitoring and Improvement

Our AI systems are continuously monitored and improved:

Drift Detection: Two complementary drift detection systems operate continuously. The DriftMonitor worker tracks false positive rates against model baselines and triggers automatic rollback when performance degrades. The real-time DriftTracker monitors prediction confidence, class distribution, and anomaly rates at the inference layer using a sliding window, triggering retraining when distribution shifts exceed configurable thresholds.
Analyst Feedback Loop: Security analysts can provide feedback on detections (confirm, dismiss, reclassify). This feedback is incorporated into model retraining to continuously improve accuracy.
Honeypot Network: Our network of six live honeypot sensors provides a continuous stream of real attack data via an always-on feed service. This data drives model retraining and keeps detection current with evolving attacker techniques.
Performance Metrics: We track precision, recall, F1 score, false positive rate, and detection latency across all models. Current ensemble accuracy: 99.63% on a held-out honeypot test set of 108,867 samples with independently verified leakage-free validation (zero feature leakage, zero duplicate samples across splits). Cross-domain validation ongoing.

7. Contact

Questions about our AI practices? Contact us at [email protected] or visit our Contact page.