MLSecOps, the practice of embedding security into every phase of the machine learning lifecycle, is no longer optional for enterprises running AI in production. A traditional DevSecOps program covers source code, dependencies, and infrastructure. It does not cover training datasets, model weights, fine-tuning pipelines, or the behavioral properties of deployed models. That gap is where attackers now operate.
This guide covers a four-phase enterprise MLSecOps framework: data pipeline security, training and fine-tuning security, evaluation gates, and deployment and runtime protection. Each phase has specific controls, recommended tooling, and detection logic you can apply today.
Key Takeaways
- ML pipelines are a distinct attack surface from software pipelines: non-determinism, data dependencies, and model artifacts require dedicated controls that DevSecOps tools do not provide.
- Data poisoning can embed persistent malicious behavior with as few as one poisoned example per 10,000 fine-tuning samples.
- Model artifact signing using the OpenSSF Model Signing specification and Sigstore is production-ready and should be mandatory in every CI/CD pipeline.
- AIBOM (AI Bill of Materials) generation is required for EU AI Act Article 53 compliance and is the foundation of ML supply chain traceability.
- OPA (Open Policy Agent) policy gates in your ML pipeline enforce security requirements without adding manual review overhead.
- Inference endpoints need the same hardening as any external API: authentication, rate limiting, input validation, and structured logging.
- Behavioral drift monitoring in production is the detection layer that catches compromised or degraded models after deployment.
Why MLSecOps Is Not Just DevSecOps with a New Name
DevSecOps addresses deterministic systems: given the same inputs, the same code produces the same outputs. Security controls such as SAST, dependency scanning, and container image signing work because the artifact under review is static.
ML systems differ in three fundamental ways.
First, behavior is encoded in data, not code. A model's outputs are determined by its training data and weight values, not its source code. You can audit every line of the training script and still deploy a model with embedded backdoors if the dataset was poisoned.
Second, model artifacts are opaque. A PyTorch checkpoint or ONNX file does not have a predictable manifest. Weights are high-dimensional floating-point tensors, and detecting malicious modifications requires behavioral analysis, not static scanning.
Third, the supply chain includes public model registries. In early 2026, researchers discovered multiple malicious model weight files on Hugging Face containing backdoors activated by specific input trigger tokens. This applies the software supply chain attack pattern directly to ML artifacts, bypassing every traditional SAST tool.
MLSecOps adds the controls that address these three differences. It extends DevSecOps rather than replacing it.
The OpenSSF MLSecOps Whitepaper from August 2025 provides a detailed breakdown of how ML-specific controls map to existing DevSecOps toolchains.
Phase 1: Data Pipeline Security
Data is the first-class dependency in ML. Every training dataset, validation set, and fine-tuning corpus is an attack surface. The controls in this phase protect the integrity, provenance, and access to that data before it reaches a training job.
Data Provenance Tracking
Every dataset entering a training pipeline should have a provenance record: where it came from, who ingested it, when it was last validated, and what transformations were applied. Without provenance, you cannot trace a behavioral anomaly back to its source data.
In practice, use a metadata store such as MLflow, DVC, or a purpose-built data catalog to record a cryptographic hash of each dataset version alongside its lineage. Flag any training job that uses an unregistered or hash-mismatched dataset for human review before execution.
Poisoning Detection at Ingestion
Data poisoning research has demonstrated that injecting as few as one malicious example per 10,000 fine-tuning samples can reliably embed backdoor behavior. Detection at the dataset level uses statistical methods: outlier detection on label distributions, embedding-space clustering to identify injected outlier samples, and cross-reference against known-clean dataset checksums.
For fine-tuning datasets specifically, run a density-based outlier detector such as DBSCAN or Isolation Forest on the embedding space of training examples before each fine-tuning job. Examples with unusually high isolation scores warrant manual review before the job proceeds.
Access Control on Training Data
Training data stores are high-value targets. Treat them with the same access discipline as production databases: least-privilege read access for training jobs, write access requiring an approval workflow, and audit logging on every read operation.
A common pattern in enterprise ML teams is to grant data scientists broad read access to all datasets by default. This creates an unnecessarily wide blast radius if a training environment is compromised. Scope data access to the specific datasets needed for each training job, using service accounts with short-lived credentials scoped to the job duration.
For regulated data (PII, PHI, financial records), apply field-level encryption before ingestion into training datasets. An AI security audit should verify that no regulated data reaches training pipelines without explicit data classification and consent tracking.
Phase 2: Training and Fine-Tuning Security
Training environments are often treated as internal tools with relaxed security postures. In practice, a compromised training job can produce a backdoored model that passes all functional tests and reaches production.
RBAC on Training Job Submission
Every training job submission should be authenticated and authorized. Explicitly define who can submit a training job against production datasets, who can modify training configurations, and who can trigger fine-tuning on a production model.
Define roles clearly: data scientist (read datasets, submit jobs to dev environments), ML engineer (submit to staging), ML ops (promote to production). Enforce these roles through your ML platform (Vertex AI, SageMaker, Azure ML) and route all job submissions to a centralized audit log.
Model Artifact Signing with Sigstore
Model artifact signing is production-ready. The OpenSSF Model Signing (OMS) specification defines a standard for cryptographically signing ML models using Sigstore. The Python library model-signing supports Sigstore keyless signing, binding signatures to CI/CD OIDC identity rather than long-lived keys.
Integrate artifact signing at the end of every training job:
from model_signing import sign
sign.sign_model_directory(
model_dir="./model_output",
signer="sigstore", # Binds to CI/CD OIDC identity
output_sig="./model_output/model.sig"
)
Before any model is promoted to staging or production, verify its signature:
from model_signing import verify
result = verify.verify_model_directory(
model_dir="./model_to_deploy",
sig_file="./model_to_deploy/model.sig"
)
if not result.success:
raise ValueError(f"Model signature verification failed: {result.reason}")
Unsigned models are blocked from promotion by an OPA policy gate in the CI/CD pipeline.
SafeTensors: Closing the Pickle Attack Surface
The pickle format, used by default in PyTorch checkpoints, allows arbitrary Python code execution during deserialization. An actor who can modify a model checkpoint can embed arbitrary code that executes when the model loads anywhere in your infrastructure.
SafeTensors eliminates this attack surface. The format restricts storage to tensor data types and cannot execute code on deserialization. Enforce SafeTensors as a pipeline requirement:
- Block promotion of models in pickle format (
.pt,.pkl,.binwith pickle headers) via an OPA policy gate. - Migrate legacy models to SafeTensors using Hugging Face's provided conversion tooling.
- Verify format compliance in CI with a lightweight static header check before any evaluation gate runs.
Backdoor Detection Before Promotion
Neural Cleanse and STRIP are the two most established techniques for detecting backdoored models before deployment. Both run as evaluation gates in the CI pipeline.
Neural Cleanse optimizes for the smallest input perturbation that causes the model to misclassify toward each possible output class. A successful backdoor attack produces an anomalously small perturbation for the trigger class, which the algorithm flags as a backdoor indicator.
STRIP tests behavioral consistency under strong perturbations. Backdoored models remain confidently misclassified even when inputs are heavily perturbed, because the trigger dominates prediction. Clean models show expected confidence degradation under perturbation.
For large language models, automated red teaming as a CI gate is more practical than Neural Cleanse. Run a fixed set of adversarial prompt templates against each candidate model and flag statistically significant behavioral changes from the baseline. See the LLM CI/CD security testing guide for implementation patterns.
Phase 3: Evaluation Gates and Model Validation
Evaluation gates are checkpoints between lifecycle stages where a model must pass defined security criteria before promotion. They function as security policy enforcement points within the ML CI/CD pipeline.
Structuring Evaluation Gates
Define three gate types across the promotion lifecycle:
Dev-to-staging gate: Functional quality checks plus artifact format validation (SafeTensors), signature verification, and dataset provenance confirmation. Fully automated.
Staging-to-production gate: Full security evaluation including backdoor analysis, adversarial robustness testing, behavioral regression against the production model, and AIBOM generation. Partial manual review required for high-risk model changes.
Hotfix gate: Expedited path for emergency deployments with artifact signing and behavioral regression as the minimum floor. Hotfix deployments automatically trigger a post-deployment security audit within 24 hours.
OPA Policy Enforcement
Open Policy Agent provides the policy-as-code layer for all three gate types. Policies are version-controlled, auditable, and executable from any CI/CD system.
Example OPA policy blocking unsigned models or non-compliant formats:
package mlsecops.deployment
deny[msg] {
input.model.signature_verified == false
msg := "Model must have a verified cryptographic signature before promotion"
}
deny[msg] {
input.model.format != "safetensors"
msg := "Model artifacts must use SafeTensors format"
}
deny[msg] {
input.model.aibom_generated == false
msg := "AIBOM must be generated before production promotion"
}
This policy runs at every promotion gate. A model that fails any condition is blocked with a clear audit trail of the rejection reason.
AIBOM Generation
An AI Bill of Materials is a machine-readable inventory of every component shaping a model's behavior: training datasets (with checksums), model architecture, hyperparameters, fine-tuning datasets, evaluation results, and signing provenance.
The CycloneDX ML-BOM specification provides a standardized schema for AIBOM generation. Generate an AIBOM at the end of each training run and store it alongside the model artifact. The AIBOM becomes the primary traceability record for EU AI Act Article 53 compliance and the anchor for supply chain incident response.
For a detailed look at the AI supply chain threat landscape and how AIBOM fits within it, see the AI model supply chain security guide.
Phase 4: Deployment and Runtime Security
A model that passes all training and evaluation security gates can still be compromised or misused at the inference layer. Runtime security covers inference endpoint hardening and behavioral monitoring in production.
Inference Endpoint Hardening
Every inference endpoint is an external API and requires the same security baseline as any other external service.
Authentication: Mutual TLS for service-to-service calls, or OAuth 2.0 with short-lived tokens for user-facing endpoints. API keys without automatic rotation policies are insufficient for production inference endpoints.
Input validation: Define schema validation at the endpoint layer. Reject inputs exceeding token or byte limits, containing malformed structures, or matching known injection patterns. Do not rely solely on the model to handle adversarial inputs, model-level defenses are the last line, not the first.
Rate limiting: Per-client and per-model rate limits prevent abuse and contain the blast radius of compromised API keys. Apply limits at the API gateway layer, not inside model-serving code, so they remain in force even during model updates.
Output filtering: Scan model outputs for PII patterns, sensitive data markers, and policy violations before returning results to the caller. Output filtering catches prompt injection attacks that attempt to exfiltrate data through model responses.
Structured logging: Log every inference request with client identity, input hash, output hash, latency, and any policy violations triggered. This log is the primary data source for anomaly detection and incident response.
A BeyondScale AI security assessment includes inference endpoint hardening review as part of every engagement, covering both the API surface and the underlying serving infrastructure.
Behavioral Drift Monitoring
Model behavior can change after deployment due to input distribution shift, model updates, or active manipulation. Behavioral drift monitoring detects these changes before they become incidents.
Implement a shadow evaluation pipeline that continuously runs a fixed set of probe inputs against the deployed model and records the output distribution. Alert on statistically significant deviations from the baseline distribution. This catches both quality degradation and security-relevant behavioral changes from potential active compromise.
For high-stakes deployments, use a canary rollout pattern: route a percentage of production traffic to a new model version alongside the current production model. Compare behavioral distributions in real time before completing the rollout. Automatic rollback triggers if anomalies exceed defined thresholds.
Tooling Reference
The MLSecOps tooling ecosystem has matured significantly since 2024. The practical reference for enterprise implementation:
| Category | Tool | Purpose |
|---|---|---|
| Artifact Signing | Sigstore / model-signing | Keyless cryptographic signing of model artifacts |
| Format Enforcement | SafeTensors | Block code execution during model deserialization |
| Data Provenance | DVC, MLflow | Dataset versioning and lineage tracking |
| Policy Enforcement | OPA (Open Policy Agent) | Policy-as-code gates in CI/CD pipelines |
| Supply Chain Inventory | CycloneDX ML-BOM | AIBOM generation and management |
| Backdoor Detection | Neural Cleanse, STRIP | Model integrity evaluation before promotion |
| Vulnerability Scanning | Veritensor, pip-audit | Static analysis of ML artifacts and dependencies |
| Inference Security | Envoy, Kong, Nginx | API gateway with rate limiting and mTLS |
| Behavioral Monitoring | Prometheus with custom probes | Drift detection in production |
The NIST AI Risk Management Framework (AI RMF 1.0) maps directly to these tool categories across its four core functions: Govern, Map, Measure, and Manage. Aligning your MLSecOps controls to the AI RMF builds a defensible governance documentation set for compliance audits.
Enterprise MLSecOps Checklist
Across the four phases, the minimum viable control set for an enterprise running AI in production:
- [ ] Dataset provenance records with cryptographic checksums for all training data
- [ ] Data ingestion pipeline includes outlier detection for poisoning signals
- [ ] Least-privilege RBAC on all training job submission and dataset access
- [ ] All model artifacts stored in SafeTensors format; pickle format blocked by OPA policy
- [ ] Model artifact signing with Sigstore integrated into every training job
- [ ] OPA policy gates enforce signing, format, and AIBOM requirements in CI/CD
- [ ] AIBOM generated and version-controlled for every production model
- [ ] Backdoor detection (Neural Cleanse or adversarial red teaming) runs in staging gate
- [ ] Inference endpoints have authentication, rate limiting, and input validation
- [ ] Structured request logging feeds anomaly detection pipeline
- [ ] Behavioral drift monitoring runs continuously against production models
- [ ] Incident response runbook covers model rollback and supply chain incident scenarios
Conclusion
MLSecOps is a distinct engineering discipline, not a relabeling of DevSecOps. The threats are different, the artifacts are different, and the controls are different. Data pipelines, model artifacts, evaluation gates, and inference endpoints each have specific security requirements that standard software security tools do not address.
The four-phase framework in this guide provides the structure for an enterprise MLSecOps program. Start with data provenance and artifact signing: these two controls deliver the highest risk reduction per implementation hour. Add evaluation gates and inference hardening as the program matures.
If you want an independent assessment of your current ML pipeline security posture, start with a BeyondScale AI security scan or contact the team for a full AI security assessment.
AI Security Audit Checklist
A 30-point checklist covering LLM vulnerabilities, model supply chain risks, data pipeline security, and compliance gaps. Used by our team during actual client engagements.
We will send it to your inbox. No spam.
BeyondScale Team
AI Security Team, BeyondScale Technologies
Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.
Want to know your AI security posture? Run a free Securetom scan in 60 seconds.
Start Free Scan