Skip to main content
AI Security

AI Model Supply Chain Security: What to Check Before Deploying

JS

Jayakrishna S

AI Security Team

14 min read

Seventy percent of open-source AI and machine learning repositories contain at least one workflow with a critical or high-severity security issue. That finding — from Mitiga Labs' analysis of 10,000 AI/ML repositories — is not a theoretical projection. It is a measurement of the ecosystem your organization is pulling from when you download a model from Hugging Face, run a fine-tuning script from GitHub, or add an inference dependency via pip.

AI model supply chain security is the set of practices that stand between your production AI deployment and a compromised upstream artifact. Unlike software supply chains, where the threat surface is code, AI supply chains have a unique attack surface: model weights themselves can carry backdoors that are invisible to every standard security tool you have.

This guide covers the major attack vectors, real incidents and research findings, and a 14-point pre-deployment checklist for any open-source LLM your organization is considering.

Key Takeaways

  • AI supply chain attacks can compromise model weights directly — these backdoors survive code review, static analysis, and even fine-tuning
  • 70% of open-source AI/ML repositories have critical or high-severity CI/CD workflow vulnerabilities (Mitiga Labs, 2024)
  • JFrog found 100+ malicious ML models on Hugging Face in Feb 2024; by early 2025, over 3,300 of 400,000 scanned models were vulnerable
  • ShadowLogic backdoors persist through fine-tuning and model format conversion — a property unique to AI supply chain attacks
  • OWASP LLM03:2025 Supply Chain Vulnerabilities rose from 5th to 3rd place in the current rankings
  • A structured pre-deployment audit covering provenance, file scanning, dependency review, and CI/CD analysis can catch most supply chain risks before they reach production

Why AI Model Supply Chains Are Different from Software Supply Chains

In traditional software supply chain security, the threat model is well-understood: an attacker compromises a library, a build script, or a CI/CD pipeline to introduce malicious code that executes in the target environment. Tools like SAST, SCA scanners, and dependency lockfiles address this threat class at known chokepoints.

AI model supply chains share some of these risks — the CI/CD and dependency layers are vulnerable in familiar ways. But they introduce a threat class that has no direct analogue in software security: backdoored model weights.

A model is not code. It is a set of numerical parameters — billions of floating-point values — that encode learned behaviors. These parameters can be manipulated to introduce a backdoor: a specific trigger input that causes the model to behave in a way an attacker controls, while maintaining normal behavior for all other inputs. The backdoor is not in any function, import statement, or executable file. It is in the mathematics of the model itself.

This means:

  • Standard antivirus and endpoint protection have zero detection capability for weight-embedded backdoors
  • Static analysis and code review cannot identify them
  • The backdoor may not appear in any security scan of the deployment environment
  • In some implementations, the backdoor survives fine-tuning on new data — meaning downloading a backdoored base model and fine-tuning it on your proprietary data does not remove the threat
This is the core reason AI supply chain security requires a different toolkit than software supply chain security.

OWASP LLM03:2025 Supply Chain Vulnerabilities — which rose from 5th to 3rd in the current OWASP LLM Top 10 rankings — recognizes this shift explicitly, identifying pre-trained model backdoors, training data poisoning, and third-party plugin risks as distinct threat categories requiring dedicated controls.


Attack Vector 1: Backdoored Models on Hugging Face and Public Registries

Hugging Face has become the de facto distribution platform for open-source AI models, hosting over 400,000 models as of early 2025. This scale creates an unavoidable supply chain risk: it is not possible for any platform to manually review the security of every uploaded model artifact.

In February 2024, JFrog's security research team published findings on over 100 malicious ML models discovered on Hugging Face. The majority — 95% — were PyTorch pickle files, a serialization format that can execute arbitrary Python code on load. These were not inert files; they were functional-looking models that executed reverse shells, code injection payloads, and object hijacking attacks when a researcher downloaded and loaded them in a standard ML environment.

By early 2025, broader scanning of Hugging Face's catalog found over 3,300 models with payloads capable of executing rogue operations.

The attack mechanism is straightforward in the pickle case: the Python pickle format allows arbitrary callable objects to be serialized, and deserializing a malicious pickle file calls those objects — executing attacker-controlled code — before the model even loads. An engineer who runs model = AutoModel.from_pretrained("organization/model-name") may execute attacker code in the same operation.

The legitimate appearance of these models matters. Many use realistic repository names, sensible README files, and believable version histories. Typosquatting (slight misspellings of popular model names) is a common delivery mechanism.

Controls: Use only models from verified organizations with Hugging Face's official verification badge. Validate checksums against those published by the source organization. Prefer safetensors format over pickle-based formats — safetensors is designed to be safe to load from untrusted sources.


Attack Vector 2: Poisoned Training Data and Fine-Tuning Datasets

Data poisoning attacks modify a model's behavior at the training level rather than the deployment level. An attacker who can introduce crafted examples into a training dataset can influence the model to behave in targeted ways for specific inputs while maintaining correct behavior across all other inputs.

In practice, this threat manifests in several ways:

Public dataset contamination. Large pre-training datasets like The Pile, C4, and Common Crawl contain content from across the public internet. A patient attacker who seeds poisoned content into web pages that will be crawled and included in future dataset versions can influence future model training. This is a low-probability but high-impact attack vector for base models trained on public data.

Fine-tuning dataset manipulation. Organizations commonly fine-tune open-source base models on proprietary datasets. If a fine-tuning dataset includes content from third-party sources — public datasets, contractor-generated data, scraped content — each of those sources is a potential poisoning vector. An attacker who contributes to a public dataset used in your fine-tuning pipeline influences your production model.

Embedding model poisoning. RAG (Retrieval-Augmented Generation) systems depend on embedding models to index and retrieve documents. A poisoned embedding model may cause the retrieval system to return incorrect or attacker-influenced documents in response to specific queries, affecting system outputs without any modification to the LLM itself.

The challenge with data poisoning detection is that the poisoned behavior may only manifest for specific trigger inputs designed by the attacker — inputs your evaluation suite may never test.


Attack Vector 3: Malicious CI/CD Workflows in Open-Source ML Projects

Mitiga Labs' analysis of 10,000 AI/ML repositories found that 70% contain at least one workflow with a critical or high-severity security issue. This is not a finding about obscure, low-traffic repositories — it reflects the state of the ecosystem broadly.

The attack surface here is GitHub Actions and similar CI/CD automation. When you clone an AI repository and run its training or fine-tuning scripts, you may trigger GitHub Actions workflows that:

  • Pull in third-party actions from repositories the maintainer does not control
  • Have workflow injection vulnerabilities where pull request titles or issue content can inject shell commands into the workflow execution
  • Write model artifacts to storage with secrets accessible to the workflow runner
  • Have pull_request_target event misconfiguration that grants untrusted forks write access to repository resources
A compromised workflow can modify model artifacts as they are built, inject malicious code into packaging steps, steal API keys and model weights during training runs, or backdoor the model being produced before it is uploaded to a distribution platform.

This vector is particularly dangerous because it is invisible from the model artifact itself — a model that was built via a compromised workflow may look identical to a legitimately built model.


Attack Vector 4: Compromised Inference Dependencies

The Python ecosystem has a well-documented supply chain attack history: typosquatting on PyPI, compromised maintainer accounts, and malicious packages that mimic legitimate ones. AI inference deployments rely heavily on the same ecosystem.

Common risk areas:

  • PyPI packages: transformers, torch, sentence-transformers, langchain, and hundreds of related packages are critical dependencies. A compromise of any of these — or of a package they depend on — affects every deployment that pulls the compromised version.
  • Conda channels: Custom conda channels can contain modified versions of packages with malicious payloads that are difficult to distinguish from legitimate versions.
  • Docker base images: AI deployment containers frequently use pre-built CUDA-enabled base images. These images may contain outdated OS packages with known CVEs, or in some cases, malicious modifications if pulled from unofficial registries.
  • Jupyter notebook dependencies: Notebooks are commonly shared as research artifacts and may contain pip install commands in cells that install packages from unofficial sources.
CVEs in inference dependencies represent a more detectable risk class than weight backdoors — SCA scanners and dependency lockfiles apply here. But the pace of package updates in the AI/ML ecosystem often leads teams to deprioritize dependency pinning in favor of staying current with model compatibility.

ShadowLogic: Why Backdoors Survive Model Conversion and Fine-Tuning

HiddenLayer's ShadowLogic research represents the most important recent finding in AI supply chain security and deserves detailed attention.

ShadowLogic is a technique for implanting backdoors directly into a neural network's computational graph — the directed acyclic graph of operations that defines how a model processes inputs. Unlike code injection, ShadowLogic operates at the mathematical layer of the model.

The technique works by identifying specific graph nodes and modifying the operations they represent to produce attacker-controlled outputs when a specific trigger pattern is present in the input, while computing normal outputs for all other inputs. The backdoor is in the mathematical structure of the model, not in any file or code that would be examined during a security review.

Two properties make this attack class uniquely dangerous:

Survival through fine-tuning. Standard wisdom in AI security assumed that fine-tuning a downloaded model on your own data would neutralize any pre-existing backdoor. ShadowLogic demonstrated this is false. Because the backdoor is embedded in the computational graph structure itself — and fine-tuning updates weight values but not graph structure — the backdoor persists through fine-tuning operations. An organization that downloads a backdoored model and fine-tunes it on proprietary data has a backdoored production model.

Survival through model format conversion. Models are commonly converted between formats (PyTorch → ONNX → TensorRT, for example) as part of deployment optimization. ShadowLogic backdoors can survive these conversions because the underlying mathematical behavior is encoded in the computation graph, which is preserved through format translation.

This changes the threat model significantly: there is no "safe downstream transformation" that removes a ShadowLogic-style backdoor. The only defense is pre-deployment detection before the model enters your pipeline.


AI Supply Chain Audit Checklist: 14 Controls Before Deploying Any Open-Source Model

Before deploying any open-source LLM or model artifact into a production or staging environment:

Provenance and Source Verification

  • Verify the model's source repository belongs to a known, authenticated organization — not a lookalike account
  • Validate file checksums (SHA-256) against those published by the source organization at the time of release
  • Confirm the model's training data sources are documented and include lineage for fine-tuning datasets
  • Check the model's license for enterprise use — some licenses restrict commercial deployment or require attribution
  • Model File Security

  • Scan pickle-based model files (.pkl, .pt, .bin) with a dedicated ML model security scanner before loading
  • Prefer safetensors format — it is structurally safe to load from untrusted sources unlike pickle
  • Load models in isolated sandbox environments first; do not load unknown models directly in production infrastructure
  • Verify model card completeness — if a popular model has a sparse or missing model card, treat this as a risk indicator
  • Supply Chain and CI/CD Integrity

  • Review GitHub Actions workflow files for the model repository: check for pull_request_target misconfigurations, third-party action pinning (actions should be pinned to commit SHA, not tags), and excessive permissions
  • Audit the training and fine-tuning scripts for external data pulls, pip install commands with unpinned versions, or calls to external services during training
  • Verify the model artifact was produced by the CI/CD system you have reviewed, not a local build uploaded directly
  • Dependency and Runtime Security

  • Pin all inference dependencies to specific versions and verify against a known-good lockfile
  • Scan Docker base images and pip packages with SCA tooling before deploying
  • Enforce network egress controls in model serving infrastructure — model inference should not require internet access; unexpected outbound connections from an inference server are a strong indicator of compromise

  • How BeyondScale Assesses AI Supply Chain Risk

    An AI supply chain assessment should be part of any enterprise evaluation process before deploying an open-source model at scale. The risks are real, they are measurable, and they are not visible to standard security tooling.

    BeyondScale's AI supply chain review covers:

    • Model artifact scanning: Evaluating model files for embedded payloads, suspicious pickle content, and known-malicious patterns
    • Provenance verification: Confirming model lineage, training data documentation, and release pipeline integrity
    • Dependency audit: SCA scanning of all inference dependencies against current CVE databases
    • CI/CD workflow review: Manual review of workflow files in the model's source repository for injection vulnerabilities and misconfigured permissions
    • Deployment environment hardening: Evaluating network egress controls, isolation, and runtime monitoring for the model serving environment
    This complements a broader AI security audit that covers your full AI deployment — not just the model artifacts themselves.

    For organizations building on open-source LLMs at scale, a supply chain audit should run before initial deployment and after any base model update. The 70% critical workflow finding from Mitiga's research is not limited to less-maintained repositories — it reflects the state of the AI/ML open-source ecosystem broadly.

    Start with a Securetom scan to get immediate visibility into supply chain risk indicators across your deployed AI infrastructure.


    Closing: Trust Nothing in the AI Supply Chain by Default

    The open-source AI ecosystem moves at a pace that makes systematic security review difficult. New models appear daily. Fine-tuning techniques evolve weekly. CI/CD pipelines are complex and their security properties are rarely documented.

    The correct posture is not to avoid open-source models — they represent the state of the art in many capability areas. The correct posture is to apply the same discipline to AI artifacts that mature organizations apply to software dependencies: verify, scan, isolate, and monitor.

    ShadowLogic demonstrated that weight backdoors are not theoretical. JFrog found over 100 malicious models on the most popular AI model distribution platform. Mitiga found 70% of AI repositories have critical workflow vulnerabilities. These are not edge cases.

    A pre-deployment supply chain audit is not a nice-to-have. For any enterprise deploying an open-source LLM in a production context — especially in security-sensitive, regulated, or customer-facing applications — it is baseline due diligence.

    Book an AI security assessment to run a full supply chain review before your next model deployment. Our team has the tooling and expertise to evaluate AI artifacts at the weight, dependency, and pipeline level — not just the code layer.

    For a deeper look at how supply chain attacks fit into the broader AI threat landscape, see our coverage of OWASP LLM Top 10 and our AI red teaming guide.


    Sources:

    AI Security Audit Checklist

    A 30-point checklist covering LLM vulnerabilities, model supply chain risks, data pipeline security, and compliance gaps. Used by our team during actual client engagements.

    We will send it to your inbox. No spam.

    Share this article:
    AI Security
    JS

    Jayakrishna S

    AI Security Team, BeyondScale Technologies

    Security researcher and engineer at BeyondScale Technologies, an ISO 27001 certified AI cybersecurity firm.

    Want to know your AI security posture? Run a free Securetom scan in 60 seconds.

    Start Free Scan

    Ready to Secure Your AI Systems?

    Get a comprehensive security assessment of your AI infrastructure.

    Book a Meeting