
My First Skills Security Review: OWASP AST10, SkillSpector, and from 60 Findings to 5 Real Ones
Article Brief
Why this article matters
Agent skills are already part of the execution layer (increasingly used and central to Spec-Driven Development, aka SDD)... and unfortunately most teams ship them with no security review at all. This post walks through the first review I had to run on 25 internal skills: I will tell you how I combined NVIDIA's SkillSpector scanner with a manual pass over the new OWASP Agentic Skills Top 10, why around 60 raw findings ended up as just 5 real ones, and why the survivors were quiet supply-chain risks rather than dramatic exploits. It is a field report on triage, based on my genuine experience, not a tool ad!
My First Skills Review: 60 Findings where almost everything was noise (combining OWASP AST10 with NVIDIA's SkillSpector)
It all started with a... "-Hey Richie, how's it going? I need to ask you a favor, can you do a security review of our skills?..."
I read it and sat there staring at the screen for a while. Not because it was hard or super complex, but because I did not know where to grab it from... At that point my company did not have a playbook for this kind of review. About seven months ago, when Anthropic launched this back in October 2025, a skill was a folder with a markdown file inside, something the model read and not much more. Today that same folder decides what the agent runs, which files it touches, and with what permissions, and there are even hierarchies of skills. And to top it off, until that moment nobody on the team had ever looked at them with security eyes.
So I did the obvious thing any of us would do in this kind of novel situation: first some research on the topic to understand whether there was a testing framework in the industry, then whether there were academic papers about this, and whether some kind of framework existed. I found out that the week before, OWASP had released the brand-new Skills Top 10 'OWASP Agentic Skills Top 10'. And I also found tools on GitHub to analyze skills, the best one being NVIDIA's SkillSpector. I pointed the tool at our 25 internal skills for this particular project, with their scripts and declared permissions, and hit enter. A few minutes later it returned around 60 findings... SIXTY! My first reaction was full-blown alarm, thinking the house was on fire.
Without further suspense, the reality is that nothing was on fire. After sitting down to validate them one by one, of those 60 about 5 remained real. All low impact, all subtle, and all (without exception) with the same quiet shape: harmless today, potentially dangerous the day someone poisons something we depend on (the famous supply chain).
The most valuable part was not finding a vulnerability or misconfiguration in some skill X, but learning to build a testing methodology for something brand-new in the industry and barely documented, without panicking at the unknown or even at that alarmist output from a new tool. What matters is having the discipline to read between the lines for what truly counts amid so much noise, understanding the context and the specific use... bringing it down from sixty to five.
Why This Matters Now
For most of late 2025 and the start of this year, the industry treated skills as harmless configuration. That assumption is now not only dead but already validated by the industry through the well-known open OWASP Top 10 project, making it very clear that we are far from that "harmless" scenario.
Skill and config files at the repository level now function as part of the execution layer. Check Point Research disclosed two vulnerabilities in Claude Code, CVE-2025-59536 and CVE-2026-21852, showing that simply cloning and opening an untrusted project could trigger code execution and credential exfiltration before any consent dialog appeared. On the registry side, OWASP's own analysis describes the ClawHub skill marketplace as the first agent registry to be systematically poisoned at scale, with several of the most-downloaded skills confirmed as malware at peak infection.
And the base rates are not reassuring. NVIDIA's research behind SkillSpector reports that 26.1% of skills contain vulnerabilities and 5.2% show likely malicious intent. If you run third-party skills, or even community-derived ones, with no review, you are statistically taking on risk.
That is the backdrop for two components that landed almost together and ended up shaping my review: the OWASP Agentic Skills Top 10 and NVIDIA's SkillSpector scanner.
The mental model shift ahead of us
A skill is not documentation the agent reads politely and in good faith. It is instructions, plus scripts, plus declared permissions, that the agent is biased to follow and execute. Review it the way you would review a script that runs with your developer's privileges, because that is basically what it is.
The OWASP Agentic Skills Top 10 (AST10)
OWASP published the Agentic Skills Top 10, version 0.5, in June 2026 (still far from v1.0). It catalogs the ten most critical security risks for agent skills across platforms (Claude, Cursor/Codex, VS Code, and OpenClaw-style SKILL.md ecosystems). It is the first vendor-neutral language we have for talking about this, which is exactly why I used it as the manual checklist behind the automated scan.

Here is the list as I worked it, with the one-line meaning of each:
- AST01 - Malicious Skills. Deliberately harmful skills, built to steal data or run unauthorized commands.
- AST02 - Supply Chain Compromise. Poisoned registries, distribution, or update channels.
- AST03 - Over-Privileged Skills. Permissions far beyond what the function needs.
- AST04 - Insecure Metadata. Dangerous parsing or deserialization of config files.
- AST05 - Untrusted External Instructions. Skills that fetch mutable instructions from external sources without pinning or validation.
- AST06 - Weak Isolation. Insufficient sandboxing, letting a skill reach host resources or cross skill boundaries.
- AST07 - Update Drift. Exploitable gaps between versions, attackers targeting the unpatched.
- AST08 - Poor Scanning. Scanners that miss natural-language injection and semantic threats.
- AST09 - No Governance. No inventory, no approval workflow, no audit log.
- AST10 - Cross-Platform Reuse. Porting a skill across platforms without revalidating its security properties.
The version 0.5 severity split is two Critical, five High, three Medium. What mattered for my review is that the Top 10 is not just a list of bugs. Half of it (AST02, AST07, AST08, AST09, AST10) is about process and supply chain, not code. You cannot scan your way to passing those. That single observation predicted exactly where my real findings would land.
SkillSpector: NVIDIA's New Open Source Scanner... What Does It Really Do?
NVIDIA/SkillSpector is an open-source (Apache 2.0) security scanner that answers one question before you install a skill: is this safe to run? I used it as the first automated pass.
It detects 68 patterns across 17 categories, including prompt injection, anti-refusal, data exfiltration, privilege escalation, supply chain, excessive agency, system-prompt leakage, memory poisoning, tool misuse, rogue-agent persistence, trigger abuse, dangerous code via Python AST, taint tracking, YARA signatures, and MCP least-privilege and tool-poisoning checks.
The architecture is two-stage, and understanding it is the key to not drowning in findings:
How SkillSpector reasons
Stage 1 (static): regex patterns, Python AST analysis, and live OSV.dev lookups for dependency CVEs. NVIDIA honestly describes this stage as "high recall, moderate precision". Translation: it over-reports on purpose.
Stage 2 (LLM, optional, but inference cannot be run fully locally yet): semantic evaluation that filters false positives and explains intent, pushing precision to around 87%. You can disable it with the --no-llm flag to run static-only.
Scanning is straightforward. It accepts directories, single files, Git URLs, and zip archives:
The scoring is additive: CRITICAL +50, HIGH +25, MEDIUM +10, LOW +5, with a 1.3x multiplier on executable scripts. The 0-100 score maps to SAFE, CAUTION, or DO NOT INSTALL. Useful as a triage signal. Not useful as a final verdict, which is exactly the point of the next section.
Security Alert
The optional LLM stage is what gives you precision, but it transmits the skill's contents to OpenAI, Anthropic, or NVIDIA depending on configuration. For in-house code that is a data-governance decision, not a default. I gated it on purpose, and I recommend that if you use inference via an API key from any of these providers, you first check that those API keys are under ZDR (Zero Data Retention policy, or inside your Enterprise Agreement with that vendor).
The Methodology: From 60 Findings to 5 Real Ones
Here is the actual process I applied to the 25 internal skills. The discipline was in the triage, not the scan.
- 1
1. Inventory and scope
Before scanning anything, I built an inventory: every in-house skill, its bundled scripts, and its declared permissions, laid out very clearly. AST09 (No Governance) starts here. You cannot review what you have not enumerated. 25 skills, each with its SKILL definition, helper scripts, and permission surface.
- 2
2. First automated pass with SkillSpector
I ran SkillSpector across all 25, producing SARIF for the record. Static stage on everything; the LLM stage I kept gated behind our data policy because they were proprietary scripts. The raw output was around 60 findings. That number felt alarming until I remembered stage 1 is tuned for recall.
- 3
3. Manual validation against OWASP AST10
This is where the real work lived. I walked every finding against the Top 10 by hand, asking three questions per item: which AST risk does it actually map to, is it exploitable in our specific context, and what is the realistic impact. Most findings failed question two. A trigger pattern that looks "overly broad" (AST05, trigger abuse) is only a risk if the skill also fetches external instructions, and ours did not.
- 4
4. Classify, drop, and confirm
The ~60 collapsed fast. The majority were precision artifacts: dangerous-looking calls in scripts that ran on trusted, local input; broad triggers with no external fetch; "privilege" flags on permissions the skill genuinely needed. After validation, around 5 findings survived as real. All low impact. All subtle.
- 5
5. Characterize the survivors
Every real finding shared a shape: it was harmless today, but it widened our exposure to a supply-chain compromise (AST02). Unpinned helpers, an update path that trusted whatever version was present (AST07), a script that would happily run a poisoned dependency if one ever landed. Nothing was exploitable on its own. Everything would matter the day something upstream fell.
Why the survivors were all supply-chain shaped
This was not a coincidence. Static scanners are good at code-level patterns inside a single skill. The risks they cannot fully judge are the process ones, AST02, AST07, AST08, AST10, which depend on context the scanner never sees: how the skill is updated, what it trusts, and what happens when an upstream changes. So the findings that survived human review were exactly the ones the tool could flag but not resolve. The scanner found the smoke. The threat model decided which smoke was a real spark, a future fire.
The headline ratio, 60 to 5, is the lesson I would hand to anyone running their first skill review. A scanner that reports 60 issues has not found 60 problems. It handed you 60 hypotheses and trusted you to do the science. The value you add is not running the tool. It is the manual OWASP pass that turns recall into truth.
Security Alert
A skill scanner is a high-recall, moderate-precision instrument. It is built to over-report so it does not miss the malicious 5%. If on day one you forward those 60 raw findings to engineers as "confirmed issues", you will burn your credibility before the second review.
Mitigations: What We Changed
The fixes were unglamorous, which is appropriate, because the risks were quiet.
- Pin everything. Every helper dependency got pinned and hashed. This directly addresses AST02 and AST07: an attacker cannot drift you onto a poisoned version you never approved.
- Least privilege on declared permissions. We trimmed the permission surface on the skills that over-declared, closing the AST03 gap the scanner had correctly flagged even where it was not yet exploitable.
- Improve the scripts, which functions are used and the criteria behind choosing them.
Nice to Have in the Future
- Scanning and this analysis became a gate, not a one-off. Ideally SkillSpector would run in CI on skill changes, with a committed baseline so accepted findings stay quiet and new ones stand out. That turns AST08 from a single review into a continuous control.
- A real inventory and an approval step. The thing that did not exist before the review, governance (AST09), is now the cheapest and highest-leverage control we have: nothing ships as a skill without being inventoried and approved by project, by team, and so on.
- Request an API key under an Enterprise Agreement with ZDR. Ideally, having an API key from a provider like OpenAI or Anthropic under strict ZDR rules would be perfect for running SkillSpector in CI, periodically and "intelligently", so that analyzing large numbers of skills stays as scalable as possible.
None of this is exotic... that is the point. The first review did not uncover a dramatic backdoor. It uncovered that we had no process, and that our quiet exposure was entirely supply-chain shaped. Fixing the process was worth more than fixing any single finding.
If you are about to run your first skill review
Start with the inventory, not the scanner. Run the tool for recall, then validate every finding against the OWASP Agentic Skills Top 10 by hand. Expect most of them to evaporate, and expect the survivors to be supply-chain risks the tool can flag but not fully judge. Budget your time for triage, not for scanning.
References and Resources
- OWASP Agentic Skills Top 10 (AST10) - the version 0.5 (June 2026) risk catalog and the checklist.
- OWASP AST10 GitHub repository - source code, risk pages, and the cross-platform matrix.
- NVIDIA/SkillSpector - the open-source scanner, the patterns, and the scoring model.
- NVIDIA: Scan Agent Skills Before Installation - official scanning guidance and the base-rate research.
- NVIDIA Technical Blog: Verified Agent Skills and Capability Governance - the governance framing behind the tooling.
Closing Thoughts
The first skill review did not feel like the dramatic security work people usually imagine when we tell them what we do outside the field. No exploit chain, no smoking-gun backdoor. Just 25 internal skills, a scanner doing its honest job of over-reporting, and a human deciding which of the 60 hypotheses survived contact with a real threat model. Five did, and all five had the same quiet shape: fine today, dangerous the day someone poisons what we depend on.
That is the maturity step. The tools are finally good enough to flag agent-skill risk at scale, and the OWASP Agentic Skills Top 10 finally gives us a shared language to talk about it. But the judgment, the part that separates 60 findings from 5 real ones, is still ours. Let's run the scanner. Then let's do the science... or the magic.
If you made it this far, thank you for sharing this time together. These days reading, writing, and attention are precious goods that we are losing to new technologies, low self-control, and the rising speed that agentic AI has brought. So, once again, thanks for that!
Test Your Technical Knowledge
In the first skill review described here, the scanner returned around 60 findings across 25 in-house skills. How many survived manual validation, and what was their character?
NVIDIA describes SkillSpector's static stage as "high recall, moderate precision". Why does that design choice make a manual OWASP pass essential?
Half of the OWASP Agentic Skills Top 10 (AST02, AST07, AST08, AST09, AST10) is about process and supply chain rather than code inside a single skill. Why does that structure predict that a static scanner's real survivors will cluster in supply-chain risk?
Continue Reading
Next steps in the archive
Keep Exploring
Related reading
Continue through adjacent topics with the strongest tag overlap.

The Technical Anatomy of Model Extraction in 2026 (The Great AI Theft of the Century?)
A deep technical dive into Model Extraction attacks. We dissect the mathematics of Knowledge Distillation, logit harvesting pipelines, and the cryptographic failures of LLM watermarking.

Rules vs. Skills: Creating Secure AI Context in Engineering Teams
At my company we ran into a familiar question while scaling AI coding assistants: when should context live in a Rule or `CLAUDE.md`, and when does it deserve a Skill...

MCP Security for Enterprise Organizations: Real-world experiences and advanced defense
A personal reflection and technical analysis on the MCP protocol, from the challenge of presenting to the community to the real-world methods and risks in AI Security, MCP Server, and recommended defenses for organizations. Includes resources, papers, and key sites for modern research in AI agent security.

