Securing AI-Generated Code at Scale with Datadog

SAST Findings Fixed

CVEs Resolved

Secrets Leaked

100%

Pre-Deploy Coverage

The Challenge: AI Agents Write Vulnerable Code

We use AI coding agents (Claude Code, Codex CLI) to build our entire platform. They’re fast — an agent can scaffold a new microservice in 20 minutes, write integration tests, and push a PR. But speed without security is just faster failure.

When we audited the code our agents had written, we found a pattern of vulnerabilities that no amount of prompt engineering could prevent:

SQL injection — String concatenation in database queries. The agent knew it “should use parameterized queries” but didn’t always do it, especially in utility scripts and one-off tools.
Silent exception swallowing — except: pass blocks that hide critical errors. The agent writes them to “handle gracefully” — which means hiding failures.
Weak cryptography — MD5 for hashing where SHA-256 was required. The agent picks the first hash function it recalls from training data.
Outdated dependencies — Agents install the version they saw most often in training, not the latest patched version.

“The agent doesn’t know that os.system(user_input)is a command injection vulnerability. It just knows the code compiles and the tests pass.”

Traditional code review catches some of this. But when agents are shipping 9 tasks/day across 5 codebases, manual review becomes the bottleneck — and the thing that gets skipped at 11 PM.

The Solution: RAKṢĀ + Datadog Code Security

RAKṢĀ (रक्षा — Sanskrit for “protection”) is our security scanning platform, integrated with Datadog Code Security MCP for deep static analysis. Together, they form a pre-deployment security gate that catches vulnerabilities before code leaves the CI pipeline.

Architecture: Scan → Block → Report → Fix

Agent writes code
       │
       ▼
┌──────────────────────────────────────────────┐
│            GitHub Action (CI)                 │
│                                               │
│  ┌─────────────┐   ┌──────────────────────┐  │
│  │   RAKṢĀ     │   │  Datadog Code        │  │
│  │   Cloud     │   │  Security MCP        │  │
│  │   Scanner   │   │                      │  │
│  │             │   │  • SAST analysis      │  │
│  │  • SAST     │   │  • Secret detection   │  │
│  │  • SCA      │   │  • CVE scanning       │  │
│  │  • Secrets  │   │  • SBOM generation    │  │
│  └──────┬──────┘   └──────────┬───────────┘  │
│         │                     │              │
│         └─────────┬───────────┘              │
│                   ▼                          │
│         ┌─────────────────┐                  │
│         │  SARIF Report   │                  │
│         │  + GitHub Code  │                  │
│         │    Scanning     │                  │
│         └────────┬────────┘                  │
│                  │                           │
│    Findings > threshold?                     │
│         │                │                   │
│        YES              NO                   │
│         │                │                   │
│    Block Deploy     ✅ Deploy                │
└─────────┼────────────────┼───────────────────┘
          │                │
          ▼                ▼
   Agent fixes        Production
   in same PR

The critical design choice: SARIF output feeds directly back into the agent’s context. The same coding agent that wrote the vulnerable code receives the scan results and fixes the issues — in the same PR, the same session. No human handoff. No Jira ticket that sits for weeks.

Real Scan Data: What We Found Today

Here’s what RAKṢĀ + Datadog Code Security found in a single scan across two repositories — RAKṢĀ itself and DevOps RAG:

SAST Findings: 46 Total

Repository	HIGH	MEDIUM	Total
RAKṢĀ	17	17	34
DevOps RAG	5	7	12

Top Findings by Category

Finding	Severity	File	Fix
SQL Injection	HIGH	vuln_db.py	Parameterized queries
Silent Exceptions (×8)	MEDIUM	Multiple files	Specific exception types + logging
Weak Hashing (MD5)	HIGH	utils/hash.py	Migrated to SHA-256
Hardcoded Credentials	HIGH	config.py	Environment variables

Dependency Vulnerabilities: 10 CVEs

Package	Vulnerability	Severity	Fix Version
urllib3	SSRF / Header injection	HIGH	≥2.3.0
starlette	Path traversal	HIGH	≥0.40.0
requests	Certificate verification bypass	MEDIUM	≥2.32.0
python-dotenv	Path injection	MEDIUM	≥1.1.0

Secret detection: 0 findings. This is the one area where our agents have been consistently disciplined — likely because we have a strong .gitignore and .env.example pattern that the agents learned from.

CI/CD Integration: The GitHub Action

RAKṢĀ runs on every push and every PR. The GitHub Action is the primary enforcement point:

# .github/workflows/security.yml
name: RAKṢĀ Security Scan

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run RAKṢĀ Security Scan
        uses: avyay/raksha-scan-action@v1
        with:
          severity_threshold: high
          scan_type: full          # SAST + SCA + secrets
          exclude_paths: |
            sample-code/
            tests/fixtures/
        env:
          RAKSHA_API_KEY: ${{ secrets.RAKSHA_API_KEY }}
          DD_API_KEY: ${{ secrets.DD_API_KEY }}

      - name: Upload SARIF to GitHub Security
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: raksha-results.sarif

      - name: Post findings to PR
        if: github.event_name == 'pull_request' && failure()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const sarif = JSON.parse(fs.readFileSync('raksha-results.sarif'));
            const findings = sarif.runs[0].results.length;
            github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `🛡️ **RAKṢĀ found ${findings} security issues.** Fix before merge.`
            });

Docker Image Hardening

One subtle but important integration: the .dockerignore excludes sample vulnerable code that ships with RAKṢĀ for testing:

# .dockerignore
sample-code/          # Intentionally vulnerable samples
tests/fixtures/vuln/  # Test fixtures with known vulnerabilities
*.sarif               # Scan results (contain file paths)
.env*                 # Environment files

Without this, RAKṢĀ’s own Docker image would contain the very vulnerabilities it’s designed to detect — a common trap in security tooling.

The Results: Zero Vulnerabilities in Production

After implementing RAKṢĀ + Datadog Code Security across all five Avyay microservices:

Metric	Before RAKṢĀ	After RAKṢĀ
SAST findings reaching prod	Unknown (no scanning)	0
Known CVEs in dependencies	10+ (untracked)	0
Secret leaks	2 incidents (caught manually)	0
Time from finding to fix	Days (manual review)	<30 min (agent auto-fix)
Scan coverage	Ad-hoc	100% of commits

The key metric isn’t the number of findings caught — it’s the time from finding to fix: under 30 minutes.Because SARIF output feeds directly into the coding agent’s context, the same agent that introduced the vulnerability fixes it in the same PR cycle. No handoff. No ticket. No “we’ll get to it next sprint.”

Get Started with RAKṢĀ

# Install the CLI
pip install raksha-cli

# Scan your project
raksha scan --severity high --format sarif

# Or use the GitHub Action
# Add avyay/raksha-scan-action@v1 to your workflow

# Or call the API directly
curl -X POST https://raksha.avyay.ai/v1/scan \
  -H "Authorization: Bearer your-key" \
  -F "files=@./src" \
  -F "scan_type=full"

CLI: pip install raksha-cli
GitHub Action: github.com/marketplace/actions/raksha-security-scan
Documentation: docs.avyay.ai/raksha

Gaurav Sharma is the founder of Avyay (अव्यय). RAKṢĀ is the security layer of the Avyay platform. Read about the full architecture at avyay.ai/blog/avyay-architecture.