← Back to Blog
Case Study · RAKṢĀ · May 2026

Securing AI-Generated Code at Scale with Datadog

AI coding agents generate code fast — and introduce vulnerabilities just as fast. Here’s how we caught 46 SAST findings and 10 CVEs before a single line reached production.

46
SAST Findings Fixed
10
CVEs Resolved
0
Secrets Leaked
100%
Pre-Deploy Coverage

The Challenge: AI Agents Write Vulnerable Code

We use AI coding agents (Claude Code, Codex CLI) to build our entire platform. They’re fast — an agent can scaffold a new microservice in 20 minutes, write integration tests, and push a PR. But speed without security is just faster failure.

When we audited the code our agents had written, we found a pattern of vulnerabilities that no amount of prompt engineering could prevent:

  • SQL injection — String concatenation in database queries. The agent knew it “should use parameterized queries” but didn’t always do it, especially in utility scripts and one-off tools.
  • Silent exception swallowingexcept: pass blocks that hide critical errors. The agent writes them to “handle gracefully” — which means hiding failures.
  • Weak cryptography — MD5 for hashing where SHA-256 was required. The agent picks the first hash function it recalls from training data.
  • Outdated dependencies — Agents install the version they saw most often in training, not the latest patched version.
“The agent doesn’t know that os.system(user_input)is a command injection vulnerability. It just knows the code compiles and the tests pass.”

Traditional code review catches some of this. But when agents are shipping 9 tasks/day across 5 codebases, manual review becomes the bottleneck — and the thing that gets skipped at 11 PM.


The Solution: RAKṢĀ + Datadog Code Security

RAKṢĀ (रक्षा — Sanskrit for “protection”) is our security scanning platform, integrated with Datadog Code Security MCP for deep static analysis. Together, they form a pre-deployment security gate that catches vulnerabilities before code leaves the CI pipeline.

Architecture: Scan → Block → Report → Fix

Agent writes code
       │
       ▼
┌──────────────────────────────────────────────┐
│            GitHub Action (CI)                 │
│                                               │
│  ┌─────────────┐   ┌──────────────────────┐  │
│  │   RAKṢĀ     │   │  Datadog Code        │  │
│  │   Cloud     │   │  Security MCP        │  │
│  │   Scanner   │   │                      │  │
│  │             │   │  • SAST analysis      │  │
│  │  • SAST     │   │  • Secret detection   │  │
│  │  • SCA      │   │  • CVE scanning       │  │
│  │  • Secrets  │   │  • SBOM generation    │  │
│  └──────┬──────┘   └──────────┬───────────┘  │
│         │                     │              │
│         └─────────┬───────────┘              │
│                   ▼                          │
│         ┌─────────────────┐                  │
│         │  SARIF Report   │                  │
│         │  + GitHub Code  │                  │
│         │    Scanning     │                  │
│         └────────┬────────┘                  │
│                  │                           │
│    Findings > threshold?                     │
│         │                │                   │
│        YES              NO                   │
│         │                │                   │
│    Block Deploy     ✅ Deploy                │
└─────────┼────────────────┼───────────────────┘
          │                │
          ▼                ▼
   Agent fixes        Production
   in same PR

The critical design choice: SARIF output feeds directly back into the agent’s context. The same coding agent that wrote the vulnerable code receives the scan results and fixes the issues — in the same PR, the same session. No human handoff. No Jira ticket that sits for weeks.


Real Scan Data: What We Found Today

Here’s what RAKṢĀ + Datadog Code Security found in a single scan across two repositories — RAKṢĀ itself and DevOps RAG:

SAST Findings: 46 Total

RepositoryHIGHMEDIUMTotal
RAKṢĀ171734
DevOps RAG5712

Top Findings by Category

FindingSeverityFileFix
SQL InjectionHIGHvuln_db.pyParameterized queries
Silent Exceptions (×8)MEDIUMMultiple filesSpecific exception types + logging
Weak Hashing (MD5)HIGHutils/hash.pyMigrated to SHA-256
Hardcoded CredentialsHIGHconfig.pyEnvironment variables

Dependency Vulnerabilities: 10 CVEs

PackageVulnerabilitySeverityFix Version
urllib3SSRF / Header injectionHIGH≥2.3.0
starlettePath traversalHIGH≥0.40.0
requestsCertificate verification bypassMEDIUM≥2.32.0
python-dotenvPath injectionMEDIUM≥1.1.0

Secret detection: 0 findings. This is the one area where our agents have been consistently disciplined — likely because we have a strong .gitignore and .env.example pattern that the agents learned from.


CI/CD Integration: The GitHub Action

RAKṢĀ runs on every push and every PR. The GitHub Action is the primary enforcement point:

# .github/workflows/security.yml
name: RAKṢĀ Security Scan

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run RAKṢĀ Security Scan
        uses: avyay/raksha-scan-action@v1
        with:
          severity_threshold: high
          scan_type: full          # SAST + SCA + secrets
          exclude_paths: |
            sample-code/
            tests/fixtures/
        env:
          RAKSHA_API_KEY: ${{ secrets.RAKSHA_API_KEY }}
          DD_API_KEY: ${{ secrets.DD_API_KEY }}

      - name: Upload SARIF to GitHub Security
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: raksha-results.sarif

      - name: Post findings to PR
        if: github.event_name == 'pull_request' && failure()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const sarif = JSON.parse(fs.readFileSync('raksha-results.sarif'));
            const findings = sarif.runs[0].results.length;
            github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `🛡️ **RAKṢĀ found ${findings} security issues.** Fix before merge.`
            });

Docker Image Hardening

One subtle but important integration: the .dockerignore excludes sample vulnerable code that ships with RAKṢĀ for testing:

# .dockerignore
sample-code/          # Intentionally vulnerable samples
tests/fixtures/vuln/  # Test fixtures with known vulnerabilities
*.sarif               # Scan results (contain file paths)
.env*                 # Environment files

Without this, RAKṢĀ’s own Docker image would contain the very vulnerabilities it’s designed to detect — a common trap in security tooling.


The Results: Zero Vulnerabilities in Production

After implementing RAKṢĀ + Datadog Code Security across all five Avyay microservices:

MetricBefore RAKṢĀAfter RAKṢĀ
SAST findings reaching prodUnknown (no scanning)0
Known CVEs in dependencies10+ (untracked)0
Secret leaks2 incidents (caught manually)0
Time from finding to fixDays (manual review)<30 min (agent auto-fix)
Scan coverageAd-hoc100% of commits

The key metric isn’t the number of findings caught — it’s the time from finding to fix: under 30 minutes.Because SARIF output feeds directly into the coding agent’s context, the same agent that introduced the vulnerability fixes it in the same PR cycle. No handoff. No ticket. No “we’ll get to it next sprint.”


Get Started with RAKṢĀ

# Install the CLI
pip install raksha-cli

# Scan your project
raksha scan --severity high --format sarif

# Or use the GitHub Action
# Add avyay/raksha-scan-action@v1 to your workflow

# Or call the API directly
curl -X POST https://raksha.avyay.ai/v1/scan \
  -H "Authorization: Bearer your-key" \
  -F "files=@./src" \
  -F "scan_type=full"

Gaurav Sharma is the founder of Avyay (अव्यय). RAKṢĀ is the security layer of the Avyay platform. Read about the full architecture at avyay.ai/blog/avyay-architecture.

Try RAKṢĀ

Secure Your AI-Generated Code

AI agents write code fast. RAKṢĀ makes sure it’s secure. SAST, SCA, and secret detection — integrated into your CI in 5 minutes.

Get Started with RAKṢĀ →