Project Basilisk: Fighting Back Against Unauthorized AI Training

AI companies scrape billions of images from the internet to train their models. Artists, photographers, and creators rarely consent to this - or even know it's happening. By the time you discover your work was used, the model is already trained and deployed.

What if you could prove your work was stolen? What if you could embed an invisible signature that survives model training and lets you detect unauthorized use?

That's Project Basilisk.

The Problem: Invisible Theft

Modern AI image generators are trained on massive datasets scraped from:

Social media platforms
Art portfolios
Stock photo sites
Personal blogs

You post your work online, and within hours it might be in a training dataset. There's no opt-out, no consent, no compensation. And worse - there's no way to prove it happened.

Traditional watermarks don't help. They're easily cropped, blurred, or removed. Even if they survive, they don't prove the image was used for training - just that it exists somewhere.

My Solution: Radioactive Data Marking

I built Basilisk based on research from Facebook AI Research's ICML 2020 paper on "radioactive data." The concept is brilliant:

Instead of adding visible watermarks, embed cryptographically secure signatures directly into the pixel values. These perturbations are:

Imperceptible to human eyes (controlled by epsilon parameter)
Robust to compression, resizing, and augmentation
Detectable after model training by analyzing model behavior

How It Works

1. Signature Generation Each user gets a unique 256-bit signature - one of 2^256 possible signatures (more than atoms in the universe). This signature is:

signature = generate_cryptographic_signature(user_id, image_hash)
# Unique per user AND per image

2. Embedding Process Using PyTorch and Projected Gradient Descent (PGD), Basilisk optimizes pixel perturbations that:

Encode the signature into the image
Remain invisible (epsilon = 0.01 default)
Survive common transformations (JPEG compression, crops, etc.)

def poison_image(image, signature, epsilon=0.01):
    # Generate adversarial perturbation
    perturbation = pgd_optimize(
        image=image,
        target_signature=signature,
        epsilon=epsilon,
        iterations=100
    )

    poisoned = image + perturbation
    # Visually identical, but mathematically signed
    return clamp(poisoned, 0, 1)

3. Detection When you suspect a model was trained on your signed images:

Generate test samples from the model
Run detection algorithm on model outputs
Statistical analysis reveals if your signature is present

If detected, you have cryptographic proof of unauthorized training.

Features I Built

Image Protection

Single & Batch Processing: Sign one image or thousands
Configurable Strength: Balance invisibility vs robustness (epsilon 0.005-0.05)
Web UI: Drag-and-drop interface built with Next.js
CLI Tools: Automate signing in your workflow

Video Protection (Beta)

This was the hardest part. Videos add temporal complexity:

Per-Frame Poisoning: Sign each frame individually
Optical Flow Encoding: Embed signatures in motion data
Temporal Signatures: Use cyclic sine waves for time-based encoding
Compression Robustness: Survive video codecs (H.264, VP9)

Integration Testing

Built a complete test environment to verify signatures survive:

Model training simulation
Various architectures (CNNs, transformers)
Detection accuracy measurement
75+ tests with 85%+ coverage

Technical Deep Dive

The Epsilon Balance

The epsilon parameter controls perturbation magnitude. Too low? Signatures might not survive compression. Too high? Visual artifacts appear.

Through experimentation, I found:

0.005: Maximum stealth, may not survive aggressive compression
0.01: Recommended balance of invisibility + robustness
0.02: Detectable artifacts in solid colors
0.05: Visible noise in most images

PGD Robustness Mode

Standard embedding can be fragile. PGD (Projected Gradient Descent) mode adds adversarial robustness:

for step in range(iterations):
    # Simulate compression/augmentation
    augmented = random_transform(poisoned_image)

    # Measure signature retention
    loss = signature_loss(augmented, target_signature)

    # Update perturbation to survive transforms
    gradient = compute_gradient(loss)
    perturbation = project(perturbation - lr * gradient, epsilon)

This makes signatures survive:

JPEG compression (quality 60+)
Resizing (50% - 200%)
Random crops
Color jittering
Gaussian noise

Cryptographic Security

Each signature is generated using:

signature = HMAC-SHA256(secret_key, user_id + image_hash + timestamp)

This means:

Unique per user: Your signatures are different from everyone else's
Unique per image: Even your own images have different signatures
Collision-resistant: Computationally infeasible to forge
Time-stamped: Proves when you signed the image

Real-World Use Cases

For Artists

"I want to share my art online but protect it from AI scrapers."

Sign your portfolio before uploading
If an AI generator mimics your style, test it for your signature
Cryptographic proof for legal action

For Photographers

"Stock photo sites are training data goldmines."

Batch sign your entire library
Monitor new AI models for unauthorized training
Protect your creative work at scale

For Researchers

"I need to study AI model behavior and data provenance."

Controlled experiments with signed datasets
Measure data memorization in models
Audit training data sources

What I Learned

Building Basilisk taught me:

Adversarial ML: How to make models behave in specific ways
Signal Processing: Balancing frequency-domain perturbations for invisibility
Cryptography: Designing collision-resistant signature schemes
Video Encoding: Temporal coherence and compression artifacts
PyTorch Internals: Custom optimization loops and gradient manipulation

The hardest part? Making it practical. Academic papers often ignore real-world constraints like:

Processing time for batch jobs
Memory usage for high-res images
User experience for non-technical users
Deployment complexity

Current Status & Limitations

Basilisk is functional and well-tested (75+ tests, 85%+ coverage) but has limitations:

Detection requires model access: You can't test closed-source models (yet)
False positive rate: Statistical detection isn't 100% certain
Computational cost: PGD robustness mode is slow for large batches
Video beta: Temporal encoding needs more research

Future work:

Black-box detection methods
GPU acceleration for batch processing
Distributed signing for large datasets
Detection-as-a-service API

The Ethical Question

Some ask: "Isn't this just DRM for AI training?"

My response: Creators deserve control over their work. If you want to train AI on someone's art, ask permission. Basilisk doesn't prevent AI development - it prevents unauthorized exploitation.

It's not about stopping progress. It's about making sure progress is built on consent, not theft.

Try It Yourself

Basilisk is open source under MIT license.

# Clone and setup
git clone https://github.com/abendrothj/basilisk
cd basilisk
pip install -r requirements.txt

# Sign an image
python basilisk.py --input photo.jpg --output signed.jpg --epsilon 0.01

# Batch processing
python basilisk.py --batch ./photos --output ./signed

# Run web UI
npm run dev  # Next.js UI on localhost:3000

If you're a creator concerned about AI training on your work, give it a try. If you're a researcher studying model behavior, Basilisk is a powerful tool for controlled experiments.

And if you're an AI company training on scraped data - we're watching.

Technology should empower creators, not exploit them. Every line of code is a choice about what kind of future we're building.