🔍 YT-to-WP Agent: Cron Failure Audit

Code review of agent.py — the YouTube-to-WordPress self-improving publishing agent. Every way this script fails when cron runs it.

Critical

1. No .env / Secret Loading

There is no .env loading anywhere in this script. Config comes from config.yaml, but sensitive keys like anthropic_api_key, wp_app_password, telegram_bot_token, and youtube_api_key are referenced directly from YAML. If those values are actually in the YAML in plaintext, that's a security problem. If you're expecting them from environment variables, they'll be empty under cron — cron runs with a minimal environment (PATH=/usr/bin:/bin, no HOME, no custom vars).

Fix: Add python-dotenv loading with an absolute path at the top of the script:

from dotenv import load_dotenv
load_dotenv(BASE_DIR / ".env")
Critical

2. File Paths & Working Directory

Line 23: BASE_DIR = Path(__file__).resolve().parent — this is fine if the script is invoked by its absolute path in the crontab. But if cron runs it as python3 agent.py or via a wrapper that cds somewhere else, __file__ resolves correctly but the working directory is wrong for any library that uses os.getcwd() internally.

Fix: Add os.chdir(BASE_DIR) right after line 23.

Critical

3. Python Path / Shebang

Line 1: #!/usr/bin/env python3 — on the Hostinger VPS, is python3 actually on cron's PATH? Cron's default PATH is /usr/bin:/bin. If Python is in /usr/local/bin/python3 or a virtualenv, this shebang fails silently and cron sends an error to local mail that nobody reads.

Fix: Use the absolute Python path in the crontab entry:

*/30 * * * * /path/to/venv/bin/python3 /absolute/path/to/agent.py
High

4. No WP Auth Preflight Check

wp_auth_header() (line ~180) builds a Basic Auth header but never tests it. If the app password is wrong, expired, or the username has a typo, you won't find out until publish_to_wordpress() fires a 401 — after you've already burned a Claude API call and a thumbnail upload.

Fix: Add a preflight auth check at the start of main():

def validate_wp_auth():
    url = CONFIG["wp_site_url"].rstrip("/") + "/wp-json/wp/v2/users/me"
    resp = requests.get(url, headers=wp_auth_header(), timeout=15)
    if resp.status_code == 401:
        raise RuntimeError("WP auth failed — check credentials")
    resp.raise_for_status()
High

5. Duplicate Posts on Save Failure

Lines 100–113: save_processed() does an atomic write with os.replace, which is good. But if the disk is full or permissions are wrong, the raise propagates up through main(), hits the finally block, releases the lock, and exits. The video was already published to WordPress but its ID was never persisted to processed.json. Next cron run → duplicate post.

Fix: Wrap the save in its own try/except inside the loop body and send a Telegram alert if state persistence fails.

High

6. No Error Notifications

The script logs errors but only notifies via Telegram on success (line ~230). If Claude fails, WordPress fails, or YouTube API fails, the error goes to the log file and stdout. Cron captures stdout but only delivers it via local mail — which is almost certainly not configured on the Hostinger VPS.

Fix: Add a notify_error() function that sends a Telegram message on critical failures.

Medium

7. Lock File Doesn't Validate PID

Lines 72–87: The lock file stores the PID, but the staleness check only looks at file age (1 hour). If the script crashes after 5 minutes, the lock blocks all runs for the next 55 minutes.

Fix: Check if the PID is still alive:

pid = int(open(LOCK_FILE).read().strip())
try:
    os.kill(pid, 0)  # signal 0 = check existence
except OSError:
    log.warning("Removing orphaned lock (PID %d is dead)", pid)
    os.remove(LOCK_FILE)
Medium

8. No config.yaml Validation at Startup

load_config() (line 26) reads the YAML but never checks that required keys exist. If youtube_api_key is missing, you get a cryptic KeyError stack trace at runtime instead of a clear message.

Fix:

REQUIRED_KEYS = ["youtube_api_key", "youtube_channel_id", 
    "anthropic_api_key", "wp_site_url", "wp_username", 
    "wp_app_password", "telegram_bot_token", "telegram_chat_id"]
missing = [k for k in REQUIRED_KEYS if k not in CONFIG]
if missing:
    raise SystemExit(f"config.yaml missing: {missing}")
Low

9. HTML-Unescaped Title in Telegram

Line ~228: f"🎬 {video['title']}" with "parse_mode": "HTML". If a video title contains <, >, or &, the Telegram API returns a 400 error and the notification silently fails.

Fix: import html and use html.escape(video['title']).

Severity Summary

IssueSeverityImpact
No env/secret loadingCriticalScript does nothing under cron if secrets are env vars
No WP auth preflightHighWastes Claude API credits on doomed runs
Duplicate posts on save failureHighPublished content duplicated
No error notificationsHighFailures go unnoticed for days
Lock file doesn't check PIDMediumHour-long outages after crashes
No config validationMediumCryptic KeyError stack traces
HTML-unescaped Telegram titleLowOccasional lost notifications