How spam scoring works
Inkwell scores each submission with a composable pipeline of seven signals. Each signal contributes points to a 0–100 total; the form's threshold determines the cutoff between clean, quarantined, and spam. Honeypot and known-bad-IP signals hard-block (rejected, no payload stored).
The seven signals
Honeypot
A hidden form field that real visitors leave empty. If a bot fills it, the submission is rejected — 100 points (hard block). The field name is configurable per form (default _subject_honeypot).
IP reputation
StopForumSpam publishes a daily CSV of known-abusive IPs. Inkwell's cron job loads it into a Redis SET; per-submission lookup is O(1). A hit hard-blocks. Misses don't add points.
Timing
Real visitors take more than two seconds to fill a form. The optional 3 KB widget writes the page-render timestamp into a hidden field; submissions under two seconds get +25 points. Without the widget, the signal returns null (no penalty for no-JS visitors).
Submission rate
More than ten submissions from the same IP to the same form in 60 seconds gets +25 points. This is on top of the per-IP rate-limit middleware — the limit catches floods, the signal catches sustained-low-rate scraping.
Content
Heuristic checks on the longest free-text field — URL density (≥ 3 = +10), all-caps ratio (> 0.5 = +5), phone-number density (≥ 3 = +5), very-short body (< 6 chars = +5). Capped at +25.
Email validity
RFC 5322 syntactic check (+15 on fail) + disposable-domain blacklist (+10). The blacklist covers the major throwaway services (10minutemail, mailinator, guerrillamail, …) — extend as a v2 enhancement.
Captcha (optional)
Pass-through to Cloudflare Turnstile (or compatible). If a visitor submits a token that verifies, the signal contributes 0; if it fails, +10 on top of whatever else fired.
Graduated decision
Score is mapped to a state by the form's threshold (default 50):
- 0–29 → clean — accepted, destinations dispatched immediately.
- 30–49 → quarantined — accepted but flagged. Visitor sees a captcha challenge; on pass → clean; on fail or timeout → spam.
- 50+ → spam — stored, no destinations dispatched, visible in admin's Spam tab.
- Hard-block (honeypot or blocklisted IP) → rejected — no payload stored, only metadata.
Why the breakdown is public
Black-box scoring breeds frustrated buyers — when a real customer's email is wrongly flagged, the buyer has no recourse without the signal-level detail. Inkwell exposes the full breakdown verbatim in the admin and (for the live demo) on the public result page. Buyers see the explainability, tune their threshold or per-form weights, and trust the system.
Adding a signal
Implement App\Services\Spam\SpamSignal, register the class in config/inkwell.php, add a corpus row in tests/corpus/spam-corpus.json, ship. The corpus is the regression contract — change semantics? Update the corpus first.