Hacker News

Published about 2 hours ago

Verified Spec-Driven Development (VSDD)

Hacker News · Feb 28, 2026 · Collected from RSS

Summary

Article URL: https://gist.github.com/dollspace-gay/d8d3bc3ecf4188df049d7a4726bb2a00 Comments URL: https://news.ycombinator.com/item?id=47197595 Points: 19 # Comments: 6

Full Article

The Fusion: VDD × TDD × SDD for AI-Native Engineering Overview Verified Spec-Driven Development (VSDD) is a unified software engineering methodology that fuses three proven paradigms into a single AI-orchestrated pipeline: Spec-Driven Development (SDD): Define the contract before writing a single line of implementation. Specs are the source of truth. Test-Driven Development (TDD): Tests are written before code. Red → Green → Refactor. No code exists without a failing test that demanded it. Verification-Driven Development (VDD): Subject all surviving code to adversarial refinement until a hyper-critical reviewer is forced to hallucinate flaws. VSDD treats these not as competing philosophies but as sequential gates in a single pipeline. Specs define what. Tests enforce how. Adversarial verification ensures nothing was missed. AI models orchestrate every phase, with the human developer serving as the strategic decision-maker and final authority. I. The VSDD Toolchain Role Entity Function The Architect Human Developer Strategic vision, domain expertise, acceptance authority. Signs off on specs, arbitrates disputes between Builder and Adversary. The Builder Claude (or similar) Spec authorship, test generation, code implementation, and refactoring. Operates under strict TDD constraints. The Tracker Chainlink Hierarchical issue decomposition — Epics → Issues → Sub-issues ("beads"). Every spec, test, and implementation maps to a bead. The Adversary Sarcasmotron (Gemini Gem or equivalent) Hyper-critical reviewer with zero patience. Reviews specs, tests, and implementation. Fresh context on every pass. II. The VSDD Pipeline Phase 1 — Spec Crystallization Nothing gets built until the contract is airtight — and the architecture is verification-ready by design. The human developer describes the feature intent to the Builder. The Builder then produces a formal specification document for each unit of work. Critically, this phase doesn't just define what the software does — it defines what must be provable about it and structures the architecture accordingly. Step 1a: Behavioral Specification The Builder produces the functional contract: Behavioral Contract: What the module/function/endpoint must do, expressed as preconditions, postconditions, and invariants. Interface Definition: Input types, output types, error types. No ambiguity. If it's an API, this is the OpenAPI/GraphQL schema. If it's a module, this is the type signature and doc contract. Edge Case Catalog: Explicitly enumerated boundary conditions, degenerate inputs, and failure modes. The Builder is prompted to be exhaustive here — "What happens when the input is null? Empty? Maximum size? Negative? Unicode? Concurrent?" Non-Functional Requirements: Performance bounds, memory constraints, security considerations baked into the spec itself. Step 1b: Verification Architecture Before any implementation design is finalized, the Builder produces a Verification Strategy that answers: "What properties of this system must be mathematically provable, and what architectural constraints does that impose?" This includes: Provable Properties Catalog: Which invariants, safety properties, and correctness guarantees must be formally verified — not just tested? Examples: "This state machine can never reach an invalid state." "This arithmetic can never overflow." "This parser always terminates." "This access control check is never bypassed." The Builder distinguishes between properties that should be proven (critical path, security boundaries, financial calculations) and properties where test coverage is sufficient (UI formatting, logging, non-critical defaults). Purity Boundary Map: A clear architectural separation between the deterministic, side-effect-free core (where formal verification can operate) and the effectful shell (I/O, network, database, user interaction). This is the most consequential design decision in VSDD — it dictates module boundaries, dependency direction, and how state flows through the system. The pure core must be designed so that verification tools can reason about it without mocking the entire universe. Verification Tooling Selection: Based on the language and the properties to be proven, the Builder selects the appropriate formal verification stack (Kani for Rust, CBMC for C/C++, Dafny, TLA+ for distributed systems, etc.) and identifies any constraints these tools impose on code structure. This happens now, not after the code is written, because tool constraints are architectural constraints. Property Specifications: Where possible, the Builder drafts the actual formal property definitions (e.g., Kani proof harnesses, Dafny contracts, TLA+ invariants) alongside the behavioral spec. These aren't implementation — they're the formal expression of what the spec already says in natural language. They serve as a second, mathematically precise encoding of the requirements. Why this must happen in Phase 1: If the system is designed with side effects woven through the core logic, no amount of Phase 5 heroics will make it verifiable. A function that reads from a database, performs a calculation, and writes to a log in one block cannot be formally verified without mocking infrastructure that the verifier may not support. But a function that takes data in, returns a result, and lets the caller handle persistence — that's a function a model checker can reason about. This boundary must be drawn at the spec level because it fundamentally shapes the module decomposition, the dependency graph, and the testing strategy that follows. Step 1c: Spec Review Gate The complete spec — behavioral contracts and verification architecture — is reviewed by both the human and the Adversary before any tests are written. Sarcasmotron tears into the spec looking for: Ambiguous language that could be interpreted multiple ways Missing edge cases Implicit assumptions that aren't stated Contradictions between different parts of the spec Properties claimed as "testable only" that should be provable (the Adversary pushes back on lazy verification boundaries) Purity boundary violations — logic marked as "pure core" that actually depends on external state Verification tool mismatches — properties the selected tooling can't actually prove The spec is iterated until the Adversary can't find legitimate holes in either the behavioral contract or the verification strategy. Chainlink Integration: Each spec maps to a Chainlink Issue. Sub-issues are generated for each behavioral contract item, edge case, non-functional requirement, and each formally provable property. The provable properties get their own bead chain so their status is tracked independently from test coverage. Phase 2 — Test-First Implementation (The TDD Core) Red → Green → Refactor, enforced by AI. With an airtight spec in hand, the Builder now writes tests — and only tests. No implementation code yet. Step 2a: Test Suite Generation The Builder translates the spec directly into executable tests: Unit Tests: One or more tests per behavioral contract item. Every postcondition becomes an assertion. Every precondition violation becomes a test that expects a specific error. Edge Case Tests: Every item in the Edge Case Catalog becomes a test. These are the tests that catch the bugs that "never happen in production" (until they do). Integration Tests: Tests that verify the module works correctly within the larger system context defined in the spec. Property-Based Tests: Where applicable, the Builder generates property-based tests (e.g., using Hypothesis, fast-check, or proptest) that assert invariants hold across randomized inputs. The Red Gate: All tests must fail before any implementation begins. If a test passes without implementation, the test is suspect — it's either testing the wrong thing or the spec was wrong. The Builder flags this for human review. Step 2b: Minimal Implementation The Builder writes the minimum code necessary to make each test pass, one at a time. This is classic TDD discipline: Pick the next failing test. Write the smallest implementation that makes it pass. Run the full suite — nothing else should break. Repeat. Step 2c: Refactor After all tests are green, the Builder refactors for clarity, performance, and adherence to the non-functional requirements in the spec. The test suite acts as the safety net — if refactoring breaks something, the tests catch it immediately. Human Checkpoint: The developer reviews the test suite and implementation for alignment with the "spirit" of the spec. AI can miss intent even when it nails the letter of the contract. Phase 3 — Adversarial Refinement (The VDD Roast) The code survived testing. Now it faces the gauntlet. The verified, test-passing codebase — along with the spec and test suite — is presented to Sarcasmotron in a fresh context window. What the Adversary reviews: Spec Fidelity: Does the implementation actually satisfy the spec, or did the tests inadvertently encode a misunderstanding? Test Quality: Are the tests actually testing what they claim? Are there tests that would pass even if the implementation were subtly wrong? (Tautological tests, tests that mock too aggressively, tests that assert on implementation details rather than behavior.) Code Quality: The classic VDD roast — placeholder comments, generic error handling, inefficient patterns, hidden coupling, missing resource cleanup, race conditions. Security Surface: Input validation gaps, injection vectors, authentication/authorization assumptions. Spec Gaps Revealed by Implementation: Sometimes writing the code reveals that the spec was incomplete. The Adversary looks for implemented behavior that isn't covered by the spec. Negative Prompting: Sarcasmotron is prompted for zero tolerance. No "overall this looks good, but..." preamble. Every piece of feedback is a concrete flaw with a specific location and a proposed fix or question. Context Reset: Fresh context window on every adversarial pass. No relationship drift. N

Share this story

Read Original at Hacker News

Hacker Newsabout 2 hours ago

The whole thing was a scam

Article URL: https://garymarcus.substack.com/p/the-whole-thing-was-scam Comments URL: https://news.ycombinator.com/item?id=47197505 Points: 33 # Comments: 2

Hacker Newsabout 2 hours ago

Obsidian Sync now has a headless client

Article URL: https://help.obsidian.md/sync/headless Comments URL: https://news.ycombinator.com/item?id=47197267 Points: 94 # Comments: 32

Hacker Newsabout 2 hours ago

Show HN: SQLite for Rivet Actors – one database per agent, tenant, or document

Hey HN! We posted Rivet Actors here previously [1] as an open-source alternative to Cloudflare Durable Objects. Today we've released SQLite storage for actors (Apache 2.0). Every actor gets its own SQLite database. This means you can have millions of independent databases: one for each agent, tenant, user, or document. Useful for: - AI agents: per-agent DB for message history, state, embeddings - Multi-tenant SaaS: real per-tenant isolation, no RLS hacks - Collaborative documents: each document gets its own database with built-in multiplayer - Per-user databases: isolated, scales horizontally, runs at the edge The idea of splitting data per entity isn't new: Cassandra and DynamoDB use partition keys to scale horizontally, but you're stuck with rigid schemas ("single-table design" [3]), limited queries, and painful migrations. SQLite per entity gives you the same scalability without those tradeoffs [2]. How this compares: - Cloudflare Durable Objects & Agents: most similar to Rivet Actors with colocated SQLite and compute, but closed-source and vendor-locked - Turso Cloud: Great platform, but closed-source + diff use case. Clients query over the network, so reads are slow or stale. Rivet's single-writer actor model keeps reads local and fresh. - D1, Turso (the DB), Litestream, rqlite, LiteFS: great tools for running a single SQLite database with replication. Rivet is for running lots of isolated databases. Under the hood, SQLite runs in-process with each actor. A custom VFS persists writes to HA storage (FoundationDB or Postgres). Rivet Actors also provide realtime (WebSockets), React integration (useActor), horizontal scalability, and actors that sleep when idle. GitHub: https://github.com/rivet-dev/rivet Docs: https://www.rivet.dev/docs/actors/sqlite/ [1] https://news.ycombinator.com/item?id=42472519 [2] https://rivet.dev/blog/2025-02-16-sqlite-on-the-server-is-mi... [3] https://www.alexdebrie.com/posts/dynamodb-single-table/ Comments URL: https://news.ycombinator.

Hacker Newsabout 3 hours ago

Cognitive Debt: When Velocity Exceeds Comprehension

Article URL: https://www.rockoder.com/beyondthecode/cognitive-debt-when-velocity-exceeds-comprehension/ Comments URL: https://news.ycombinator.com/item?id=47196582 Points: 197 # Comments: 79

Hacker Newsabout 4 hours ago

Show HN: Rust-powered document chunker for RAG – 40x faster, O(1) memory

I built a document chunking library for RAG pipelines with a Rust core and Python bindings. The problem: LangChain's chunker is pure Python and becomes a bottleneck at scale — slow and memory-hungry on large document sets. What Krira Chunker does differently: - Rust-native processing — 40x faster than LangChain's implementation - O(1) space complexity — memory stays flat regardless of document size - Drop-in Python API — works with any existing RAG pipeline - Production-ready — 17 versions shipped, 315+ installs pip install krira-augment Would love brutal feedback from anyone building RAG systems — what chunking problems are you running into that this doesn't solve yet? Comments URL: https://news.ycombinator.com/item?id=47196069 Points: 4 # Comments: 0

Hacker Newsabout 4 hours ago

Please do not use auto-scrolling content on the web and in applications

Article URL: https://cerovac.com/a11y/2026/01/please-do-not-use-auto-scrolling-content-on-the-web-and-in-applications/ Comments URL: https://news.ycombinator.com/item?id=47195582 Points: 35 # Comments: 1

All Articles

Hacker News

Published about 2 hours ago

Verified Spec-Driven Development (VSDD)

Hacker News · Feb 28, 2026 · Collected from RSS

Summary

Article URL: https://gist.github.com/dollspace-gay/d8d3bc3ecf4188df049d7a4726bb2a00 Comments URL: https://news.ycombinator.com/item?id=47197595 Points: 19 # Comments: 6

Full Article

Share this story

Read Original at Hacker News

Hacker Newsabout 2 hours ago

The whole thing was a scam

Article URL: https://garymarcus.substack.com/p/the-whole-thing-was-scam Comments URL: https://news.ycombinator.com/item?id=47197505 Points: 33 # Comments: 2

Hacker Newsabout 2 hours ago

Obsidian Sync now has a headless client

Article URL: https://help.obsidian.md/sync/headless Comments URL: https://news.ycombinator.com/item?id=47197267 Points: 94 # Comments: 32

Hacker Newsabout 2 hours ago

Show HN: SQLite for Rivet Actors – one database per agent, tenant, or document

Hacker Newsabout 3 hours ago

Cognitive Debt: When Velocity Exceeds Comprehension

Article URL: https://www.rockoder.com/beyondthecode/cognitive-debt-when-velocity-exceeds-comprehension/ Comments URL: https://news.ycombinator.com/item?id=47196582 Points: 197 # Comments: 79

Hacker Newsabout 4 hours ago

Show HN: Rust-powered document chunker for RAG – 40x faster, O(1) memory

Hacker Newsabout 4 hours ago

Verified Spec-Driven Development (VSDD)

Full Article

Related Articles

The whole thing was a scam

Obsidian Sync now has a headless client

Show HN: SQLite for Rivet Actors – one database per agent, tenant, or document

Cognitive Debt: When Velocity Exceeds Comprehension

Show HN: Rust-powered document chunker for RAG – 40x faster, O(1) memory

Please do not use auto-scrolling content on the web and in applications

Verified Spec-Driven Development (VSDD)

Full Article

Related Articles

The whole thing was a scam

Obsidian Sync now has a headless client

Show HN: SQLite for Rivet Actors – one database per agent, tenant, or document

Cognitive Debt: When Velocity Exceeds Comprehension

Show HN: Rust-powered document chunker for RAG – 40x faster, O(1) memory

Please do not use auto-scrolling content on the web and in applications