NewsWorld
PredictionsDigestsScorecardTimelinesArticles
NewsWorld
HomePredictionsDigestsScorecardTimelinesArticlesWorldTechnologyPoliticsBusiness
AI-powered predictive news aggregation© 2026 NewsWorld. All rights reserved.
For live open‑source updates on the Middle East conflict, visit the IranXIsrael War Room.

A real‑time OSINT dashboard curated for the current Middle East war.

Open War Room

Trending
IranIranianMilitaryStrikesIsraeliCrisisPricesMarketOperationsRegionalLaunchTrumpGulfHormuzProxyDisruptionEscalationConflictTimelineTargetsStatesStraitDigestPower
IranIranianMilitaryStrikesIsraeliCrisisPricesMarketOperationsRegionalLaunchTrumpGulfHormuzProxyDisruptionEscalationConflictTimelineTargetsStatesStraitDigestPower
All Articles
Parallel coding agents with tmux and Markdown specs
Hacker News
Published about 7 hours ago

Parallel coding agents with tmux and Markdown specs

Hacker News · Mar 2, 2026 · Collected from RSS

Summary

Article URL: https://schipper.ai/posts/parallel-coding-agents/ Comments URL: https://news.ycombinator.com/item?id=47218318 Points: 32 # Comments: 8

Full Article

I’ve been running parallel coding agents with a lightweight setup for a few months now with tmux, Markdown files, bash aliases, and six slash commands. These are vanilla agents - no subagent profiles or orchestrators, but I do use a role naming convention per tmux window: Planner: Build Markdown specs for new features or fixes Worker: Implement from a finished spec PM: Backlog grooming and idea dumping Most actual code writing happens from a finished spec I call Feature Designs (FDs). An FD is just an md file that has: the problem we are trying to solve all solutions that were considered including pros and cons for each the final solution with an implementation plan including what files need to be updated verification steps Since I adopted this, I am able to work in parallel with 4 to 8 agents. After 8+ agents it’s hard to keep up and the quality of my decisions suffer. I built this setup by hand in one project where I did 300+ of these specs. As I started new projects, I wanted to port the same system over, so I made a slash command /fd-init that bootstraps the full setup into any repo. Feature Design tracking Each FD gets a numbered spec file (FD-001, FD-002…) which is tracked in an index across all FDs and managed through slash commands for the full lifecycle. The file lives in docs/features/ and moves through 8 stages: Stage What it means Planned Identified, not yet designed Design Actively designing the solution Open Designed, ready for implementation In Progress Currently being implemented Pending Verification Code complete, awaiting runtime verification Complete Verified working, ready to archive Deferred Postponed indefinitely Closed Won’t do Six slash commands handle the lifecycle: Command What it does /fd-new Create a new FD from an idea dump /fd-status Show the index: what’s active, pending verification, and done /fd-explore Bootstrap a session: load architecture docs, dev guide, FD index /fd-deep Launch 4 parallel Opus agents to explore a hard design problem /fd-verify Proofread code, propose a verification plan, commit /fd-close Archive the FD, update the index, update the changelog Every commit ties back to its FD: FD-049: Implement incremental index rebuild. The changelog accumulates automatically as FDs complete. A typical FD file looks like this: FD-051: Multi-label document classification Status: Open Priority: Medium Effort: Medium Impact: Better recall for downstream filtering ## Problem Incoming documents get a single category label, but many span multiple topics. Downstream filters miss relevant docs because the classifier forces a single best-fit. ## Solution Replace single-label classification with multi-label: 1. Use an LLM to assign confidence scores per category. 2. Accept all labels above 0.90 confidence. 3. For ambiguous scores (0.50-0.90), run a second LLM pass with few-shot examples to confirm. 4. Store all labels with scores so downstream queries can threshold flexibly. ## Files to Modify - src/classify/multi_label.py (new: LLM-based multi-label logic) - src/classify/prompts.py (new: few-shot templates for ambiguous cases) - sql/01_schema.sql (add document_labels table with scores) - sql/06_classify_job.sql (new: scheduled classification after ingestion) ## Verification 1. Run classifier on staging document table 2. Verify no errors in operation log, run health checks 3. Spot-check: docs with known multi-topic content have expected labels 4. Run tests, confirm downstream filters respect confidence threshold The FEATURE_INDEX.md tracks status across all FDs: ## Active Features | FD | Title | Status | Effort | Priority | |--------|-------------------------------------|----------------------|--------|----------| | FD-051 | Multi-label document classification | Open | Medium | Medium | | FD-052 | Streaming classification pipeline | In Progress | Large | High | | FD-050 | Confidence-based routing | Pending Verification | Medium | High | ## Completed | FD | Title | Completed | Notes | |--------|-------------------------------------|------------|----------------| | FD-049 | Incremental index rebuild | 2026-02-20 | 45 min → 2 min | | FD-048 | LLM response caching | 2026-02-18 | | Run /fd-init in any repo and it: Infers project context from CLAUDE.md, package configs, and git log Creates the directory structure (docs/features/, docs/features/archive/) Generates a FEATURE_INDEX.md customized to the project Creates an FD template Installs the slash commands (/fd-new, /fd-status, /fd-explore, /fd-deep, /fd-verify, /fd-close) Appends FD lifecycle conventions to the project’s CLAUDE.md * FD System Initialized Files Created - docs/features/FEATURE_INDEX.md — Feature index - docs/features/TEMPLATE.md — FD file template - docs/features/archive/ — Archive directory - CHANGELOG.md — Changelog (Keep a Changelog format) - CLAUDE.md — Project conventions with FD management section - .claude/commands/fd-new.md — Create new FD - .claude/commands/fd-explore.md — Project exploration - .claude/commands/fd-deep.md — Deep parallel analysis - .claude/commands/fd-status.md — Status and grooming - .claude/commands/fd-verify.md — Verification workflow - .claude/commands/fd-close.md — Close and archive FD with changelog update Next Steps 1. Run /fd-new to create your first feature design 2. Run /fd-status to check the current state The development loop Planning I spend most of the time working with Planners. Each one starts with /fd-explore to load codebase context and past work so the agent doesn’t start from zero. For the original project where FDs were born, this slash command grew organically and now includes architecture docs, dev guide, readmes, and core code files. For my new projects, I created the generic version (the one I am sharing in this article), and my plan is to customize it for my new projects. Once /fd-explore completes, I’ll usually point the Planner to an existing FD file and chat back and forth until I’m satisfied with the spec: on fd14 - can we move the batch job to event-driven? what does the retry logic look like if the queue backs up? In Boris Tane’s How I Use Claude Code, he describes how he uses inline annotations to give Claude feedback. I adapted this pattern for complex FDs where conversational back-and-forth can be imprecise. I edit the FD file directly in Cursor and add inline annotations prefixed with %%: ## Solution Replace cron-based batch processing with an event-driven pipeline. Consumer pulls from the queue, processes in micro-batches of 50. %% what's the max queue depth before we start dropping? need backpressure math Run both in parallel for 48h, compare outputs, then kill the cron job. Failures go to the dead-letter queue. %% what happens to in-flight items during cutover? need to confirm drain behavior Then in Claude Code: fd14 - check %% notes. Sometimes a feature is quite complex, a problem doesn’t have an obvious solution, or I don’t know the technologies I’m working with well enough. In those cases, I may do two things: I cross-check the FD plan in Cursor with the gpt-5.3-codex xhigh (or whatever latest SoTA model) A special skill: /fd-deep that launches 4 Opus agents in parallel (inspired from GPT Pro’s parallel test-time compute1) to explore different angles: if we switch to async processing, what happens to the retry queue when the consumer crashes mid-batch? use /fd-deep. /fd-deep runs each of the agents on Explore mode with a specific angle to investigate (algorithmic, structural, incremental, environmental, or whatever fits the problem). The orchestrator then verifies each of their outputs and recommends next steps. Complex planning sessions can span multiple context windows. I often ask Claude to checkpoint the plan since compaction doesn’t do a great job to keep the relevant context in the new session. Worker execution When an FD is ready, I’ll launch a brand new agent in a separate tmux window. I point it at the FD with “plan mode on” so Claude builds a line-level implementation plan then run with “accept edits on” and let it run. When an FD has a big blast radius, I’ll instruct the Worker to create a worktree which Claude Code handles natively. Compaction tends to work better with Workers probably because the FD has granular plan details that a newborn Worker can attend to. Verification Each FD has a verification plan, however Claude tends to find bugs with its own code when prompted to double check its work, so I kept typing the same things over and over: proofread your code end to end, must be airtight check for edge cases again commit now, then create a verification plan on live test deployment. So I built /fd-verify - it commits current state, does a proofread pass and executes a verification plan. In my original project, I also created dedicated testing slash commands like /test-cli that run full verification against live data. The agent executes live queries and commands and reasons about whether the results are correct and writes Markdown files with tables, timestamps, and diagnostic notes. What’s great about this is that the agent can investigate issues on the spot so by the end, the result comes back diagnosed. PM window: 1. /fd-status ← What's active and pending 2. Pick an FD (or /fd-new) ← Groom the backlog or dump a new idea Planner window (new agent session): 3. /fd-explore ← Load project context 4. Design the FD ← if stuck /fd-deep and cross-check in Cursor 5. FD status → Open ← Design is ready for implementation Worker window (fresh agent session): 6. /fd-explore ← Fresh context load 7. "implement fd-14" (plan mode) ← Claude builds a line-level implementation plan 8. Implement with atomic commits ← FD-XXX: description 9. /fd-verify ← Proofread and verification 10. Test on real deployment ← Verification skills or manual 11. /fd-close ← Archive, update index, changelog FD files as decision traces In my original project, I now have 300+ FD files each with a problem statement, solutions considered, and what was implemented. An emergent property


Share this story

Read Original at Hacker News

Related Articles

Hacker Newsabout 2 hours ago
Show HN: Gapless.js – gapless web audio playback

Hey HN, I just released v4 of my gapless playback library that I first built in 2017 for https://relisten.net. We stream concert recordings, where gapless playback is paramount. It's built from scratch, backed by a rigid state machine (the sole dependency is xstate) and is already running in production over at Relisten. The way it works is by preloading future tracks as raw buffers and scheduling them via the web audio API. It seamlessly transitions between HTML5 and web audio. We've used this technique for the last 9 years and it works fairly well. Occasionally it will blip slightly from HTML5->web audio, but there's not much to be done to avoid that (just when to do it - lotta nuance here). Once you get on web audio, everything should be clean. Unfortunately web audio support still lacks on mobile, in which case you can just disable web audio and it'll fallback to full HTML5 playback (sans gapless). But if you drive a largely desktop experience, this is fine. On mobile, most people use our native app. You can view a demo of the project at https://gapless.saewitz.com - just click on "Scarlet Begonias", seek halfway in the track (as it won't preload until >15s) and wait for "decoding" on "Fire on the Mountain" to switch to "ready". Then tap "skip to -2s and hear the buttery smooth segue. Comments URL: https://news.ycombinator.com/item?id=47222271 Points: 10 # Comments: 4

Hacker Newsabout 2 hours ago
"That Shape Had None" – A Horror of Substrate Independence (Short Fiction)

Article URL: https://starlightconvenience.net/#that-shape-had-none Comments URL: https://news.ycombinator.com/item?id=47222226 Points: 18 # Comments: 2

Hacker Newsabout 3 hours ago
Bars close and hundreds lose jobs as US firm buys Brewdog in £33M deal

Article URL: https://www.bbc.com/news/articles/c05v0p1d0peo Comments URL: https://news.ycombinator.com/item?id=47221994 Points: 74 # Comments: 46

Hacker Newsabout 4 hours ago
Show HN: Govbase – Follow a bill from source text to news bias to social posts

Govbase tracks every bill, executive order, and federal regulation from official sources (Congress.gov, Federal Register, White House). An AI pipeline breaks each one down into plain-language summaries and shows who it impacts by demographic group. It also ties each policy directly to bias-rated news coverage and politician social posts on X, Bluesky, and Truth Social. You can follow a single bill from the official text to how media frames it to what your representatives are saying about it. Free on web, iOS, and Android. https://govbase.com I'd love feedback from the community, especially on the data pipeline or what policy areas/features you feel are missing. Comments URL: https://news.ycombinator.com/item?id=47220809 Points: 16 # Comments: 3

Hacker Newsabout 4 hours ago
Reflex (YC W23) Is Hiring Software Engineers – Python

Article URL: https://www.ycombinator.com/companies/reflex/jobs Comments URL: https://news.ycombinator.com/item?id=47220666 Points: 0 # Comments: 0

Hacker Newsabout 4 hours ago
Launch HN: OctaPulse (YC W26) – Robotics and computer vision for fish farming

Hi HN! My name is Rohan and, together with Paul, I’m the co-founder of OctaPulse (https://www.tryoctapulse.com/). We’re building a robotics layer for seafood production, starting with automated fish inspection. We are currently deployed at our first production site with the largest trout producer in North America. You might be wondering how the heck we got into this with no background in aquaculture or the ocean industry. We are both from coastal communities. I am from Goa, India and Paul is from Malta and Puerto Rico. Seafood is deeply tied to both our cultures and communities. We saw firsthand the damage being done to our oceans and how wild fish stocks are being fished to near extinction. We also learned that fish is the main protein source for almost 55% of the world's population. Despite it not being huge consumption in America it is massive globally. And then we found out that America imports 90% of its seafood. What? That felt absurd. That was the initial motivation for starting this company. Paul and I met at an entrepreneurship happy hour at CMU. We met to talk about ocean tech. It went on for three hours. I was drawn to building in the ocean because it is one of the hardest engineering domains out there. Paul had been researching aquaculture for months and kept finding the same thing: a $350B global industry with less data visibility than a warehouse. After that conversation we knew we wanted to work on this together. Hatcheries, the early stage on-land part of production, are full of labor intensive workflows that are perfect candidates for automation. Farmers need to measure their stock for feeding, breeding, and harvest decisions but fish are underwater and get stressed when handled. Most farms still sample manually. They net a few dozen fish, anesthetize them, place them on a table to measure one by one, and extrapolate to populations of hundreds of thousands. It takes about 5 minutes per fish and the data is sparse. When we saw this process we were ba