2026.06 / NEWBIZFEED BUILD LOG

Using Multiple Codex Agents to Review Companies House and CRO Data

NewBizFeed is a simple idea with an awkward data problem behind it: fresh UK and Ireland company formation data is useful, but raw public-register data is not the same thing as a customer-ready lead list.

Companies House and the Irish Companies Registration Office are excellent sources of official company records. They tell you that a company exists, when it was incorporated, what broad category it sits in, and where the official record can be checked. They do not, by themselves, answer the question a small business actually cares about: “is this a sensible company for me to contact, and can I trust the extra website/contact evidence attached to it?”

The product is not the raw database. The product is a smaller, fresher, source-attributed list that has been filtered, checked, and kept honest about uncertainty.

The raw feeds are useful, but uneven

The UK side starts with Companies House public data products and, where configured, Companies House stream/API updates. The Ireland side uses CRO Open Data company records, which are suitable for a daily Ireland feed. Both sources are official public-register data, but they arrive with different cadence, shape, and limitations.

That matters because “new companies” is a deceptively simple phrase. A company can be active, active with a strike-off proposal, recently incorporated, dormant-looking, a property holding vehicle, a charity-style entity, or a perfectly valid but commercially irrelevant record for a particular lead pack. NewBizFeed has to treat the register as a warehouse, not as the finished offer.

Why one agent is not enough

At small scale, a single enrichment job can work through companies one by one. At real register scale, that becomes slow and expensive. The better pattern is a pipeline with different jobs doing different levels of thinking:

Data ingestion imports Companies House and CRO records and keeps source attribution attached.
Deterministic filters remove obviously unsuitable records before any model is asked to reason.
Codex Discovery A and B work through separate batches in parallel, looking for website and contact evidence.
Codex Instant Review checks candidates that discovery found, focusing on quick evidence quality.
Codex Final Review is reserved for uncertain, weak, ambiguous, or higher-risk cases.
The deterministic gate is the only part allowed to publish customer-visible website/contact fields.

The important design choice is that the agents are not treated as magic truth machines. They are reviewers. They can inspect evidence, summarise what they found, and label confidence, but they do not get to publish directly.

Two discovery lanes are better than duplicate checking

Earlier versions of this kind of workflow often spend multiple model calls checking the same row. That can improve confidence, but it also burns throughput. For NewBizFeed, the normal discovery path now uses two Codex discovery lanes on different company batches. Codex Discovery A and Codex Discovery B can both move through fresh UK and Ireland rows at the same time.

That gives the system a wider first pass. If neither lane finds useful evidence, there is no reason to send the row deeper into review. If a lane does find a plausible website/contact candidate, the row moves to a fast review stage. Only rows that remain uncertain or weak need the slower final-review lane.

Companies House / CRO records
  ↓
quality filters and enrichment queue
  ↓
Codex Discovery A      Codex Discovery B
  ↓                    ↓
found candidates and evidence notes
  ↓
Codex Instant Review
  ↓
Codex Final Review only when needed
  ↓
deterministic publish gate
  ↓
customer-visible lead packs

The gate matters more than the model

The safest part of the design is deliberately boring: a deterministic gate decides whether a field can be published. Agent output is provisional. A model can say “this looks like the right website”, but the gate still checks whether the evidence is strong enough, whether the confidence label is acceptable, whether the contact-page evidence is missing or ambiguous, and whether the row should remain blank instead.

That last option is important. For lead products, a blank field is often better than a wrong field. A buyer can work around missing data; wrong data damages trust.

What Codex is actually reviewing

The agents are not reviewing private information. The useful checks are public, source-attributed, and intentionally narrow:

Does a candidate website appear to belong to the company rather than a similarly named business?
Is there hard evidence on the site, such as matching company name, trading name, locality, or source link?
Is the contact page present and relevant, or is the evidence weak?
Should the row be marked high-confidence, kept for review, or suppressed?
Does the result remain compliant with the source terms and customer-facing caveats?

This keeps the automation useful without pretending it can infer permission to market, private contact details, or guaranteed buyer intent. NewBizFeed is about organising official public-register data and evidence, not inventing a magic list of ready-to-buy customers.

Why this is a good fit for Codex-style agents

This task sits in a useful middle ground. It is too judgement-heavy for a simple script, but too repetitive for a human to do manually at register scale. Codex-style agents are useful because they can read messy public web pages, compare evidence, return structured notes, and operate as separate workers on separate batches.

The trick is not to ask the model to run the business. The trick is to wrap the model in a system that limits scope:

separate discovery lanes for throughput,
a fast review lane for ordinary candidates,
a slower final-review lane for difficult cases,
audit trails for agent runs and imported reviews,
deterministic rules for what can be published,
source links so customers can verify important records themselves.

The product lesson

The useful lesson from NewBizFeed is not “use AI to scrape everything.” It is almost the opposite: use official public data as the base, filter hard, use agents for the specific evidence checks that scripts struggle with, and keep a non-AI gate between provisional review and customer-facing output.

That is slower than pretending every record is a qualified lead, but it is a better foundation for a product people can trust. The aim is a lead feed that is fresh enough to be useful, small enough to act on, and honest enough that every exported row can point back to its source.