How cases merge

Files attach to a case by identifier. When a new file lands carrying an identifier already claimed by an existing case, it joins that case; otherwise a new case is atomically created. Each file is extracted independently and its values write into case.data — empty values cannot blank populated fields, and your manual edits are protected by the override map.

Last updated 2026-05-06

Files attach to a case by identifier. When a new file lands carrying an identifier already claimed by an existing case, it joins that case; otherwise a new case is atomically created. Each file is extracted independently and its values write into case.data — empty values cannot blank populated fields, and your manual edits are protected by the override map.

What you need

A workspace with at least one case
A new file whose extracted identifier matches that case

The routing rule

Every case has a top-level Identifier. The workspace enforces uniqueness on identifier values — at most one case per (workspace, identifier value) pair, backed by an atomic Firestore lock. Every ingestion channel — Gmail, Drive, manual upload — runs the same three-branch routing:

Identifier extracted, lock found → file attaches to the existing case.
Identifier extracted, no lock → atomically claim the lock, create a new case, attach the file.
No identifier surfaced → create a new case with an auto-generated ID, attach the file.

The atomic claim makes step 2 race-safe: two parallel ingests of the same identifier can't produce two cases — exactly one wins the lock, the other attaches to the winner.

Auto-promote on extraction

A case created without an identifier (manual placeholder, no-identifier ingest) auto-promotes the moment a later extraction surfaces a clean identifier value. If that value is already claimed by another case the promote is skipped and the auto ID stays — you resolve the collision manually by editing the identifier on either case.

Renaming an identifier

Open the case and edit the Identifier field directly. The rename atomically releases the old lock and claims the new one. If another case in the workspace already owns the new value, the rename surfaces a clean conflict error and nothing changes.

What happens to field values when a second file arrives

When a file attaches to an existing case, extraction runs on that one new file against its matched template. The result writes into case.data field by field. There's no cross-file diffing — each file's extraction is independent.

The field-level rules for those writes:

The latest extraction wins for each field. A second invoice extracting Vendor Name = "Acme Supply Co." overwrites a previous "Acme Supply". There is no "longer string wins" rule, no string-similarity comparison — the newest extracted value replaces the prior value.
Empty extractions can't blank populated fields. If the AI returns null, undefined, or an empty string for a field that already has a value, the merge skips that field. The previous value stays.
Manual edits are sticky. Any field you've pencil-edited (or that the AI agent corrected via chat) is recorded in case.data._overrides. Every re-extraction — automatic or manual — checks this map and skips overridden fields. Document Blueprint never silently undoes a worker's correction. Clear the override (via the row's history popover) if you want the next extraction to re-overwrite the field.
Dates are normalized at write time. Every value going into a date-typed field is converted to US MM/DD/YYYY regardless of how the AI returned it. Free text the AI returned for a date field (e.g. "TBD") falls through unchanged.
Repeating-row (batch) fields are written as a whole. The latest extraction's array of rows replaces the previous array. There is no row-level concatenation across files — if you need a single case to accumulate line items from multiple invoices, configure the second invoice as a separate case or merge them by hand.

A merge in action

Old fileVendor='Acme Supply', Due=2026-04-01

New fileVendor='Acme Supply Co.', Due=2026-04-15

After re-extractionVendor='Acme Supply Co.' (overwritten), Due=2026-04-15 (overwritten)

What happens if the user edited a field before the second file arrived?

That's exactly what _overrides exists to protect. If you edited Vendor Name = "Acme Supply Inc." between the two files arriving, the second file's extraction sees the override entry for vendor_name, skips that field, and writes everything else. Your "Acme Supply Inc." value survives.

You can confirm a field is protected by looking for the small "edited" indicator next to the row in the case detail panel — that's the visual signal that an override is in place.