The Extractor relies on readable files with recognizable layouts. When it misses data or pulls the wrong numbers, the cause is almost always one of four things: a scanned image with poor OCR quality, a non-standard statement format, a multi-account statement that got collapsed into one, or a file type it can't fully read yet.
Here's how to get better results.
Why it misses things
Scanned images. Photos of statements or low-resolution scans are hard to parse. The Extractor has to guess at blurry numbers and tiny print, and it often guesses wrong.
Non-standard formats. Custodian statements, brokerage summaries, and tax forms have recognizable layouts. A handwritten note, a screenshot of a portal, or an unusual one-off report gives the Extractor less to work with.
Multi-account statements. If a single PDF contains five accounts, the Extractor might merge them, split them incorrectly, or miss some entirely. It tries hard to separate them, but dense consolidated statements can trip it up.
Unsupported or mixed file types. See the File Types article for what's fully supported today.
Fixes that usually work
Use text-based PDFs when you can. A PDF exported directly from the custodian or tax software parses far better than a scanned image of the same document. If you only have a scan, try to get a clean text version instead.
Split multi-account statements. If the Extractor merged accounts that should be separate, re-upload each account as its own file. Smaller, focused documents extract more reliably.
Use the context field when uploading. The text prompt on upload is your chance to tell the Extractor what it's looking at. Something like "this is a joint brokerage account at Schwab for John and Jane" gives it the grounding to assign ownership and institution correctly.
Review extracted items before committing. Everything the Extractor pulls lands on the Documents tab as pending items. Look them over, edit anything wrong, reject anything it shouldn't have created, then finish the extraction.
Add missing data manually. If something didn't extract and you've tried once or twice, add it by hand through the Profile tabs. It's almost always faster than fighting extraction.
When to send it to support
If you uploaded a common statement format from a major custodian and the extraction came back clearly broken, send it to us through Intercom with a sanitized copy of the file. Those reports are how the Extractor improves. The more real-world examples we see, the better it gets at the formats you actually use.
