Capability
Document Image Text Extraction With Layout Preservation
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “optical character recognition with layout preservation”
Microsoft's unified model for diverse vision tasks.
Unique: Performs end-to-end OCR with layout preservation using a single seq2seq model that generates text tokens interleaved with coordinate sequences, eliminating separate text detection and recognition stages
vs others: Simpler pipeline than Tesseract + text detection models but with 15-25% lower character accuracy on printed documents; stronger on handwriting and scene text than traditional OCR