I have PDF Architect + OCR. When I open an existing PDF (image based), I click on the OCR tab and attempt to OCR the document with the image in the foreground and OCR text in the background. After the process runs, there are dark grey boxes around the text it has identified. It also removes some characters (assuming it did not identify them appropriately) and changes those areas to white spaces.
An example document I'm attempting to OCR is a Invoice/Packing Slip. It was scanned and emailed from a Xerox multi-function machine. I usually save them out of my email, OCR the text while preserving the original look. That way, searching through a folder(s) of documents (content search) is easier.
With Adobe Acrobat (an old version), I was able to OCR and it did not change the original look. Since then, I'm switched to PDF Architect, but cannot figure the configuration out.
My Settings are:
------>Recognition Quality: High
------>Output Quality: Max
------>Orientation and Script Detection: Enabled
------>Selected: Deskew, Rotate Page, Detect Text Orientation and
------>PDF Type: Image-Text