OCR document converter


The Optical Character Recognition (OCR) document converter service does not make documents accessible.  It automatically converts individual image-based PDFs and images into text-based electronic documents.

The conversion process turns inaccessible scanned images into searchable text-based documents.  A text-based document is the first step toward accessibility but there are several other steps to create an truly accessible document.  Refer to UM's accessibility website for further details on Electronic Accessibility.

You will receive both a text-based pdf and a word processing document from the convertdoc service. You can edit the second document and save it as a fully accessible .pdf 

OCR conversion is done using ABBYY Recognition Server

Appropriate uses:

  • This process is for PDFS only.  Word documents will not be processed and you will receive an error message email.
  • This process is for PDFs that are image-based[1], not PDFs which are already text-based[2]

[1] Image-based PDFs do not have text.  You cannot select a portion of the text and copy it to another document.  If you click on a page of the document, the entire page changes color slightly to indicate that the page is an image which you have just selected.  Screen readers cannot read any part of an image.  Screen readers require “true text” (optical characters).

[2] When you click on the text, you see an I-beam and can select a portion of the text to copy to another document.  (Your experience is opposite the experience described in footnote #1.

Please be aware

  • None of these options are secure enough for confidential information, especially FERPA and HIPPA data.  Use them appropriately.
  • Putting an accessible PDF through this process may strip it of accessibility features such as alt tags and heading styles.

How to request

Send an email message to convertdoc@umontana.edu from a University of Montana email account with the document you want converted attached. You should receive a return email with your converted documents attached within several minutes. 

Alternatively departments may request a departmental OCR folder. OCR folders are provided for your convenience when converting groups of documents and larger documents from image-based documents to text-based documents.  They function as drives on your computer (as seen below) where image-based documents can be dropped in the ocr_umprocessed folder and retrieved several minutes later as text-based documents.  

folder containing an ocr_completed and ocr_unprocessed folder as a  


Most problems are due to:

  • Submitting a word processing document rather than an image-based PDF.
  • Email program limits. When ABBYY Fine Reader processes a PDF, it creates a separate optical character text layer.  This makes the PDF larger – sometimes too large to be able to send as an email attachment.  In that case you receive an email which says “Unfortunately, the conversion of your document has failed.  The job was discarded with the following error:  Cannot send e-mail message.”  If this happens to you frequently you may want to read “A lot of documents” above.