How to OCR and Save PDFs as Text Files Using Batch Processing

Here’s a walk through on how to OCR PDFs and save them as text files using the Batch Processing feature and ClearScan feature in Adobe Acrobat 9 Pro.

1. Open Adobe Acrobat 9 Pro

2. In the top Menu, choose Advanced–> Document Processing–> Batch Processing

3. In the Batch Sequences window, choose New Sequence

4. In the textbox, type in a name for the sequence. I chose OCR Text in PDF and Save as Text File. Click OK.

5. Click the Select Commands button.

6. In the Edit Sequence window, under the Docuemnt Folder, choose Recognize Text Using OCR.

7. Now the Add Button in the middle will highlight. Click the Add button. It will appear in the right side column.

8. Double click on the text in the right hand column ( Recognize Text Using OCR) or highlight Recognize Text Using OCR and click the Edit button in the middle bottom of the Edit Sequence window.

9. In the Recognize Text Settings window, choose the language you will be OCRing from the PDFs (we used English (US)).

10. Change PDF Output style to ClearScan.

11. Change Downsample Images to Low (300 dpi).

12. Click OK for the Recognize Text Settings window and Edit Sequence Video.

13. For the Edit Batch Sequence window under step 2, for Run Commands on, click the browse button and choose the folder that has the PDFs you want to OCR and save as text files.

14. For Select output type (step 3), click the browse button and choose the folder you want to save the OCRed text files. You can create a new folder while in the search window.

15. Under step 3, click the Output Options button.

16. Under File Naming Choose how you would like the files to be named (can be set to be same file name as original. You can also chose to overwrite existing files.

17. Under Output Format, choose Export File(s) to Alternate Format.  In the drop down menu, choose Text (Accessible).

18. Click OK for the Output Options and the Edit Batch Sequence windows.

19. In the Batch Sequences window, highlight the sequence you just created and click the Run Sequence button.

20. Click OK. Your batch process will now run on the folder you designated for that sequence. Any errors will be reported as the batch process runs in Acrobat.

 

This entry was posted in Educational Technology and tagged , . Bookmark the permalink.