|
Working with PDF Documents This section explains what happens when you import different types of PDF files into Docsvault. PDF files can be imported into Docsvault by one of the following way:
We can therefore divide PDF files into two types: Text-based PDF: A series of text elements contents and images (optional)
Image-only PDF: A single scanned image per page
Importing image-only PDF files into Docsvault In image-based PDF you will find image instead of text. While selecting the text with mouse you will find that the text was not selected. You can’t edit the text. Even you can’t delete the content. It looks like an image. You can only read the PDF file. Neither you can edit nor you can delete the text.
If you import such image-based PDF file which is created or scanned using any other application into Docsvault, you can still work with it. Docsvault will OCR it using its optical character recognition (OCR) add-on tool while importing the file. Note that this feature will be available only if has been enabled in Tools > Advance Settings.. For more information, see OCR in the Advance Settings. Docsvault will attempt to OCR all imported PDFs. However if the OCR process finds any text content in any imported PDF file, it will skip that PDF file and will keep it in its original form. This is essential to protect text based PDF files that do not need any OCR. However if you can still wish to OCR, you can force OCR by opening the Properties dialog of this file and marking it for Re-OCR/Force OCR. This will convert the entire PDF file into image based PDF and then OCR all pages. For more information on Re-OCR, see Re-OCR PDF File. You will be able to get the update status of OCR status from the File Properties > General tab. For more information, see OCR Status.
Scanning documents and optical character recognition (OCR) You can create a PDF directly from a paper document using Docsvault and your scanner.
Page url: http://www.docsvault.com/online-help/professional/index.html?working_with_pdf_documents.html |