contentCrawler is an integrated analysis, reporting and processing framework that empowers Document Management Professionals with a range of services for managing high-value documents in Content Repositories more efficiently and more reliably.
Make documents retrievable and searchable
See how easy it is to locate and convert image-based documents to text-searchable PDFs with pdfDocs Content Crawler.
Enhance search capability
Content Crawler provides a framework for searching an entire Content Repository based on specific search queries.
Content Crawler assesses and identifies documents in the Content Repository that meet the search criteria, producing an Audit report for Administrators.
Make all documents searchable
10-20% of documents in Content Repositories are non-searchable. This represents a significant risk to organizations. contentCrawler can identify non-searchable content in image files, PDFs and even email attachments.
The files are converted to text-searchable PDFs using DocsCorps OCR technology and saved back into the Content Repository.
contentCrawler can search and convert backlogs of legacy documents as well as actively monitor newly-profiled documents.
Source of non-searchable content
- Scanned documents saved as TIFF or image PDFs without being OCR’d
- Faxes saved as TIFF or image PDFs without being OCR’d
- Emails received with imaged-based attachments and saved into the DMS
- Documents ingested from acquisitions or litigation files imported with image PDFs, including image files attached to emails
- Legacy documents retained over many years
Content Repository integration
Integrates with Autonomy iManage 8.2 or higher and OpenText eDOCS DM 5.1.05 or higher as well as OpenText Content Server.
contentCrawler also integrates with MS Windows file systems.