I'm looking for some kind of (free or open source) document management application which should allow me to store paper documents in my personal computer allowing me to add fields of information with data entry to be able to get the digital copy back.
For example, if the document is a fine I can find it if I remember the date or the fact that is a fine or another custom field I could add to search.
OCR full text search would be a great plus but not mandatory.
LogicalDOC Community could be used for this purpose. Allows you to catalog and tag many file types and has a built-in free OCR.
One of the features that I really like about this package is the full-text search engine that can run natively language specific searches.
There is a good documentation for installation on Ubuntu, which doesn't involve special difficulties
There are several open source document management systems and scanning solutions which would work to help your archiving needs. For document management there is:
pip install mayan-edsm
)As for scanning software, there are a few open source options - but nothing that will perform too well. Depending on what you are looking to archive (and how you plan on accessing it in the future) you might be able to just tag your documents accordingly inside of your management software. Also...you are unlikely to find solid OCR in any freeware scanning application.
If you have the option, I strongly suggest outsourcing document conversion projects. Not only will you get it done faster - you will have the option to OCR your files and know that the finished quality of your project will be professional and easy to read.
There is a document management system that does pretty much exactly what you require, called Archivista. I've evaluated it for our museum's archive.
It can be downloaded as an installable ISO or purchased pre-installed on small business computers. I do not know of a possibility to install it under Ubuntu, however, which may be a dealbreaker for you. Here, we just run it as a virtual machine and interact with it via X forwarding and its HTML interface.
Archivista claims the software is designed for long (approx. 20 years) data retention periods. It can make use of scanners, and stores an image of the scanned document, a PDF and OCR version. Documents can be assigned metatags, and their OCR'ed text is searchable.
I think you are looking for document catalog management software. I am using Calibre to manage my ebooks. Apart from pdf, it also supports MOBI, LIT, PRC, EPUB, ODT, HTML, CBR, CBZ, RTF, TXT, PDF and LRS format.
I am not sure, if it supports Ms doc format. But you can check it out. Please visit official site for more information
To install calibre, use following command.
Information hierarchy helps you collaborate to generate documents online or in Microsoft Office tools. You can swiftly organize, store, and locate your documents via dataentry.ie