* [gentoo-user] Document management solution [possibly a bit off-topic...] @ 2005-09-29 14:45 Steve [Gentoo] 2005-09-29 17:29 ` A. Khattri 0 siblings, 1 reply; 6+ messages in thread From: Steve [Gentoo] @ 2005-09-29 14:45 UTC (permalink / raw To: gentoo-user I think I want a "document management solution" - though I'm not sure that everyone understands the same idea by the term. I've got a filing cabinet full of paperwork which is an absolute nightmare to cope with. One of the key problems is that the documents want to be indexed in different ways. All the documents are dated, but they can be further sub-divided by subjects - lots of documents appertain to several subjects. I frequently require to find either a specific document, a sequence of related documents or similar. I rarely need the original document - but often want a copy or just to check some detail or other. Some documents are multi-page, some single page... all can be easily scanned. I'm interested to establish software which minimises the burden of managing these documents - probably as scanned images. I'm familiar with the Dj-Vu Libre library and think that format is fantastic - though a less ambitious format would likely suffice (even at 200dpi grey scale jpegs I get ~10,000 pages without needing more than one DVD to back up...) A significant burden is in scanning and storing all these documents - and this makes a good UI essential - preferably allowing a single click to scan a document (incidentally can anyone recommend a good, cheap, sheet-fed scanner?) before page-preview (cropping/rotating) and assignment of "subject" classification and date-stamping. It would be useful if there was an OCR pass in order to extract plain-text and to index that - though this feature is not essential. There would need to be a friendly UI in order to establish all the documents matching a given subject classification (or group of classifications) - to preview on-screen and offer an option to print... preferably in-order... maybe with a watermark dating the copy? Is anyone aware of any existing packages - preferably for Gentoo, but any open-source solution would suffice. Thanks in advance for any suggestions :-) Steve -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Document management solution [possibly a bit off-topic...] 2005-09-29 14:45 [gentoo-user] Document management solution [possibly a bit off-topic...] Steve [Gentoo] @ 2005-09-29 17:29 ` A. Khattri 2005-09-29 17:53 ` Steve [Gentoo] 0 siblings, 1 reply; 6+ messages in thread From: A. Khattri @ 2005-09-29 17:29 UTC (permalink / raw To: gentoo-user On Thu, 29 Sep 2005, Steve [Gentoo] wrote: > I think I want a "document management solution" - though I'm not sure > that everyone understands the same idea by the term. This might be overkill: http://www.alfresco.org/ Or maybe something like ScrollKeeper would suffice? -- -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Document management solution [possibly a bit off-topic...] 2005-09-29 17:29 ` A. Khattri @ 2005-09-29 17:53 ` Steve [Gentoo] 2005-09-29 20:52 ` A. Khattri 0 siblings, 1 reply; 6+ messages in thread From: Steve [Gentoo] @ 2005-09-29 17:53 UTC (permalink / raw To: gentoo-user A. Khattri wrote: > On Thu, 29 Sep 2005, Steve [Gentoo] wrote: >> I think I want a "document management solution" - though I'm not sure >> that everyone understands the same idea by the term. >> > This might be overkill: > http://www.alfresco.org/ > Alfresco is what I'd have called a content management system - as opposed to a document management system. I'm interested in managing archives of documents I have received from other people (in dead-tree format)... > Or maybe something like ScrollKeeper would suffice? Scrollkeeper seems to target electronic manuals etc. (as far as I can tell) - It doesn't appear to be focused on scanned documents. The typical sort of documents I need to manage include monthly and quarterly invoices and statements etc. from a wide variety of vendors. Like Alfresco, I'd say that Scrollkeeper looks more like a content management system than a document management system... -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Document management solution [possibly a bit off-topic...] 2005-09-29 17:53 ` Steve [Gentoo] @ 2005-09-29 20:52 ` A. Khattri 2005-09-29 22:36 ` Nick Rout 0 siblings, 1 reply; 6+ messages in thread From: A. Khattri @ 2005-09-29 20:52 UTC (permalink / raw To: gentoo-user On Thu, 29 Sep 2005, Steve [Gentoo] wrote: > Alfresco is what I'd have called a content management system - as > opposed to a document management system. I'm interested in managing > archives of documents I have received from other people (in dead-tree > format)... If there was something that scanned the document, performed OCR on it, checked the OCR output and then built an electronic repository for you I'd recommend it. Until then, Alfresco is the closest thing Ive seen that is open source. If you're willing to do your own scanning and OCR'ing then it will do the rest. BTW, I would call things like Mambo or Xaraya, content-management tools - Alfresco is a slightly different kettle of fish. -- -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Document management solution [possibly a bit off-topic...] 2005-09-29 20:52 ` A. Khattri @ 2005-09-29 22:36 ` Nick Rout 2005-09-30 1:43 ` Eric Crossman 0 siblings, 1 reply; 6+ messages in thread From: Nick Rout @ 2005-09-29 22:36 UTC (permalink / raw To: gentoo-user On Thu, 29 Sep 2005 16:52:54 -0400 (EDT) A. Khattri wrote: > On Thu, 29 Sep 2005, Steve [Gentoo] wrote: > > > Alfresco is what I'd have called a content management system - as > > opposed to a document management system. I'm interested in managing > > archives of documents I have received from other people (in dead-tree > > format)... > > If there was something that scanned the document, performed OCR on it, > checked the OCR output and then built an electronic repository for you I'd > recommend it. Until then, Alfresco is the closest thing Ive seen that is > open source. If you're willing to do your own scanning and OCR'ing then it > will do the rest. > > BTW, I would call things like Mambo or Xaraya, content-management tools - > Alfresco is a slightly different kettle of fish. Yes I know what Steve is after, and I'd love to find a way. I was put off by Alfresco being called "Content Management" because all of the content management systems I have seen end up bioding something that resembles [name your favourite news website] A closer look at alfresco reveals that it does look more like what Steve (and I ) are after. I am a lawyer and I handle hundreds of documents every week, from email through pdf (both made from an electronic source and therefore has all the text available, and scanned) openoffice (one enlightened client!), word, excel, html, faxes, letters (on paper, ya know!) you name it someone will send me something in it! It'd be great to have a metadata system where I could give everything some keywords: client name, file number, matter number, subjects, useful as a precedent, useful case etc etc etc so that in future I can : pull up every document on my computer, my secretary's computer, my mail server (including attachments), my file server, my palm pilot, relating to a particular client pull up every document about company debentures find the case i downloaded and stored somewhere about liability of guarantors in a consumer credit loan find the seminar book for the seminar i went to on asome new area of law. find a letter written by Joe Bloggs sometime in 2003. > > > -- > > -- > gentoo-user@gentoo.org mailing list -- Nick Rout <nick@rout.co.nz> -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Document management solution [possibly a bit off-topic...] 2005-09-29 22:36 ` Nick Rout @ 2005-09-30 1:43 ` Eric Crossman 0 siblings, 0 replies; 6+ messages in thread From: Eric Crossman @ 2005-09-30 1:43 UTC (permalink / raw To: gentoo-user On Fri, 2005-09-30 at 10:36 +1200, Nick Rout wrote: > On Thu, 29 Sep 2005 16:52:54 -0400 (EDT) > A. Khattri wrote: > > > On Thu, 29 Sep 2005, Steve [Gentoo] wrote: > > > > > Alfresco is what I'd have called a content management system - as > > > opposed to a document management system. I'm interested in managing > > > archives of documents I have received from other people (in dead-tree > > > format)... > > > > If there was something that scanned the document, performed OCR on it, > > checked the OCR output and then built an electronic repository for you I'd > > recommend it. Until then, Alfresco is the closest thing Ive seen that is > > open source. If you're willing to do your own scanning and OCR'ing then it > > will do the rest. > > > > BTW, I would call things like Mambo or Xaraya, content-management tools - > > Alfresco is a slightly different kettle of fish. > > Yes I know what Steve is after, and I'd love to find a way. I was put > off by Alfresco being called "Content Management" because all of the > content management systems I have seen end up bioding something that > resembles [name your favourite news website] > > A closer look at alfresco reveals that it does look more like what Steve (and I ) are after. > > I am a lawyer and I handle hundreds of documents every week, from email > through pdf (both made from an electronic source and therefore has all > the text available, and scanned) openoffice (one enlightened client!), > word, excel, html, faxes, letters (on paper, ya know!) you name it > someone will send me something in it! > > It'd be great to have a metadata system where I could give everything > some keywords: > > client name, file number, matter number, subjects, useful as a > precedent, useful case etc etc etc so that in future I can : > > pull up every document on my computer, my secretary's computer, my mail > server (including attachments), my file server, my palm pilot, relating > to a particular client > > pull up every document about company debentures > > find the case i downloaded and stored somewhere about liability of > guarantors in a consumer credit loan > > find the seminar book for the seminar i went to on asome new area of > law. > > find a letter written by Joe Bloggs sometime in 2003. > > > > > > > > -- > > > > -- > > gentoo-user@gentoo.org mailing list > > -- > Nick Rout <nick@rout.co.nz> > I'm not sure if what you're describing exists right now in the open source world, but I can tell you that it certainly does in the commercial world. I used to work in the "metadata" department for a startup here in upstate NY, USA that built a web based application targeting lawyers such as yourself. It was written in PHP/MySQL but the database was being migrated to Oracle due to the rapid growth in the database tables. Unfortunately though, in the migration to Oracle, they elected to create a "dynamic" scheme to support adding custom metadata fields as requested per client. It was great for flexibility but the performance was horrible even on quad 3 ghz xeon boxes with maxed out memory. For us programmers, it also made the easy queries difficult and the hard queries near impossible. Eric -- gentoo-user@gentoo.org mailing list ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-09-30 1:47 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-09-29 14:45 [gentoo-user] Document management solution [possibly a bit off-topic...] Steve [Gentoo] 2005-09-29 17:29 ` A. Khattri 2005-09-29 17:53 ` Steve [Gentoo] 2005-09-29 20:52 ` A. Khattri 2005-09-29 22:36 ` Nick Rout 2005-09-30 1:43 ` Eric Crossman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox