From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id EAA711382C5 for ; Mon, 1 Mar 2021 12:38:01 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 13AD8E0996; Mon, 1 Mar 2021 12:37:57 +0000 (UTC) Received: from smtp.hosts.co.uk (smtp.hosts.co.uk [85.233.160.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id C962AE095A for ; Mon, 1 Mar 2021 12:37:56 +0000 (UTC) Received: from host86-128-145-119.range86-128.btcentralplus.com ([86.128.145.119] helo=[192.168.1.64]) by smtp.hosts.co.uk with esmtpa (Exim) (envelope-from ) id 1lGho9-0005eZ-8G for gentoo-user@lists.gentoo.org; Mon, 01 Mar 2021 12:37:53 +0000 Subject: Re: [gentoo-user] Re: [OT] Saving an image as black and white To: gentoo-user@lists.gentoo.org References: <603CD50B.9080303@youngman.org.uk> From: Wols Lists X-Enigmail-Draft-Status: N1110 Message-ID: <603CE962.7070202@youngman.org.uk> Date: Mon, 1 Mar 2021 13:17:22 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Archives-Salt: cf09b1a2-9b95-4a54-969d-da0ad9d2e975 X-Archives-Hash: ca2aa5294fab06e267f515b9a7ddd0ee On 01/03/21 12:11, (Nuno Silva) wrote: > On 2021-03-01, Wols Lists wrote: > >> I've got a bunch of scans, let's assume they're text documents. And >> they're rather big ... I want to email them. >> >> How on earth do I convert them to TRUE b&w documents? At the moment they >> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >> to store all the colour, luminance, whatever, per pixel. But actually, >> there's only ONE BIT of information there - whether that pixel is black >> or white. >> >> I'm using imagemagick, but so far all my attempts to strip out the >> surplus information have resulted in INcreasing the file size ??? >> >> So basically, how do I save an image as "one bit per pixel" like you'd >> think you'd send to a B&W printer? >> >> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of >> uncompressed info for a page of A4, not 3MB. >> >> Cheers, >> Wol > > Somebody else might have a better suggestion, or perhaps a better > understanding of the JPEG format and of what needs to be tuned, but, for > example: > > convert origin.jpg -threshold 70% -monochrome result.jpg > > (And adjust the "-threshold percent" if needed. It might be that you > don't need thresholding at all, but if you do, it apparently must go > before "-monochrome".) > > (Depending on the receiving end, you could also explore other > formats. Here, if the scanned document can be stored in monochrome, I > usually use djvu.) > Thanks but no, I've already tried that. It makes matters worse! I've messed about with the scanner, so it is now creating 800KB images, but I don't want to rescan everything I've done. The problem is that it is clearly saving the images as greyscale, not as black&white. And when I search for help, what I want is swamped by all the false positives for greyscale. Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's why I used the word "assume" - to make it clear that I want a 1-bit/pixel palette, not a 5-byte/pixel greyscale. Cheers, Wol