* [gentoo-user] [OT] Saving an image as black and white @ 2021-03-01 11:50 Wols Lists 2021-03-01 12:01 ` Hund ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Wols Lists @ 2021-03-01 11:50 UTC (permalink / raw To: gentoo-user I've got a bunch of scans, let's assume they're text documents. And they're rather big ... I want to email them. How on earth do I convert them to TRUE b&w documents? At the moment they are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes to store all the colour, luminance, whatever, per pixel. But actually, there's only ONE BIT of information there - whether that pixel is black or white. I'm using imagemagick, but so far all my attempts to strip out the surplus information have resulted in INcreasing the file size ??? So basically, how do I save an image as "one bit per pixel" like you'd think you'd send to a B&W printer? Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of uncompressed info for a page of A4, not 3MB. Cheers, Wol ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 11:50 [gentoo-user] [OT] Saving an image as black and white Wols Lists @ 2021-03-01 12:01 ` Hund 2021-03-01 12:11 ` [gentoo-user] " Nuno Silva 2021-03-01 13:48 ` [gentoo-user] " Neil Bothwick 2 siblings, 0 replies; 11+ messages in thread From: Hund @ 2021-03-01 12:01 UTC (permalink / raw To: gentoo-user On March 1, 2021 12:50:35 PM GMT+01:00, Wols Lists <antlists@youngman.org.uk> wrote: >I've got a bunch of scans, let's assume they're text documents. And >they're rather big ... I want to email them. > >How on earth do I convert them to TRUE b&w documents? At the moment they >are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >to store all the colour, luminance, whatever, per pixel. But actually, >there's only ONE BIT of information there - whether that pixel is black >or white. > >I'm using imagemagick, but so far all my attempts to strip out the >surplus information have resulted in INcreasing the file size ??? > >So basically, how do I save an image as "one bit per pixel" like you'd >think you'd send to a B&W printer? > >Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of >uncompressed info for a page of A4, not 3MB. > >Cheers, >Wol > Have you tried an optical character recognition software like Tesseract[1]? 1. https://github.com/tesseract-ocr/tesseract -- Hund ^ permalink raw reply [flat|nested] 11+ messages in thread
* [gentoo-user] Re: [OT] Saving an image as black and white 2021-03-01 11:50 [gentoo-user] [OT] Saving an image as black and white Wols Lists 2021-03-01 12:01 ` Hund @ 2021-03-01 12:11 ` Nuno Silva 2021-03-01 13:17 ` Wols Lists 2021-03-01 13:48 ` [gentoo-user] " Neil Bothwick 2 siblings, 1 reply; 11+ messages in thread From: Nuno Silva @ 2021-03-01 12:11 UTC (permalink / raw To: gentoo-user On 2021-03-01, Wols Lists wrote: > I've got a bunch of scans, let's assume they're text documents. And > they're rather big ... I want to email them. > > How on earth do I convert them to TRUE b&w documents? At the moment they > are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes > to store all the colour, luminance, whatever, per pixel. But actually, > there's only ONE BIT of information there - whether that pixel is black > or white. > > I'm using imagemagick, but so far all my attempts to strip out the > surplus information have resulted in INcreasing the file size ??? > > So basically, how do I save an image as "one bit per pixel" like you'd > think you'd send to a B&W printer? > > Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of > uncompressed info for a page of A4, not 3MB. > > Cheers, > Wol Somebody else might have a better suggestion, or perhaps a better understanding of the JPEG format and of what needs to be tuned, but, for example: convert origin.jpg -threshold 70% -monochrome result.jpg (And adjust the "-threshold percent" if needed. It might be that you don't need thresholding at all, but if you do, it apparently must go before "-monochrome".) (Depending on the receiving end, you could also explore other formats. Here, if the scanned document can be stored in monochrome, I usually use djvu.) -- Nuno Silva ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] Re: [OT] Saving an image as black and white 2021-03-01 12:11 ` [gentoo-user] " Nuno Silva @ 2021-03-01 13:17 ` Wols Lists 2021-03-01 12:48 ` Nuno Silva 2021-03-01 13:24 ` William Kenworthy 0 siblings, 2 replies; 11+ messages in thread From: Wols Lists @ 2021-03-01 13:17 UTC (permalink / raw To: gentoo-user On 01/03/21 12:11, (Nuno Silva) wrote: > On 2021-03-01, Wols Lists wrote: > >> I've got a bunch of scans, let's assume they're text documents. And >> they're rather big ... I want to email them. >> >> How on earth do I convert them to TRUE b&w documents? At the moment they >> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >> to store all the colour, luminance, whatever, per pixel. But actually, >> there's only ONE BIT of information there - whether that pixel is black >> or white. >> >> I'm using imagemagick, but so far all my attempts to strip out the >> surplus information have resulted in INcreasing the file size ??? >> >> So basically, how do I save an image as "one bit per pixel" like you'd >> think you'd send to a B&W printer? >> >> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of >> uncompressed info for a page of A4, not 3MB. >> >> Cheers, >> Wol > > Somebody else might have a better suggestion, or perhaps a better > understanding of the JPEG format and of what needs to be tuned, but, for > example: > > convert origin.jpg -threshold 70% -monochrome result.jpg > > (And adjust the "-threshold percent" if needed. It might be that you > don't need thresholding at all, but if you do, it apparently must go > before "-monochrome".) > > (Depending on the receiving end, you could also explore other > formats. Here, if the scanned document can be stored in monochrome, I > usually use djvu.) > Thanks but no, I've already tried that. It makes matters worse! I've messed about with the scanner, so it is now creating 800KB images, but I don't want to rescan everything I've done. The problem is that it is clearly saving the images as greyscale, not as black&white. And when I search for help, what I want is swamped by all the false positives for greyscale. Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's why I used the word "assume" - to make it clear that I want a 1-bit/pixel palette, not a 5-byte/pixel greyscale. Cheers, Wol ^ permalink raw reply [flat|nested] 11+ messages in thread
* [gentoo-user] Re: [OT] Saving an image as black and white 2021-03-01 13:17 ` Wols Lists @ 2021-03-01 12:48 ` Nuno Silva 2021-03-01 13:24 ` William Kenworthy 1 sibling, 0 replies; 11+ messages in thread From: Nuno Silva @ 2021-03-01 12:48 UTC (permalink / raw To: gentoo-user On 2021-03-01, Wols Lists wrote: > On 01/03/21 12:11, (Nuno Silva) wrote: >> On 2021-03-01, Wols Lists wrote: >> >>> I've got a bunch of scans, let's assume they're text documents. And >>> they're rather big ... I want to email them. >>> >>> How on earth do I convert them to TRUE b&w documents? At the moment they >>> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >>> to store all the colour, luminance, whatever, per pixel. But actually, >>> there's only ONE BIT of information there - whether that pixel is black >>> or white. >>> >>> I'm using imagemagick, but so far all my attempts to strip out the >>> surplus information have resulted in INcreasing the file size ??? >>> >>> So basically, how do I save an image as "one bit per pixel" like you'd >>> think you'd send to a B&W printer? >>> >>> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of >>> uncompressed info for a page of A4, not 3MB. >>> >>> Cheers, >>> Wol >> >> Somebody else might have a better suggestion, or perhaps a better >> understanding of the JPEG format and of what needs to be tuned, but, for >> example: >> >> convert origin.jpg -threshold 70% -monochrome result.jpg >> >> (And adjust the "-threshold percent" if needed. It might be that you >> don't need thresholding at all, but if you do, it apparently must go >> before "-monochrome".) >> >> (Depending on the receiving end, you could also explore other >> formats. Here, if the scanned document can be stored in monochrome, I >> usually use djvu.) >> > Thanks but no, I've already tried that. It makes matters worse! > > I've messed about with the scanner, so it is now creating 800KB images, > but I don't want to rescan everything I've done. > > The problem is that it is clearly saving the images as greyscale, not as > black&white. And when I search for help, what I want is swamped by all > the false positives for greyscale. > > Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's > why I used the word "assume" - to make it clear that I want a > 1-bit/pixel palette, not a 5-byte/pixel greyscale. > > Cheers, > Wol Sorry, my bad - I was checking the file sizes, but I didn't notice the larger one was the new, "monochrome" version. More coffee needed, it seems. -- Nuno Silva ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] Re: [OT] Saving an image as black and white 2021-03-01 13:17 ` Wols Lists 2021-03-01 12:48 ` Nuno Silva @ 2021-03-01 13:24 ` William Kenworthy 1 sibling, 0 replies; 11+ messages in thread From: William Kenworthy @ 2021-03-01 13:24 UTC (permalink / raw To: gentoo-user save/convert to pdf - use gs from ghostscrpit to convert them (I use ebook for the target) which gives 10-20x reduction in size with only a small reduction in quality - perfect for emailing. I dont have the actual command string but I originally found the suggestion via google. BillK On 1/3/21 9:17 pm, Wols Lists wrote: > On 01/03/21 12:11, (Nuno Silva) wrote: >> On 2021-03-01, Wols Lists wrote: >> >>> I've got a bunch of scans, let's assume they're text documents. And >>> they're rather big ... I want to email them. >>> >>> How on earth do I convert them to TRUE b&w documents? At the moment they >>> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >>> to store all the colour, luminance, whatever, per pixel. But actually, >>> there's only ONE BIT of information there - whether that pixel is black >>> or white. >>> >>> I'm using imagemagick, but so far all my attempts to strip out the >>> surplus information have resulted in INcreasing the file size ??? >>> >>> So basically, how do I save an image as "one bit per pixel" like you'd >>> think you'd send to a B&W printer? >>> >>> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of >>> uncompressed info for a page of A4, not 3MB. >>> >>> Cheers, >>> Wol >> Somebody else might have a better suggestion, or perhaps a better >> understanding of the JPEG format and of what needs to be tuned, but, for >> example: >> >> convert origin.jpg -threshold 70% -monochrome result.jpg >> >> (And adjust the "-threshold percent" if needed. It might be that you >> don't need thresholding at all, but if you do, it apparently must go >> before "-monochrome".) >> >> (Depending on the receiving end, you could also explore other >> formats. Here, if the scanned document can be stored in monochrome, I >> usually use djvu.) >> > Thanks but no, I've already tried that. It makes matters worse! > > I've messed about with the scanner, so it is now creating 800KB images, > but I don't want to rescan everything I've done. > > The problem is that it is clearly saving the images as greyscale, not as > black&white. And when I search for help, what I want is swamped by all > the false positives for greyscale. > > Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's > why I used the word "assume" - to make it clear that I want a > 1-bit/pixel palette, not a 5-byte/pixel greyscale. > > Cheers, > Wol > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 11:50 [gentoo-user] [OT] Saving an image as black and white Wols Lists 2021-03-01 12:01 ` Hund 2021-03-01 12:11 ` [gentoo-user] " Nuno Silva @ 2021-03-01 13:48 ` Neil Bothwick 2021-03-01 14:22 ` Rich Freeman 2021-03-01 15:54 ` Wols Lists 2 siblings, 2 replies; 11+ messages in thread From: Neil Bothwick @ 2021-03-01 13:48 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1164 bytes --] On Mon, 1 Mar 2021 11:50:35 +0000, Wols Lists wrote: > I've got a bunch of scans, let's assume they're text documents. And > they're rather big ... I want to email them. > > How on earth do I convert them to TRUE b&w documents? At the moment they > are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes > to store all the colour, luminance, whatever, per pixel. But actually, > there's only ONE BIT of information there - whether that pixel is black > or white. > > I'm using imagemagick, but so far all my attempts to strip out the > surplus information have resulted in INcreasing the file size ??? > > So basically, how do I save an image as "one bit per pixel" like you'd > think you'd send to a B&W printer? $ convert input.jpg -threshold 50% output.png should do it, you may need to play with the threshold setting. The file command reports the output file as being "1-bit grayscale". You can also use -monochrome but that will produce a dithered image, that's probably not what you want judging by your description. -- Neil Bothwick If we aren't supposed to eat animals, why are they made of meat? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 13:48 ` [gentoo-user] " Neil Bothwick @ 2021-03-01 14:22 ` Rich Freeman 2021-03-01 15:54 ` Wols Lists 1 sibling, 0 replies; 11+ messages in thread From: Rich Freeman @ 2021-03-01 14:22 UTC (permalink / raw To: gentoo-user On Mon, Mar 1, 2021 at 8:48 AM Neil Bothwick <neil@digimed.co.uk> wrote: > > should do it, you may need to play with the threshold setting. The file > command reports the output file as being "1-bit grayscale". > > You can also use -monochrome but that will produce a dithered image, > that's probably not what you want judging by your description. Keep in mind that your starting image might not be 1-bit. You might be scanning in greyscale, which is probably 8-bit. Nothing wrong with converting to 1-bit, but in that case you would be throwing away detail. If you plan to do any processing of the file you might want to do that before throwing out the detail. You also may or may not want the threshold to be 50%. Also, as some are starting to hit on, jpeg may or may not be an ideal format depending on what you're scanning. It was designed for photographs, and it doesn't really cope well with sharp edges unless you use very high quality levels. I don't want to offer too much advice beyond that as I don't really deal with document scanning at any kind of scale where I get concerned with this stuff - defaults are almost always fine for me. I'm sure the right format and process would depend a bit on what you intend to do with the files. -- Rich ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 13:48 ` [gentoo-user] " Neil Bothwick 2021-03-01 14:22 ` Rich Freeman @ 2021-03-01 15:54 ` Wols Lists 2021-03-01 18:00 ` Rich Freeman 2021-03-04 20:03 ` Frank Steinmetzger 1 sibling, 2 replies; 11+ messages in thread From: Wols Lists @ 2021-03-01 15:54 UTC (permalink / raw To: gentoo-user On 01/03/21 13:48, Neil Bothwick wrote: > On Mon, 1 Mar 2021 11:50:35 +0000, Wols Lists wrote: > >> I've got a bunch of scans, let's assume they're text documents. And >> they're rather big ... I want to email them. >> >> How on earth do I convert them to TRUE b&w documents? At the moment they >> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes >> to store all the colour, luminance, whatever, per pixel. But actually, >> there's only ONE BIT of information there - whether that pixel is black >> or white. >> >> I'm using imagemagick, but so far all my attempts to strip out the >> surplus information have resulted in INcreasing the file size ??? >> >> So basically, how do I save an image as "one bit per pixel" like you'd >> think you'd send to a B&W printer? > > $ convert input.jpg -threshold 50% output.png > > should do it, you may need to play with the threshold setting. The file > command reports the output file as being "1-bit grayscale". > > You can also use -monochrome but that will produce a dithered image, > that's probably not what you want judging by your description. > > FINALLY! Thanks, that worked! Okay, I also adjusted the dpi because the original scan was 600 and I've reduced it to 300, but this has reduced the file size from 3MB to 180KB. Dunno why, but everything I was trying was INcreasing the file size :-( And the png does make a massive difference - the same command with jpg output is 1.7MB - so why is my scanner chucking out 800KB jpegs if I set it correctly? Cheers, Wol ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 15:54 ` Wols Lists @ 2021-03-01 18:00 ` Rich Freeman 2021-03-04 20:03 ` Frank Steinmetzger 1 sibling, 0 replies; 11+ messages in thread From: Rich Freeman @ 2021-03-01 18:00 UTC (permalink / raw To: gentoo-user On Mon, Mar 1, 2021 at 10:54 AM Wols Lists <antlists@youngman.org.uk> wrote: > > And the png does make a massive difference - the same command with jpg > output is 1.7MB - so why is my scanner chucking out 800KB jpegs if I set > it correctly? jpeg quality is adjustable. You can output a jpeg file of almost any size. Software less geared towards image editing may not actually let you set the quality level, but the software IS using one. So, two programs could output the same file at different sizes. The smaller you make the file, the lower the quality. This does have diminishing returns - as you approach maximum quality you increase the size greatly with very little difference in visual quality. Of course, if you try to convert that 1.7MB jpeg into a 30kb jpeg, you'll probably notice the difference. This is why this is a meme: http://needsmorejpeg.com/ -- Rich ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [gentoo-user] [OT] Saving an image as black and white 2021-03-01 15:54 ` Wols Lists 2021-03-01 18:00 ` Rich Freeman @ 2021-03-04 20:03 ` Frank Steinmetzger 1 sibling, 0 replies; 11+ messages in thread From: Frank Steinmetzger @ 2021-03-04 20:03 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1555 bytes --] Am Mon, Mar 01, 2021 at 03:54:12PM +0000 schrieb Wols Lists: > >> So basically, how do I save an image as "one bit per pixel" like you'd > >> think you'd send to a B&W printer? > > > > $ convert input.jpg -threshold 50% output.png > > > > should do it, you may need to play with the threshold setting. The file > > command reports the output file as being "1-bit grayscale". > > > > You can also use -monochrome but that will produce a dithered image, > > that's probably not what you want judging by your description. > > > > > FINALLY! > > Thanks, that worked! Okay, I also adjusted the dpi because the original > scan was 600 and I've reduced it to 300, but this has reduced the file > size from 3MB to 180KB. Also note: DPI is just a factor that is stored in the image’s metadata. What produces the actual filesize are the pixels. DPI is used to “convert” between the physical size of a hypothetical print (i.e. sheet of paper) and the number of pixel required for a certain density (and thus, quality). As far as I know, jpeg does not have a special “grayscale mode”. You may have reduced the information of the image by making all three colour channels equal to one another, but jpeg still encodes the data as if were a colour image. That’s why png is the much better option in this case. -- Gruß | Greetings | Qapla’ Please do not share anything from, with or about me on any social network. UNIX is not user-unfriendly. It just expects the user to be a little more computer-friendly. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-03-04 20:04 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-03-01 11:50 [gentoo-user] [OT] Saving an image as black and white Wols Lists 2021-03-01 12:01 ` Hund 2021-03-01 12:11 ` [gentoo-user] " Nuno Silva 2021-03-01 13:17 ` Wols Lists 2021-03-01 12:48 ` Nuno Silva 2021-03-01 13:24 ` William Kenworthy 2021-03-01 13:48 ` [gentoo-user] " Neil Bothwick 2021-03-01 14:22 ` Rich Freeman 2021-03-01 15:54 ` Wols Lists 2021-03-01 18:00 ` Rich Freeman 2021-03-04 20:03 ` Frank Steinmetzger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox