* [gentoo-dev] New global USE flag: gzip-dict
@ 2008-12-16 19:21 Peter Volkov
2008-12-16 19:27 ` Ciaran McCreesh
0 siblings, 1 reply; 12+ messages in thread
From: Peter Volkov @ 2008-12-16 19:21 UTC (permalink / raw
To: gentoo-dev
Hello.
Some time ago I've modified stardict.eclass and added optional
possibility based on 'gzip' USE flag to compress index and dict data
files. But I realized too late that I need to document this USE flag
somewhere, and since it'll do similar things for all stardict-*
dictionaries (heh, more than 5 packages...) I'm going to add it as
global USE flag. Also since gzip USE flag already exist in
x11-misc/openclipart I'll change 'gzip' to 'gzip-dict'. So if there will
be no objections I'll add new 'gzip-dict' global USE flag in 2-3 days
from now.
--
Peter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-16 19:21 [gentoo-dev] New global USE flag: gzip-dict Peter Volkov
@ 2008-12-16 19:27 ` Ciaran McCreesh
2008-12-16 19:57 ` Doug Goldstein
2008-12-16 20:06 ` Peter Volkov
0 siblings, 2 replies; 12+ messages in thread
From: Ciaran McCreesh @ 2008-12-16 19:27 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 826 bytes --]
On Tue, 16 Dec 2008 22:21:16 +0300
Peter Volkov <pva@gentoo.org> wrote:
> Some time ago I've modified stardict.eclass and added optional
> possibility based on 'gzip' USE flag to compress index and dict data
> files. But I realized too late that I need to document this USE flag
> somewhere, and since it'll do similar things for all stardict-*
> dictionaries (heh, more than 5 packages...) I'm going to add it as
> global USE flag. Also since gzip USE flag already exist in
> x11-misc/openclipart I'll change 'gzip' to 'gzip-dict'. So if there
> will be no objections I'll add new 'gzip-dict' global USE flag in 2-3
> days from now.
What's the point of having this as an option at all? Is it really
something that affects the end user in any way? Or is it just
gratuitous choisiosity?
--
Ciaran McCreesh
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-16 19:27 ` Ciaran McCreesh
@ 2008-12-16 19:57 ` Doug Goldstein
2008-12-16 20:06 ` Peter Volkov
1 sibling, 0 replies; 12+ messages in thread
From: Doug Goldstein @ 2008-12-16 19:57 UTC (permalink / raw
To: gentoo-dev
Ciaran McCreesh wrote:
> On Tue, 16 Dec 2008 22:21:16 +0300
> Peter Volkov <pva@gentoo.org> wrote:
>
>> Some time ago I've modified stardict.eclass and added optional
>> possibility based on 'gzip' USE flag to compress index and dict data
>> files. But I realized too late that I need to document this USE flag
>> somewhere, and since it'll do similar things for all stardict-*
>> dictionaries (heh, more than 5 packages...) I'm going to add it as
>> global USE flag. Also since gzip USE flag already exist in
>> x11-misc/openclipart I'll change 'gzip' to 'gzip-dict'. So if there
>> will be no objections I'll add new 'gzip-dict' global USE flag in 2-3
>> days from now.
>>
>
> What's the point of having this as an option at all? Is it really
> something that affects the end user in any way? Or is it just
> gratuitous choisiosity?
>
>
I happen to be in agreement here. gzip is a quick process, especially
with a separate index file which would point to a specific section in
the dict to uncompress. Assuming they've coded it right, it should
barely be noticeable in the grand scheme of things.
If this is not the case at all and it in fact for some odd reasons
requires additional deps and requires uncompressing huge files in memory
such that low memory systems can't handle it, then I'd be in favor of a
USE flag. But otherwise, it seems like less maintenance for you and less
user confusion by making it default.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-16 19:27 ` Ciaran McCreesh
2008-12-16 19:57 ` Doug Goldstein
@ 2008-12-16 20:06 ` Peter Volkov
2008-12-18 0:34 ` Donnie Berkholz
1 sibling, 1 reply; 12+ messages in thread
From: Peter Volkov @ 2008-12-16 20:06 UTC (permalink / raw
To: gentoo-dev
В Втр, 16/12/2008 в 19:27 +0000, Ciaran McCreesh пишет:
> What's the point of having this as an option at all? Is it really
> something that affects the end user in any way?
The reason is that this feature requires additional dependency on
app-text/dictd package (to compress dictionary data dictzip program is
required).
--
Peter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-16 20:06 ` Peter Volkov
@ 2008-12-18 0:34 ` Donnie Berkholz
2008-12-19 14:40 ` Peter Volkov
0 siblings, 1 reply; 12+ messages in thread
From: Donnie Berkholz @ 2008-12-18 0:34 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 675 bytes --]
On 23:06 Tue 16 Dec , Peter Volkov wrote:
> В Втр, 16/12/2008 в 19:27 +0000, Ciaran McCreesh пишет:
> > What's the point of having this as an option at all? Is it really
> > something that affects the end user in any way?
>
> The reason is that this feature requires additional dependency on
> app-text/dictd package (to compress dictionary data dictzip program is
> required).
Is that some huge package that takes an unreasonable amount of time to
build or space to install? If not, this doesn't seem like a very
meaningful choice to me.
--
Thanks,
Donnie
Donnie Berkholz
Developer, Gentoo Linux
Blog: http://dberkholz.wordpress.com
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-18 0:34 ` Donnie Berkholz
@ 2008-12-19 14:40 ` Peter Volkov
2008-12-19 14:45 ` Ciaran McCreesh
0 siblings, 1 reply; 12+ messages in thread
From: Peter Volkov @ 2008-12-19 14:40 UTC (permalink / raw
To: gentoo-dev
В Срд, 17/12/2008 в 16:34 -0800, Donnie Berkholz пишет:
> Is that some huge package that takes an unreasonable amount of time to
> build or space to install?
Probably in this case it takes reasonable amount of time...
> If not, this doesn't seem like a very meaningful choice to me.
Well, your questions forced me to do my own investigation of gzip
performance in one real-life scenario which I hope to use really soon. I
took my Neo FreeRunner and tested gzip decompression speed there. Time
to read 10Mb file is about ~ 2.15s. But if I need to read and decompress
it at the same time it takes ~ 9.90s. So this makes times slower read of
compressed files. Since stardict reads all index files on each startup
and currently index files of my dictionaries occupy more than 20M this
makes much longer startup time. I have not checked runtime performance.
It'll be affected too but, probably, not too much since stardict will
decompress only required parts of data. But anyway gzip is not free and
it's better to have it optional.
--
Peter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-19 14:40 ` Peter Volkov
@ 2008-12-19 14:45 ` Ciaran McCreesh
2008-12-19 16:56 ` Peter Volkov
0 siblings, 1 reply; 12+ messages in thread
From: Ciaran McCreesh @ 2008-12-19 14:45 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 998 bytes --]
On Fri, 19 Dec 2008 17:40:41 +0300
Peter Volkov <pva@gentoo.org> wrote:
> Well, your questions forced me to do my own investigation of gzip
> performance in one real-life scenario which I hope to use really
> soon. I took my Neo FreeRunner and tested gzip decompression speed
> there. Time to read 10Mb file is about ~ 2.15s. But if I need to read
> and decompress it at the same time it takes ~ 9.90s. So this makes
> times slower read of compressed files. Since stardict reads all index
> files on each startup and currently index files of my dictionaries
> occupy more than 20M this makes much longer startup time. I have not
> checked runtime performance. It'll be affected too but, probably, not
> too much since stardict will decompress only required parts of data.
> But anyway gzip is not free and it's better to have it optional.
If it reads (and presumably uncompresses) all of them at startup
anyway, what's the point in compressing them at all?
--
Ciaran McCreesh
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-19 14:45 ` Ciaran McCreesh
@ 2008-12-19 16:56 ` Peter Volkov
2008-12-19 17:06 ` Ciaran McCreesh
0 siblings, 1 reply; 12+ messages in thread
From: Peter Volkov @ 2008-12-19 16:56 UTC (permalink / raw
To: gentoo-dev
В Птн, 19/12/2008 в 14:45 +0000, Ciaran McCreesh пишет:
> If it reads (and presumably uncompresses) all of them at startup
> anyway, what's the point in compressing them at all?
It makes size smaller: both index and data files are text files so
compression is very effective. All distributions I've checked compress
data files, some compress both data and index. Probably all desktop
users want dictionaries to be compressed because modern cpu's are really
fast in decompression and even on my 4-years old notebook it takes less
then second... But still there are environments where it's better to
keep dictionaries uncompressed. That's why I want to keep this feature
optional.
--
Peter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-19 16:56 ` Peter Volkov
@ 2008-12-19 17:06 ` Ciaran McCreesh
2008-12-19 17:32 ` Peter Volkov
0 siblings, 1 reply; 12+ messages in thread
From: Ciaran McCreesh @ 2008-12-19 17:06 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1485 bytes --]
On Fri, 19 Dec 2008 19:56:02 +0300
Peter Volkov <pva@gentoo.org> wrote:
> В Птн, 19/12/2008 в 14:45 +0000, Ciaran McCreesh пишет:
> > If it reads (and presumably uncompresses) all of them at startup
> > anyway, what's the point in compressing them at all?
>
> It makes size smaller: both index and data files are text files so
> compression is very effective. All distributions I've checked compress
> data files, some compress both data and index. Probably all desktop
> users want dictionaries to be compressed because modern cpu's are
> really fast in decompression and even on my 4-years old notebook it
> takes less then second... But still there are environments where it's
> better to keep dictionaries uncompressed. That's why I want to keep
> this feature optional.
But disk space is cheap. How big are the dictionaries? The vim
dictionaries are around half a meg uncompressed, and if you're looking
to save a meg or two in disk space on the kind of system that includes
dictionaries then you're doing something seriously wrong...
Really, all that compression seems to do is save a small amount of
irrelevant disk space, at the cost of requiring more disk space and
memory for a new library and slowing things down to a level that's
unacceptable on some systems. Compression makes sense for network
transfers, backups and file formats that do their own domain specific
compression. Elsewhere? Likely not so much.
--
Ciaran McCreesh
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] New global USE flag: gzip-dict
2008-12-19 17:06 ` Ciaran McCreesh
@ 2008-12-19 17:32 ` Peter Volkov
2008-12-20 0:16 ` [gentoo-dev] " Duncan
0 siblings, 1 reply; 12+ messages in thread
From: Peter Volkov @ 2008-12-19 17:32 UTC (permalink / raw
To: gentoo-dev
В Птн, 19/12/2008 в 17:06 +0000, Ciaran McCreesh пишет:
> But disk space is cheap. How big are the dictionaries? The vim
> dictionaries are around half a meg uncompressed, and if you're looking
> to save a meg or two in disk space on the kind of system that includes
> dictionaries then you're doing something seriously wrong...
Size is times larger. All dictionary data (without index) I have
currently installed occupies 93M in compressed form and uncompressed
it'll take 402M. This does not count dictionaries I'm going to add into
the tree. If I remember correctly all dictionaries I needed from
stardict site took about 1Gbyte (uncompressed). Also some people use
more then two languages and then they'll use more dictionaries.
> Really, all that compression seems to do is save a small amount of
> irrelevant disk space, at the cost of requiring more disk space and
> memory for a new library and slowing things down to a level that's
> unacceptable on some systems. Compression makes sense for network
> transfers, backups and file formats that do their own domain specific
> compression. Elsewhere? Likely not so much.
I agree in general but in this specific case compression does a good
job.
--
Peter.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [gentoo-dev] Re: New global USE flag: gzip-dict
2008-12-19 17:32 ` Peter Volkov
@ 2008-12-20 0:16 ` Duncan
2008-12-20 1:04 ` Mart Raudsepp
0 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2008-12-20 0:16 UTC (permalink / raw
To: gentoo-dev
Peter Volkov <pva@gentoo.org> posted
1229707964.13304.1334.camel@localhost, excerpted below, on Fri, 19 Dec
2008 20:32:44 +0300:
> В Птн, 19/12/2008 в 17:06 +0000, Ciaran McCreesh пишет:
>> But disk space is cheap. How big are the dictionaries? The vim
>> dictionaries are around half a meg uncompressed, and if you're looking
>> to save a meg or two in disk space on the kind of system that includes
>> dictionaries then you're doing something seriously wrong...
>
> Size is times larger. All dictionary data (without index) I have
> currently installed occupies 93M in compressed form and uncompressed
> it'll take 402M. This does not count dictionaries I'm going to add into
> the tree. If I remember correctly all dictionaries I needed from
> stardict site took about 1Gbyte (uncompressed). Also some people use
> more then two languages and then they'll use more dictionaries.
I believe this is all people have been asking, really. For a gig of
data, compression to under a couple hundred megs sounds worthwhile. For
a hundred megs, compression to twenty megs, or even ten or five, not so
much, as on the fast machines a hundred megs or so of space shouldn't be
an issue, while on the slow machines, the decompression latency isn't
tolerable. But a gig of space (or even half a gig)... that's rather
different as there are still a decent number of people for whom that's 1%
or more of their total, who may be willing to take that latency as they
have better things to do with the space.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [gentoo-dev] Re: New global USE flag: gzip-dict
2008-12-20 0:16 ` [gentoo-dev] " Duncan
@ 2008-12-20 1:04 ` Mart Raudsepp
0 siblings, 0 replies; 12+ messages in thread
From: Mart Raudsepp @ 2008-12-20 1:04 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]
On Sat, 2008-12-20 at 00:16 +0000, Duncan wrote:
> But a gig of space (or even half a gig)... that's rather
> different as there are still a decent number of people for whom that's 1%
> or more of their total, who may be willing to take that latency as they
> have better things to do with the space.
If you are dealing with rotating media, I wouldn't be even so sure that
1GB unpacked is quicker than packed 100-200MB. On your typical desktop
computer with rotating HDDs, it could very well be quite the opposite,
as seek and disk access time can be dozens and hundreds of times slower
than a simple zlib inflate, which is a relatively quick operation on
x86's.
Now of course on something like the Neo FreeRunner, where we'd likely be
dealing with a flash based media and an embedded CPU (something like
arm7 I guess?, no I don't need to know, I could just google if I did),
read from flash is more likely to be quicker.
--
Mart Raudsepp
Gentoo Developer
Mail: leio@gentoo.org
Weblog: http://planet.gentoo.org/developers/leio
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2008-12-20 1:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-16 19:21 [gentoo-dev] New global USE flag: gzip-dict Peter Volkov
2008-12-16 19:27 ` Ciaran McCreesh
2008-12-16 19:57 ` Doug Goldstein
2008-12-16 20:06 ` Peter Volkov
2008-12-18 0:34 ` Donnie Berkholz
2008-12-19 14:40 ` Peter Volkov
2008-12-19 14:45 ` Ciaran McCreesh
2008-12-19 16:56 ` Peter Volkov
2008-12-19 17:06 ` Ciaran McCreesh
2008-12-19 17:32 ` Peter Volkov
2008-12-20 0:16 ` [gentoo-dev] " Duncan
2008-12-20 1:04 ` Mart Raudsepp
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox