* [gentoo-dev] The release of 1.4 and its impact on our mirrors @ 2003-07-23 19:48 Kurt Lieber 2003-07-23 8:40 ` Alvaro Figueroa Cabezas ` (3 more replies) 0 siblings, 4 replies; 25+ messages in thread From: Kurt Lieber @ 2003-07-23 19:48 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 2370 bytes --] Folks -- One issue that came out of today's manager meeting was the amount of space that GRP and some of the new livecds will require for 1.4_release. As I understand it, that number hovers right around 10GB at the moment. We have already received numerous complaints from our mirror admins about the amount of disk space we chew up now. For reference, here is a break down: 9.7G ./releases 139M ./snapshots 17G ./distfiles 6.6G ./experimental We can clean out some space in the /releases directory by deleting some of the old 1.4_rc* stuff, but we will still have a significant overall size increase as a part of 1.4_final. We will be working on reducing the size of /distfiles as well, but this will be nothing more than a temporary, tactical fix. We still need a longer term strategic fix. I'd like to come up with a solution that allows each arch the flexibility it requires to create new livecds, GRP packages, etc. But I'd also like to be mindful and respectful of our mirrors. I'd like to hear suggestions from you all on the best way to achieve this. Some ideas that I've heard so far include: * creating separate directories that are optional for our mirrors to support. GRP would be a prime candidate for this given the space it requires. Mirrors short on disk space could choose not to mirror these files. * assigning each arch a quota for the files it may use outside of /distfiles. This quota would apply to liveCDs, things in /experimental as well as everything within /releases (as well as /grp if we make that a separate top level directory) To be effective, this quota would need to be around the ~3GB level given the number of arches we support. Those are simply two ideas that have been proposed so far. I'm sure you guys will have more. The only requirements at this point are: * We cannot expect our system of mirrors to give us an unlimited amount of disk space. (they've already started to complain) * We cannot afford to lose many mirrors. Especially in North America, source mirrors are in short supply. * We must expect /distfiles to continue growing as we continue to add new ebuilds. So solutions involving "reducing the size of /distfiles" will not be feasible except as a short-term, stop-gap measure. Thoughts? Ideas? --kurt [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 19:48 [gentoo-dev] The release of 1.4 and its impact on our mirrors Kurt Lieber @ 2003-07-23 8:40 ` Alvaro Figueroa Cabezas 2003-07-23 21:01 ` Kurt Lieber 2003-07-23 20:36 ` [gentoo-dev] The release of 1.4 and its impact on our mirrors Matthew Walker ` (2 subsequent siblings) 3 siblings, 1 reply; 25+ messages in thread From: Alvaro Figueroa Cabezas @ 2003-07-23 8:40 UTC (permalink / raw To: gentoo-dev On Jul 23 15:48, Kurt Lieber wrote: > I'm sure you guys will have more. Using jigdo instead of distributing the ISOs. This can have lots of benefits in other areas (e.g. Its far easyer to distribute an ISO that only has (e.g) X and GNOME, and no X. Will save bandwith. Can be made to distribute the downloads between several mirrors, etc), but I know that if this idea is considered, this thread will be quite long. I to encourage people to go and _make_ an ISO[0] before saying this is a bad idea. 10 minutes of experience will give a nice perspective on the matter. [0] You can make your own ISO, or you can... I don't know.. fetch a Debian ISO or something -- Alvaro Figueroa -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 8:40 ` Alvaro Figueroa Cabezas @ 2003-07-23 21:01 ` Kurt Lieber 2003-07-23 9:28 ` Alvaro Figueroa Cabezas 0 siblings, 1 reply; 25+ messages in thread From: Kurt Lieber @ 2003-07-23 21:01 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 532 bytes --] On Wed, Jul 23, 2003 at 02:40:39PM +0600 or thereabouts, Alvaro Figueroa Cabezas wrote: > Using jigdo instead of distributing the ISOs. I like the idea of jigdo and think it's a great idea for saving bandwidth, but I don't necessarily see how this will save disk space, which is our primary issue right now. Take GRP stuff, for instance -- we're releasing multiple sets, optimized for each architecture. So the binary for KDE on P4 is not the same as the binary for KDE on Athlon XP -- two separate copies need to exist. --kurt [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 21:01 ` Kurt Lieber @ 2003-07-23 9:28 ` Alvaro Figueroa Cabezas 2003-07-23 9:30 ` Alvaro Figueroa Cabezas 2003-07-24 0:11 ` [gentoo-dev] " Pieter Van den Abeele 0 siblings, 2 replies; 25+ messages in thread From: Alvaro Figueroa Cabezas @ 2003-07-23 9:28 UTC (permalink / raw To: gentoo-dev On Jul 23 17:01, Kurt Lieber wrote: > On Wed, Jul 23, 2003 at 02:40:39PM +0600 or thereabouts, Alvaro Figueroa Cabezas wrote: > > Using jigdo instead of distributing the ISOs. > > I like the idea of jigdo and think it's a great idea for saving bandwidth, > but I don't necessarily see how this will save disk space, which is our > primary issue right now. I must have this wrong. I tought that there where GRP tar files on the repository and inside the ISO images. The stage3 tar file is the stage2 plus a bit more packages, right? Well, then this could be manage with jigdo as well. Lets remember that jigdo isn't only for ISO images. -- Alvaro Figueroa -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 9:28 ` Alvaro Figueroa Cabezas @ 2003-07-23 9:30 ` Alvaro Figueroa Cabezas 2003-07-24 0:11 ` [gentoo-dev] " Pieter Van den Abeele 1 sibling, 0 replies; 25+ messages in thread From: Alvaro Figueroa Cabezas @ 2003-07-23 9:30 UTC (permalink / raw To: gentoo-dev On Jul 23 15:28, Alvaro Figueroa Cabezas wrote: > I must have this wrong. I tought that there where GRP tar files on the > repository and inside the ISO images. Ohh, no, no wait. I is like I say so... I think :). http://distro.ibiblio.org/pub/linux/distributions/gentoo/releases/1.4_rc4/sparc/sparc64/stage3-sparc64-1.4_rc4.tar.bz2 is inside http://distro.ibiblio.org/pub/linux/distributions/gentoo/releases/1.4_rc4/sparc/sparc64/livecd/gentoo-sparc64-1.4_rc4-2.iso? -- Alvaro Figueroa -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-dev] Re: The release of 1.4 and its impact on our mirrors 2003-07-23 9:28 ` Alvaro Figueroa Cabezas 2003-07-23 9:30 ` Alvaro Figueroa Cabezas @ 2003-07-24 0:11 ` Pieter Van den Abeele 2003-07-24 0:55 ` Nathaniel McCallum 1 sibling, 1 reply; 25+ messages in thread From: Pieter Van den Abeele @ 2003-07-24 0:11 UTC (permalink / raw To: Alvaro Figueroa Cabezas; +Cc: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 5724 bytes --] On 23/07/03 15:28 +0600, Alvaro Figueroa Cabezas wrote: > On Jul 23 17:01, Kurt Lieber wrote: > > On Wed, Jul 23, 2003 at 02:40:39PM +0600 or thereabouts, Alvaro Figueroa Cabezas wrote: > > > Using jigdo instead of distributing the ISOs. > > > > I like the idea of jigdo and think it's a great idea for saving bandwidth, > > but I don't necessarily see how this will save disk space, which is our > > primary issue right now. > > I must have this wrong. I tought that there where GRP tar files on the > repository and inside the ISO images. > > The stage3 tar file is the stage2 plus a bit more packages, right? Well, > then this could be manage with jigdo as well. Lets remember that jigdo > isn't only for ISO images. The GRP cds can be pre-ordered from our store; they allow one to set up a complete, optimized gentoo linux *very fast* on a system that shouldn't even be connected to the net. These 2 disc GRP sets for each cpu (right now x86 and ppc, no word yet from sparc or the others) contain (in casu ppc) some distfiles, prebuild packages for most commonly used stuff and stages. A grp cd is a small livecd with some prebuild packages We know that: livecd = bootloader + kernel with uncompressor module + initrd + compressed(live-env) live-env = stage3 + extra packages stage3 = stage2 + extra packages stage2 = stage1 + bootstrap so we could give you a small script (emerge livecd-ng) that given a 10M stage1 builds a complete livecd or a GRP cd etc, but that script has a 'I need a net connection to download stuff' and a 'I need a week compile time, unless you got 20 machines lying around doing nothing' dependency. I think jigdo has the same problem; it has a 'I need a network' dependency. It might not have a 'I need cpu time' dependency, but we need to make the prebuild packages available (or have jigdo compile from source). With debian I can understand those packages are available as .deb, but with gentoo ebuilds have to be compiled (which is what makes imho gentoo so easy to port to other platforms/architectures). I cannot suggest a direct solution to this problem. p2p is the first thing that comes to mind, but we'll have to look into that first to check if that solution is applicable to our problem. (safety/security is an important issue here). A solution that I think might be worth while investigating is 'cloning' (The idea comes from the prototype-based (instead of OO) programming languages, such as Sun SELF): -- Instead of releasing stages, and several types of livecds on every release, we could make and release only one livecd for each architecture gentoo linux runs on. The purpose of our livecds is to give people an idea what gentoo looks/feels like when it is installed. So, basically the live environment on the livecd == (should be) the same environment as it would be after installing gentoo to your hard drive without manually modifying it. The only difference being that the live environment on the cd is read-only, while those on your hard drive isn't of course. Since nothing has been manually modified, one can call the live environment clean. (No user modified files, no user added files, ...) Portage does merging and unmerging of software: this means that should you copy a clean live environment from a livecd onto a harddisk, you could downgrade it to a clean state (stage0), or to a stage2, a stage3 or just unmerge one package. You could also 'add' or replace software: add an extra build to the live environment (this requires net connection)... Since portage knows what a clean live environment is (it keeps a record of the files it installs for each package.), this means that given a 'poluted' environment it should be able to migrate to a clean environment. Heck, portage is even capable of given a gentoo environment, create a GRP package using that environment. What I'm suggesting is the following: suppose gentoo releases a livecd, and kept updating the environment on this livecd on a regular interval (mount the live env, emerge --update word). If we figure out a way to clone the live environment of a livecd onto a hd, and the other way around, there would be no need for GRP, nor stages, only a live environment (a livecd). The only thing that we need to have is a huge live environment (cd) on major releases and a small live environment on smaller releases. I can prove that this cloning system can support everything we support now (custom CFLAGS...). And we no longer need to provide people with cpu optimized optimized stuff, as they will be able to provide that themselves. We could still have an oportunity to build optimized live environments, but users could do themselves too. (a big environment can be downgraded to a small one and upgraded to a bigger, optimized one without much difficulty) conclusion: I think that instead of releasing a livecd (=stage1+extra stuff) with a set of stages (stage1 , stage1+extra stuff, stage1+bootstrap+extra stuff) and some extra stuff as GRP cd. We could release a big livecd (=stage1+a lot of extra stuff) if we enhanced our package manager. I see no drawbacks, apart from revising our current method of installation. Imho installation stays as educational and powerfull as it is now. Maybe faster and better to some. Note that this is only a idea, and not ment to be something that I absolutely want. -- Pieter -- Pieter Van den Abeele Gentoo Linux http://www.gentoo.org/~pvdabeel/ Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xF238673E Key fingerprint = F29C C550 54CD 1196 6723 EDC3 9B0D 4EA7 F238 673E [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] Re: The release of 1.4 and its impact on our mirrors 2003-07-24 0:11 ` [gentoo-dev] " Pieter Van den Abeele @ 2003-07-24 0:55 ` Nathaniel McCallum 2003-07-24 2:07 ` [gentoo-dev] Python on the liveCD Nathaniel McCallum 0 siblings, 1 reply; 25+ messages in thread From: Nathaniel McCallum @ 2003-07-24 0:55 UTC (permalink / raw To: Pieter Van den Abeele; +Cc: Alvaro Figueroa Cabezas, gentoo-dev Pieter Van den Abeele <pvdabeel@gentoo.org> writes: >The GRP cds can be pre-ordered from our store; they allow one to set up >a complete, optimized gentoo linux *very fast* on a system that shouldn't >even be connected to the net. These 2 disc GRP sets for each cpu (right >now >x86 and ppc, no word yet from sparc or the others) contain (in casu ppc) >some >distfiles, prebuild packages for most commonly used stuff and stages. > >A grp cd is a small livecd with some prebuild packages > >We know that: > >livecd = bootloader + kernel with uncompressor module + initrd + >compressed(live-env) >live-env = stage3 + extra packages >stage3 = stage2 + extra packages >stage2 = stage1 + bootstrap > >so we could give you a small script (emerge livecd-ng) that given a 10M >stage1 builds a complete livecd or a GRP cd etc, but that script has a 'I >need a net >connection to download stuff' and a 'I need a week compile time, unless >you >got 20 machines lying around doing nothing' dependency. > >I think jigdo has the same problem; it has a 'I need a network' >dependency. >It might not have a 'I need cpu time' dependency, but we need to make the >prebuild packages available (or have jigdo compile from source). With >debian >I can understand those packages are available as .deb, but with gentoo >ebuilds have to be compiled (which is what makes imho gentoo so easy to >port >to other platforms/architectures). > >I cannot suggest a direct solution to this problem. p2p is the first thing >that comes to mind, but we'll have to look into that first to check if >that >solution is applicable to our problem. (safety/security is an important >issue >here). > >A solution that I think might be worth while investigating is 'cloning' >(The idea comes from the prototype-based (instead of OO) programming >languages, such as Sun SELF): > >-- >Instead of releasing stages, and several types of livecds on every >release, >we could make and release only one livecd for each architecture gentoo >linux runs >on. The purpose of our livecds is to give people an idea what gentoo >looks/feels like when it is installed. So, basically the live environment >on the livecd == (should be) the same environment as it would be after >installing >gentoo to your hard drive without manually modifying it. The only >difference being that the live >environment on the cd is read-only, while those on your hard drive isn't >of >course. Since nothing has been manually modified, one can call the live >environment >clean. (No user modified files, no user added files, ...) > >Portage does merging and unmerging of software: this means >that should you copy a clean live environment from a livecd onto a >harddisk, >you could downgrade it to a clean state (stage0), or to a stage2, a >stage3 or just unmerge one package. You could also 'add' or replace >software: add an >extra build to the live environment (this requires net connection)... >Since portage knows what a clean live environment is (it keeps a record >of the files it >installs for each package.), this means that given a 'poluted' >environment it should be >able to migrate to a clean environment. Heck, portage is even capable of >given a gentoo >environment, create a GRP package using that environment. > >What I'm suggesting is the following: suppose gentoo releases a livecd, >and >kept updating the environment on this livecd on a regular interval (mount >the >live env, emerge --update word). If we figure out a way to clone the live >environment of a livecd onto a hd, and the other way around, there would >be >no need for GRP, nor stages, only a live environment (a livecd). The only >thing >that we need to have is a huge live environment (cd) on major releases >and a small >live environment on smaller releases. I can prove that this cloning >system can support >everything we support now (custom CFLAGS...). And we no longer need to >provide people with cpu optimized optimized stuff, as they will be able to >provide that themselves. We could still have an oportunity to build >optimized live >environments, but users could do themselves too. (a big environment can be >downgraded to a small one and upgraded to a bigger, optimized one without >much difficulty) > >conclusion: > >I think that instead of releasing a livecd (=stage1+extra stuff) with a >set >of stages (stage1 , stage1+extra stuff, stage1+bootstrap+extra stuff) and >some extra >stuff as GRP cd. We could release a big livecd (=stage1+a lot of extra >stuff) >if we enhanced our package manager. I see no drawbacks, apart from >revising >our current method of installation. Imho installation stays as educational >and powerfull as it is now. Maybe faster and better to some. >Note that this is only a idea, and not ment to be something that I >absolutely >want. Just want everyone to know, that GLIS (http://glis.sf.net) is happy to be involved with anything that needs help (especially the scripting). Nathaniel -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-dev] Python on the liveCD 2003-07-24 0:55 ` Nathaniel McCallum @ 2003-07-24 2:07 ` Nathaniel McCallum 2003-07-24 9:29 ` Seemant Kulleen 0 siblings, 1 reply; 25+ messages in thread From: Nathaniel McCallum @ 2003-07-24 2:07 UTC (permalink / raw To: gentoo-user; +Cc: gentoo-dev GLIS is pondering a rewrite in python (adding GUI for the X liveCDs). The question is, why isn't python on the liveCD? Is it feasable to get it included? Nathaniel -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] Python on the liveCD 2003-07-24 2:07 ` [gentoo-dev] Python on the liveCD Nathaniel McCallum @ 2003-07-24 9:29 ` Seemant Kulleen 0 siblings, 0 replies; 25+ messages in thread From: Seemant Kulleen @ 2003-07-24 9:29 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 740 bytes --] On Wed, 23 Jul 2003 22:07:37 -0400 "Nathaniel McCallum" <Nathaniel_McCallum@asburyseminary.edu> wrote: > GLIS is pondering a rewrite in python (adding GUI for the X liveCDs). > The question is, why isn't python on the liveCD? Is it feasable to get > it included? > > Nathaniel > > > -- > gentoo-dev@gentoo.org mailing list > > I'm pretty sure it can be added -- can you file a bug on it and assign to livewire at gentoo dot org? The GLIS project looks really promising, by the way :) -- Seemant Kulleen Developer and Project Co-ordinator, Gentoo Linux http://dev.gentoo.org/~seemant Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x3458780E Key fingerprint = 23A9 7CB5 9BBB 4F8D 549B 6593 EDA2 65D8 3458 780E [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 19:48 [gentoo-dev] The release of 1.4 and its impact on our mirrors Kurt Lieber 2003-07-23 8:40 ` Alvaro Figueroa Cabezas @ 2003-07-23 20:36 ` Matthew Walker 2003-07-23 20:39 ` Tal Peer 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall 3 siblings, 0 replies; 25+ messages in thread From: Matthew Walker @ 2003-07-23 20:36 UTC (permalink / raw To: gentoo-dev I like the suggestion made by someone on this list just yesterday to not include packages in /distfiles if they already have a well established mirror network like sourceforge. I don't know how big a dent this would put in distfiles, but it would be bound to help, especially since those files tend to be pretty popular. Matthew -- Was I helpful? Let others know: http://svcs.affero.net/rm.php?r=utoxin&p=main -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 19:48 [gentoo-dev] The release of 1.4 and its impact on our mirrors Kurt Lieber 2003-07-23 8:40 ` Alvaro Figueroa Cabezas 2003-07-23 20:36 ` [gentoo-dev] The release of 1.4 and its impact on our mirrors Matthew Walker @ 2003-07-23 20:39 ` Tal Peer 2003-07-23 21:10 ` Jon Portnoy 2003-07-23 21:41 ` Alec Berryman 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall 3 siblings, 2 replies; 25+ messages in thread From: Tal Peer @ 2003-07-23 20:39 UTC (permalink / raw To: gentoo-dev On Wed, 23 Jul 2003, Kurt Lieber wrote: > We have already received numerous complaints from our mirror admins about > the amount of disk space we chew up now. For reference, here is a break > down: > > 9.7G ./releases > 139M ./snapshots > 17G ./distfiles > 6.6G ./experimental > > [real-big-snip] > > Thoughts? Ideas? > Looking at the numbers you provided, i think we should seperate the mirrors into two groups: Binary and Source. Binary mirrors would provide GRPs and ISOs, and source mirrors will only provide distfiles. Mirrors could provide both, of course. In the short term, there won't be too many binary mirrors (freeing almost 17 gigs of free space is tempting), so we should encourage mirrors that are high on diskspace to mirror both source and binary. In the long term, this could also rise the numbers of mirrors, as mirror provideres will need to 'waste' less disk space on the gentoo mirror (if they choose to only mirror one type, that is). -- Tal Peer Gentoo Developer Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x253D2947 Key Fingerprint: C0B1 D91D 7323 6C0F 227A CBD6 D635 E53D 253D 2947 -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 20:39 ` Tal Peer @ 2003-07-23 21:10 ` Jon Portnoy 2003-07-23 21:41 ` Alec Berryman 1 sibling, 0 replies; 25+ messages in thread From: Jon Portnoy @ 2003-07-23 21:10 UTC (permalink / raw To: Tal Peer; +Cc: gentoo-dev On Wed, Jul 23, 2003 at 11:39:29PM +0300, Tal Peer wrote: > On Wed, 23 Jul 2003, Kurt Lieber wrote: > > > > We have already received numerous complaints from our mirror admins about > > the amount of disk space we chew up now. For reference, here is a break > > down: > > > > 9.7G ./releases > > 139M ./snapshots > > 17G ./distfiles > > 6.6G ./experimental > > > > [real-big-snip] > > > > Thoughts? Ideas? > > > > Looking at the numbers you provided, i think we should seperate the > mirrors into two groups: Binary and Source. Binary mirrors would provide > GRPs and ISOs, and source mirrors will only provide distfiles. Mirrors > could provide both, of course. > > In the short term, there won't be too many binary mirrors (freeing almost > 17 gigs of free space is tempting), so we should encourage mirrors that > are high on diskspace to mirror both source and binary. > > In the long term, this could also rise the numbers of mirrors, as mirror > provideres will need to 'waste' less disk space on the gentoo mirror (if > they choose to only mirror one type, that is). > I think we should probably start with leaving non-GRP LiveCD ISOs on "source' mirrors in that case and see how that works for mirror providers. Otherwise, we may have a lot of frustrated users who can't find CDs. -- Jon Portnoy avenj/irc.freenode.net -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] The release of 1.4 and its impact on our mirrors 2003-07-23 20:39 ` Tal Peer 2003-07-23 21:10 ` Jon Portnoy @ 2003-07-23 21:41 ` Alec Berryman 1 sibling, 0 replies; 25+ messages in thread From: Alec Berryman @ 2003-07-23 21:41 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1099 bytes --] On Wed, 2003-07-23 at 15:39, Tal Peer wrote: > Looking at the numbers you provided, i think we should seperate the > mirrors into two groups: Binary and Source. Binary mirrors would provide > GRPs and ISOs, and source mirrors will only provide distfiles. Mirrors > could provide both, of course. > > In the short term, there won't be too many binary mirrors (freeing almost > 17 gigs of free space is tempting), so we should encourage mirrors that > are high on diskspace to mirror both source and binary. > > In the long term, this could also rise the numbers of mirrors, as mirror > provideres will need to 'waste' less disk space on the gentoo mirror (if > they choose to only mirror one type, that is). Along the line of 'wasting' less disk space is the wasting of less bandwidth; would this not be a great time to start really pushing something like deltup (http://deltup.sourceforge.net/glep.html)? The few times I have used deltup it has worked great; patch availability is the problem. If it were kept up to date, it could take quite a load off the servers. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-23 19:48 [gentoo-dev] The release of 1.4 and its impact on our mirrors Kurt Lieber ` (2 preceding siblings ...) 2003-07-23 20:39 ` Tal Peer @ 2003-07-24 7:35 ` Håvard Wall 2003-07-23 5:50 ` Fred Van Andel ` (3 more replies) 3 siblings, 4 replies; 25+ messages in thread From: Håvard Wall @ 2003-07-24 7:35 UTC (permalink / raw To: gentoo-dev There has been some discussion lately about how to reduce disk usage on mirrors. Some programs (typically games) takes a lot of space and must be stored in the portage-tree due to lack of good host-sites on their own. How about implementing a file-sharing propram taylored for gentoo? Users could voluntarily share their /usr/portage/distfiles, or whatever would benefit mirrors. This would potentially let us keep huge (gaming-)files on their (faulty) hosts. When the original host is down, there would probably already be some users online which have a copy and is sharing it. Ok, I guess this might be a rather stupid proposal. I don't expect any supportive comments. Just had to get it off my chest -) Flames are nevertheless welcome. -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall @ 2003-07-23 5:50 ` Fred Van Andel [not found] ` <3F1F9174.6010504@ifi.uio.no> 2003-07-24 5:54 ` Raimundo Bilbao ` (2 subsequent siblings) 3 siblings, 1 reply; 25+ messages in thread From: Fred Van Andel @ 2003-07-23 5:50 UTC (permalink / raw To: gentoo-dev On July 24, 2003 12:35 am, Håvard Wall wrote: > There has been some discussion lately about how to reduce disk usage > on mirrors. Some programs (typically games) takes a lot of space and > must be stored in the portage-tree due to lack of good host-sites on > their own. > > How about implementing a file-sharing propram taylored for gentoo? > Users could voluntarily share their /usr/portage/distfiles, or > whatever would benefit mirrors. This would potentially let us keep > huge (gaming-)files on their (faulty) hosts. When the original host > is down, there would probably already be some users online which have > a copy and is sharing it. > > Ok, I guess this might be a rather stupid proposal. I don't expect > any supportive comments. Just had to get it off my chest -) Flames > are nevertheless welcome. I am actively working on this now. -- Fred Van Andel fava@gentoo.org GPG KeyID: 76526AD599455482 GPG fingerprint: 64E4 4BAB 9C99 D565 3E3C F5D0 7652 6AD5 9945 5482 -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <3F1F9174.6010504@ifi.uio.no>]
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors [not found] ` <3F1F9174.6010504@ifi.uio.no> @ 2003-07-23 6:04 ` Fred Van Andel 0 siblings, 0 replies; 25+ messages in thread From: Fred Van Andel @ 2003-07-23 6:04 UTC (permalink / raw To: Håvard Wall, gentoo-dev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Wow. Is there a project site? It would be fun to check out and see if > I could contribute with something. No, I have not gotten that far yet. I am currently working on proof of concept code which will be put on cvs when I have code that actually does something. - -- Fred Van Andel fava@gentoo.org GPG KeyID: 76526AD599455482 GPG fingerprint: 64E4 4BAB 9C99 D565 3E3C F5D0 7652 6AD5 9945 5482 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/HiV9dlJq1ZlFVIIRAh8mAKCGHzQrUpZZWUYWLWVRuYnZvzNEsQCgsjMx LT/UjxPQIVOihXBIqKr4GtM= =+3U1 -----END PGP SIGNATURE----- -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall 2003-07-23 5:50 ` Fred Van Andel @ 2003-07-24 5:54 ` Raimundo Bilbao 2003-07-23 6:42 ` Fred Van Andel 2003-07-24 6:35 ` bdharring 2003-07-24 9:32 ` Mix Sella 2003-07-24 16:39 ` gerrynjr 3 siblings, 2 replies; 25+ messages in thread From: Raimundo Bilbao @ 2003-07-24 5:54 UTC (permalink / raw To: gentoo-dev On Thu, 24 Jul 2003 09:35:04 +0200 Håvard Wall <haavardw@ifi.uio.no> wrote: [...] > How about implementing a file-sharing propram taylored for gentoo? Users > could voluntarily share their /usr/portage/distfiles, or whatever would > benefit mirrors. This would potentially let us keep huge (gaming-)files > on their (faulty) hosts. When the original host is down, there would > probably already be some users online which have a copy and is sharing it. > [...] Sound great, a P2P gentoo (?), but how do you protect against trojans, malware and stuffs like that?, is MD5 (AFAIK, currently the only checksum used) good enough?. cheers mundo -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 5:54 ` Raimundo Bilbao @ 2003-07-23 6:42 ` Fred Van Andel 2003-07-24 7:30 ` Robin H.Johnson 2003-07-24 6:35 ` bdharring 1 sibling, 1 reply; 25+ messages in thread From: Fred Van Andel @ 2003-07-23 6:42 UTC (permalink / raw To: gentoo-dev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 23, 2003 10:54 pm, Raimundo Bilbao wrote: > Sound great, a P2P gentoo (?), but how do you protect against > trojans, malware and stuffs like that?, is MD5 (AFAIK, currently the > only checksum used) good enough?. There are a couple of features to prevent against that kind of thing. Only files that exist on the official distfiles mirrors will eligible for sharing. In other words users cannot submit new files into the system. MD5's will be used to protect each chunk of data as well as the entire file. All hashes will originate from a central server so there is no opportunity for a malicious user to create a compromised chunk of data and have it accepted by the system. As for the security of MD5, there is no published instance of anyone finding 2 different datasets that produce an identical hash value. MD5 is a 128 bit hash algorithm so in theory it would be be required to calculate approximately 1.2 * sqrt(2^128) different hashes in order to have a 50% chance of a single collision. That would require > 350 billion gigabytes just to store the hashes. I believe MD5 to be secure enough for this application. - -- Fred Van Andel fava@gentoo.org GPG KeyID: 76526AD599455482 GPG fingerprint: 64E4 4BAB 9C99 D565 3E3C F5D0 7652 6AD5 9945 5482 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/Hi5SdlJq1ZlFVIIRAn+rAKCTzLilqNQjFCfNt9hXkhlZUK/JWwCg8w+a R6YWR9iUF6R0VBU2e18pQ5w= =8wC3 -----END PGP SIGNATURE----- -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-23 6:42 ` Fred Van Andel @ 2003-07-24 7:30 ` Robin H.Johnson 2003-07-23 7:53 ` Fred Van Andel 0 siblings, 1 reply; 25+ messages in thread From: Robin H.Johnson @ 2003-07-24 7:30 UTC (permalink / raw To: Fred Van Andel; +Cc: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 1371 bytes --] On Tue, Jul 22, 2003 at 11:42:26PM -0700, Fred Van Andel wrote: > As for the security of MD5, there is no published instance of anyone > finding 2 different datasets that produce an identical hash value. MD5 > is a 128 bit hash algorithm so in theory it would be be required to > calculate approximately 1.2 * sqrt(2^128) different hashes in order to > have a 50% chance of a single collision. That would require > 350 > billion gigabytes just to store the hashes. I believe MD5 to be secure > enough for this application. I'd be VERY careful with this. http://www.rsasecurity.com/rsalabs/faq/3-6-6.html I've seen much more recent research into it myself, along with a way of making it SIGNIFICENTLY more difficult to break. Namely, store the correct filesize along with the MD5 sum in a verifiable fashion. Having file containing a list of tarballs and their sizes, then providing a GPG signature for that file makes solves the issue to a level such that even all the computers in the world in 10 years could not beat it [famous last words, after seeing the crypto-attack on RSA keys using a massive NFS setup]. -- Robin Hugh Johnson E-Mail : robbat2@orbis-terrarum.net Home Page : http://www.orbis-terrarum.net/?l=people.robbat2 ICQ# : 30269588 or 41961639 GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 7:30 ` Robin H.Johnson @ 2003-07-23 7:53 ` Fred Van Andel 0 siblings, 0 replies; 25+ messages in thread From: Fred Van Andel @ 2003-07-23 7:53 UTC (permalink / raw To: gentoo-dev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 24, 2003 12:30 am, you wrote: > > I'd be VERY careful with this. > http://www.rsasecurity.com/rsalabs/faq/3-6-6.html I am familiar with this. But in practice its not an issue. Each chunk of the file would be protected by a MD5 hash and then the file as a whole is protected by a different MD5, any change to the data that passes the MD5 protecting one chunk would fail the MD5 protecting the file as a whole. I suspect the referenced machine would work only looking at hashes that belong in a reduced domain, ie the upper X bits being zero. In our application that is not an option. Our problem is not a birthday attack but rather a search for a small set of possible values. As well anyone who has the resources to spend on creating a machine such as the one described would have much more profitable uses for it. I am thinking of protecting each chunk with a smaller hash (say 64 bits) and use it just to detect transmissions errors. A full MD5 or SHA1 would protect the file as a whole. This would reduce the bandwidth requirements of the central tracker. In the event that a bad chunk passes the reduced hash test then the full file hash would cause the entire download to be rejected rather than just one chunk. So in the worst case a malicious user could cause a lot of failed downloads but no corrupted files. - -- Fred Van Andel fava@gentoo.org GPG KeyID: 76526AD599455482 GPG fingerprint: 64E4 4BAB 9C99 D565 3E3C F5D0 7652 6AD5 9945 5482 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/Hj7hdlJq1ZlFVIIRAkCCAJ9i2IxN6rSvWnHFmucimtZwMkeiggCZAUeA s3n6mdCInenSfgFyWegQ3uM= =OO70 -----END PGP SIGNATURE----- -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 5:54 ` Raimundo Bilbao 2003-07-23 6:42 ` Fred Van Andel @ 2003-07-24 6:35 ` bdharring 2003-07-23 7:22 ` Fred Van Andel 1 sibling, 1 reply; 25+ messages in thread From: bdharring @ 2003-07-24 6:35 UTC (permalink / raw To: gentoo-dev On Thursday, July 24, 2003, at 12:54 AM, Raimundo Bilbao wrote: > On Thu, 24 Jul 2003 09:35:04 +0200 > Håvard Wall <haavardw@ifi.uio.no> wrote: > > [...] > >> How about implementing a file-sharing propram taylored for gentoo? >> Users >> could voluntarily share their /usr/portage/distfiles, or whatever >> would >> benefit mirrors. This would potentially let us keep huge >> (gaming-)files >> on their (faulty) hosts. When the original host is down, there would >> probably already be some users online which have a copy and is >> sharing it. >> > [...] > > Sound great, a P2P gentoo (?), but how do you protect against trojans, > malware and stuffs like that?, is MD5 (AFAIK, currently the only > checksum used) good enough?. Famous last words, but if there was a trusted central listing of md5's, it is a strong enough hash to identify if the downloaded distfile is original or not. I would guess that it is *possible* to have a different dataset that produces an identical md5, but to actually do this isn't even remotely feasible, let alone having the code *actually* do something nefarious. Of course I'm not a cryptologist/mathematician, but suffice it to say there is a reason most downloaded sources maintain an md5 sig alongside... I realize this particular horse has been beaten well past it's death, but why create a separate p2p system instead of using bit torrent? Just curious, I'm aware of how bit torrent is structured, but that's about it... Other then that, you've mentioned that you're attempting a proof of concept, care to elaborate on some of the aspects of the particular p2p system you're attempting to create/test? ~bdh -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 6:35 ` bdharring @ 2003-07-23 7:22 ` Fred Van Andel 0 siblings, 0 replies; 25+ messages in thread From: Fred Van Andel @ 2003-07-23 7:22 UTC (permalink / raw To: gentoo-dev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On July 23, 2003 11:35 pm, bdharring wrote: > I realize this particular horse has been beaten well past it's death, > but why create a separate p2p system instead of using bit torrent? > Just curious, I'm aware of how bit torrent is structured, but that's > about it... Bittorrent is well suited to distrubuting a small number of large files, that is what it was designed for. The downside is you need a separate instance of bittorrent running for every file that you are shaing, so if you are sharing several hundred files you need several hundred copies of bittorrent running. Another problem is that in bittorrent the client side server and the fetcher is the same program, for our purposes we need a separate program to perform each function. > Other then that, you've mentioned that you're attempting a proof of > concept, care to elaborate on some of the aspects of the particular > p2p system you're attempting to create/test? > ~bdh Lets see: Central repository of hashes for security. Only authorized files will be shareable. Separate server and fetcher program. Will download simultaneously from multiple sources. Drop in replacement for wget via FETCHCOMMAND= Rate Limiting on the server (but not to 0). The fetcher will automatically start the server (to limit leaching) Will automatically share your distfiles directory ??? It is NOT being designed as a general purpose p2p program. I have no desire to become a target of the MPAA or RIAA - -- Fred Van Andel fava@gentoo.org GPG KeyID: 76526AD599455482 GPG fingerprint: 64E4 4BAB 9C99 D565 3E3C F5D0 7652 6AD5 9945 5482 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/HjebdlJq1ZlFVIIRAmL4AKDWKGB+SEmMA/s9cdR6D15O/5sMvgCdE2aH ZMna7nBS7ESfa3uUnCY02/M= =/haH -----END PGP SIGNATURE----- -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall 2003-07-23 5:50 ` Fred Van Andel 2003-07-24 5:54 ` Raimundo Bilbao @ 2003-07-24 9:32 ` Mix Sella 2003-07-24 16:39 ` gerrynjr 3 siblings, 0 replies; 25+ messages in thread From: Mix Sella @ 2003-07-24 9:32 UTC (permalink / raw To: gentoo-dev On Thursday 24 July 2003 10:35, Håvard Wall wrote: > There has been some discussion lately about how to reduce disk usage on > mirrors. Some programs (typically games) takes a lot of space and must > be stored in the portage-tree due to lack of good host-sites on their own. > > How about implementing a file-sharing propram taylored for gentoo? Users > could voluntarily share their /usr/portage/distfiles, or whatever would > benefit mirrors. This would potentially let us keep huge (gaming-)files > on their (faulty) hosts. When the original host is down, there would > probably already be some users online which have a copy and is sharing it. > > Ok, I guess this might be a rather stupid proposal. I don't expect any > supportive comments. Just had to get it off my chest -) Flames are > nevertheless welcome. > This is not a stupid proposal at all. The question to ask however is "who writes the code"? > > -- > gentoo-dev@gentoo.org mailing list > > This mail was checked for viruses by Romat email server -- Mix Sella (well, not really but hey) -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall ` (2 preceding siblings ...) 2003-07-24 9:32 ` Mix Sella @ 2003-07-24 16:39 ` gerrynjr 2003-07-24 15:59 ` Tom Payne 3 siblings, 1 reply; 25+ messages in thread From: gerrynjr @ 2003-07-24 16:39 UTC (permalink / raw To: gentoo-dev I think it's a great Idea. I proposed a similar idea several months ago... but it was shot down due to security concerns, plus I had no clue how to write the needed code. It is definitely a good idea, providing those concerns can be addressed. On Thu, 2003-07-24 at 02:35, Håvard Wall wrote: > There has been some discussion lately about how to reduce disk usage on > mirrors. Some programs (typically games) takes a lot of space and must > be stored in the portage-tree due to lack of good host-sites on their own. > > How about implementing a file-sharing propram taylored for gentoo? Users > could voluntarily share their /usr/portage/distfiles, or whatever would > benefit mirrors. This would potentially let us keep huge (gaming-)files > on their (faulty) hosts. When the original host is down, there would > probably already be some users online which have a copy and is sharing it. > > Ok, I guess this might be a rather stupid proposal. I don't expect any > supportive comments. Just had to get it off my chest -) Flames are > nevertheless welcome. > > > -- > gentoo-dev@gentoo.org mailing list > > -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors 2003-07-24 16:39 ` gerrynjr @ 2003-07-24 15:59 ` Tom Payne 0 siblings, 0 replies; 25+ messages in thread From: Tom Payne @ 2003-07-24 15:59 UTC (permalink / raw To: gerrynjr; +Cc: gentoo-dev On Thu, Jul 24, 2003 at 11:39:42AM -0500, gerrynjr wrote: > > How about implementing a file-sharing propram taylored for gentoo? Users > > could voluntarily share their /usr/portage/distfiles, or whatever would > > benefit mirrors. This would potentially let us keep huge (gaming-)files > > on their (faulty) hosts. When the original host is down, there would > > probably already be some users online which have a copy and is sharing it. BitTorrent would be the obvious starting point here. It might require a bit of tweaking to shared all available files in /usr/portage/distfiles (rather than individual files), but that's all. gentoo.org could run tracker (list of available peers). Security concerns are already builtin to BitTorrent. It uses hashing to ensure you're being sent uncorrupted/tampered data. More info: http://bitconjurer.org/BitTorrent/ This would be a _really_ good thing to implement. Regards, Tom -- gentoo-dev@gentoo.org mailing list ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2003-07-24 15:59 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-07-23 19:48 [gentoo-dev] The release of 1.4 and its impact on our mirrors Kurt Lieber 2003-07-23 8:40 ` Alvaro Figueroa Cabezas 2003-07-23 21:01 ` Kurt Lieber 2003-07-23 9:28 ` Alvaro Figueroa Cabezas 2003-07-23 9:30 ` Alvaro Figueroa Cabezas 2003-07-24 0:11 ` [gentoo-dev] " Pieter Van den Abeele 2003-07-24 0:55 ` Nathaniel McCallum 2003-07-24 2:07 ` [gentoo-dev] Python on the liveCD Nathaniel McCallum 2003-07-24 9:29 ` Seemant Kulleen 2003-07-23 20:36 ` [gentoo-dev] The release of 1.4 and its impact on our mirrors Matthew Walker 2003-07-23 20:39 ` Tal Peer 2003-07-23 21:10 ` Jon Portnoy 2003-07-23 21:41 ` Alec Berryman 2003-07-24 7:35 ` [gentoo-dev] (crazy?) proposal to reduce load and disk on mirrors Håvard Wall 2003-07-23 5:50 ` Fred Van Andel [not found] ` <3F1F9174.6010504@ifi.uio.no> 2003-07-23 6:04 ` Fred Van Andel 2003-07-24 5:54 ` Raimundo Bilbao 2003-07-23 6:42 ` Fred Van Andel 2003-07-24 7:30 ` Robin H.Johnson 2003-07-23 7:53 ` Fred Van Andel 2003-07-24 6:35 ` bdharring 2003-07-23 7:22 ` Fred Van Andel 2003-07-24 9:32 ` Mix Sella 2003-07-24 16:39 ` gerrynjr 2003-07-24 15:59 ` Tom Payne
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox