From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1FeVhb-000759-Gl for garchives@archives.gentoo.org; Fri, 12 May 2006 11:21:00 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.6/8.13.6) with SMTP id k4CBLDmp002313; Fri, 12 May 2006 11:21:13 GMT Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) by robin.gentoo.org (8.13.6/8.13.6) with ESMTP id k4CBLBR8022856 for ; Fri, 12 May 2006 11:21:12 GMT Received: from lark (lark.gentoo.osuosl.org [140.211.166.177]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with SMTP id 847E1647F6 for ; Fri, 12 May 2006 11:21:11 +0000 (UTC) Received: by lark (sSMTP sendmail emulation); Fri, 12 May 2006 11:21:11 +0000 From: "undefined" Date: Fri, 12 May 2006 11:21:11 +0000 To: gentoo-doc-cvs@lists.gentoo.org Subject: [gentoo-doc-cvs] cvs commit: rsync.xml Message-Id: <20060512112111.847E1647F6@smtp.gentoo.org> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-doc-cvs@gentoo.org Reply-to: docs-team@lists.gentoo.org X-Archives-Salt: e1207bc5-f417-49c3-8c72-b5f200141fcd X-Archives-Hash: c4e3676fa805361b5b14e8caac86b3f2 jforman 06/05/12 11:21:11 Modified: rsync.xml Log: Rewrite complements of admin tobias klausmann Revision Changes Path 1.47 xml/htdocs/doc/en/rsync.xml file : http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/rsync.xml?rev=1.47&content-type=text/x-cvsweb-markup&cvsroot=gentoo plain: http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/rsync.xml?rev=1.47&content-type=text/plain&cvsroot=gentoo diff : http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/rsync.xml.diff?r1=1.46&r2=1.47&cvsroot=gentoo Index: rsync.xml =================================================================== RCS file: /var/cvsroot/gentoo/xml/htdocs/doc/en/rsync.xml,v retrieving revision 1.46 retrieving revision 1.47 diff -u -r1.46 -r1.47 --- rsync.xml 1 Jan 2006 11:51:43 -0000 1.46 +++ rsync.xml 12 May 2006 11:21:10 -0000 1.47 @@ -1,47 +1,61 @@ - - + + - -Gentoo Linux rsync Mirrors Policy - + +Gentoo Linux rsync Mirrors Policy and Guide + + + Tobias Klausmann + + + Gentoo Mirror Administrators - + Xavier Neys + + Tobias Klausmann + + + This document explains how to set up a official rsync mirror and your own local mirror. + + -1.13 -2005-12-12 +3.0 +2006-05-12 -Hardware Request +Preliminaries
-Machine Donation +Terms, names and all that

-Gentoo Linux relies upon two different kinds of mirrors: main rotation mirrors -and community mirrors. Main rotation mirrors are dedicated rsync servers and -are responsible for handling the bulk of our rsync traffic. All main rotation -mirrors run Gentoo Linux and are managed by members of the Gentoo development -team. Community mirrors are servers which are provided and managed by members -of the community. These servers may or may not be dedicated to rsync usage and -they may or may not run Gentoo Linux. +This guide is intended for people who would like to set up an rsync mirror of +their own. I caters not only to those who want to run an official rsync mirror +but also those wanting to run private mirrors.

-

-At this time, we have enough community mirrors and are actively seeking -additional main rotation mirrors. Specifications for main rotation servers -include: -

+

There are three kinds of Gentoo rsync mirrors: main rotation mirrors, +community mirrors and private mirrors. Main rotation mirrors are maintained by +the Gentoo infrastructure team. They handle the bulk of the Gentoo rsync +traffic. The community mirrors are run by volunteers from the Gentoo community. +Private mirrors are mirrors run by individuals which are closed off to the +public and meant to cut traffic costs and latency for an organization or +individual.

+ +

At this time, we have enough community mirrors and are actively seeking +additional main rotation mirrors. Specifications for main rotation servers +include:

  • Minimum of a 2GHz Pentium 4 processor (or equivalent)
  • @@ -65,295 +79,152 @@
- - -Short FAQ (provided as a reference for current mirror admins) -
-Q: Who should I contact regarding rsync issues and maintenance? - +

The same holds true for organizations who would like to control the rsync +mirror their servers and workstations sync against. Of course, they usually also +want to save on badnwidth and traffic costs.

+ +

All you need to do is select which machine is going to be your own local +rsync mirror and set it up. You should choose a computer that can handle the CPU +and disk load that an rsync operation require. Your local mirror also needs to +be available whenever any of your other computers syncs its portage tree. +Besides, it should have a static IP address or a name that always resolves to +your server. Configuring a DHCP and/or a DNS server is beyond the scope of this +guide.

-

-A: Visit http://bugs.gentoo.org and fill out a bug on the product "Rsync". -

+

Note that these instructions assume your private rsync mirror is a Gentoo +machine. If you intend to run it on a different distribution, the guide for +setting up a community mirror might be more helpful. Just don't sync the mirror +every half hour but once or twice a day.

-Q: I run a private rsync mirror for my company. Can I still access rsync1.us.gentoo.org? +Setting up the server -

-A: Because our resources are limited, we need to ensure we allocate them in -such a way as to provide the maximum amount of benefit to our users. As such, we -limit connections to our master rsync and distfile mirrors to public mirrors -only. Users are welcome to use our regular mirror system to establish a private -rsync mirror, though they are asked to follow certain basic - -rsync etiquette guidelines. -

+

There is no extra package to install as the required software is already on +your computer.Setting up your own local rsync mirror is just a matter of +configuring the rsyncd daemon to make your /usr/portage +directory available for syncing. Create the following +/etc/rsyncd.conf configuration file:

- -
-
-Q: Is it important that I sync my mirror twice an hour? - +
+pid file = /var/run/rsyncd.pid
+max connections = 5
+use chroot = yes
+uid = nobody
+gid = nobody
+# Optional: restrict access to your Gentoo boxes
+hosts allow = 192.168.0.1 192.168.0.2 192.168.1.0/24
+hosts deny  = *
 
-

-A: Yes it is important. You do not need to perform the syncs at exactly :00 and :30 -but the syncs should take place in each of the following two windows: -

+[gentoo-portage] +path=/usr/portage +comment=Gentoo Portage +exclude=distfiles/ packages/ +
-
    -
  1. :00 to :10
  2. -
  3. :30 to :40
  4. -
+

You do not need to use the hosts allow and hosts deny options. +By default, all clients will be allowed to connect. The order in which you write +the options is not relevant. The server will always check the hosts allow +option first and grant the connection if the connecting host matches any of the +listed patterns. The server will then check the hosts deny option and +refuse the connection if any match is found. Any host that does not match +anything will be granted a connection. Please read the man page (man +rsyncd.conf) for more information.

-

-Additionally, please make sure that your syncs are exactly 30 minutes apart. So, if -you schedule the first sync of each hour for :08, please schedule the second sync of -the hour for :38. +

Now, start your rsync daemon with the following command as the root user:

- -
- -
-Q: Where should I sync my rsync mirror before I become an official Gentoo mirror? - +
+(Start the daemon now)
+# /etc/init.d/rsyncd start
+(Add the daemon to your default runlevel)
+# rc-update add rsyncd default
+
-
    -
  • I am a European-based rsync mirror: sync to rsync.de.gentoo.org
  • -
  • I am a US-based rsync mirror: sync to rsync.us.gentoo.org
  • -
  • I am not in the first two groups: sync to rsync.us.gentoo.org
  • -
+

Let's test your rsync mirror. You do not need to try from another machine +but it would be a good idea to do so. If your server is not known by name from +all your computers, you can use its IP address instead.

- -
-
-Q: How do I find the mirror nearest to me? - +
+(You may use the server name or its IP)
+# rsync 192.168.0.1::
+gentoo-portage     Gentoo Portage
+# rsync your_server_name::gentoo-portage
+(You should see the content of /usr/portage on your mirror)
+
-

-A: netselect was designed to do this for you. If you haven't already run -emerge netselect then do it. Then run: netselect rsync.gentoo.org. -After a minute or so netselect will print an IP address. Take this address and -use it as the only parameter for rsync with two colons appended to it. eg: -rsync 1.2.3.4::. You should be able to find out which mirror that is -from the banner message. Update your /etc/make.conf accordingly. -

+

Your rsync mirror is now set up. Keep running emerge --sync as you +have done so far to keep your server up-to-date. If you use cron or similar +facilities to sync regularly, remember to keep it down to a sensible frequency +like once or twice a day.

+ + Please note that most public mirror administrators consider syncing more +than once or twice a day an abuse. Some if not most of them will ban your IP +from their server if you start abusing their machines.
-Q: Can I use compression when syncing against rsync1.us.gentoo.org? +Configuring your clients -

-A: No. Compression utilizes too many resources on the server, so we have -forcibly disabled it on rsync1.us.gentoo.org. Please do not -attempt to use compression when syncing against this server. -

+

Now, make your other computers use your own local rsync mirror instead of a +public one. Edit your /etc/make.conf and make the SYNC +variable point to your server.

- -
-
-Q: I'm seeing a lot of old and probably dead rsync processes, how can I get rid of them? - +
+(Use your server IP addess)
+SYNC="rsync://192.168.0.1/gentoo-portage"
+(Or use your server name)
+SYNC="rsync://your_server_name/gentoo-portage"
+
-

-A: Please see the Example Scripts section. -

+

You can check that your computer has been properly set up by syncing against +your own local mirror for the first time:

- -
-
-Q: There are many users who connect to my rsync server very frequently, -sometimes even causing a DoS to my mirror, is there any way to prevent this? - +
+(Check that the SYNC variable has been setup)
+# emerge --info|grep SYNC
+SYNC="rsync://your_server_name/gentoo-portage"
+(Sync against your local mirror)
+# emerge --sync
+
-

-A: Again please see the Example Scripts section. -

+

That's it! All your computers will now use your local rsync mirror whenever +you run emerge --sync.

+ -Example Scripts +Setting up a community rsync server
+Introduction - -You will find sample configuration and script files in the gentoo-rsync-mirror -package. Just do emerge gentoo-rsync-mirror - + You can find sample configuration and script files in the +gentoo-rsync-mirror package. Just do emerge gentoo-rsync-mirror

-Right now, mirroring our Portage tree requires around 250Mb, so it isn't space -intensive; having at least 500Mb free should allow for growing room. Setting +Right now, mirroring our Portage tree requires around 600Mb, so it isn't space +intensive; having at least 1Gb free should allow for growing room. Setting up a Portage tree mirror is simple -- first, ensure that your mirror has rsync installed. Then, set up your rsyncd.conf file to look something like this:

@@ -383,21 +254,24 @@ exclude = distfiles -

-Above, the gentoo-x86-portage mirror points to the same data as gentoo-portage. -Although we have recently changed the official name of our mirror to -gentoo-portage, gentoo-x86-portage is still needed for backwards compatibility, -so include both entries. -

- -

-For security reasons, the use of a chrooted environment is required! -

- -

-Now, you need to mirror the Gentoo Linux Portage tree. You should use the -following script to do so: -

+

You can pick your own locations for most of the files, of course. What's +important are the section names ([gentoo-portage] and +[gentoo-x86-portage]). They are the locations that rsync clients will try +to sync from

+ +

Above, the gentoo-x86-portage mirror points to the same data as +gentoo-portage. Although we have recently changed the official name of our +mirror to gentoo-portage, gentoo-x86-portage is still needed for backwards +compatibility, so include both entries. Eventually, the gentoo-x86-portage +location will be removed.

+ +

For security reasons, the use of a chrooted environment is required! This +has implications for the logged timestamps -- see the FAQ below.

+ +

Now, you need to mirror the Gentoo Linux Portage tree. You can use the +script below to do so. Again, you'll probably want to change some of the file +locations to suit your needs -- in particular, they should match those of your +rsyncd.conf.

 #!/bin/bash
@@ -417,217 +291,168 @@
 echo "End: "`date` >> $0.log 2>&1 
 
-
-#!/sbin/runscript
-# Copyright 1999-2004 Gentoo Foundation
-# Distributed under the terms of the GNU General Public License v2
-# $Header: /var/cvsroot/gentoo-x86/net-misc/rsync/files/rsyncd.init.d,v 1.2 2004/05/02 22:45:02 mholzer Exp $
-
-depend() {
-need net
-}
-
-# FYI: --sparce seems to cause problems.
-RSYNCOPTS="--daemon --safe-links --timeout=300"
-
-start() {
-ebegin "Starting rsync daemon"
-start-stop-daemon --start --quiet --pidfile /var/run/rsyncd.pid --nicelevel 15 --exec /usr/bin/rsync -- ${RSYNCOPTS}
-eend $?
-}
-
-stop() {
-ebegin "Stopping rsync daemon"
-start-stop-daemon --stop --quiet --pidfile /var/run/rsyncd.pid
-eend $?
-} 
-
+

Your rsyncd.motd should contain your IP address and other +relevant information about your mirror, such as information about the host +providing the Portage mirror and an administrative contact.You can now test your +server as outlined in the "Setting up your own local rsync mirror" section +above.

+ +

After you have been approved as an official rsync mirror your host will be +aliased with a name of the form: rsync[num].[country +code].gentoo.org

-

-Your rsyncd.motd should contain your IP address and other relevant information -about your mirror, such as information about the host providing the Portage -mirror and an administrative contact. After you have been approved as an -official rsync mirror your host will be aliased with a name of the form: -rsync[num].[country code].gentoo.org -

+ +
+
-

-This command will help you to kill old rsync processes that sometimes lie -around due to connection problems. It's important to kill those because they -count as valid connections for the 'max connections' option. You may run this -command via crontab every hour, it will search and kill rsync processes older -than one hour. -

+ +Short FAQ +
+Q: Who should I contact regarding rsync issues and maintenance? + -
-/bin/kill -9 `/bin/ps --no-headers -Crsync -o etime,user,pid,command|/bin/grep nobody | \
-             /bin/grep "[0-9]\{2\}:[0-9]\{2\}:" |/bin/awk '{print $3}'` 
-
+

A: Visit Gentoo Bugzilla and fill +out a bug on the product "Mirrors", component "Server Problem".

-

-In some cases, there are a few inconsiderate users who abuse the rsync mirror -system by syncing more than 1-2 times per day. In the most extreme cases, -users schedule cron jobs to sync every 15 minutes or so. This often leads to a -Denial of Service attack by continually occupying an rsync slot that could have -otherwise gone to another user. To try and prevent this, you may use the -following perl -script which will scan through your rsync log files, pick out IP -addresses that have already connected more than N times that day and -dynamically create a rsyncd.conf file, including the offending IP addresses in -the 'hosts deny' directive. The following line controls what N equals: -

+ +
-
-@badhosts=grep {$hash{$_}>4} keys %hash;
-
+
+Q: How can I check the freshness of an official rsync server? + +

The Gentoo infrastructure team monitors all community rsync servers for +freshness. You can see the results on the corresponding web page. +

+
-

-If you use this script, please remember to rotate your rsync log files daily -and modify the script to match the location of your rsyncd.conf -file. This script is tested on Gentoo Linux, but should work suitably on other -arches that support both rsync and perl. -

+
+Q: I run a private rsync mirror for my company. Can I still access +rsync1.us.gentoo.org? + + + +

A: Because our resources are limited, we need to ensure we allocate them in +such a way as to provide the maximum amount of benefit to our users. As such, we +limit connections to our master rsync and distfile mirrors to public mirrors +only. Users are welcome to use our regular mirror system to establish a private +rsync mirror, though they are asked to follow certain basic +rsync etiquette guidelines.

-
- - -Setting up your own local rsync mirror
-Introduction +Q: Is it important that I sync my mirror twice an hour? -

-Many users run Gentoo on several machines and need to run emerge --sync -on all of them. Using public mirrors is simply a waste of bandwidth at both -ends. Syncing only one machine against a public mirror and all others against -that computer would save resources on Gentoo mirrors and save users' bandwidth. +

A: Yes it is important. You do not need to perform the syncs at exactly :00 +and :30 but the syncs should take place in each of the following two windows:

-

-All you need to do is select which of your machines is going to be your own local -rsync mirror and set it up. You should choose a computer that can handle the -CPU and disk load that an rsync operation require. Your local mirror also needs -to be available whenever any of your other computers syncs its portage tree. -Besides, it should have a static IP address or a name that always resolves to -your server. Configuring a DHCP and/or a DNS server is beyond the scope of -this guide. +

    +
  1. :00 to :10
  2. +
  3. :30 to :40
  4. +
+ +

Additionally, please make sure that your syncs are exactly 30 minutes apart. +So, if you schedule the first sync of each hour for :08, please schedule the +second sync of the hour for :38.

+
-Setting up the server +Q: Where should I sync my rsync mirror before I become an official Gentoo mirror? -

-There is no extra package to install as the required software is already on -your computer. Setting up your own local rsync mirror is just a matter of -configuring the rsyncd daemon to make your /usr/portage -directory available for syncing. Create the following -/etc/rsyncd.conf configuration file:

- -
-pid file = /var/run/rsyncd.pid
-max connections = 5
-use chroot = yes
-uid = nobody
-gid = nobody
-# Optional: restrict access to your Gentoo boxes
-hosts allow = 192.168.0.1 192.168.0.2 192.168.1.0/24
-hosts deny  = *
-
-[gentoo-portage]
-path=/usr/portage
-comment=Gentoo Portage
-exclude=distfiles/ packages/
-
- -

-You do not have to use the hosts allow and hosts deny options. By -default, all clients will be allowed to connect. The order in which you write -the options is not relevant. The server will always check the hosts -allow option first and grant the connection if the connecting host matches -any of the listed patterns. The server will then check the hosts deny -option and refuse the connection if any match is found. Any host that does not -match anything will be granted a connection. Please read the man page (man -rsyncd.conf) for more information. -

+
    +
  • For European-based rsync mirror: sync to rsync.de.gentoo.org
  • +
  • For US-based rsync mirror: sync to rsync.us.gentoo.org
  • +
  • For all others: sync to rsync.us.gentoo.org
  • +
-

-Now, start your rsync daemon with the following command as the root user: -

+ +
-
-(Start the daemon now)
-# /etc/init.d/rsyncd start
-(Add the daemon to your default runlevel)
-# rc-update add rsyncd default
-
+
+Q: How do I find the mirror nearest to me? + -

-Let's test your rsync mirror. You do not need to try from another machine but -it would be a good idea to do so. If your server is not known by name from all -your computers, you can use its IP address instead. -

+

A: netselect was designed to do this for you. If you haven't already +run emerge netselect then do it. Then run: netselect +rsync.gentoo.org. After a minute or so netselect will print an IP address. +Take this address and use it as the only parameter for rsync with two colons +appended to it. eg: rsync 1.2.3.4::. You should be able to find out which +mirror that is from the banner message. Update your /etc/make.conf +accordingly.

-
-(You may use the server name or its IP)
-# rsync 192.168.0.1::
-gentoo-portage     Gentoo Portage
-# rsync your_server_name::gentoo-portage
-(You should see the content of /usr/portage on your mirror)
-
+ +
-

-Your rsync mirror is now set up. Keep running emerge --sync as you have -done so far to keep your server up-to-date. -

+
+Q: Can I use compression when syncing against rsync1.us.gentoo.org? + - -Please note that most public mirror administrators consider syncing more than -once or twice a day an abuse. - +

A: No. Compression utilizes too many resources on the server, so we have +forcibly disabled it on rsync1.us.gentoo.org. Please do not +attempt to use compression when syncing against this server.

+
-Configuring your clients +Q: I'm seeing a lot of old and probably dead rsync processes, how can I +get rid of them?

-Now, make your other computers use your own local rsync mirror instead of a -public one. Edit your /etc/make.conf and make the SYNC -variable point to your server. +This command will help you to kill old rsync processes that sometimes lie +around due to connection problems. It's important to kill those because they +count as valid connections for the 'max connections' option. You may run this +command via crontab every hour, it will search and kill rsync processes older +than one hour.

-
-(Use your server IP addess)
-SYNC="rsync://192.168.0.1/gentoo-portage"
-(Or use your server name)
-SYNC="rsync://your_server_name/gentoo-portage"
+
+/bin/kill -9 `/bin/ps --no-headers -Crsync -o etime,user,pid,command|/bin/grep nobody | \
+             /bin/grep "[0-9]\{2\}:[0-9]\{2\}:" |/bin/awk '{print $3}'` 
 
+ +
+
+Q: There are many users who connect to my rsync server very frequently, +sometimes even causing a DoS to my mirror, is there any way to prevent +this? + +

-You can check that your computer has been properly set up by syncing against your -own local mirror for the first time: -

+In some cases, there are a few inconsiderate users who abuse the rsync mirror +system by syncing more than 1-2 times per day. In the most extreme cases, users +schedule cron jobs to sync every 15 minutes or so. This often leads to a Denial +of Service attack by continually occupying an rsync slot that could have +otherwise gone to another user. To try and prevent this, you may use the this perl script +which will scan through your rsync log files, pick out IP addresses that have +already connected more than N times that day and dynamically create a +rsyncd.conf file, including the offending IP addresses in the +'hosts deny' directive. The following line controls what N equals (in +this case 4):

-
-(Check that the SYNC variable has been setup)
-# emerge --info|grep SYNC
-SYNC="rsync://your_server_name/gentoo-portage"
-(Sync against your local mirror)
-# emerge --sync
+
+@badhosts=grep {$hash{$_}>4} keys %hash;
 
-

-That's it! All your computers will now use your local rsync mirror whenever you -run emerge --sync. -

+

If you use this script, please remember to rotate your rsync log files daily +and modify the script to match the location of your rsyncd.conf +file. This script is tested on Gentoo Linux, but should work suitably on other +arches that support both rsync and perl.

+ + +
-- gentoo-doc-cvs@gentoo.org mailing list