From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1QM7L2-0004tu-Ri for garchives@archives.gentoo.org; Mon, 16 May 2011 23:36:37 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 65B9C1C0CE; Mon, 16 May 2011 23:34:25 +0000 (UTC) Received: from mail-wy0-f181.google.com (mail-wy0-f181.google.com [74.125.82.181]) by pigeon.gentoo.org (Postfix) with ESMTP id 18E091C0CE for ; Mon, 16 May 2011 23:34:24 +0000 (UTC) Received: by wyi11 with SMTP id 11so5156052wyi.40 for ; Mon, 16 May 2011 16:34:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:to:subject:date:user-agent:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id; bh=6cStS7pIFIkYL5IsoCkVjMZ8b5mWiG60quPWVWhuNvY=; b=Y03cTyKgpwDoOlqopVQG2M8Cx+yjis1OeoHxtOpnN3e+P/fJN9R7OFx+nJf2l8GjYX RQ8h2mdVEdPNhgPb6z8jf9YVeJLerbFWiqeDGBHf/9On6Xx2tdjRvg7Klj5R3GL4pO8z OmkVV9LogRAcrUa8CdSQo/a7YVAr1qDEUbva0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:user-agent:references:in-reply-to:mime-version :content-type:content-transfer-encoding:message-id; b=eUGlpQ9iQ7pW/YiJ9KWC1zD0hljD4kDShxkFf+OJKx9sw9c7UQ1RMIuorFu7/FNj95 tnuPmW9S23HkzP63rBSnwH5Q7I66SyxCWOAWLpi2hLxJqdG+Vcd07sam8PwmA6x4iL9K 8VXP+by3H2clQgYQUwfv/fOrIadw8GcdAPRwI= Received: by 10.227.10.67 with SMTP id o3mr4765369wbo.26.1305588864297; Mon, 16 May 2011 16:34:24 -0700 (PDT) Received: from nazgul.localnet (196-215-114-244.dynamic.isadsl.co.za [196.215.114.244]) by mx.google.com with ESMTPS id w25sm3444410wbd.39.2011.05.16.16.34.22 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 16 May 2011 16:34:23 -0700 (PDT) From: Alan McKinnon To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] is a nice "place" :-D Date: Tue, 17 May 2011 01:33:39 +0200 User-Agent: KMail/1.13.7 (Linux/2.6.38-ck-r1; KDE/4.6.3; x86_64; ; ) References: <4DD1AEC8.5010501@earthlink.net> In-Reply-To: <4DD1AEC8.5010501@earthlink.net> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201105170133.39864.alan.mckinnon@gmail.com> X-Archives-Salt: X-Archives-Hash: 84c8a818820909dba5738789efe8a48f Apparently, though unproven, at 01:10 on Tuesday 17 May 2011, Felix Miata did opine thusly: > After attempting to install for the first time last week, I started 3 > different threads here looking for help. I'm pleased with the nature of the > responses, and being able to succeed eventually using a mix of those > responses and my own efforts digging into Google, gentoo.org and cranial > cobwebs. So, thanks to all who replied, and even to those who showed > interest without replying. > > For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three > threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep > -v | sort > outfile' generated 117 lines. That's a lot more hits > than I can ever remember getting before when asking for help from a > mailing list (even if it did take 5 days to accumulate so many). > > I'm curious if anyone here would like to offer a better variant of my local > query that would limit the hit count so that no more than one hit per IP is > represented in the output? My skill with such things is very limited. I > can't think of the the name of a command to cut the IP off the front of > each line, much less how to compare if it's a non-first instance to be > discarded. Or, maybe there's an Apache utility for doing this that I just > don't know about? There's always a million ways to skin a cat like this. At a high volume site you would of course not try and deal with this directly from the apache logs. You would send them to syslog which would parse them and write them to a database from where you could run sophisticated SQL. There are also Apache analyser apps out there, google will find them. But I think all that is overkill for what you want. Your command works fine except for needing to discard duplicate IPs. You don't seem to need to know the details of the GET, so just grab using awk the first field and sort | uniq the result. It will run a tad quicker (and reveal less n00bness to your audience) if you grep the file directly instead of cat | grep: grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v | \ awk '{print $1}' | sort | uniq | wc In true grand Unix tradition you cannot get quicker, dirtier or more effective than that -- alan dot mckinnon at gmail dot com