From: Alan McKinnon <alan.mckinnon@gmail.com>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] is a nice "place" :-D
Date: Tue, 17 May 2011 01:33:39 +0200 [thread overview]
Message-ID: <201105170133.39864.alan.mckinnon@gmail.com> (raw)
In-Reply-To: <4DD1AEC8.5010501@earthlink.net>
Apparently, though unproven, at 01:10 on Tuesday 17 May 2011, Felix Miata did
opine thusly:
> After attempting to install for the first time last week, I started 3
> different threads here looking for help. I'm pleased with the nature of the
> responses, and being able to succeed eventually using a mix of those
> responses and my own efforts digging into Google, gentoo.org and cranial
> cobwebs. So, thanks to all who replied, and even to those who showed
> interest without replying.
>
> For http://fm.no-ip.com/Tmp/Linux/G/, newly created to use with those three
> threads, 'cat /var/log/apache2/access_log | grep "GET /Tmp/Linux/G" | grep
> -v <myip> | sort > outfile' generated 117 lines. That's a lot more hits
> than I can ever remember getting before when asking for help from a
> mailing list (even if it did take 5 days to accumulate so many).
>
> I'm curious if anyone here would like to offer a better variant of my local
> query that would limit the hit count so that no more than one hit per IP is
> represented in the output? My skill with such things is very limited. I
> can't think of the the name of a command to cut the IP off the front of
> each line, much less how to compare if it's a non-first instance to be
> discarded. Or, maybe there's an Apache utility for doing this that I just
> don't know about?
There's always a million ways to skin a cat like this. At a high volume site
you would of course not try and deal with this directly from the apache logs.
You would send them to syslog which would parse them and write them to a
database from where you could run sophisticated SQL.
There are also Apache analyser apps out there, google will find them.
But I think all that is overkill for what you want. Your command works fine
except for needing to discard duplicate IPs. You don't seem to need to know
the details of the GET, so just grab using awk the first field and sort | uniq
the result. It will run a tad quicker (and reveal less n00bness to your
audience) if you grep the file directly instead of cat | grep:
grep "GET /Tmp/Linux/G" | /var/log/apache2/access_log | grep-v <myip> | \
awk '{print $1}' | sort | uniq | wc
In true grand Unix tradition you cannot get quicker, dirtier or more effective
than that
--
alan dot mckinnon at gmail dot com
next prev parent reply other threads:[~2011-05-16 23:36 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-16 23:10 [gentoo-user] is a nice "place" :-D Felix Miata
2011-05-16 23:33 ` Alan McKinnon [this message]
2011-05-17 0:36 ` Willie Wong
2011-05-17 0:38 ` Felix Miata
2011-05-17 7:25 ` Neil Bothwick
2011-05-17 10:43 ` Pandu Poluan
2011-05-17 13:10 ` Juan Diego Tascón
2011-05-17 13:36 ` Alex Schuster
2011-05-17 13:51 ` Juan Diego Tascón
2011-05-17 14:34 ` Pandu Poluan
2011-05-17 17:38 ` Stroller
2011-05-18 10:17 ` Neil Bothwick
2011-05-18 19:03 ` Alan McKinnon
2011-05-18 20:04 ` Neil Bothwick
2011-05-18 20:15 ` Alan Mackenzie
2011-05-18 20:28 ` Alan McKinnon
2011-05-19 19:01 ` Walter Dnes
2011-05-17 14:30 ` David Haller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201105170133.39864.alan.mckinnon@gmail.com \
--to=alan.mckinnon@gmail.com \
--cc=gentoo-user@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox