* [gentoo-user] Dealing with scrapers - Help!
@ 2008-08-09 16:08 Grant
2008-08-09 16:15 ` [gentoo-user] " Grant
0 siblings, 1 reply; 4+ messages in thread
From: Grant @ 2008-08-09 16:08 UTC (permalink / raw
To: Gentoo mailing list
My apache web server has been very slow lately and webalizer charts
show page accesses at 5x normal with other stats normal. I'm thinking
scrapers? How do you guys deal with this? Do you identify the IP
(how?) and ban it (how?)?
- Grant
^ permalink raw reply [flat|nested] 4+ messages in thread
* [gentoo-user] Re: Dealing with scrapers - Help!
2008-08-09 16:08 [gentoo-user] Dealing with scrapers - Help! Grant
@ 2008-08-09 16:15 ` Grant
2008-08-09 16:26 ` Mick
0 siblings, 1 reply; 4+ messages in thread
From: Grant @ 2008-08-09 16:15 UTC (permalink / raw
To: Gentoo mailing list
> My apache web server has been very slow lately and webalizer charts
> show page accesses at 5x normal with other stats normal. I'm thinking
> scrapers? How do you guys deal with this? Do you identify the IP
> (how?) and ban it (how?)?
>
> - Grant
I used netstat to identify the IP and I see that I can use it with
"deny from" in httpd.conf. It seems to be over now, but this type of
thing happens periodically. How can I be alerted to this type of
situation when it starts so I can block the IP right away?
- Grant
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-user] Re: Dealing with scrapers - Help!
2008-08-09 16:15 ` [gentoo-user] " Grant
@ 2008-08-09 16:26 ` Mick
2008-08-09 16:48 ` Grant
0 siblings, 1 reply; 4+ messages in thread
From: Mick @ 2008-08-09 16:26 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 900 bytes --]
On Saturday 09 August 2008, Grant wrote:
> > My apache web server has been very slow lately and webalizer charts
> > show page accesses at 5x normal with other stats normal. I'm thinking
> > scrapers? How do you guys deal with this? Do you identify the IP
> > (how?) and ban it (how?)?
> >
> > - Grant
>
> I used netstat to identify the IP and I see that I can use it with
> "deny from" in httpd.conf. It seems to be over now, but this type of
> thing happens periodically. How can I be alerted to this type of
> situation when it starts so I can block the IP right away?
You will need to configure quotas probably using something like:
http://www.howtoforge.com/mod_cband_apache2_bandwidth_quota_throttling
Not sure if it is possible to differentiate between rogue and legit clients,
other than by checking your logs to see what was blocked.
HTH.
--
Regards,
Mick
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-user] Re: Dealing with scrapers - Help!
2008-08-09 16:26 ` Mick
@ 2008-08-09 16:48 ` Grant
0 siblings, 0 replies; 4+ messages in thread
From: Grant @ 2008-08-09 16:48 UTC (permalink / raw
To: gentoo-user
>> > My apache web server has been very slow lately and webalizer charts
>> > show page accesses at 5x normal with other stats normal. I'm thinking
>> > scrapers? How do you guys deal with this? Do you identify the IP
>> > (how?) and ban it (how?)?
>> >
>> > - Grant
>>
>> I used netstat to identify the IP and I see that I can use it with
>> "deny from" in httpd.conf. It seems to be over now, but this type of
>> thing happens periodically. How can I be alerted to this type of
>> situation when it starts so I can block the IP right away?
>
> You will need to configure quotas probably using something like:
>
> http://www.howtoforge.com/mod_cband_apache2_bandwidth_quota_throttling
>
> Not sure if it is possible to differentiate between rogue and legit clients,
> other than by checking your logs to see what was blocked.
Turns out it was a "legit" bot. Watch out for this one:
Mozilla/5.0 (compatible; discobot/1.0;
+http://discoveryengine.com/discobot.html)
It's bad that a single IP can bring down my http isn't it?
- Grant
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-08-09 16:48 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-09 16:08 [gentoo-user] Dealing with scrapers - Help! Grant
2008-08-09 16:15 ` [gentoo-user] " Grant
2008-08-09 16:26 ` Mick
2008-08-09 16:48 ` Grant
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox