From: Grant <emailgrant@gmail.com>
To: Gentoo mailing list <gentoo-user@lists.gentoo.org>
Subject: Re: [gentoo-user] Re: TCP Queuing problem
Date: Sat, 24 Sep 2016 17:25:11 -0700 [thread overview]
Message-ID: <CAN0CFw0mwoD=uNNQBt1M3=CiKxFogasL9C4SvOB7=fmq8BjPqg@mail.gmail.com> (raw)
In-Reply-To: <20160922090628.73d6a3b1@jupiter.sol.kaishome.de>
>> >> I haven't mentioned it yet, but several times I've seen the website
>> >> perform fine all day until I browse to it myself and then all of a
>> >> sudden it's super slow for me and my third-party monitor. WTF???
>> >
>> > I had a similar problems once when routing through a IPsec VPN
>> > tunnnel. I needed to reduce MTU in front of the tunnel to make it
>> > work correctly. But I think your problem is different.
>>
>>
>> I'm not using IPsec or a VPN.
>>
>>
>> > Does the http server backlog on the other side? Do you have
>> > performance graphs for other parts of the system to see them in
>> > relation? Maybe some router on the path doesn't work as expected.
>>
>>
>> I've attached a graph of http response time, CPU usage, and TCP
>> queueing over the past week. It seems clear from watching top, iotop,
>> and free than my CPU is always the bottleneck on my server.
>
> What kind of application stack is running in the http server? CPU is a
> bottleneck you cannot always circumvent by throwing more CPUs at the
> problem. Maybe that stack needs tuning...
>
> At the point when requests start queuing up in the http server, the load
> on the server will exponentially rise. It's like a traffic jam on a
> multi lane high way. If one car brakes, thinks may still work. If a car
> in every lane brakes, you suddenly have a huge traffic jam backlogging
> a few miles. And it takes time to recover from that. You need to solve
> the cause for "braking" in the first place and add some alternative
> routes for "cars that never brake" (static files and cacheable
> content). Each lane corresponds to one CPU. Adding just more lanes when
> you have just 4 CPUs will only make the lanes slower. The key is to
> drastically lower the response times which are much too high if I look
> at your graphs. What do memory and IO say?
It turned out this was a combination of two problems which made it
much more difficult to figure out.
First of all I didn't have enough apache2 processes. That seems like
it should have been obvious but it wasn't for two reasons. Firstly,
my apache2 processes are always idle or nearly idle, even when traffic
levels are high. But it must be the case that each request made to
nginx which is then handed off to apache2 monopolizes an apache2
process even though my backend application server is the one using all
the CPU instead of apache2. The other thing that made it difficult to
track down was the way munin graphs apache2 processes. On my graph,
busy and free processes only appeared as tiny dots at the bottom
because apache2's ServerLimit is drawn on the same graph which is many
times greater than the number of busy and free processes. It would be
better to draw MaxClients instead of ServerLimit since I think
MaxClients is more likely to be tuned. It at least appears in the
default config file on Gentoo. Since busy and free apache2 processes
were virtually invisible on the munin graph, I wasn't able to
correlate their ebb and flow with my server's response times.
Once I fixed the apache2 problem, I was sure I had it nailed. That's
when I emailed here a few days ago to say I think I got it. But it
turned out there was another problem and that was Odoo (formerly known
as OpenERP) which is also running in a reverse proxy configuration
behind nginx. Whenever someone uses Odoo on my server, it absolutely
destroys performance for my non-Odoo website. That would have been
really easy to test and I did test stopping the odoo service early on,
but I ruled it out when the problem persisted after stopping Odoo
which I now realize must have been because of the apache2 problem.
So this was much more difficult to figure out due to the fact that I
had multiple problems interacting with each other.
- Grant
next prev parent reply other threads:[~2016-09-25 0:25 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-17 21:13 [gentoo-user] TCP Queuing problem Grant
2016-09-19 17:23 ` [gentoo-user] " Grant
2016-09-19 21:25 ` Grant
2016-09-20 0:38 ` Grant
2016-09-20 13:08 ` Grant
2016-09-21 4:01 ` Kai Krakow
2016-09-21 14:30 ` Grant
2016-09-21 19:29 ` Kai Krakow
2016-09-21 19:37 ` Grant
2016-09-21 20:06 ` Kai Krakow
2016-09-21 20:28 ` Kai Krakow
2016-09-21 19:41 ` Kai Krakow
2016-09-21 19:53 ` Grant
2016-09-21 20:18 ` Kai Krakow
2016-09-21 20:47 ` Grant
2016-09-21 21:44 ` Michael Mol
2016-09-22 0:30 ` Grant
2016-09-22 7:06 ` Kai Krakow
2016-09-25 0:25 ` Grant [this message]
2016-10-01 9:57 ` Grant
2016-09-20 13:50 ` J. Roeleveld
2016-09-20 14:53 ` Grant
2016-09-20 18:06 ` J. Roeleveld
2016-09-20 19:52 ` Grant
2016-09-20 20:19 ` Alarig Le Lay
2016-09-22 16:58 ` Volker Armin Hemmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN0CFw0mwoD=uNNQBt1M3=CiKxFogasL9C4SvOB7=fmq8BjPqg@mail.gmail.com' \
--to=emailgrant@gmail.com \
--cc=gentoo-user@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox