public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] strange TCP timeout errors
@ 2015-10-05 17:35 Grant
  2015-10-05 22:57 ` Bill Kenworthy
  0 siblings, 1 reply; 10+ messages in thread
From: Grant @ 2015-10-05 17:35 UTC (permalink / raw
  To: Gentoo mailing list

[-- Attachment #1: Type: text/plain, Size: 351 bytes --]

I've attached a PNG from Munin showing the TCP timeout errors on my
Gentoo server over the past month.  The data is expressed in timeouts
per second and that rate is shown to be steadily increasing over the
past month.  That seems strange to me.  Munin doesn't show any other
data point increasing like this over the time period.  Any ideas?

- Grant

[-- Attachment #2: tcp-timeouts.png --]
[-- Type: image/png, Size: 61019 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-05 17:35 [gentoo-user] strange TCP timeout errors Grant
@ 2015-10-05 22:57 ` Bill Kenworthy
  2015-10-05 23:26   ` Alan McKinnon
  0 siblings, 1 reply; 10+ messages in thread
From: Bill Kenworthy @ 2015-10-05 22:57 UTC (permalink / raw
  To: gentoo-user

On 06/10/15 01:35, Grant wrote:
> I've attached a PNG from Munin showing the TCP timeout errors on my
> Gentoo server over the past month.  The data is expressed in timeouts
> per second and that rate is shown to be steadily increasing over the
> past month.  That seems strange to me.  Munin doesn't show any other
> data point increasing like this over the time period.  Any ideas?
> 
> - Grant
> 

weird - does it reset on an interface restart or reboot?

Can you verify its not an artefact within munin (how?)


BillK




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-05 22:57 ` Bill Kenworthy
@ 2015-10-05 23:26   ` Alan McKinnon
  2015-10-07 12:58     ` Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Alan McKinnon @ 2015-10-05 23:26 UTC (permalink / raw
  To: gentoo-user

On 06/10/2015 00:57, Bill Kenworthy wrote:
> On 06/10/15 01:35, Grant wrote:
>> I've attached a PNG from Munin showing the TCP timeout errors on my
>> Gentoo server over the past month.  The data is expressed in timeouts
>> per second and that rate is shown to be steadily increasing over the
>> past month.  That seems strange to me.  Munin doesn't show any other
>> data point increasing like this over the time period.  Any ideas?
>>
>> - Grant
>>
> 
> weird - does it reset on an interface restart or reboot?

this would be my test #1

> Can you verify its not an artefact within munin (how?)

In theory, a misconfigured graph can do this. Munin can draw many
different types of graph, including cumulative values. Even for a data
type like this which is X events per unit time, if you tell munin to add
them all up, it will do so and graph it.

Qucik test is to look at the graph config.


-- 
Alan McKinnon
alan.mckinnon@gmail.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-05 23:26   ` Alan McKinnon
@ 2015-10-07 12:58     ` Grant
  2015-10-07 14:22       ` Alan McKinnon
  0 siblings, 1 reply; 10+ messages in thread
From: Grant @ 2015-10-07 12:58 UTC (permalink / raw
  To: Gentoo mailing list

>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>> Gentoo server over the past month.  The data is expressed in timeouts
>>> per second and that rate is shown to be steadily increasing over the
>>> past month.  That seems strange to me.  Munin doesn't show any other
>>> data point increasing like this over the time period.  Any ideas?
>>>
>>> - Grant
>>>
>>
>> weird - does it reset on an interface restart or reboot?
>
> this would be my test #1


I rebooted and the rate of errors has dropped off to almost nothing.


>> Can you verify its not an artefact within munin (how?)
>
> In theory, a misconfigured graph can do this. Munin can draw many
> different types of graph, including cumulative values. Even for a data
> type like this which is X events per unit time, if you tell munin to add
> them all up, it will do so and graph it.
>
> Qucik test is to look at the graph config.


This graph lives in the "network" section of the munin web interface.
There is no matching section in /etc/munin/plugin-conf.d/munin-node so
it should be be using the default config.

Any ideas based on this new info?

- Grant


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 12:58     ` Grant
@ 2015-10-07 14:22       ` Alan McKinnon
  2015-10-07 15:55         ` Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Alan McKinnon @ 2015-10-07 14:22 UTC (permalink / raw
  To: gentoo-user

On 07/10/2015 14:58, Grant wrote:
>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>>> Gentoo server over the past month.  The data is expressed in timeouts
>>>> per second and that rate is shown to be steadily increasing over the
>>>> past month.  That seems strange to me.  Munin doesn't show any other
>>>> data point increasing like this over the time period.  Any ideas?
>>>>
>>>> - Grant
>>>>
>>>
>>> weird - does it reset on an interface restart or reboot?
>>
>> this would be my test #1
> 
> 
> I rebooted and the rate of errors has dropped off to almost nothing.
> 
> 
>>> Can you verify its not an artefact within munin (how?)
>>
>> In theory, a misconfigured graph can do this. Munin can draw many
>> different types of graph, including cumulative values. Even for a data
>> type like this which is X events per unit time, if you tell munin to add
>> them all up, it will do so and graph it.
>>
>> Qucik test is to look at the graph config.
> 
> 
> This graph lives in the "network" section of the munin web interface.
> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
> it should be be using the default config.
> 
> Any ideas based on this new info?

A few :-)


I can't find the plugin that delivers that graph though. Maybe I just
don't have it, maybe it comes from contrib/

What's your USE for munin?
What do you have in "ls -al /etc/munin/plugins/"  ?


-- 
Alan McKinnon
alan.mckinnon@gmail.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 14:22       ` Alan McKinnon
@ 2015-10-07 15:55         ` Grant
  2015-10-07 18:39           ` Alan McKinnon
  0 siblings, 1 reply; 10+ messages in thread
From: Grant @ 2015-10-07 15:55 UTC (permalink / raw
  To: Gentoo mailing list

>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>>>> Gentoo server over the past month.  The data is expressed in timeouts
>>>>> per second and that rate is shown to be steadily increasing over the
>>>>> past month.  That seems strange to me.  Munin doesn't show any other
>>>>> data point increasing like this over the time period.  Any ideas?
>>>>>
>>>>> - Grant
>>>>>
>>>>
>>>> weird - does it reset on an interface restart or reboot?
>>>
>>> this would be my test #1
>>
>>
>> I rebooted and the rate of errors has dropped off to almost nothing.
>>
>>
>>>> Can you verify its not an artefact within munin (how?)
>>>
>>> In theory, a misconfigured graph can do this. Munin can draw many
>>> different types of graph, including cumulative values. Even for a data
>>> type like this which is X events per unit time, if you tell munin to add
>>> them all up, it will do so and graph it.
>>>
>>> Qucik test is to look at the graph config.
>>
>>
>> This graph lives in the "network" section of the munin web interface.
>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
>> it should be be using the default config.
>>
>> Any ideas based on this new info?
>
> A few :-)
>
>
> I can't find the plugin that delivers that graph though. Maybe I just
> don't have it, maybe it comes from contrib/
>
> What's your USE for munin?


USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
-ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"


> What do you have in "ls -al /etc/munin/plugins/"  ?


# ls -al /etc/munin/plugins/
total 8
drwxr-xr-x 2 munin munin 4096 Aug 26 13:22 .
drwxr-xr-x 7 root  root  4096 Aug 27 08:42 ..
-rw-r--r-- 1 root  root     0 Aug 23 18:10 .keep_net-analyzer_munin-0
lrwxrwxrwx 1 root  root    42 Jun 16  2013 apache_accesses ->
/usr/libexec/munin/plugins/apache_accesses
lrwxrwxrwx 1 root  root    43 Jun 16  2013 apache_processes ->
/usr/libexec/munin/plugins/apache_processes
lrwxrwxrwx 1 root  root    40 Jun 16  2013 apache_volume ->
/usr/libexec/munin/plugins/apache_volume
lrwxrwxrwx 1 root  root    30 Jun 16  2013 cpu -> /usr/libexec/munin/plugins/cpu
lrwxrwxrwx 1 root  root    29 Jun 16  2013 df -> /usr/libexec/munin/plugins/df
lrwxrwxrwx 1 root  root    35 Jun 16  2013 df_inode ->
/usr/libexec/munin/plugins/df_inode
lrwxrwxrwx 1 root  root    36 Jun 21  2013 diskstat_ ->
/usr/libexec/munin/plugins/diskstat_
lrwxrwxrwx 1 root  root    36 Jun 16  2013 diskstats ->
/usr/libexec/munin/plugins/diskstats
lrwxrwxrwx 1 root  root    34 Jun 16  2013 entropy ->
/usr/libexec/munin/plugins/entropy
lrwxrwxrwx 1 root  root    32 Jun 16  2013 forks ->
/usr/libexec/munin/plugins/forks
lrwxrwxrwx 1 root  root    34 Jun 18  2013 hddtemp ->
/usr/libexec/munin/plugins/hddtemp
lrwxrwxrwx 1 root  root    35 Jun 18  2013 hddtemp2 ->
/usr/libexec/munin/plugins/hddtemp2
lrwxrwxrwx 1 root  root    43 Jun 18  2013 hddtemp_smartctl ->
/usr/libexec/munin/plugins/hddtemp_smartctl
lrwxrwxrwx 1 root  root    35 Jun 18  2013 hddtempd ->
/usr/libexec/munin/plugins/hddtempd
lrwxrwxrwx 1 root  root    30 Jun 21  2013 if_enp2s2f0 ->
/usr/libexec/munin/plugins/if_
lrwxrwxrwx 1 root  root    34 Jun 21  2013 if_err_enp2s2f0 ->
/usr/libexec/munin/plugins/if_err_
lrwxrwxrwx 1 root  root    37 Jun 16  2013 interrupts ->
/usr/libexec/munin/plugins/interrupts
lrwxrwxrwx 1 root  root    35 Jun 16  2013 irqstats ->
/usr/libexec/munin/plugins/irqstats
lrwxrwxrwx 1 root  root    31 Jun 16  2013 load ->
/usr/libexec/munin/plugins/load
lrwxrwxrwx 1 root  root    33 Jun 16  2013 lpstat ->
/usr/libexec/munin/plugins/lpstat
lrwxrwxrwx 1 root  root    34 Jun 18  2013 meminfo ->
/usr/libexec/munin/plugins/meminfo
lrwxrwxrwx 1 root  root    33 Jun 16  2013 memory ->
/usr/libexec/munin/plugins/memory
lrwxrwxrwx 1 root  root    38 Jun 16  2013 munin_stats ->
/usr/libexec/munin/plugins/munin_stats
lrwxrwxrwx 1 root  root    39 Jun 18  2013 munin_update ->
/usr/libexec/munin/plugins/munin_update
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_bin_relay_log ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_commands ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_connections ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_files_tables ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_bpool ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_bpool_act ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_insert_buf ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_io ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_io_pend ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_log ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_rows ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_semaphores ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_innodb_tnx ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_myisam_indexes ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_network_traffic ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_qcache ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_qcache_mem ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_replication ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_select_types ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_slow ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_sorts ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_table_locks ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    33 Jun 21  2013 mysql_tmp_tables ->
/usr/libexec/munin/plugins/mysql_
lrwxrwxrwx 1 root  root    34 Jun 16  2013 netstat ->
/usr/libexec/munin/plugins/netstat
lrwxrwxrwx 1 root  root    40 Jun 18  2013 netstat_multi ->
/usr/libexec/munin/plugins/netstat_multi
lrwxrwxrwx 1 root  root    40 Jun 16  2013 nginx_request ->
/usr/libexec/munin/plugins/nginx_request
lrwxrwxrwx 1 root  root    39 Jun 16  2013 nginx_status ->
/usr/libexec/munin/plugins/nginx_status
lrwxrwxrwx 1 root  root    37 Jun 16  2013 open_files ->
/usr/libexec/munin/plugins/open_files
lrwxrwxrwx 1 root  root    38 Jun 16  2013 open_inodes ->
/usr/libexec/munin/plugins/open_inodes
lrwxrwxrwx 1 root  root    44 Jun 16  2013 postfix_mailqueue ->
/usr/libexec/munin/plugins/postfix_mailqueue
lrwxrwxrwx 1 root  root    44 Jun 16  2013 postfix_mailstats ->
/usr/libexec/munin/plugins/postfix_mailstats
lrwxrwxrwx 1 root  root    45 Jun 16  2013 postfix_mailvolume ->
/usr/libexec/munin/plugins/postfix_mailvolume
lrwxrwxrwx 1 root  root    31 Jun 18  2013 proc ->
/usr/libexec/munin/plugins/proc
lrwxrwxrwx 1 root  root    35 Jun 16  2013 proc_pri ->
/usr/libexec/munin/plugins/proc_pri
lrwxrwxrwx 1 root  root    36 Jun 16  2013 processes ->
/usr/libexec/munin/plugins/processes
lrwxrwxrwx 1 root  root    35 Jun 18  2013 sensors_ ->
/usr/libexec/munin/plugins/sensors_
lrwxrwxrwx 1 root  root    31 Jun 16  2013 swap ->
/usr/libexec/munin/plugins/swap
lrwxrwxrwx 1 root  root    34 Jun 16  2013 threads ->
/usr/libexec/munin/plugins/threads
lrwxrwxrwx 1 root  root    33 Jun 16  2013 uptime ->
/usr/libexec/munin/plugins/uptime
lrwxrwxrwx 1 root  root    32 Jun 16  2013 users ->
/usr/libexec/munin/plugins/users


So I don't have a "network" plugin either but I do have a "network"
section under Categories in the munin web interface.

- Grant

P.S. Any other good plugins you'd recommend?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 15:55         ` Grant
@ 2015-10-07 18:39           ` Alan McKinnon
  2015-10-07 19:42             ` brettrsears
  2015-10-09 14:15             ` Grant
  0 siblings, 2 replies; 10+ messages in thread
From: Alan McKinnon @ 2015-10-07 18:39 UTC (permalink / raw
  To: gentoo-user

On 07/10/2015 17:55, Grant wrote:
>>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>>>>> Gentoo server over the past month.  The data is expressed in timeouts
>>>>>> per second and that rate is shown to be steadily increasing over the
>>>>>> past month.  That seems strange to me.  Munin doesn't show any other
>>>>>> data point increasing like this over the time period.  Any ideas?
>>>>>>
>>>>>> - Grant
>>>>>>
>>>>>
>>>>> weird - does it reset on an interface restart or reboot?
>>>>
>>>> this would be my test #1
>>>
>>>
>>> I rebooted and the rate of errors has dropped off to almost nothing.
>>>
>>>
>>>>> Can you verify its not an artefact within munin (how?)
>>>>
>>>> In theory, a misconfigured graph can do this. Munin can draw many
>>>> different types of graph, including cumulative values. Even for a data
>>>> type like this which is X events per unit time, if you tell munin to add
>>>> them all up, it will do so and graph it.
>>>>
>>>> Qucik test is to look at the graph config.
>>>
>>>
>>> This graph lives in the "network" section of the munin web interface.
>>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
>>> it should be be using the default config.
>>>
>>> Any ideas based on this new info?
>>
>> A few :-)
>>
>>
>> I can't find the plugin that delivers that graph though. Maybe I just
>> don't have it, maybe it comes from contrib/
>>
>> What's your USE for munin?
> 
> 
> USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
> -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"
> 
> 
>> What do you have in "ls -al /etc/munin/plugins/"  ?


It's as I thought - your data is accurate but rrd has been given a
completely wrong method to derive the graphs.

Munin graphs for section "Network" do not have to be in a file called
"network" - it's just a category and the plugin defines what web-page
section it must be in. In your case, the relevant plugin is
netstat_multi which doesn't often get installed. It's data source is
"netstat -s" so grep that output for "timeout" to see it.

Timeouts are cumulative counters, they do not get less till they wrap
around. So to scale them, the plugin gets the rrd file to subtract
previous reading from current reading and divide by the time interval to
get the timeouts/sec. This is all done inside rrd when the data files
are updated (it's quite a lot of magic)

That plugin sets the graph type to DERIVE
(/etc/munin/plugins/netstat_multi around line 190. I feel it should be
GAUGE or COUNTER.

The proper reference on rrd is
http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
and the munin docs are
https://munin.readthedocs.org/en/latest/index.html

You must edit the plugin file and IIRC recreate the rrd, you will lose
all past info (can't be helped).


[snip ls output]


> P.S. Any other good plugins you'd recommend?

http://gallery.munin-monitoring.org/

Monitoring is highly site-specific so recommendations aren't usually
worth much, but that gallery has LOTS of contributed plugins

-- 
Alan McKinnon
alan.mckinnon@gmail.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 18:39           ` Alan McKinnon
@ 2015-10-07 19:42             ` brettrsears
  2015-10-07 22:25               ` Alan McKinnon
  2015-10-09 14:15             ` Grant
  1 sibling, 1 reply; 10+ messages in thread
From: brettrsears @ 2015-10-07 19:42 UTC (permalink / raw
  To: gentoo-user

YyyyYYuIIIIIU
Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Alan McKinnon <alan.mckinnon@gmail.com>
Date: Wed, 7 Oct 2015 20:39:42 
To: <gentoo-user@lists.gentoo.org>
Reply-to: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] strange TCP timeout errors

On 07/10/2015 17:55, Grant wrote:
>>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>>>>> Gentoo server over the past month.  The data is expressed in timeouts
>>>>>> per second and that rate is shown to be steadily increasing over the
>>>>>> past month.  That seems strange to me.  Munin doesn't show any other
>>>>>> data point increasing like this over the time period.  Any ideas?
>>>>>>
>>>>>> - Grant
>>>>>>
>>>>>
>>>>> weird - does it reset on an interface restart or reboot?
>>>>
>>>> this would be my test #1
>>>
>>>
>>> I rebooted and the rate of errors has dropped off to almost nothing.
>>>
>>>
>>>>> Can you verify its not an artefact within munin (how?)
>>>>
>>>> In theory, a misconfigured graph can do this. Munin can draw many
>>>> different types of graph, including cumulative values. Even for a data
>>>> type like this which is X events per unit time, if you tell munin to add
>>>> them all up, it will do so and graph it.
>>>>
>>>> Qucik test is to look at the graph config.
>>>
>>>
>>> This graph lives in the "network" section of the munin web interface.
>>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
>>> it should be be using the default config.
>>>
>>> Any ideas based on this new info?
>>
>> A few :-)
>>
>>
>> I can't find the plugin that delivers that graph though. Maybe I just
>> don't have it, maybe it comes from contrib/
>>
>> What's your USE for munin?
> 
> 
> USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
> -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"
> 
> 
>> What do you have in "ls -al /etc/munin/plugins/"  ?


It's as I thought - your data is accurate but rrd has been given a
completely wrong method to derive the graphs.

Munin graphs for section "Network" do not have to be in a file called
"network" - it's just a category and the plugin defines what web-page
section it must be in. In your case, the relevant plugin is
netstat_multi which doesn't often get installed. It's data source is
"netstat -s" so grep that output for "timeout" to see it.

Timeouts are cumulative counters, they do not get less till they wrap
around. So to scale them, the plugin gets the rrd file to subtract
previous reading from current reading and divide by the time interval to
get the timeouts/sec. This is all done inside rrd when the data files
are updated (it's quite a lot of magic)

That plugin sets the graph type to DERIVE
(/etc/munin/plugins/netstat_multi around line 190. I feel it should be
GAUGE or COUNTER.

The proper reference on rrd is
http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
and the munin docs are
https://munin.readthedocs.org/en/latest/index.html

You must edit the plugin file and IIRC recreate the rrd, you will lose
all past info (can't be helped).


[snip ls output]


> P.S. Any other good plugins you'd recommend?

http://gallery.munin-monitoring.org/

Monitoring is highly site-specific so recommendations aren't usually
worth much, but that gallery has LOTS of contributed plugins

-- 
Alan McKinnon
alan.mckinnon@gmail.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 19:42             ` brettrsears
@ 2015-10-07 22:25               ` Alan McKinnon
  0 siblings, 0 replies; 10+ messages in thread
From: Alan McKinnon @ 2015-10-07 22:25 UTC (permalink / raw
  To: gentoo-user

On 07/10/2015 21:42, brettrsears@gmail.com wrote:
> YyyyYYuIIIIIU
> Sent from my Verizon Wireless BlackBerry


Hmmmmmmmmmmmmmm, interesting reply. I'm wondering if it has something to
do with:

1. verizon
2. dodgy 3g
3. crapberry. oops, sorry: blackberry

Or maybe it's because y, u and i are in a row on the keyboard, shift and
enter are adjacent, and you have a over-friendly cat?

:-)

> 
> -----Original Message-----
> From: Alan McKinnon <alan.mckinnon@gmail.com>
> Date: Wed, 7 Oct 2015 20:39:42 
> To: <gentoo-user@lists.gentoo.org>
> Reply-to: gentoo-user@lists.gentoo.org
> Subject: Re: [gentoo-user] strange TCP timeout errors
> 
> On 07/10/2015 17:55, Grant wrote:
>>>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my
>>>>>>> Gentoo server over the past month.  The data is expressed in timeouts
>>>>>>> per second and that rate is shown to be steadily increasing over the
>>>>>>> past month.  That seems strange to me.  Munin doesn't show any other
>>>>>>> data point increasing like this over the time period.  Any ideas?
>>>>>>>
>>>>>>> - Grant
>>>>>>>
>>>>>>
>>>>>> weird - does it reset on an interface restart or reboot?
>>>>>
>>>>> this would be my test #1
>>>>
>>>>
>>>> I rebooted and the rate of errors has dropped off to almost nothing.
>>>>
>>>>
>>>>>> Can you verify its not an artefact within munin (how?)
>>>>>
>>>>> In theory, a misconfigured graph can do this. Munin can draw many
>>>>> different types of graph, including cumulative values. Even for a data
>>>>> type like this which is X events per unit time, if you tell munin to add
>>>>> them all up, it will do so and graph it.
>>>>>
>>>>> Qucik test is to look at the graph config.
>>>>
>>>>
>>>> This graph lives in the "network" section of the munin web interface.
>>>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so
>>>> it should be be using the default config.
>>>>
>>>> Any ideas based on this new info?
>>>
>>> A few :-)
>>>
>>>
>>> I can't find the plugin that delivers that graph though. Maybe I just
>>> don't have it, maybe it comes from contrib/
>>>
>>> What's your USE for munin?
>>
>>
>> USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi
>> -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}"
>>
>>
>>> What do you have in "ls -al /etc/munin/plugins/"  ?
> 
> 
> It's as I thought - your data is accurate but rrd has been given a
> completely wrong method to derive the graphs.
> 
> Munin graphs for section "Network" do not have to be in a file called
> "network" - it's just a category and the plugin defines what web-page
> section it must be in. In your case, the relevant plugin is
> netstat_multi which doesn't often get installed. It's data source is
> "netstat -s" so grep that output for "timeout" to see it.
> 
> Timeouts are cumulative counters, they do not get less till they wrap
> around. So to scale them, the plugin gets the rrd file to subtract
> previous reading from current reading and divide by the time interval to
> get the timeouts/sec. This is all done inside rrd when the data files
> are updated (it's quite a lot of magic)
> 
> That plugin sets the graph type to DERIVE
> (/etc/munin/plugins/netstat_multi around line 190. I feel it should be
> GAUGE or COUNTER.
> 
> The proper reference on rrd is
> http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
> and the munin docs are
> https://munin.readthedocs.org/en/latest/index.html
> 
> You must edit the plugin file and IIRC recreate the rrd, you will lose
> all past info (can't be helped).
> 
> 
> [snip ls output]
> 
> 
>> P.S. Any other good plugins you'd recommend?
> 
> http://gallery.munin-monitoring.org/
> 
> Monitoring is highly site-specific so recommendations aren't usually
> worth much, but that gallery has LOTS of contributed plugins
> 


-- 
Alan McKinnon
alan.mckinnon@gmail.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] strange TCP timeout errors
  2015-10-07 18:39           ` Alan McKinnon
  2015-10-07 19:42             ` brettrsears
@ 2015-10-09 14:15             ` Grant
  1 sibling, 0 replies; 10+ messages in thread
From: Grant @ 2015-10-09 14:15 UTC (permalink / raw
  To: Gentoo mailing list

> It's as I thought - your data is accurate but rrd has been given a
> completely wrong method to derive the graphs.
>
> Munin graphs for section "Network" do not have to be in a file called
> "network" - it's just a category and the plugin defines what web-page
> section it must be in. In your case, the relevant plugin is
> netstat_multi which doesn't often get installed. It's data source is
> "netstat -s" so grep that output for "timeout" to see it.
>
> Timeouts are cumulative counters, they do not get less till they wrap
> around. So to scale them, the plugin gets the rrd file to subtract
> previous reading from current reading and divide by the time interval to
> get the timeouts/sec. This is all done inside rrd when the data files
> are updated (it's quite a lot of magic)
>
> That plugin sets the graph type to DERIVE
> (/etc/munin/plugins/netstat_multi around line 190. I feel it should be
> GAUGE or COUNTER.
>
> The proper reference on rrd is
> http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
> and the munin docs are
> https://munin.readthedocs.org/en/latest/index.html
>
> You must edit the plugin file and IIRC recreate the rrd, you will lose
> all past info (can't be helped).
>
>
> [snip ls output]
>
>
>> P.S. Any other good plugins you'd recommend?
>
> http://gallery.munin-monitoring.org/
>
> Monitoring is highly site-specific so recommendations aren't usually
> worth much, but that gallery has LOTS of contributed plugins


Many thanks Alan!

- Grant


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-10-09 14:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-05 17:35 [gentoo-user] strange TCP timeout errors Grant
2015-10-05 22:57 ` Bill Kenworthy
2015-10-05 23:26   ` Alan McKinnon
2015-10-07 12:58     ` Grant
2015-10-07 14:22       ` Alan McKinnon
2015-10-07 15:55         ` Grant
2015-10-07 18:39           ` Alan McKinnon
2015-10-07 19:42             ` brettrsears
2015-10-07 22:25               ` Alan McKinnon
2015-10-09 14:15             ` Grant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox