[gentoo-user] aggregate logs into Elasticsearch

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-user] aggregate logs into Elasticsearch
@ 2020-04-03 11:40 Stefan G. Weichinger
  2020-04-03 15:57 ` Ralph Seichter
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan G. Weichinger @ 2020-04-03 11:40 UTC (permalink / raw
  To: gentoo-user

I am trying my first steps to collect and aggregate logs into a
elasticsearch/kibana combo.

I have them in a docker-compose stack and want to collect nginx and
postfix logs for a start.

So far I am confused by stuff like filebeat, logstash, fluentd ... brrr

Could someone explain or even share some config files? How to set that
up without dozens of packages installed etc (fluentd pulls a lot of
packages, for example).

My goal:

collect logs of postfix, nginx into the docker-containers running ES,
Kibana .. and learn my way from there.

thanks, Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] aggregate logs into Elasticsearch
  2020-04-03 11:40 [gentoo-user] aggregate logs into Elasticsearch Stefan G. Weichinger
@ 2020-04-03 15:57 ` Ralph Seichter
  2020-04-04  7:43   ` Stefan G. Weichinger
  0 siblings, 1 reply; 5+ messages in thread
From: Ralph Seichter @ 2020-04-03 15:57 UTC (permalink / raw
  To: gentoo-user

* Stefan G. Weichinger:

> My goal:
>
> collect logs of postfix, nginx into the docker-containers running ES,
> Kibana .. and learn my way from there.

If you are not dead-set on Elasticsearch et al, I propose considering
MongoDB as an alternative.

There are syslog Modules that allow logging into MongoDB directly. On
the DB side, collections (roughly equivalent to tables in relational
databases) can be limited by size or by age, meaning that removing older
data will happen automatically if you so wish.

MongoDB also makes it easy to add data from sources with different data
makeup to shared collections, because there is no rigid table structure.

For analysis, MongoDB includes its own Aggregation Framework[1], which
is a very powerful and versatile. While probably not relevant to your
needs right now, It even comes with built-in geolocation search

  [1] https://docs.mongodb.com/manual/core/aggregation-pipeline/

I think very highly of MongoDB and encourage you to look into it as a
possibility and as an interesing technical concept.

-Ralph

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] aggregate logs into Elasticsearch
  2020-04-03 15:57 ` Ralph Seichter
@ 2020-04-04  7:43   ` Stefan G. Weichinger
  2020-04-04 14:02     ` Ralph Seichter
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan G. Weichinger @ 2020-04-04  7:43 UTC (permalink / raw
  To: gentoo-user

Am 03.04.20 um 17:57 schrieb Ralph Seichter:
> * Stefan G. Weichinger:
> 
>> My goal:
>>
>> collect logs of postfix, nginx into the docker-containers running ES,
>> Kibana .. and learn my way from there.
> 
> If you are not dead-set on Elasticsearch et al, I propose considering
> MongoDB as an alternative.
> 
> There are syslog Modules that allow logging into MongoDB directly. On
> the DB side, collections (roughly equivalent to tables in relational
> databases) can be limited by size or by age, meaning that removing older
> data will happen automatically if you so wish.
> 
> MongoDB also makes it easy to add data from sources with different data
> makeup to shared collections, because there is no rigid table structure.
> 
> For analysis, MongoDB includes its own Aggregation Framework[1], which
> is a very powerful and versatile. While probably not relevant to your
> needs right now, It even comes with built-in geolocation search
> 
>   [1] https://docs.mongodb.com/manual/core/aggregation-pipeline/
> 
> I think very highly of MongoDB and encourage you to look into it as a
> possibility and as an interesing technical concept.

Thanks for the feedback.

I am not at all set on ES, it just was part of an article I read lately,
and so I started with that docker-compose stack example.

Yesterday I spent quite a while trying to pipe the journald entries into
the fluentd container, quite complicated and messy in a way.

Maybe I look into mongodb as well, for example I found this small howto:

https://www.fluentd.org/guides/recipes/maillog-mongodb

In the end I look for a solution to aggregate (systemd/journald) logs
into one pile of data and to be able to analyze stuff there.

All these solutions seem rather complicated and overly "academic" to me
... but that might be my newbie status in this area.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] aggregate logs into Elasticsearch
  2020-04-04  7:43   ` Stefan G. Weichinger
@ 2020-04-04 14:02     ` Ralph Seichter
  2020-04-09  8:42       ` Stefan G. Weichinger
  0 siblings, 1 reply; 5+ messages in thread
From: Ralph Seichter @ 2020-04-04 14:02 UTC (permalink / raw
  To: gentoo-user

* Stefan G. Weichinger:

> Maybe I look into mongodb as well, for example I found this small
> howto: https://www.fluentd.org/guides/recipes/maillog-mongodb

That looks unnecessarily complicated to me. While you can of course move
data from an existing log file into MongoDB, I find configuring syslog
to use a MongoDB destination (in addition to your files or as a full
replacement) much easier.

See [1] section "Storing messages in a MongoDB database". I have also
done it with rsyslog, but that took a bit more work.

Here's a syslog-ng destination I use. Note that using uri() allows
passing parameters to modern MongoDB drivers which the older servers()
statement cannot cope with.

  destination d_mongo {
    mongodb(
      uri("mongodb://user:pw@hostname:27017/syslog?authSource=admin&ssl=true")
      collection("messages")
      value-pairs(
        scope("selected-macros" "nv-pairs")
        pair("DATE", datetime("$UNIXTIME"))
        pair("PID", int64("$PID"))
        pair("SEQNUM", int64("$SEQNUM"))
        exclude("HOST*")
        exclude("LEGACY*")
        exclude("SOURCE*")
        exclude("TAGS")
      )
    );
  };

Values are strings to begin with. This example excludes some values I am
not interested in, and performs type conversion on others, for example
mapping DATE to MongoDB's date/time data type (see ISODate) and PID to a
numeric value. Conversion can of course happen during analysis, but
since syslog-ng is smart enough to do it when writing data, I prefer
that.

[1] https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/37#TOPIC-956524

-Ralph

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] aggregate logs into Elasticsearch
  2020-04-04 14:02     ` Ralph Seichter
@ 2020-04-09  8:42       ` Stefan G. Weichinger
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan G. Weichinger @ 2020-04-09  8:42 UTC (permalink / raw
  To: gentoo-user

Am 04.04.20 um 16:02 schrieb Ralph Seichter:
> * Stefan G. Weichinger:
> 
>> Maybe I look into mongodb as well, for example I found this small
>> howto: https://www.fluentd.org/guides/recipes/maillog-mongodb
> 
> That looks unnecessarily complicated to me. While you can of course move
> data from an existing log file into MongoDB, I find configuring syslog
> to use a MongoDB destination (in addition to your files or as a full
> replacement) much easier.
> 
> See [1] section "Storing messages in a MongoDB database". I have also
> done it with rsyslog, but that took a bit more work.
> 
> Here's a syslog-ng destination I use. Note that using uri() allows
> passing parameters to modern MongoDB drivers which the older servers()
> statement cannot cope with.
> 
>   destination d_mongo {
>     mongodb(
>       uri("mongodb://user:pw@hostname:27017/syslog?authSource=admin&ssl=true")
>       collection("messages")
>       value-pairs(
>         scope("selected-macros" "nv-pairs")
>         pair("DATE", datetime("$UNIXTIME"))
>         pair("PID", int64("$PID"))
>         pair("SEQNUM", int64("$SEQNUM"))
>         exclude("HOST*")
>         exclude("LEGACY*")
>         exclude("SOURCE*")
>         exclude("TAGS")
>       )
>     );
>   };
> 
> Values are strings to begin with. This example excludes some values I am
> not interested in, and performs type conversion on others, for example
> mapping DATE to MongoDB's date/time data type (see ISODate) and PID to a
> numeric value. Conversion can of course happen during analysis, but
> since syslog-ng is smart enough to do it when writing data, I prefer
> that.
> 
> [1] https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/37#TOPIC-956524

Thanks a lot ... I think I will postpone that project ;-)



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-09  8:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-03 11:40 [gentoo-user] aggregate logs into Elasticsearch Stefan G. Weichinger
2020-04-03 15:57 ` Ralph Seichter
2020-04-04  7:43   ` Stefan G. Weichinger
2020-04-04 14:02     ` Ralph Seichter
2020-04-09  8:42       ` Stefan G. Weichinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox