[GitHub] https://github.com/fluent/fluentd
[Doc] http://docs.fluentd.org/articles/config-file
[Example] https://github.com/fluent/fluentd/tree/master/example文章源自靠谱运维-https://www.ixdba.net/archives/1116
默认配置文件:/etc/td-agent/td-agent.conf
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Fluentd directives
source
:确定输入源match
: 确定输出目的地filter
:确定 event 处理流system
:设置系统范围的配置label
:将内部路由的输出和过滤器分组@include
:包括其它文件
source 指令
通过使用 source 指令,来选择和配置所需的输入插件来启用 Fluentd 输入源。Fluentd 的标准输入插件包含 http(监听 9880) 和 forward 模式(监听 24224)
,分别用来接收 HTTP 请求和 TCP 请求。文章源自靠谱运维-https://www.ixdba.net/archives/1116
http
:使 fluentd 转变为一个 httpd 端点,以接受进入的 http 报文。forward
:使 fluentd 转变为一个 TCP 端点,以接受 TCP 报文。
1
2
3
4
5
6
7
8
9
10
11
12
|
### Receive events from 24224/tcp
### This is used by log forwarding and the fluent-cat command
<source>
@type forward
port 24224
</source>
### http://this.host:9880/myapp.access?json={"event":"data"}
<source>
@type http
port 9880
</source>
|
每个 source 指令必须包括 “type” 参数,指定使用那种插件。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Routing(路由)
:source 把事件提交到 fluentd 的路由引擎中。一个事件由三个实体组成:tag、time 和 record。文章源自靠谱运维-https://www.ixdba.net/archives/1116
tag
:是一个通过 “.” 来分离的字符串(e.g. myapp.access),用作 Fluentd 内部路由引擎的方向。time
:时间字段由输入插件指定,并且必须为 Unix 时间格式。record
:一个 JSON 对象。
在上面的例子中,http 输入插件提交了以下的事件文章源自靠谱运维-https://www.ixdba.net/archives/1116
match 指令
match 指令查找匹配 “tags” 的事件,并处理它们。match 命令的最常见用法是将事件输出到其他系统(因此,与 match 命令对应的插件称为 “输出插件”)。 Fluentd 的标准输出插件包括 file 和 forward。
文章源自靠谱运维-https://www.ixdba.net/archives/1116
每个 match 指令必须包括一个匹配模式和 type 参数。只有与模式匹配的 “tags” 事件才会发送到输出目标(在上面的示例中,只有标记 “myapp.access” 的事件匹配)。type 参数指定使用哪种输出插件。文章源自靠谱运维-https://www.ixdba.net/archives/1116
match 模式文章源自靠谱运维-https://www.ixdba.net/archives/1116
*
:匹配单个 tag 部分- 例:a.*,匹配 a.b,但不匹配 a 或者 a.b.c
**
:匹配 0 或 多个 tag 部分- 例:a.**,匹配 a、a.b 和 a.b.c
{X,Y,Z}
:匹配 X、Y 或 Z,其中 X、Y 和 Z 是匹配模式。可以和*
和**
模式组合使用- 例 1:{a, b},匹配 a 和 b,但不匹配 c
- 例 2:a.{b,c}. 和 a.{b,c.*}
- 当多个模式列在一个
<match>
标签(由一个或多个空格分隔)内时,它匹配任何列出的模式。 例如:<match a b>
:匹配 a 和 b<match a.** b.*>
:匹配 a、a.b、a.b.c 和 b.d
Fluentd 尝试按照它们在配置文件中出现的顺序,从上到下来进行 "tags" 匹配 。
所以,如果是下面的配置,那么 myapp.access 将永远不会匹配。文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter 指令
Event processing pipeling(事件处理流)文章源自靠谱运维-https://www.ixdba.net/archives/1116
“filter” 指令具有与 “match” 相同的语法,但是 filter 可以串联成 pipeline,对数据进行串行处理,最终再交给 match 输出。 使用 fliters,事件流如下:文章源自靠谱运维-https://www.ixdba.net/archives/1116
这个例子里,filter 获取数据后,调用原生的 @type record_transformer 插件,在事件的 record 里插入了新的字段 host_param,然后再交给 match 输出。文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter 匹配顺序与 match 相同,我们应该在 <match> 之前放置 <filter>。
文章源自靠谱运维-https://www.ixdba.net/archives/1116
system 指令
fluentd 的相关设置,可以在启动时设置,也可以在配置文件里设置,包含:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- log_level
- suppress_repeated_stacktrace
- emit_error_log_interval
- suppress_config_dump
- without_source
例 1
:fluentd 启动配置文章源自靠谱运维-https://www.ixdba.net/archives/1116
例 2
:fluentd 进程名文章源自靠谱运维-https://www.ixdba.net/archives/1116
label 指令
label 用于将任务进行分组,方便复杂任务的管理。文章源自靠谱运维-https://www.ixdba.net/archives/1116
@label @<label_name>
文章源自靠谱运维-https://www.ixdba.net/archives/1116
你可以在 source
里指定 @label @<LABEL_NAME>
,这个 source 所触发的事件就会被发送给指定的 label 所包含的任务,而不会被后续的其他任务获取到。文章源自靠谱运维-https://www.ixdba.net/archives/1116
@ERROR label
文章源自靠谱运维-https://www.ixdba.net/archives/1116
用来接收插件通过调用 emit_error_event API 抛出的异常,使用方法和 label 一样,通过设定 文章源自靠谱运维-https://www.ixdba.net/archives/1116
include
指令
导入其它独立的配置文件中的指令,这些文件可以使用:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- 相对路径
- 绝对路由
- HTTP URL
Fluentd Plugins
[Plugins] http://www.fluentd.org/plugins文章源自靠谱运维-https://www.ixdba.net/archives/1116
Fluentd 有 6 种类型的插件:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Input
:输入Output
:输出Buffer
:缓冲区Filter
:过滤器Parset
:解析器Formatter
:格式化器
Input Plugins
[Doc] http://docs.fluentd.org/articles/input-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
input 插件扩展 Fluentd,以从外部源检索和拉取事件日志。 input 插件通常创建线程套接字和侦听套接字。 它还可以被写入以周期性地从数据源拉取数据。文章源自靠谱运维-https://www.ixdba.net/archives/1116
- in_udp
- in_tcp
- in_forward
- in_secure_forward
- in_http
- in_unix
- in_tail
- in_exec
- in_syslog
- in_scribe
- in_multiprocess
- in_dummy
in_udp
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_tcp文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required)
- tag( required)
- port
- bind
- delimiter
- source_host_key
- format (required): 参见 “2.5 Parser Plugins”
- regexp
- apache2
- apoache_error
- nginx
- syslog
- tsv or csv
- ltsv
- json
- none
- mulitline
- keep_time_key
- log_level
- fatal
- error
- warn
- info
- debug
- trace
in_tcp
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_tcp文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
同 in_udp文章源自靠谱运维-https://www.ixdba.net/archives/1116
in_forward
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_forward文章源自靠谱运维-https://www.ixdba.net/archives/1116
- 侦听 TCP 套接字以接收事件流,它还侦听 UDP 套接字以接收心跳消息
- in_forward 不提供解析机制,不像 in_tail 或 in_tcp,因为 in_forward 主要是为了高效的日志传输。 如果要解析传入事件,请在 pipeline 中使用 parser filter
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required):forward
- port
- bind
- linger_timeout
- chunk_zine_limit
- chunk_size_warn_limit
- skip_invalid_event
- source_hostname_key
- log_level
in_secure_forward文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_secure_forward文章源自靠谱运维-https://www.ixdba.net/archives/1116
in_http
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_http文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required):forward
- port
- bind
- body_size_limit
- keepalive_timeout
- add_http_headers
- add_remote_addr
- cors_allow_origins
- format: 参见 “2.5 Parser Plugins”
- default:json、msgpack
- regexp
- json
- ltsv
- tsv or csv
- none
- log_level
in_unix文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_unix文章源自靠谱运维-https://www.ixdba.net/archives/1116
in_tail
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_tail文章源自靠谱运维-https://www.ixdba.net/archives/1116
- When Fluentd is first configured with in_tail, it will start reading from the tail of that log, not the beginning.
- Once the log is rotated, Fluentd starts reading the new file from the beginning. It keeps track of the current inode number.
- If td-agent restarts, it starts reading from the last position td-agent read before the restart. This position is recorded in the position file specified by the pos_file parameter.
1
2
3
4
5
6
7
8
9
|
<source>
@type tail
path /var/log/httpd-access.log
exclude_path ["/var/log/*.gz", "/var/log/*.zip"]
pos_file /var/log/td-agent/httpd-access.log.pos
tag apache.access
format apache2
keep_time_key false
</source>
|
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type (required)
- tag (required)
- path (required)
- exclude_path
- refresh_interval
- read_from_head
- read_lines_limit
- multiline_flush_interval
- pos_file (highly recommended)
- format (required): 参见 “2.5 Parser Plugins”
- regexp
- apache2
- apache_error
- nginx
- syslog
- tsv or csv
- ltsv
- json
- none
- multiline
- time_format
- rotate_wait
- enable_watch_timer
in_exec
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_exec文章源自靠谱运维-https://www.ixdba.net/archives/1116
in_exec Input 插件执行外部程序以接收或拉取事件日志。 然后它将从程序的 stdout 读取 TSV(制表符分隔值),JSON 或 MessagePack。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required)
- command(required)
- format
- tsv(default)
- josn
- msgpack
- tag (required if tag_key is not specified)
- tag_key
- time_key
- time_format
- run_interval
- log_level
in_syslog
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/in_syslog文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required)
- port
- bind
- protocol_type
- tag(required)
- format: 参见 “2.5 Parser Plugins”
- regexp
- apache2
- apache_error
- nginx
- syslog
- tsv or osv
- ltsv
- json
- none
- multiline
Othter文章源自靠谱运维-https://www.ixdba.net/archives/1116
- in_scribe
- in_multiprocess
- in_dummy
Ouput Plugins
[Doc] http://docs.fluentd.org/articles/output-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
- out_file
- out_forward
- out_secure_forward
- out_exec
- out_exec_filter
- out_copy
- out_geoip
- out_roundrobin
- out_stdout
- out_null
- out_s3
- out_mongo
- out_mongo_replset
- out_relabel
- out_rewrite_tag_filter
- out_webhdfs
Plugins Type文章源自靠谱运维-https://www.ixdba.net/archives/1116
文章源自靠谱运维-https://www.ixdba.net/archives/1116
输出插件的缓冲区行为(如果有的话)由单独的缓冲区插件定义。 可以为每个输出插件选择不同的缓冲区插件。 一些输出插件是完全自定义的,不使用缓冲区。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Non-Buffered
- 非缓冲输出插件不缓冲数据并立即写出结果。
- includes:
- out_copy
- out_stdout
- out_null
- out_stdout
Buffered
- 时间切片输出插件事实上是一种缓冲插件,但块是按时间键入的。
- includes:
- out_exec_filter
- out_forward
- out_mongo or out_mongo_replset
Time Sliced
- 时间切片输出插件事实上是一种缓冲插件,但块是按时间键入的。
- includes:
- out_exec
- out_file
- out_s3
- out_webhdfs
out_file
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/out_file文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_file TimeSliced 输出插件将事件写入文件。 默认情况下,它每天创建文件(大约 00:10)。 这意味着,当您首次使用插件导入记录时,不会立即创建文件。 当满足 time_slice_format 条件时,将创建该文件。 要更改输出频率,请修改 time_slice_format 值。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
type (required)
path (required)
append
:默认值 false,即刷新日志到不同的文件;若为 true 则输出到同一文件,直到触发 time_slice_formatformat
:默认值 out_file,其它格式有:json、hash、ltsv、single_value、csv、stdout,参见 “2.6 Formatter Plugins”time_format
utc
compress
:gzipsymlink_path
:当 buffer_type 是文件时,创建到临时缓冲文件的符号链接。 默认情况下不创建符号链接。This is useful for tailing file content to check logs.
Time Sliced Output Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- time_slice_format:
- The default format is %Y%m%d%H, which creates one file per hour.
- time_slice_wait
- buffer_type
- buffer_queue_limit, buffer_chunk_limit
- flush_interval
- flush_at_shutdown
- retry_wait, max_retry_wait
- retry_limit, disable_retry_limit
- num_threads
- slow_flush_log_threshold
out_forward
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/out_forward文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_forward 缓冲输出插件将事件转发到其他 fluentd节点。 此插件支持负载平衡和自动故障切换(又名主动 - 主动备份)。 对于复制,请使用 out_copy 插件。文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_forward 插件使用 “φaccrual failure detector” 算法检测服务器故障。 您可以自定义算法的参数。 当服务器故障恢复时,插件使服务器在几秒钟后自动可用。文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_forward 插件支持最多一次和至少一次消息投递语义。 默认值为最多一次。参见 “4. Fluentd 高可用”。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type (required)
<server>
(at least one is required)- require_ack_response
- ack_response_timeout
<secondary>
(optional)- send_timeout
- recover_wait
- heartbeat_type
- heartbeat_interval
- phi_failure_detector
- phi_threshold
- hard_timeout
standby
- expire_dns_cache
- dns_round_robin
Buffered Output Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- buffer_type
- buffer_queue_limit, buffer_chunk_limit
- flush_interval
- flush_at_shutdown
- retry_wait, max_retry_wait
- retry_limit, disable_retry_limit
- num_threads
- slow_flush_log_threshold
out_copy
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/out_copy文章源自靠谱运维-https://www.ixdba.net/archives/1116
copy 输出插件将事件复制到多个输出。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type (required)
- deep_copy
<store>
(at least one required)
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_stdout
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/out_stdout文章源自靠谱运维-https://www.ixdba.net/archives/1116
stdout 输出插件将事件打印到 stdout(或日志,如果以守护进程模式启动)。 对于调试目的,这个输出插件是很有用的。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type (required)
- output_type:json or hash (Ruby’s hash)
out_exec
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/out_exec文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_exec TimeSliced 输出插件将事件传递到外部程序。 程序接收包含传入事件作为其最后一个参数的文件的路径。 默认情况下,文件格式为制表符分隔值(TSV)。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type (required)
- command (required)
- format
- tsv(default)
- json
- msgpack
- tag_key
- time_key
- time_format
Time Sliced Output Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- time_slice_format
- time_slice_wait
- buffer_type
- buffer_queue_limit, buffer_chunk_limit
- flush_interval
- flush_at_shutdown
- retry_wait, max_retry_wait
- retry_limit, disable_retry_limit
- num_threads
- slow_flush_log_threshold
Buffer Plugins
[Doc] http://docs.fluentd.org/articles/buffer-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
- buf_memory
- buf_file
缓冲插件由缓冲输出插件使用,如 out_file,out_forward 等。用户可以选择最适合其性能和可靠性需求的缓冲插件。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Buffer Structure文章源自靠谱运维-https://www.ixdba.net/archives/1116
The buffer structure is a queue of chunks like the following:文章源自靠谱运维-https://www.ixdba.net/archives/1116
When the top chunk exceeds the specified size or time limit (buffer_chunk_limit and flush_interval, respectively), a new empty chunk is pushed to the top of the queue. The bottom chunk is written out immediately when new chunk is pushed.文章源自靠谱运维-https://www.ixdba.net/archives/1116
If the bottom chunk fails to be written out, it will remain in the queue and Fluentd will retry after waiting several seconds (retry_wait). If the retry limit has not been disabled (disable_retry_limit is false) and the retry count exceeds the specified limit (retry_limit), the chunk is trashed. The retry wait time doubles each time (1.0sec, 2.0sec, 4.0sec, …) until max_retry_wait is reached. If the queue length exceeds the specified limit (buffer_queue_limit), new events are rejected.文章源自靠谱运维-https://www.ixdba.net/archives/1116
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Filter Plugins
[Doc] http://docs.fluentd.org/articles/filter-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
- filter_record_transformer
- filter_grep
- filter_parser
- filter_stdout
过滤插件使 Fluentd 可以修改事件流。 示例用例:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- 通过删除一个或多个字段的值来过滤事件
- 通过添加新字段丰富事件
- 删除或屏蔽某些字段,以确保隐私权和合规性。
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Only the events whose “message” field contain “cool” get the new field “hostname” with the machine’s hostname as its value.文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_record_transformer
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/filter_record_transformer文章源自靠谱运维-https://www.ixdba.net/archives/1116
The filter_record_transformer filter plugin mutates/transforms incoming event streams in a versatile manner. If there is a need to add/delete/modify events, this plugin is the first filter to try.文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- enable_ruby (optional)
- auto_typecast (optional)
- renew_record (optional)
- renew_time_key (optional, string type)
- keep_keys (optional, string type)
- remove_keys (optional, string type)
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
The above filter adds the new field “hostname” with the server’s hostname as its value (It is taking advantage of Ruby’s string interpolation) and the new field “tag” with tag value. So, an input like文章源自靠谱运维-https://www.ixdba.net/archives/1116
is transformed into文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters inside directives are considered to be new key-value pairs:文章源自靠谱运维-https://www.ixdba.net/archives/1116
对于NEW_FIELD和NEW_VALUE,特殊语法${}
允许用户动态生成新字段。 在花括号中,以下变量可用:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- The incoming event’s existing values can be referred
by their field names
. So, if the record is {“total”:100, “count”:10}, then total=10 and count=10. tag_parts[N]
:refers to the Nth part of the tag. It works like the usual zero-based array accessor.tag_prefix[N]
:refers to the first N parts of the tag. It works like the usual zero-based array accessor.tag_suffix[N]
:refers to the last N parts of the tag. It works like the usual zero-based array accessor.tag
:refers to the whole tag.time
:refers to stringanized event time.
filter_grep
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/filter_grep文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_grep 过滤器插件通过指定字段的值 “greps” 事件。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- regexpN (optional)
- excludeN (optional)
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
regexpN (optional)
文章源自靠谱运维-https://www.ixdba.net/archives/1116
The “N” at the end should be replaced with an integer between 1 and 20 (ex: “regexp1”). regexpN takes two whitespace-delimited arguments.文章源自靠谱运维-https://www.ixdba.net/archives/1116
- The first argument is the field name to which the regular expression is applied.
- The second argument is the regular expression.
The above example matches any event that satisfies the following conditions:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- The value of the “message” field contains “cool”
- The value of the “hostname” field matches web.example.com.
- The value of the “message” field does NOT contain “uncool”.
因此,以下事件将被保留:文章源自靠谱运维-https://www.ixdba.net/archives/1116
而以下事件被过滤掉:文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_parser
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/filter_parser文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_parser 过滤器插件在事件记录中 “解析” 字符串字段,并使用解析的结果更改其事件记录。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- reserve_data
- suppress_parse_error_log
- replace_invalid_sequence
- inject_key_prefix
- hash_value_field
- time_parse
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_parser 使用内置的解析器插件和您自己的自定义解析器插件,因此您可以重新使用像 apache,json 等预定义的格式。有关详细信息,参见 “2.5 Parser Plugins”。文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_stdout
文章源自靠谱运维-https://www.ixdba.net/archives/1116
[Doc] http://docs.fluentd.org/articles/filter_stdout文章源自靠谱运维-https://www.ixdba.net/archives/1116
filter_stdout 过滤器插件将事件打印到 stdout(或日志,如果以守护进程模式启动)。 对于调试目的,这个过滤器插件是很有用的。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Parameters:
文章源自靠谱运维-https://www.ixdba.net/archives/1116
- type(required)
- format:stdout
- output_type:可以指定任何格式化插件,默认为 json。
- out_file
Parser Plugins
[Doc] http://docs.fluentd.org/articles/parser-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
有时,输入插件(例如:in_tail,in_syslog,in_tcp 和 in_udp)的格式参数无法解析用户的自定义数据格式(例如,不能使用正则表达式解析的上下文相关语法)。 为了解决这种情况,对于 v0.10.46 及以上版本,Fluentd 有一个可插拔的系统,使用户能够创建自己的解析器格式
。文章源自靠谱运维-https://www.ixdba.net/archives/1116
List of Core Input Plugins with Parser support文章源自靠谱运维-https://www.ixdba.net/archives/1116
with format
parameter.文章源自靠谱运维-https://www.ixdba.net/archives/1116
- in_tail
- in_tcp
- in_udp
- in_syslog
- in_http
List of Built-in Parsers(内置解析列表)文章源自靠谱运维-https://www.ixdba.net/archives/1116
regexp
The regexp for the format parameter can be specified. If the parameter value starts and ends with “/”, it is considered to be a regexp. The regexp must have at least one named capture (?<NAME>PATTERN).
If the regexp has a capture named ‘time’, it is used as the time of the event. You can specify the time format using the time_format parameter.文章源自靠谱运维-https://www.ixdba.net/archives/1116
apache2
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Reads apache’s log file for the following fields: host, user, time, method, path, code, size, referer and agent. This template is analogous to the following configuration:文章源自靠谱运维-https://www.ixdba.net/archives/1116
apache_error
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Reads apache’s error log file for the following fields: time, level, pid, client and (error) message. This template is analogous to the following configuration:文章源自靠谱运维-https://www.ixdba.net/archives/1116
nginx
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Reads Nginx’s log file for the following fields: remote, user, time, method, path, code, size, referer and agent. This template is analogous to the following configuration:文章源自靠谱运维-https://www.ixdba.net/archives/1116
syslog
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Reads syslog’s output file (e.g. /var/log/syslog) for the following fields: time, host, ident, and message. This template is analogous to the following configuration:文章源自靠谱运维-https://www.ixdba.net/archives/1116
tsv or csv
文章源自靠谱运维-https://www.ixdba.net/archives/1116
If you use tsv or csv format, please also specify the keys parameter.文章源自靠谱运维-https://www.ixdba.net/archives/1116
If you specify the time_key parameter, it will be used to identify the timestamp of the record. The timestamp when Fluentd reads the record is used by default.文章源自靠谱运维-https://www.ixdba.net/archives/1116
ltsv
文章源自靠谱运维-https://www.ixdba.net/archives/1116
ltsv (Labeled Tab-Separated Value) is a tab-delimited key-value pair format. You can learn more about it on its webpage.文章源自靠谱运维-https://www.ixdba.net/archives/1116
If you specify the time_key parameter, it will be used to identify the timestamp of the record. The timestamp when Fluentd reads the record is used by default.文章源自靠谱运维-https://www.ixdba.net/archives/1116
json
文章源自靠谱运维-https://www.ixdba.net/archives/1116
One JSON map, per line. This is the most straight forward format :).文章源自靠谱运维-https://www.ixdba.net/archives/1116
One JSON map, per line. This is the most straight forward format :).文章源自靠谱运维-https://www.ixdba.net/archives/1116
none
文章源自靠谱运维-https://www.ixdba.net/archives/1116
You can use the none format to defer parsing/structuring the data. This will parse the line as-is with the key name “message”. For example, if you had a line文章源自靠谱运维-https://www.ixdba.net/archives/1116
It will be parsed as文章源自靠谱运维-https://www.ixdba.net/archives/1116
The key field is “message” by default, but you can specify a different value using the message_key parameter as shown below:文章源自靠谱运维-https://www.ixdba.net/archives/1116
multiline
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Read multiline log with formatN and format_firstline parameters. format_firstline is for detecting start line of multiline log. formatN, N’s range is 1..20, is the list of Regexp format for multiline log. Here is Rails log Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
If you have a multiline log文章源自靠谱运维-https://www.ixdba.net/archives/1116
It will be parsed as文章源自靠谱运维-https://www.ixdba.net/archives/1116
One more example, you can parse Java like stacktrace logs with multiline. Here is a configuration example.文章源自靠谱运维-https://www.ixdba.net/archives/1116
If you have a following log:文章源自靠谱运维-https://www.ixdba.net/archives/1116
It will be parsed as:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Formatter Plugins
[Doc] http://docs.fluentd.org/articles/formatter-plugin-overview文章源自靠谱运维-https://www.ixdba.net/archives/1116
有时,输出插件的输出格式不能满足需要。 Fluentd 有一个称为文本格式器
的可插拔系统,允许用户扩展和重新使用自定义输出格式。文章源自靠谱运维-https://www.ixdba.net/archives/1116
对于支持文本格式器的输出插件,format 参数可用于更改输出格式。文章源自靠谱运维-https://www.ixdba.net/archives/1116
For example, by default, out_file plugin outputs data as文章源自靠谱运维-https://www.ixdba.net/archives/1116
However, if you set format json like this文章源自靠谱运维-https://www.ixdba.net/archives/1116
The output changes to文章源自靠谱运维-https://www.ixdba.net/archives/1116
List of Output Plugins with Text Formatter Support文章源自靠谱运维-https://www.ixdba.net/archives/1116
- out_file
- out_s3
List of Built-in Formatters文章源自靠谱运维-https://www.ixdba.net/archives/1116
stdout
此格式旨在由 stdout 插件使用。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Output time, tag and formatted record as follows:文章源自靠谱运维-https://www.ixdba.net/archives/1116
Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
stdout format has a following option to customize the format of the record part.文章源自靠谱运维-https://www.ixdba.net/archives/1116
For this format, the following common parameters are also supported.文章源自靠谱运维-https://www.ixdba.net/archives/1116
include_time_key
(Boolean, Optional, defaults to false)- If true, the time field (as specified by the time_key parameter) is kept in the record.
time_key
(String, xOptional, defaults to “time”)- The field name for the time key.
time_format
(String. Optional)- By default, the output format is iso8601 (e.g. “2008-02-01T21:41:49”). One can specify their own format with this parameter.
include_tag_key
(Boolean. Optional, defaults to false)- If true, the tag field (as specified by the tag_key parameter) is kept in the record.
tag_key
(String, Optional, defaults to “tag”)- The field name for the tag key.
localtime
(Boolean. Optional, defaults to true)- If true, use local time. Otherwise, UTC is used. This parameter is overwritten by the utc parameter.
timezone
(String. Optional)- By setting this parameter, one can parse the time value in the specified timezone. The following formats are accepted:
- [+–]HH:MM (e.g. “+09:00”)
- [+–]HHMM (e.g. “+0900”)
- [+–]HH (e.g. “+09”)
- Region/Zone (e.g. “Asia/Tokyo”)
- Region/Zone/Zone (e.g. “America/Argentina/Buenos_Aires”)
- By setting this parameter, one can parse the time value in the specified timezone. The following formats are accepted:
The timezone set in this parameter takes precedence over localtime, e.g., if localtime is set to true but timezone is set to +0000, UTC would be used.文章源自靠谱运维-https://www.ixdba.net/archives/1116
out_file
1
|
time[delimiter]tag[delimiter]record\n
|
1
2
3
|
delimiter SPACE # Optional, SPACE or COMMA. "\t"(TAB) is used by default
output_tag false # Optional, defaults to true. Output the tag field if true.
output_time true # Optional, defaults to true. Output the time field if true.
|
record is json format data. Example:文章源自靠谱运维-https://www.ixdba.net/archives/1116
其它支持的参数同 stdout。文章源自靠谱运维-https://www.ixdba.net/archives/1116
json
1
|
{"field1":"value1","field2":"value2"}\n
|
其它支持的参数同 stdout。文章源自靠谱运维-https://www.ixdba.net/archives/1116
hash
1
|
{"field1"=>"value1","field2"=>"value2"}\n
|
其它支持的参数同 stdout。文章源自靠谱运维-https://www.ixdba.net/archives/1116
ltsv
1
|
field1[label_delimiter]value1[delimiter]field2[label_delimiter]value2\n
|
1
2
3
|
format ltsv
delimiter SPACE # Optional. "\t"(TAB) is used by default
label_delimiter = # Optional. ":" is used by default
|
其它支持的参数同 stdout。文章源自靠谱运维-https://www.ixdba.net/archives/1116
single_value
1
|
value1\n
|
single_value format supports the add_newline and message_key options.文章源自靠谱运维-https://www.ixdba.net/archives/1116
csv
1
|
"value1"[delimiter]"value2"[delimiter]"value3"\n
|
csv format supports the delimiter and force_quotes options.文章源自靠谱运维-https://www.ixdba.net/archives/1116
其它支持的参数同 stdout。文章源自靠谱运维-https://www.ixdba.net/archives/1116
Fluentd UI
[Doc] http://docs.fluentd.org/articles/fluentd-ui文章源自靠谱运维-https://www.ixdba.net/archives/1116
访问 Flunetd UI 界面(默认用户为:admin,默认密码 changeme)文章源自靠谱运维-https://www.ixdba.net/archives/1116
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Fluentd 高可用
消息投递语义(Message Delivery Semantics)
系统可以提供的几种可能的消息传递保障,如下所示:文章源自靠谱运维-https://www.ixdba.net/archives/1116
最多一次
:消息立即传输。 如果传输成功,则不会再次发送消息。 但是,许多故障情况可能导致丢失消息(例如:没有更多的写入容量)至少一次
:每条消息至少会发送一次,在故障情况下,消息可能会传送两次。仅仅一次
:每条消息只会而且仅会发送一次,这种是人们实际想要的。
如果系统 “不能丢失单个事件”,并且还必须传输 “仅仅一次”,则系统必须在写入容量不足时停止提取事件。 正确的方法是使用同步日志记录并在不能接受事件时返回错误。文章源自靠谱运维-https://www.ixdba.net/archives/1116
这就是为什么 Fluentd 提供 “最多一次” 和 “至少一次” 传输。 为了在不影响应用程序性能的情况下收集大量数据,数据记录器必须异步传输数据。 这以潜在传送失败为代价提高了性能。文章源自靠谱运维-https://www.ixdba.net/archives/1116
然而,大多数故障情况是可预防的。 以下各节介绍如何设置 Fluentd 的拓扑以实现高可用性。文章源自靠谱运维-https://www.ixdba.net/archives/1116
高可用架构
[Plugins]文章源自靠谱运维-https://www.ixdba.net/archives/1116
网络拓扑文章源自靠谱运维-https://www.ixdba.net/archives/1116
要配置 Fluentd 以实现高可用性,我们假设您的网络由 “日志转发器” 和 “日志聚合器” 组成。如下图所示,其中 “日志转发器” 通常安装在每个节点上以接收本地事件。 一旦接收到事件,它们通过网络将其转发到 “日志聚合器”。文章源自靠谱运维-https://www.ixdba.net/archives/1116
文章源自靠谱运维-https://www.ixdba.net/archives/1116
“日志聚合器” 是继续从日志转发器接收事件的守护程序。 他们缓冲事件并定期将数据上传到云中。文章源自靠谱运维-https://www.ixdba.net/archives/1116
文章源自靠谱运维-https://www.ixdba.net/archives/1116
Fluentd 可以充当日志转发器或日志聚合器,具体取决于其配置。文章源自靠谱运维-https://www.ixdba.net/archives/1116
日志转发器配置文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
### TCP input
<source>
@type forward
port 24224
</source>
### HTTP input
<source>
@type http
port 9880
</source>
### Log Forwarding
<match mytag.**>
@type forward
# primary host
<server>
host 192.168.0.1
port 24224
</server>
### use secondary host
<server>
host 192.168.0.2
port 24224
standby ### 指定备节点
</server>
### use longer flush_interval to reduce CPU usage.
### note that this is a trade-off against latency.
flush_interval 60s
</match>
|
当活动聚合器(192.168.0.1)死机时,日志将被发送到备份聚合器(192.168.0.2)。 如果两个服务器都死机,则日志在相应的转发器节点上缓存在磁盘上。文章源自靠谱运维-https://www.ixdba.net/archives/1116
日志聚合器配置文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
2
3
4
5
6
7
8
9
10
|
### Input
<source>
@type forward
port 24224
</source>
### Output
<match mytag.**>
...
</match>
|
传入日志被缓冲,然后定期上传到云中。 如果上传失败,日志将存储在本地磁盘上,直到重传成功。文章源自靠谱运维-https://www.ixdba.net/archives/1116
转发器和聚合器故障失败文章源自靠谱运维-https://www.ixdba.net/archives/1116
当日志转发器从应用程序接收事件时,事件首先写入磁盘缓冲区(由buffer_path指定)。 在每个 flush_interval 之后,缓冲的数据被转发到聚合器(或云中)。文章源自靠谱运维-https://www.ixdba.net/archives/1116
这个过程对于数据丢失是固有的鲁棒性。 如果日志转发器(或聚合器)的 fluentd 进程死机,则缓冲的数据在重新启动后会正确传输到其聚合器(云中)。 如果转发器和聚合器(或聚合器和云)之间的网络断开,则会自动重试数据传输。文章源自靠谱运维-https://www.ixdba.net/archives/1116
但是,存在可能的消息丢失情况:文章源自靠谱运维-https://www.ixdba.net/archives/1116
- 在接收到事件之后,但在将它们写入缓冲器之前,处理立即结束。
- 转发器(或聚合器)的磁盘损坏,文件缓冲区丢失。
因此,请确保您可以使用不仅 TCP,而且 UDP 与端口 24224通信。 这些命令将有助于检查网络配置。文章源自靠谱运维-https://www.ixdba.net/archives/1116
案例
[Docker+fluentd] http://www.fluentd.org/guides/recipes/docker-logging文章源自靠谱运维-https://www.ixdba.net/archives/1116
Nginx
日志格式文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
2
3
|
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$request_body" "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
|
fluentd 配置文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
### ------ NGINX ------
<source>
@type tail
@label @NGINX
path /data/logs/**-nginx/access.log
tag webapp.nginx.access
format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<request>[^\"]*)\" (?<code>[^ ]*) (?<body_bytes_sent>[^ ]*) \"(?<request_body>[^ ]*)\" \"(?<http_referer>[^ ]*)\" \"(?<agent>[^\"]*)\" \"(?<client>[^ ]*)\"/
time_format %d/%b/%Y:%H:%M:%S %z
pos_file /tmp/webapp.nginx.access.pos
refresh_interval 10
</source>
<source>
@type tail
@label @NGINX
path /data/logs/**-nginx/error.log
tag webapp.nginx.error
format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<request>[^\"]*)\" (?<code>[^ ]*) (?<body_bytes_sent>[^ ]*) \"(?<request_body>[^ ]*)\" \"(?<http_referer>[^ ]*)\" \"(?<agent>[^\"]*)\" \"(?<client>[^ ]*)\"/
time_format %d/%b/%Y:%H:%M:%S %z
pos_file /tmp/webapp.nginx.error.pos
refresh_interval 10
</source>
<label @NGINX>
<filter webapp.nginx.**>
@type record_transformer
<record>
host "#{Socket.gethostname}"
</record>
</filter>
<match webapp.nginx.*>
type elasticsearch
host 192.168.112.4
port 9200
index_name fluentd
type_name fluentd
logstash_format true
logstash_prefix nginx
utc_index true
include_tag_key false
</match>
</label>
|
Java
日志格式文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
|
[%-5level] [%contextName] %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] [%X{req.remoteHost}] [%X{req.requestURI}] [%X{traceId}] %logger - %msg%n
|
fluentd 配置文章源自靠谱运维-https://www.ixdba.net/archives/1116
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
### ------ JAVA ------
<source>
@type tail
@label @JAVA
tag webapp.java.access
path /data/logs/**/*.log
exclude_path ["/data/logs/**/*.gz"]
format multiline
format_firstline /^\[[\w ]+\]/
format1 /^\[(?<level>[\w ]+)\] \[(?<app_name>\S+)\] (?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{1,3}) \[(?<thread>[^ ]+)\] \[(?<remote_addr>[^ ]*)\] \[(?<request>[^ ]*)\] \[(?<trace_id>[^ ]*)\] \S+ - (?<msg>.*)/
pos_file /tmp/webapp.java.access.pos
</source>
<label @JAVA>
<filter webapp.java.access>
@type record_transformer
<record>
host "#{Socket.gethostname}"
</record>
</filter>
<match webapp.java.access>
@type copy
<store>
@type stdout
</store>
<store>
@type elasticsearch
host 192.168.112.4
port 9200
logstash_format true
flush_interval 10s # for testing
logstash_prefix webapp
</store>
</match>
</label>
|
Fluentd Docker
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
|
FROM 192.168.101.26/library/ubuntu:14.04.5
MAINTAINER Mallux "hj.mallux@gmail.com"
ENV LANG en_US.UTF-8
ENV TZ Asia/Shanghai
RUN echo 'LANG="en_US.UTF-8"' > /etc/default/locale && \
echo 'LANGUAGE="en_US:en"' >> /etc/default/locale && \
ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && \
echo $TZ > /etc/timezone
COPY ./archives/sources.list /etc/apt/sources.list
ENV FLUENTD_VERSION 2.3.4
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com --recv EA312927 && \
apt-get update && \
apt-get install -y --no-install-recommends \
apt-transport-https curl wget vim lrzsz net-tools && \
apt-get clean && rm -rf /var/lib/apt/lists/*
COPY ./archives/vimrc /root/.vimrc
RUN sed -i '/^#force_color_prompt=yes/ s/#//' /root/.bashrc && \
sed -i "/^if \[ \"\$color_prompt\" = yes \]/ { N; s/\(.*PS1\).*/\1='\${debian_chroot:+(\$debian_chroot)}[\\\[\\\e[0;32;1m\\\]\\\u\\\[\\\e[0m\\\]@\\\[\\\e[0;36;1m\\\]\\\h\\\[\\\e[0m\\\] \\\[\\\e[0;33;1m\\\]\\\W\\\[\\\e[0m\\\]]\\\\$ '/}" /root/.bashrc
COPY ./archives/td-agent_2.3.4-0_amd64.deb /
RUN dpkg -i /td-agent_2.3.4-0_amd64.deb && \
rm -rf /td-agent_2.3.4-0_amd64.deb && \
td-agent-gem source -a https://rubygems.org/ && \
td-agent-gem install fluent-plugin-elasticsearch \
fluent-plugin-typecast \
fluent-plugin-secure-forward
COPY ./fluentd/td-agent.conf /etc/td-agent/td-agent.conf
COPY ./fluentd/entrypoint.sh /
RUN mkdir -p /var/log/td-agent /data/logs /data/log && \
chown td-agent.td-agent -R /var/log/td-agent && \
chmod +x /entrypoint.sh
VOLUME /var/log/td-agent
VOLUME /data/logs
VOLUME /data/log
EXPOSE 24224
EXPOSE 9880
EXPOSE 9292
ENTRYPOINT ["/entrypoint.sh"]
|
entrypoint.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
#!/usr/bin/env bash
### --------------------------------------------------
### Filename: entrypoint.sh
### Revision: latest stable
### Author: Mallux
### E-mail: hj.mallux@gmail.com
### Blog: blog.mallux.me
### Description:
### --------------------------------------------------
### Copyright © 2014-2016 Mallux
#set -e
### starting up fluentd service
function gosu_td-agent {
chown -R td-agent.td-agent /etc/td-agent /var/log/td-agent
service td-agent start ; echo "=> Done!"
nohup td-agent-ui start &>/dev/null &
}
gosu_td-agent
while :
do
### check your fluentd server is running.
ps -ef | grep td-agent | grep -v grep >/dev/null 2>&1
exit_code=$?
if [ x"$exit_code" != x"0" ]
then
echo "your fluentd server has stop, please restart it"
#service td-agent restart
fi
### wait for 60 seconds
sleep 60
done
|
评论