@InterfaceAudience.Public @InterfaceStability.Evolving public class RollingFileSystemSink extends Object implements MetricsSink, Closeable
This class is a metrics sink that uses
FileSystem
to write the metrics logs. Every
hour a new directory will be created under the path specified by the
basepath
property. All metrics will be logged to a file in the
current hour's directory in a file named <hostname>.log, where
<hostname> is the name of the host on which the metrics logging
process is running. The base path is set by the
<prefix>.sink.<instance>.basepath
property. The
time zone used to create the current hour's directory name is GMT. If the
basepath
property isn't specified, it will default to
"/tmp", which is the temp directory on whatever default file
system is configured for the cluster.
The <prefix>.sink.<instance>.ignore-error
property controls whether an exception is thrown when an error is encountered
writing a log file. The default value is true
. When set to
false
, file errors are quietly swallowed.
The primary use of this class is for logging to HDFS. As it uses
FileSystem
to access the target file system,
however, it can be used to write to the local file system, Amazon S3, or any
other supported file system. The base path for the sink will determine the
file system used. An unqualified path will write to the default file system
set by the configuration.
Not all file systems support the ability to append to files. In file
systems without the ability to append to files, only one writer can write to
a file at a time. To allow for concurrent writes from multiple daemons on a
single host, the source
property should be set to the name of
the source daemon, e.g. namenode. The value of the
source
property should typically be the same as the property's
prefix. If this property is not set, the source is taken to be
unknown.
Instead of appending to an existing file, by default the sink will create a new file with a suffix of ".<n>&quet;, where n is the next lowest integer that isn't already used in a file name, similar to the Hadoop daemon logs. NOTE: the file with the highest sequence number is the newest file, unlike the Hadoop daemon logs.
For file systems that allow append, the sink supports appending to the
existing file instead. If the allow-append
property is set to
true, the sink will instead append to the existing file on file systems that
support appends. By default, the allow-append
property is
false.
Note that when writing to HDFS with allow-append
set to true,
there is a minimum acceptable number of data nodes. If the number of data
nodes drops below that minimum, the append will succeed, but reading the
data will fail with an IOException in the DataStreamer class. The minimum
number of data nodes required for a successful append is generally 2 or
3.
Note also that when writing to HDFS, the file size information is not updated until the file is closed (e.g. at the top of the hour) even though the data is being written successfully. This is a known HDFS limitation that exists because of the performance cost of updating the metadata. See HDFS-5478.
When using this sink in a secure (Kerberos) environment, two additional
properties must be set: keytab-key
and
principal-key
. keytab-key
should contain the key by
which the keytab file can be found in the configuration, for example,
yarn.nodemanager.keytab
. principal-key
should
contain the key by which the principal can be found in the configuration,
for example, yarn.nodemanager.principal
.
Modifier and Type | Field and Description |
---|---|
protected static boolean |
flushQuickly |
protected static boolean |
hasFlushed |
protected static Configuration |
suppliedConf |
protected static FileSystem |
suppliedFilesystem |
Constructor and Description |
---|
RollingFileSystemSink() |
Modifier and Type | Method and Description |
---|---|
void |
close() |
void |
flush()
Flush any buffered metrics
|
void |
init(org.apache.commons.configuration.SubsetConfiguration metrics2Properties)
Initialize the plugin
|
void |
putMetrics(MetricsRecord record)
Put a metrics record in the sink
|
protected static boolean flushQuickly
protected static volatile boolean hasFlushed
protected static Configuration suppliedConf
protected static FileSystem suppliedFilesystem
public RollingFileSystemSink()
public void init(org.apache.commons.configuration.SubsetConfiguration metrics2Properties)
MetricsPlugin
init
in interface MetricsPlugin
metrics2Properties
- the configuration object for the pluginpublic void putMetrics(MetricsRecord record)
MetricsSink
putMetrics
in interface MetricsSink
record
- the record to putpublic void flush()
MetricsSink
flush
in interface MetricsSink
public void close()
close
in interface Closeable
close
in interface AutoCloseable
Copyright © 2016 Apache Software Foundation. All rights reserved.