Skip to content

ForceStopAsyncSend option for graceful stop in async mode #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 19, 2020
Merged

ForceStopAsyncSend option for graceful stop in async mode #77

merged 4 commits into from
Feb 19, 2020

Conversation

JamesJJ
Copy link
Contributor

@JamesJJ JamesJJ commented Dec 6, 2019

As I understand, existing behaviour in Async mode is:

  • Add events to the end of the buffer. Return an error if the buffer is full
  • Take events from the front of the buffer, for each event try to send to fluentd MaxRetry times, and if still not successfully sent the event is discarded
  • When Close() is called, the pending event channel is closed and attempting to add any new events will return an error
  • When Close() is called, it will block until all remaining events in the buffer have either been successfully sent, or each one has run through it's MaxRetry number of failures. This means that Close() can block for a very long time if the number of unsent events is large.

This PR introduces a new option AsyncStop. When true in Async mode and Close() is called, the current write() / MaxRetries iteration will complete for the single event in-progress, and then any remaining events in the buffer will be discarded. This may help to significantly reduce the time that Close() blocks when fluentd is continuously available.

JamesJJ added a commit to JamesJJ/moby that referenced this pull request Dec 6, 2019
@JamesJJ JamesJJ changed the title Possible Fix for #71. AsyncStop option for graceful stop in async mode [WIP] AsyncStop option for graceful stop in async mode Dec 7, 2019
@JamesJJ JamesJJ changed the title [WIP] AsyncStop option for graceful stop in async mode AsyncStop option for graceful stop in async mode Dec 7, 2019
@thaJeztah
Copy link

ping @tagomoris @cpuguy83 ptal (looks like this relates to moby/moby#40063)

@cpuguy83
Copy link

In theory this is nice, but I feel like everyone will just turn it on because close just should not block in general.
The option also doesn't seem appropriately named since it's not an async close but rather discard+close.

Perhaps something like Reset() before Close() would be a possible alternative.
But again I feel like probably everyone will just want to call Reset().

Copy link
Member

@tagomoris tagomoris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I agree to add this feature as a new option, not to change the default behavior. IMO some people want to find processes which stuck in its shutdown sequence and resume Fluentd destination processes before terminating golang processes forcibly.

But I found some points to be improved.
@JamesJJ I'm very sorry for reviewing this change so lately, but could you check my review comments?

@tagomoris
Copy link
Member

LGTM. Thank you!

@tagomoris tagomoris merged commit b8d8d8a into fluent:master Feb 19, 2020
@tagomoris tagomoris changed the title AsyncStop option for graceful stop in async mode ForceStopAsyncSend option for graceful stop in async mode Feb 19, 2020
@tagomoris
Copy link
Member

Merged, and tagged as v1.5.0

akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever the target Fluentd server is down.

Also the logger is currently lazily initializing the connection when
it receives its first log. But this is a problem when the Fluentd server
has never been available as the connection initialization blocks the
select signaling the log channel should be drained.
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever the target Fluentd server is down.

Also the logger is currently lazily initializing the connection when
it receives its first log. But this is a problem when the Fluentd server
has never been available as the connection initialization blocks the
select signaling the log channel should be drained.
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever the target Fluentd server is down.

Also the logger is currently lazily initializing the connection when
it receives its first log. But this is a problem when the Fluentd server
has never been available as the connection initialization blocks the
select signaling the log channel should be drained.
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 22, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 23, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 23, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 23, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Apr 24, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting for tcp timeout.

To fix these edge cases, the connection isn't initialized lazily
anymore. However, it's still initialized asynchronously and with the
exponential back-off retry. The Fluent.run() method has been adapted to
wait for either the connection to become ready or to receive a stop
signal on the stopRunning channel before starting to unqueue logs.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request May 21, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After() and a new stopAsyncConnect channel
has been introduced to stop it. Moreover, the dialer.Dial() call used
to connect to Fluentd has been replaced with dialer.DialContext() and a
cancellable context is now used to stop the call to that method.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Aug 4, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After() and a new stopAsyncConnect channel
has been introduced to stop it. Moreover, the dialer.Dial() call used
to connect to Fluentd has been replaced with dialer.DialContext() and a
cancellable context is now used to stop the call to that method.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Nov 11, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After() and a new stopAsyncConnect channel
has been introduced to stop it. Moreover, the dialer.Dial() call used
to connect to Fluentd has been replaced with dialer.DialContext() and a
cancellable context is now used to stop the call to that method.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Nov 11, 2020
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After(). Moreover, the dialer.Dial() call
used to connect to Fluentd has been replaced with dialer.DialContext()
and a cancellable context is now used to stop the call to that method.
The associated cancel function is stored in Fluent and got called by
Close() when ForceStopAsyncSend option is enabled.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Oct 19, 2021
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After(). Moreover, the dialer.Dial() call
used to connect to Fluentd has been replaced with dialer.DialContext()
and a cancellable context is now used to stop the call to that method.
The associated cancel function is stored in Fluent and got called by
Close() when ForceStopAsyncSend option is enabled.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton pushed a commit to akerouanton/fluent-logger-golang that referenced this pull request Oct 21, 2021
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After(). Moreover, the dialer.Dial() call
used to connect to Fluentd has been replaced with dialer.DialContext()
and a cancellable context is now used to stop the call to that method.
The associated cancel function is stored in Fluent and got called by
Close() when ForceStopAsyncSend option is enabled.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton added a commit to akerouanton/fluent-logger-golang that referenced this pull request Oct 21, 2021
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After(). Moreover, the dialer.Dial() call
used to connect to Fluentd has been replaced with dialer.DialContext()
and a cancellable context is now used to stop the call to that method.
The associated cancel function is stored in Fluent and got called by
Close() when ForceStopAsyncSend option is enabled.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
akerouanton added a commit to akerouanton/fluent-logger-golang that referenced this pull request Oct 28, 2021
PR fluent#77 introduced a new parameter named ForceStopAsyncSend. It can be
used to tell the logger to not try to send all the log messages in its
buffer before closing. Without this parameter, the logger hangs out
whenever it has logs to write and the target Fluentd server is down.

But this new parameter is not enough: the connection is currently lazily
initialized when the logger receive its first message. This blocks the
select reading messages from the queue (until the connection is ready).
Moreover, the connection dialing uses an exponential back-off retry.

Because of that, the logger won't look for messages on `stopRunning`
channel (the channel introduced by PR fluent#77), either because it's
blocked by the Sleep used for the retry or because the connection
dialing is waiting until dialing timeout (eg. TCP timeout).

To fix these edge cases, the time.Sleep() used for back-off retry has
been transformed into a time.After(). Moreover, the dialer.Dial() call
used to connect to Fluentd has been replaced with dialer.DialContext()
and a cancellable context is now used to stop the call to that method.
The associated cancel function is stored in Fluent and got called by
Close() when ForceStopAsyncSend option is enabled.

Finally, the Fluent.run() method has been adapted to wait for both new
messages on f.pending and stop signal sent to f.stopRunning channel.
Previously, both channels were selected independently, in a procedural
fashion.

This fix is motivated by the issue described in: moby/moby#40063.

Signed-off-by: Albin Kerouanton <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants