Hi Amit,
The default timeout for the election is 15 seconds, so I would expect those log lines to show up at that interval.
The syslog_drain_binder election code was written before my time on Loggregator, so I don't know exactly what the original reason behind doing the leader election this way was. From my perspective it's easy to understand and we haven't had any problems with it.
Rohit
toggle quoted message
Show quoted text
On Tue, Sep 15, 2015 at 7:39 AM, Amit Gupta <agupta(a)pivotal.io> wrote: Hi Rohit,
To add to Guangcai's question, is it expected for those "lost election" log lines to be so frequent? Does the component run for election every 15 seconds? In other leader election protocols that I'm familiar with, the followers heartbeat to the leader and only hold an election and run for it if they determine that there is no leader.
Amit
On Mon, Sep 14, 2015 at 8:22 PM, Rohit Kumar <rokumar(a)pivotal.io> wrote:
Hi Guangcai,
The log messages are coming from the syslog_drain_binder process which is colocated with dopplers. The syslog_drain_binder is used to poll the CloudController for active syslog drain URLs for apps. At any point we only want one syslog_drain_binder to be active, so that the CloudController doesn't get overloaded with requests. The election process is done to ensure that.
To answer your question, yes these messages are expected. Secondly, the syslog_drain_binders will run for election after a specified timeout has expired. All of them try to create a key in etcd but only one succeeds and becomes the leader. The exact logic can be found here <https://github.com/cloudfoundry/loggregator/blob/develop/src/syslog_drain_binder/elector/elector.go#L38-L58> .
Rohit
On Mon, Sep 14, 2015 at 1:10 AM, Guangcai Wang <guangcai.wang(a)gmail.com> wrote:
Hi all,
I have 2 doppler instances. I found if one of doppler won election for cluster leader, the other will frequently log "lost election for cluster leader" as follows. Is it expected?
{"timestamp":1442214238.411536455,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector: 'doppler_z1.0' lost election for cluster leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"} {"timestamp":1442214253.724292278,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector: 'doppler_z1.0' lost election for cluster leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"} {"timestamp":1442214269.286961317,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector: 'doppler_z1.0' lost election for cluster leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"} {"timestamp":1442214284.720170259,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector: 'doppler_z1.0' lost election for cluster leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"} {"timestamp":1442214300.056922436,"process_id":7212,"source":"syslog_drain_binder","log_level":"info","message":"Elector: 'doppler_z1.0' lost election for cluster leader.","data":null,"file":"/var/vcap/data/compile/syslog_drain_binder/loggregator/src/syslog_drain_binder/elector/elector.go","line":57,"method":"syslog_drain_binder/elector.(*Elector).RunForElection"}
I also want to know in which conditions/situations they will reelect for cluster leader again.
|