Re: Feedback request: Disable logging Client IP’s in the Gorouter logs for compliance with the EU General Data Protection Regulation (GDPR)


Stefan Mayr
 

Hi,

Am 13.03.2018 um 00:39 schrieb Shubha Anjur Tupil:
Hello everyone, 

In lieu of The EU General Data Protection Regulation (GDPR), the routing
team is investigating adding manifest properties to allow an operator to
disable logging the client IP's in the |X-Forwarded-For| header in the
access and error logs for Gorouter.

Enforcement of The EU General Data Protection Regulation (GDPR)
(https://www.eugdpr.org/) begins May 28th and imposes steep fines. This
law says that companies will be fined if they are capturing PII. The
Gorouter currently captures Client IP addresses that are included in
that definition. We are exploring manifest properties to allow operators
to disable logging the originating client IP.

We need help weighing the options given the use-cases.

*Use-cases*:

#1. L3 Load Balancer e.g. Amazon NLB in front of the Gorouter 
The client IP is logged and the X-Forwarded-For header might have the
originating client IP if an intermediary component or the originating
client is adding the header.

#2. L7 Load Balancer e.g. Amazon NLB in front of the Gorotuer 
The client IP is the IP of the load balancer in front of the Gorouter,
but the X-Forwarded-For header has the originating client IP.

*Option1*: We have one manifest property to disable logging both the
X-Forwarded-For header and the Client IP in the Gorouter logs.

For use-case #1 this works, and there is only one manifest property to
disable all PII being logged by the Gorouter.

For use-case #2 this results in the operator losing the LB IP which
might be helpful for troubleshooting.

*Option2*: We have two separate manifest properties to disable logging
the X-Forwarded-For header and/or the Client IP in the Gorouter logs.
This is generally more flexible at the cost of user experience.

For use-case #1 this would mean that an operator would have to set two
manifest properties instead of one. Both these properties would need to
be exposed in Ops Man for PCF installations. It leads to a more
cumbersome user experience, adding to the already long list of options.

For use-case #2 this results in a better outcome for an operator in that
they still get the information on the LB the request came from, while
the originating client IP will not be logged.

Based on the information we have, we are not sure which experience is
better and which use-case to optimize for. Some questions we have

1. What type of LB is more popular in CF environments; L3 or L7? This
might help us optimize for Use-case #2 and go with Option #2.
Our deployment scenario is #2. The L7 load balancer inserts a
X-Forwarded-For header.

2. Is there a compelling reason from an operator perspective for
installations with a L7 load balancer, making it important to have
the Client IP (i.e. they would absolutely want to have Option #2 or
maybe don’t care and Option #1 would be ok)?
All external request to gorouter are coming through the L7 load
balancer. So for most debugging use cases the load balancer IP in
gorouter access log would not provide any value - but the client IP from
the X-Forwarded-For header does. This brings up another point: it is not
generally forbidden to store ip addresses for specific uses cases, at
least in germany. E.g, as far as I understand the current situation we
are allowed the store IP addresses for defense or debugging purpose for
a limited time if (!) this is documented in the public data privacy
statement. It looks like 1-2 weeks are generally acceptable for this use
case.

3. Does the user experience benefit by having just one option outweigh
the benefit of having the client IP for an operator with an L7 Load
Balancer? (Flexibility over Experience)
For us this would not provide any advantage to have seperate
configuration options. If we cannot limit the timeframe keept by
logrotation we would disable IP logging.

We would love to hear from you if you have thoughts on this.
This topics raises some other questions about the IP addresses being
passed around in multiple components. gorouter creates the RTR log
messages that we can read from loggregator. Does loggregator buffer
those messages or disk or is this only kept in memory? When is this
information purged from loggregator?
Next component we see is the buildpack. Some also show classic access
log information on stdout. Did anybody check if none of the buildpacks
writes this information to disk?


Thanks, 

Shannon Coen & Shubha Anjur Tupil
Product Managers, CF Routing
Regards,

Stefan

Join cf-dev@lists.cloudfoundry.org to automatically receive all group messages.