Date
1 - 9 of 9
CF Networking -- Seeking clarity on today's implementation
Matthew Sykes <matthew.sykes@...>
1. Yes - the source address observed by listeners accessed from a container
will be a DEA address.
2. Yes - DEAs implement a default egress policy of deny; you must
explicitly enable access to addresses.
3. Thank you. One way to contribute to the docs is to create a pull request
against [1]. If that's not something you're comfortable with, I'm certain
we can find a different way to incorporate some of this information for new
users.
Thanks.
[1]: https://github.com/cloudfoundry/docs-book-cloudfoundry
On Mon, Dec 21, 2015 at 5:43 PM, ravi malhotra <ravi.malhotra(a)bnymellon.com>
wrote:
--
Matthew Sykes
matthew.sykes(a)gmail.com
will be a DEA address.
2. Yes - DEAs implement a default egress policy of deny; you must
explicitly enable access to addresses.
3. Thank you. One way to contribute to the docs is to create a pull request
against [1]. If that's not something you're comfortable with, I'm certain
we can find a different way to incorporate some of this information for new
users.
Thanks.
[1]: https://github.com/cloudfoundry/docs-book-cloudfoundry
On Mon, Dec 21, 2015 at 5:43 PM, ravi malhotra <ravi.malhotra(a)bnymellon.com>
wrote:
Shannon, Matt, Amit --
Great responses. Thank you.
Can you please confirm the statements below:
1) For outbound flows from the App -- I am in enterprise space -- and I am
not using a NAT VM -- the source address of the Container will be Source
NATTED behind the DEA or VMs routable address.
2) Policies to permit/deny above flows can be configured using Security
Groups.
3) Everyone here has provided a lot of information. I would be happy to
prepare a Network101 for CF if the community is accepting updated
documentation. Please let me know if I should take all your responses and
prepare a document to upload.
--
Matthew Sykes
matthew.sykes(a)gmail.com
ravi malhotra
Shannon, Matt, Amit --
Great responses. Thank you.
Can you please confirm the statements below:
1) For outbound flows from the App -- I am in enterprise space -- and I am not using a NAT VM -- the source address of the Container will be Source NATTED behind the DEA or VMs routable address.
2) Policies to permit/deny above flows can be configured using Security Groups.
3) Everyone here has provided a lot of information. I would be happy to prepare a Network101 for CF if the community is accepting updated documentation. Please let me know if I should take all your responses and prepare a document to upload.
Great responses. Thank you.
Can you please confirm the statements below:
1) For outbound flows from the App -- I am in enterprise space -- and I am not using a NAT VM -- the source address of the Container will be Source NATTED behind the DEA or VMs routable address.
2) Policies to permit/deny above flows can be configured using Security Groups.
3) Everyone here has provided a lot of information. I would be happy to prepare a Network101 for CF if the community is accepting updated documentation. Please let me know if I should take all your responses and prepare a document to upload.
Shannon Coen
On Wed, Dec 16, 2015 at 9:26 AM, Amit Gupta <agupta(a)pivotal.io> wrote:
The router is targeting the ip and port advertised in the route. For
The routing table is dynamically update by requests from DEAs (for legacy
runtime) and Route-Emitter (for Diego). The source address from the
perspective of an application running on CF is the address of a router.
Upstream addresses are included in the X-Forwarded-For header.
that resolve to a load balancer in front of the CF routers. Response by an
app goes back the way it came as connections at each leg are kept open
pending response.
3. Is there any performance data available on the CF Router?
request is which the session established takes 3x as long as an unencrypted
request; subsequent encrypted requests are comparable with unencrypted
requests.
Otherwise we don't have great documentation to share here yet. We have an
epic of performance exploration coming up, in which we'll aim to benchmark
metrics like throughput and connection rate limits, and document the
results for public consumption.
sends it to the message bus. In the future the emitter will send this
configuration to the Routing API, or the emitter will be collapsed into the
Routing API which will fetch routing config from Diego itself.
Sykes answered this well.3. Router to DEA traffic: is the Router just changing the destination
address of the request to the address of the DEA and forwarding the request
with the source address intact?
Check with the Routing team PM, Shannon (cc'd)
The router is targeting the ip and port advertised in the route. For
application instances hosted by a DEA, that will be the DEA’s address and a
port associated with a DNAT rule to the application’s endpoint in the
container’s network namespace.
The routing table is dynamically update by requests from DEAs (for legacy
runtime) and Route-Emitter (for Diego). The source address from the
perspective of an application running on CF is the address of a router.
Upstream addresses are included in the X-Forwarded-For header.
East-West traffic between Containers:In a typical CF topology, requests from one app to another use DNS names1. E-W traffic must go through a Router.
2. APP1 will seek out a Router (which one?)
3. The Router will direct the request to APP2 on some DEA using some
algorithm (say, round-robin).
4. The reverse traffic from APP2 to APP1 would need to be NATTED to the
Router address. Also, we need a destination NAT. Not sure how the NAT
function would do this work.
Not sure what you mean by East-West. Request to app1.my-domain.com
typically has DNS resolve to an upstream LB. LB routes traffic to routers,
although you could have DNS resolve directly to routers if you want to
expose routers externally. Routers then balance traffic to apps. I believe
the response returns via the router, again, check with Shannon.
that resolve to a load balancer in front of the CF routers. Response by an
app goes back the way it came as connections at each leg are kept open
pending response.
3. Is there any performance data available on the CF Router?
In the the SSL perf tests Amit referred to, we found that the initial SSL
Shannon can give you a more comprehensive answer. I know they did some
perf tests when using Routers for SSL termination. Other high request tests
I've seen have not exposed router as bottleneck, rather some conntrack
parameter settings in Ubuntu, on the DEA. These have since been addressed.
request is which the session established takes 3x as long as an unencrypted
request; subsequent encrypted requests are comparable with unencrypted
requests.
Otherwise we don't have great documentation to share here yet. We have an
epic of performance exploration coming up, in which we'll aim to benchmark
metrics like throughput and connection rate limits, and document the
results for public consumption.
Which DEAs a router knows about? What tcp sessions are active? Where can IThe route-emitter listens to Diego for application routing information andfind the detailed documentation?
DEAs don't know about routers. Currently, DEAs broadcast application
routes over a message bus, routers subscribe to the channel. This may
change in the future with Diego cells directly talking to the routing tier
over HTTP to populate the routing tables.
sends it to the message bus. In the future the emitter will send this
configuration to the Routing API, or the emitter will be collapsed into the
Routing API which will fetch routing config from Diego itself.
Matthew Sykes <matthew.sykes@...>
Just adding a couple of technical bits to Jason’s response.
On Dec 18, 2015, at 1:41 PM, Jason Sherron <jsherron(a)pivotal.io> wrote:To be more explicit, NATS is a/topic based message bus. The DEAs do subscribe to a number of topics but routing isn’t one of them. Basically, NATS is used for far more than routing.
Hi Ravi,
Matt and Amit really did a great job in the thread so far; I'm the new networking PM so I'm still learning some of this, but I wanted to help answer your questions too. See inline.
On Fri, Dec 18, 2015 at 10:01 AM, ravi malhotra <ravi.malhotra(a)bnymellon.com <mailto:ravi.malhotra(a)bnymellon.com>> wrote:
Matt and Amit,
Great feedback from you! Thanks so much for taking so much time.
I think I understand some more pieces of the puzzle, but, some more statements/questions -- if you could validate/comment, as you did earlier. (Amit, You had asked where I am looking up stuff. Mainly using cloudfoundry.org <http://cloudfoundry.org/>. There are also some blogs -- I saw one from someone in Pivotal.)
NATS BUS:
a) DEAs will publish App instances using their (routable) IP Address and a tcp Port (the Port is in turn mapped by the DEA to a private container ip and port).
b) Routers subscribe to this bus (since external traffic needs to be load balanced over the available App instances).
c) DEAs currently do not subscribe to this bus (may change with Diego).
For DEA/Warden, we reserve a /30 for each container. We use one address for the host side, one for the container side, one for broadcast, and one is unused.
This is correct. Do you have a specific interest or concern about c)? I've noticed you've asked about direct DEA communications a couple times.
ARCHITECTURE OF CONTAINER NETWORK (ON A DEA):
a) Each DEA may define one or more container networks.
b) Each container network is local to the DEA.
c) The container network could be associated with a subnet (say, 10.254.1.0/24 <http://10.254.1.0/24>, 10.254/2.0/24, or just point-to-point /30 or /31, etc)
d) A container/app will derive an address from its container subnet.Warden (you’ve been using warden and DEA interchangeably but they are different components with different roles and responsibilities) uses the address associated with the host side of the veth pair as the default gateway for the associated container.
e) The DEA interface on the container network is configured with an IP address which represents the default gateway for the containers.
f) The DEA interface on the container network also provides the NAT function for traffic outbound from Container to anywhere outside the DEA.iptables masquerade rules are configured on the DEA node by Warden to handle NAT for the container.
g) This NAT function is provisioned via IPTABLES rules.Terminology - the mechanism used by the router is not important; the router uses the address and port advertised in the *route*. Routes can be created by NATS messages or APIs. The DEAs will only advertise routes with one of the addresses in a multi-homed stack and it selects the first non-loopback address returned by the ruby socket API.
Generally correct, but you won't have multiple "container networks" on a single DEA/cell, if I'm understanding you correctly. (If you have a doc reference, please share.) We're working on ways this might change, such as with overlay networks. I'd like to know if you have a specific scenario in mind.
TRAFFIC FROM AN EXTERNAL WEB SERVER TO APP INSTANCE:
a) The router will target the DEA Address/Port it saw on the NATS BUS.
b) The source ip address of the web server will not change in this flow from Router to DEA to Container.Not sure what you’re getting at here. Since the go router is essentially a reverse proxy. When it forwards a request to an application instance, the source address (TCP level) will be the go router that forwarded the request. Since the original source is lost, the go router will add X-Forwarded-For and X-Forwarded-Proto headers for the app. If XFF already exists, the router will append to it.
c) The flow from the Container to the web server will be NATTED by the DEA (to its external IP address -- the same address that the Web Server targeted).I’m afraid I don’t completely follow this question. If you’re talking about the response from the app to the web server, there’s no need for NAT; if you’re talking about a request from an app container to an external web server, yes, the request will masquerade as the DEA. If the DEA is in a private address space, another level of NAT will be done when the request leaves the private network.
d) There is mention of a NAT VM in the CF documentation. Not sure how that fits into the architecture!Think of an AWS VPC where everything has a private address; in a situation like that, you’ll need a NAT box for outbound flows to public addresses. Depending on how and where you’re doing a deployment, you may not have a “NAT box."
DNS should resolve to a load balancer.
That NAT VM exists for outbound communication originated from inside the perimeter. See diagrams in the security doc [1].
TRAFFIC FROM APP to APP:
a) A container will use DNS and will receive the address of a Router or the LB in the DNS reply message (as Asit mentioned in his reply).
b) If two containers on different DEAs are talking to each other -- the flow would need to go through a router.Load balancer and router in the default configuration.
c) If two containers on the same DEA are talking to each other -- the flow would still need to go through a router.Load balancer and router in the default configuration.
d) In other words, Container-P and Container-Q on container subnet A could talk to each other over this virtual network but since they will never see each others private addresses all communications would need to go through a router.
This is correct, and what we're working to change, to allow controlled container-to-container communication. I encourage you to read and comment on the proposal [2], and contact me directly if you want to discuss further.
NAT VM
a) I do not understand how this fits the architecture.
Answered previously.
SECURITY
a) Need to understand how security is set up (assuming it is all via IPTABLES rules but need to look up the methods available to set up rules).
At the bottom, it's IPTables. [1] is a good reference as is the linked material on Application Security Groups that Amit and Matt mentioned previously.
1- https://docs.cloudfoundry.org/concepts/security.html <https://docs.cloudfoundry.org/concepts/security.html>
2 - https://docs.google.com/document/d/1zQJqIEk4ldHH5iE5zat_oKIK8Ejogkgd_lySpg_oV_s/edit#heading=h.rv9r1vkzpih9 <https://docs.google.com/document/d/1zQJqIEk4ldHH5iE5zat_oKIK8Ejogkgd_lySpg_oV_s/edit#heading=h.rv9r1vkzpih9>
Jason Sherron
Hi Ravi,
Matt and Amit really did a great job in the thread so far; I'm the new
networking PM so I'm still learning some of this, but I wanted to help
answer your questions too. See inline.
On Fri, Dec 18, 2015 at 10:01 AM, ravi malhotra <ravi.malhotra(a)bnymellon.com
noticed you've asked about direct DEA communications a couple times.
single DEA/cell, if I'm understanding you correctly. (If you have a doc
reference, please share.) We're working on ways this might change, such as
with overlay networks. I'd like to know if you have a specific scenario in
mind.
perimeter. See diagrams in the security doc [1].
container-to-container communication. I encourage you to read and comment
on the proposal [2], and contact me directly if you want to discuss further.
material on Application Security Groups that Amit and Matt mentioned
previously.
1- https://docs.cloudfoundry.org/concepts/security.html
2 -
https://docs.google.com/document/d/1zQJqIEk4ldHH5iE5zat_oKIK8Ejogkgd_lySpg_oV_s/edit#heading=h.rv9r1vkzpih9
Matt and Amit really did a great job in the thread so far; I'm the new
networking PM so I'm still learning some of this, but I wanted to help
answer your questions too. See inline.
On Fri, Dec 18, 2015 at 10:01 AM, ravi malhotra <ravi.malhotra(a)bnymellon.com
wrote:
Matt and Amit,This is correct. Do you have a specific interest or concern about c)? I've
Great feedback from you! Thanks so much for taking so much time.
I think I understand some more pieces of the puzzle, but, some more
statements/questions -- if you could validate/comment, as you did earlier.
(Amit, You had asked where I am looking up stuff. Mainly using
cloudfoundry.org. There are also some blogs -- I saw one from someone in
Pivotal.)
NATS BUS:
a) DEAs will publish App instances using their (routable) IP Address and a
tcp Port (the Port is in turn mapped by the DEA to a private container ip
and port).
b) Routers subscribe to this bus (since external traffic needs to be load
balanced over the available App instances).
c) DEAs currently do not subscribe to this bus (may change with Diego).
noticed you've asked about direct DEA communications a couple times.
Generally correct, but you won't have multiple "container networks" on a
ARCHITECTURE OF CONTAINER NETWORK (ON A DEA):
a) Each DEA may define one or more container networks.
b) Each container network is local to the DEA.
c) The container network could be associated with a subnet (say,
10.254.1.0/24, 10.254/2.0/24, or just point-to-point /30 or /31, etc)
d) A container/app will derive an address from its container subnet.
e) The DEA interface on the container network is configured with an IP
address which represents the default gateway for the containers.
f) The DEA interface on the container network also provides the NAT
function for traffic outbound from Container to anywhere outside the DEA.
g) This NAT function is provisioned via IPTABLES rules.
single DEA/cell, if I'm understanding you correctly. (If you have a doc
reference, please share.) We're working on ways this might change, such as
with overlay networks. I'd like to know if you have a specific scenario in
mind.
TRAFFIC FROM AN EXTERNAL WEB SERVER TO APP INSTANCE:That NAT VM exists for outbound communication originated from inside the
a) The router will target the DEA Address/Port it saw on the NATS BUS.
b) The source ip address of the web server will not change in this flow
from Router to DEA to Container.
c) The flow from the Container to the web server will be NATTED by the DEA
(to its external IP address -- the same address that the Web Server
targeted).
d) There is mention of a NAT VM in the CF documentation. Not sure how that
fits into the architecture!
perimeter. See diagrams in the security doc [1].
This is correct, and what we're working to change, to allow controlled
TRAFFIC FROM APP to APP:
a) A container will use DNS and will receive the address of a Router or
the LB in the DNS reply message (as Asit mentioned in his reply).
b) If two containers on different DEAs are talking to each other -- the
flow would need to go through a router.
c) If two containers on the same DEA are talking to each other -- the flow
would still need to go through a router.
d) In other words, Container-P and Container-Q on container subnet A could
talk to each other over this virtual network but since they will never see
each others private addresses all communications would need to go through a
router.
container-to-container communication. I encourage you to read and comment
on the proposal [2], and contact me directly if you want to discuss further.
Answered previously.
NAT VM
a) I do not understand how this fits the architecture.
At the bottom, it's IPTables. [1] is a good reference as is the linked
SECURITY
a) Need to understand how security is set up (assuming it is all via
IPTABLES rules but need to look up the methods available to set up rules).
material on Application Security Groups that Amit and Matt mentioned
previously.
1- https://docs.cloudfoundry.org/concepts/security.html
2 -
https://docs.google.com/document/d/1zQJqIEk4ldHH5iE5zat_oKIK8Ejogkgd_lySpg_oV_s/edit#heading=h.rv9r1vkzpih9
ravi malhotra
Matt and Amit,
Great feedback from you! Thanks so much for taking so much time.
I think I understand some more pieces of the puzzle, but, some more statements/questions -- if you could validate/comment, as you did earlier. (Amit, You had asked where I am looking up stuff. Mainly using cloudfoundry.org. There are also some blogs -- I saw one from someone in Pivotal.)
NATS BUS:
a) DEAs will publish App instances using their (routable) IP Address and a tcp Port (the Port is in turn mapped by the DEA to a private container ip and port).
b) Routers subscribe to this bus (since external traffic needs to be load balanced over the available App instances).
c) DEAs currently do not subscribe to this bus (may change with Diego).
ARCHITECTURE OF CONTAINER NETWORK (ON A DEA):
a) Each DEA may define one or more container networks.
b) Each container network is local to the DEA.
c) The container network could be associated with a subnet (say, 10.254.1.0/24, 10.254/2.0/24, or just point-to-point /30 or /31, etc)
d) A container/app will derive an address from its container subnet.
e) The DEA interface on the container network is configured with an IP address which represents the default gateway for the containers.
f) The DEA interface on the container network also provides the NAT function for traffic outbound from Container to anywhere outside the DEA.
g) This NAT function is provisioned via IPTABLES rules.
TRAFFIC FROM AN EXTERNAL WEB SERVER TO APP INSTANCE:
a) The router will target the DEA Address/Port it saw on the NATS BUS.
b) The source ip address of the web server will not change in this flow from Router to DEA to Container.
c) The flow from the Container to the web server will be NATTED by the DEA (to its external IP address -- the same address that the Web Server targeted).
d) There is mention of a NAT VM in the CF documentation. Not sure how that fits into the architecture!
TRAFFIC FROM APP to APP:
a) A container will use DNS and will receive the address of a Router or the LB in the DNS reply message (as Asit mentioned in his reply).
b) If two containers on different DEAs are talking to each other -- the flow would need to go through a router.
c) If two containers on the same DEA are talking to each other -- the flow would still need to go through a router.
d) In other words, Container-P and Container-Q on container subnet A could talk to each other over this virtual network but since they will never see each others private addresses all communications would need to go through a router.
NAT VM
a) I do not understand how this fits the architecture.
SECURITY
a) Need to understand how security is set up (assuming it is all via IPTABLES rules but need to look up the methods available to set up rules).
Great feedback from you! Thanks so much for taking so much time.
I think I understand some more pieces of the puzzle, but, some more statements/questions -- if you could validate/comment, as you did earlier. (Amit, You had asked where I am looking up stuff. Mainly using cloudfoundry.org. There are also some blogs -- I saw one from someone in Pivotal.)
NATS BUS:
a) DEAs will publish App instances using their (routable) IP Address and a tcp Port (the Port is in turn mapped by the DEA to a private container ip and port).
b) Routers subscribe to this bus (since external traffic needs to be load balanced over the available App instances).
c) DEAs currently do not subscribe to this bus (may change with Diego).
ARCHITECTURE OF CONTAINER NETWORK (ON A DEA):
a) Each DEA may define one or more container networks.
b) Each container network is local to the DEA.
c) The container network could be associated with a subnet (say, 10.254.1.0/24, 10.254/2.0/24, or just point-to-point /30 or /31, etc)
d) A container/app will derive an address from its container subnet.
e) The DEA interface on the container network is configured with an IP address which represents the default gateway for the containers.
f) The DEA interface on the container network also provides the NAT function for traffic outbound from Container to anywhere outside the DEA.
g) This NAT function is provisioned via IPTABLES rules.
TRAFFIC FROM AN EXTERNAL WEB SERVER TO APP INSTANCE:
a) The router will target the DEA Address/Port it saw on the NATS BUS.
b) The source ip address of the web server will not change in this flow from Router to DEA to Container.
c) The flow from the Container to the web server will be NATTED by the DEA (to its external IP address -- the same address that the Web Server targeted).
d) There is mention of a NAT VM in the CF documentation. Not sure how that fits into the architecture!
TRAFFIC FROM APP to APP:
a) A container will use DNS and will receive the address of a Router or the LB in the DNS reply message (as Asit mentioned in his reply).
b) If two containers on different DEAs are talking to each other -- the flow would need to go through a router.
c) If two containers on the same DEA are talking to each other -- the flow would still need to go through a router.
d) In other words, Container-P and Container-Q on container subnet A could talk to each other over this virtual network but since they will never see each others private addresses all communications would need to go through a router.
NAT VM
a) I do not understand how this fits the architecture.
SECURITY
a) Need to understand how security is set up (assuming it is all via IPTABLES rules but need to look up the methods available to set up rules).
Amit Kumar Gupta
Hey Ravi, great questions. You're right, all these details may not be well
documented. Out of curiosity, where did you look for documentation, and
where did you find out what you've currently come to know?
Responses inline.
On Wednesday, December 16, 2015, ravi malhotra <ravi.malhotra(a)bnymellon.com>
wrote:
Depends what you mean by dynamically. Users of the platform (people who
push apps) can scale apps via command line "cf scale myapp -i 4" (for 4
instances) or with an application manifest. That's true of open source CF.
Some official vendors (e.g. Pivotal) offer application monitoring and
autoscaling add-on services.
Routers are part of the platform and are managed by an operator. These are
deployed using BOSH and are scaled up via a manifest change (BOSH manifest
though, not CF). There are no dynamic solutions for this in OSS or vendor
solutions at this time that I know of.
Yes, any DEA. The next generation CF backend "Diego" is the same way,
routers can LB to any Diego cell.
They can, if you configure your application security groups to allow it.
You would still need to know IP and port of the container you're trying to
reach. That said, a brand new project is spinning up (or has already) to
solve container to container networking for containers on Diego cells.
If you use an AWS ELB, this is handled by BOSH. If you're deploying
something like HAProxy as part of your CF deployment, the manifest where
you declare the desire to scale up routers can also configure HAProxy to
know about them. Most BOSH manifest generation tooling automatically
handles making sure the HAProxy config gets the right data.
Yes.
Native. You could presumably have an overlay network, but not necessary.
Check with the Routing team PM, Shannon (cc'd)
What sort of state?
Yes.
I interpreted your previous question about DEA/Diego cell to APP as meaning
APP to APP, so see previous response. Or did I misinterpret your previous
question?
I would check with the Garden PMs (garden is the containerizer used in CF),
Will and Julz.
Yes, standard Linux container technology, cgroups and namespaces.
Again, check with Garden PMs.
Not sure what you mean by East-West. Request to app1.my-domain.com
typically has DNS resolve to an upstream LB. LB routes traffic to routers,
although you could have DNS resolve directly to routers if you want to
expose routers externally. Routers then balance traffic to apps. I believe
the response returns via the router, again, check with Shannon.
May be upcoming in Garden.
Can you elaborate?
As an app developer using Diego backend, you can SSH into container (unless
permissions restricted by play for operator or space manager). As an
operator you can SSH onto DEA or Diego cell itself.
See answers to first couple questions.
Shannon can give you a more comprehensive answer. I know they did some perf
tests when using Routers for SSL termination. Other high request tests I've
seen have not exposed router as bottleneck, rather some conntrack parameter
settings in Ubuntu, on the DEA. These have since been addressed.
Yes.
Big question. The agent is in all stem cells, not just DEA. BOSH director
communicates with it to tell the VM what to do. Looking at the agent client
interface might be a helpful start.
https://github.com/cloudfoundry/bosh-agent/blob/master/agentclient/agent_client_interface.go
No.
What do you mean within a droplet? Do you mean instances of the same
application?
This is allowed.
CF is unique amongst platforms that containerizer and schedule workloads in
that it goes beyond this, and puts applications and routes as first
class; containers and IPs can usually be safely ignored as an
implementation detail. It's possible to get the information you
mentioned as an operator via the Diego BBS API, but depending on the
problem your trying to solve, this may not be the most relevant data.
In fact, even Diego abstracts containers into long running processes and
tasks. When running with a Windows Diego cells (as opposed to Linux) the
notion of container obviously doesn't translate into namespaces and cgroups.
At any rate, you can see the Diego BBS client interface here:
https://github.com/cloudfoundry-incubator/bbs/blob/master/client.go
DEAs don't know about routers. Currently, DEAs broadcast application routes
over a message bus, routers subscribe to the channel. This may change in
the future with Diego cells directly talking to the routing tier over HTTP
to populate the routing tables.
documented. Out of curiosity, where did you look for documentation, and
where did you find out what you've currently come to know?
Responses inline.
On Wednesday, December 16, 2015, ravi malhotra <ravi.malhotra(a)bnymellon.com>
wrote:
I did not find detailed documentation so I have created a set of
assumptions below on how CF Networking works. Are these correct? Can I get
answers to the questions? Thank you!
High Level Architecture/Scaling:
1. APP Containers and CF Routers are created/destroyed dynamically to
support application needs.
Depends what you mean by dynamically. Users of the platform (people who
push apps) can scale apps via command line "cf scale myapp -i 4" (for 4
instances) or with an application manifest. That's true of open source CF.
Some official vendors (e.g. Pivotal) offer application monitoring and
autoscaling add-on services.
Routers are part of the platform and are managed by an operator. These are
deployed using BOSH and are scaled up via a manifest change (BOSH manifest
though, not CF). There are no dynamic solutions for this in OSS or vendor
solutions at this time that I know of.
2. A Router can LB to any DEA in the environment (or, are there
Availability Zones which prescribe sets of Routers and DEAs?)
Yes, any DEA. The next generation CF backend "Diego" is the same way,
routers can LB to any Diego cell.
3. DEAs cannot talk directly to each other; APP1 to APP2 communication
must go through a Router?
They can, if you configure your application security groups to allow it.
You would still need to know IP and port of the container you're trying to
reach. That said, a brand new project is spinning up (or has already) to
solve container to container networking for containers on Diego cells.
4. If I deploy my own LB solution -- how do I dynamically update Router
addresses in my LB (as Routers are created/destroyed)?
If you use an AWS ELB, this is handled by BOSH. If you're deploying
something like HAProxy as part of your CF deployment, the manifest where
you declare the desire to scale up routers can also configure HAProxy to
know about them. Most BOSH manifest generation tooling automatically
handles making sure the HAProxy config gets the right data.
Communication from Router to App:
1. Router can use some algorithm (like round-robin) to direct traffic to a
DEA.
Yes.
2. Router to DEA traffic: is there an overlay network? or are we just
utilizing the native network?
Native. You could presumably have an overlay network, but not necessary.
3. Router to DEA traffic: is the Router just changing the destination
address of the request to the address of the DEA and forwarding the request
with the source address intact?
Check with the Routing team PM, Shannon (cc'd)
4. Router to DEA traffic: let's say the Router dies half way through; can
we mirror state to another Router?
What sort of state?
5. If a Router dies – all the DEAs can still be accessed via other
Routers; is this right?
Yes.
Communication from point of view of App/Container:
1. An APP (container) cannot directly talk to another APP (container) even
in the same DEA. This communication must go through a Router. Is this
accurate?
I interpreted your previous question about DEA/Diego cell to APP as meaning
APP to APP, so see previous response. Or did I misinterpret your previous
question?
2. The container is in a Network Name Space which is bridged to a Linux
Bridge that then joins to physical NIC.
I would check with the Garden PMs (garden is the containerizer used in CF),
Will and Julz.
3. Containers are isolated from each other because they are in different
Name Spaces and because of IPTables rules.
Yes, standard Linux container technology, cgroups and namespaces.
3. IPTables rules allow the container to communicate with all Routers.
4. IPTables rules bar the container from directly talking to anything that
is not a Router.
Again, check with Garden PMs.
East-West traffic between Containers:
1. E-W traffic must go through a Router.
2. APP1 will seek out a Router (which one?)
3. The Router will direct the request to APP2 on some DEA using some
algorithm (say, round-robin).
4. The reverse traffic from APP2 to APP1 would need to be NATTED to the
Router address. Also, we need a destination NAT. Not sure how the NAT
function would do this work.
Not sure what you mean by East-West. Request to app1.my-domain.com
typically has DNS resolve to an upstream LB. LB routes traffic to routers,
although you could have DNS resolve directly to routers if you want to
expose routers externally. Routers then balance traffic to apps. I believe
the response returns via the router, again, check with Shannon.
Management:
1. Is there ability to define network policy in WARDEN to shut an APP?
2. We may want to define policy based on bandwidth usage.
May be upcoming in Garden.
3. Can we configure QoS bits on an application?
Can you elaborate?
Troubleshooting:
1. Is there a promiscuous APP on a container that can sniff all traffic so
we can troubleshoot?
2. Use case for above: let's say an APP appears to freeze -- having a
packet capture from the DEA node could help diagnose the problem.
As an app developer using Diego backend, you can SSH into container (unless
permissions restricted by play for operator or space manager). As an
operator you can SSH onto DEA or Diego cell itself.
Performance:
1. When is a new CF Router instance spun up? Can I set up a rule in BOSH
to spin up new router when a certain traffic threshold is exceeded?
2. Similarly, when are new APP instances spun up?
See answers to first couple questions.
3. Is there any performance data available on the CF Router?
Shannon can give you a more comprehensive answer. I know they did some perf
tests when using Routers for SSL termination. Other high request tests I've
seen have not exposed router as bottleneck, rather some conntrack parameter
settings in Ubuntu, on the DEA. These have since been addressed.
DEA:
1. can DEA's be multihomed on Public and private networks?
Yes.
2. the BOSH agent on each DEA – what are all its functions?
Big question. The agent is in all stem cells, not just DEA. BOSH director
communicates with it to tell the VM what to do. Looking at the agent client
interface might be a helpful start.
https://github.com/cloudfoundry/bosh-agent/blob/master/agentclient/agent_client_interface.go
Is it collecting health data used by the router in the LB decision?
No.
Packet walk (please include LB and overlay technologies involved):
1. From App to App within a droplet?
What do you mean within a droplet? Do you mean instances of the same
application?
2. From App to App between droplets on the same host?
3. From App to App between droplets on different hosts?
4. From App to App between Availability Zones? (is this allowed?)
This is allowed.
5. From web server (outside CF environment) to App.
IP Addressing:
1. The containers all take addresses from a NATTED range (say,
10.254.0.0/16). Don’t I also need to NAT my source address? Example, I am
coming from an Apache web server to a CF App. The source address of the
Apache web server cannot be from the 10.254.0.0/16 range (if it were, we
would need to NAT the source).
2. Are the container addresses further subnetted (say, /24 per host?)
IP Multicast: Assuming there is no requirement for IP multicast in this
space.
Details: Commands to check which containers are up? What are their
addresses?
CF is unique amongst platforms that containerizer and schedule workloads in
that it goes beyond this, and puts applications and routes as first
class; containers and IPs can usually be safely ignored as an
implementation detail. It's possible to get the information you
mentioned as an operator via the Diego BBS API, but depending on the
problem your trying to solve, this may not be the most relevant data.
In fact, even Diego abstracts containers into long running processes and
tasks. When running with a Windows Diego cells (as opposed to Linux) the
notion of container obviously doesn't translate into namespaces and cgroups.
At any rate, you can see the Diego BBS client interface here:
https://github.com/cloudfoundry-incubator/bbs/blob/master/client.go
Which DEAs a router knows about? What tcp sessions are active? Where can I
find the detailed documentation?
DEAs don't know about routers. Currently, DEAs broadcast application routes
over a message bus, routers subscribe to the channel. This may change in
the future with Diego cells directly talking to the routing tier over HTTP
to populate the routing tables.
Matthew Sykes <matthew.sykes@...>
You have a lot of questions here and some of them may depend on how you’re deploying Cloud Foundry. I’ll add some answers inline.
Routers are not tied to an application container lifecycle but routes are.
When it comes to application instances that are hosted by a DEA, in the default configuration, application instances are not able to communicate directly with each other without going through the router. This is due to the presence of application security groups and networking sandboxing. While strongly discouraged, warden can be configured to allow access to the DEA network stack and security groups can be modified to allow direct access to other DEAs if needed.
A container networking project has just been proposed at the Runtime PMC with the goal of enabling container to container communication with appropriate access control and policies.
On Dec 16, 2015, at 11:05 AM, ravi malhotra <ravi.malhotra(a)bnymellon.com> wrote:A container is created for each desired instance of an application and they can be destroyed for various reasons. The most common reasons are because you’ve scaled down the number of desired instances, your application instance has crashed, or the DEA/Cell hosting the application instance is evacuating application instances for maintenance.
I did not find detailed documentation so I have created a set of assumptions below on how CF Networking works. Are these correct? Can I get answers to the questions? Thank you!
High Level Architecture/Scaling:
1. APP Containers and CF Routers are created/destroyed dynamically to support application needs.
Routers are not tied to an application container lifecycle but routes are.
2. A Router can LB to any DEA in the environment (or, are there Availability Zones which prescribe sets of Routers and DEAs?)Today routers will forward requests to any application instance that advertises a route which matches a request. Routers do not care about DEAs or anything else.
3. DEAs cannot talk directly to each other; APP1 to APP2 communication must go through a Router?There’s nothing that prevents DEAs from communicating with each other but, in general, they do not; there’s no need to. This is not the same as saying application instances can talk to each other...
When it comes to application instances that are hosted by a DEA, in the default configuration, application instances are not able to communicate directly with each other without going through the router. This is due to the presence of application security groups and networking sandboxing. While strongly discouraged, warden can be configured to allow access to the DEA network stack and security groups can be modified to allow direct access to other DEAs if needed.
A container networking project has just been proposed at the Runtime PMC with the goal of enabling container to container communication with appropriate access control and policies.
4. If I deploy my own LB solution -- how do I dynamically update Router addresses in my LB (as Routers are created/destroyed)?It will likely require engineering effort on your part. Some organizations have had success (LDS presented ‘norouter’ at one of the summits) but routing is a bit of a moving target. The routing team is better positioned to answer this.
The current implementation uses simple round robin.
Communication from Router to App:
1. Router can use some algorithm (like round-robin) to direct traffic to a DEA.
2. Router to DEA traffic: is there an overlay network? or are we just utilizing the native network?Generally speaking, it’s the native network but, ultimately, it depends on how and where you’ve deployed Cloud Foundry.
3. Router to DEA traffic: is the Router just changing the destination address of the request to the address of the DEA and forwarding the request with the source address intact?The router is targeting the ip and port advertised in the route. For application instances hosted by a DEA, that will be the DEA’s address and a port associated with a DNAT rule to the application’s endpoint in the container’s network namespace.
4. Router to DEA traffic: let's say the Router dies half way through; can we mirror state to another Router?I don’t follow the question.
5. If a Router dies – all the DEAs can still be accessed via other Routers; is this right?In general, yes, but it ultimately depends on how you’ve deployed.
Answered previously.
Communication from point of view of App/Container:
1. An APP (container) cannot directly talk to another APP (container) even in the same DEA. This communication must go through a Router. Is this accurate?
2. The container is in a Network Name Space which is bridged to a Linux Bridge that then joins to physical NIC.No. In warden it’s a veth pair with a private IP that is only accessible to the host.
3. Containers are isolated from each other because they are in different Name Spaces and because of IPTables rules.Yes.
3. IPTables rules allow the container to communicate with all Routers.In general, no. The iptables rules allow access to the load balancer sitting in front of the go routers. But, again, this depends on your application security groups.
4. IPTables rules bar the container from directly talking to anything that is not a Router.Depends on your application security groups.
These have generally been answered already.
East-West traffic between Containers:
1. E-W traffic must go through a Router.
2. APP1 will seek out a Router (which one?)
3. The Router will direct the request to APP2 on some DEA using some algorithm (say, round-robin).
4. The reverse traffic from APP2 to APP1 would need to be NATTED to the Router address. Also, we need a destination NAT. Not sure how the NAT function would do this work.
I can’t parse the question.
Management:
1. Is there ability to define network policy in WARDEN to shut an APP?
2. We may want to define policy based on bandwidth usage.Warden and DEA support bandwidth limits; Diego and Garden do not (yet).
3. Can we configure QoS bits on an application?No.
If you’re dealing with your own deployment, you can use standard network troubleshooting techniques from the DEA.
Troubleshooting:
1. Is there a promiscuous APP on a container that can sniff all traffic so we can troubleshoot?
2. Use case for above: let's say an APP appears to freeze -- having a packet capture from the DEA node could help diagnose the problem.
Routers are started when you change the number of instances in your bosh manifest and deploy. Bosh does not do auto scaling.
Performance:
1. When is a new CF Router instance spun up? Can I set up a rule in BOSH to spin up new router when a certain traffic threshold is exceeded?
2. Similarly, when are new APP instances spun up?App instances are started when the number of desired instances changes. Cloud Foundry does not do auto scaling but many vendors have ways of doing this.
3. Is there any performance data available on the CF Router?There are a number of metrics managed by the go router and the access logs provide response time information.
Depends on your deployment but I believe the DNAT rules for app instances will only be configured for one interface.
DEA:
1. can DEA's be multihomed on Public and private networks?
2. the BOSH agent on each DEA – what are all its functions? Is it collecting health data used by the router in the LB decision?You should look at the bosh documentation. The agent functions are not specific to DEAs or any other job type in a Cloud Foundry deployment.
This seems like an exercise, not a question.
Packet walk (please include LB and overlay technologies involved):
1. From App to App within a droplet?
2. From App to App between droplets on the same host?
3. From App to App between droplets on different hosts?
4. From App to App between Availability Zones? (is this allowed?)
5. From web server (outside CF environment) to App.
IP Addressing:
1. The containers all take addresses from a NATTED range (say, 10.254.0.0/16). Don’t I also need to NAT my source address? Example, I am coming from an Apache web server to a CF App. The source address of the Apache web server cannot be from the 10.254.0.0/16 range (if it were, we would need to NAT the source).
2. Are the container addresses further subnetted (say, /24 per host?)
IP Multicast: Assuming there is no requirement for IP multicast in this space.
Details: Commands to check which containers are up? What are their addresses? Which DEAs a router knows about? What tcp sessions are active? Where can I find the detailed documentation?
ravi malhotra
I did not find detailed documentation so I have created a set of assumptions below on how CF Networking works. Are these correct? Can I get answers to the questions? Thank you!
High Level Architecture/Scaling:
1. APP Containers and CF Routers are created/destroyed dynamically to support application needs.
2. A Router can LB to any DEA in the environment (or, are there Availability Zones which prescribe sets of Routers and DEAs?)
3. DEAs cannot talk directly to each other; APP1 to APP2 communication must go through a Router?
4. If I deploy my own LB solution -- how do I dynamically update Router addresses in my LB (as Routers are created/destroyed)?
Communication from Router to App:
1. Router can use some algorithm (like round-robin) to direct traffic to a DEA.
2. Router to DEA traffic: is there an overlay network? or are we just utilizing the native network?
3. Router to DEA traffic: is the Router just changing the destination address of the request to the address of the DEA and forwarding the request with the source address intact?
4. Router to DEA traffic: let's say the Router dies half way through; can we mirror state to another Router?
5. If a Router dies – all the DEAs can still be accessed via other Routers; is this right?
Communication from point of view of App/Container:
1. An APP (container) cannot directly talk to another APP (container) even in the same DEA. This communication must go through a Router. Is this accurate?
2. The container is in a Network Name Space which is bridged to a Linux Bridge that then joins to physical NIC.
3. Containers are isolated from each other because they are in different Name Spaces and because of IPTables rules.
3. IPTables rules allow the container to communicate with all Routers.
4. IPTables rules bar the container from directly talking to anything that is not a Router.
East-West traffic between Containers:
1. E-W traffic must go through a Router.
2. APP1 will seek out a Router (which one?)
3. The Router will direct the request to APP2 on some DEA using some algorithm (say, round-robin).
4. The reverse traffic from APP2 to APP1 would need to be NATTED to the Router address. Also, we need a destination NAT. Not sure how the NAT function would do this work.
Management:
1. Is there ability to define network policy in WARDEN to shut an APP?
2. We may want to define policy based on bandwidth usage.
3. Can we configure QoS bits on an application?
Troubleshooting:
1. Is there a promiscuous APP on a container that can sniff all traffic so we can troubleshoot?
2. Use case for above: let's say an APP appears to freeze -- having a packet capture from the DEA node could help diagnose the problem.
Performance:
1. When is a new CF Router instance spun up? Can I set up a rule in BOSH to spin up new router when a certain traffic threshold is exceeded?
2. Similarly, when are new APP instances spun up?
3. Is there any performance data available on the CF Router?
DEA:
1. can DEA's be multihomed on Public and private networks?
2. the BOSH agent on each DEA – what are all its functions? Is it collecting health data used by the router in the LB decision?
Packet walk (please include LB and overlay technologies involved):
1. From App to App within a droplet?
2. From App to App between droplets on the same host?
3. From App to App between droplets on different hosts?
4. From App to App between Availability Zones? (is this allowed?)
5. From web server (outside CF environment) to App.
IP Addressing:
1. The containers all take addresses from a NATTED range (say, 10.254.0.0/16). Don’t I also need to NAT my source address? Example, I am coming from an Apache web server to a CF App. The source address of the Apache web server cannot be from the 10.254.0.0/16 range (if it were, we would need to NAT the source).
2. Are the container addresses further subnetted (say, /24 per host?)
IP Multicast: Assuming there is no requirement for IP multicast in this space.
Details: Commands to check which containers are up? What are their addresses? Which DEAs a router knows about? What tcp sessions are active? Where can I find the detailed documentation?
High Level Architecture/Scaling:
1. APP Containers and CF Routers are created/destroyed dynamically to support application needs.
2. A Router can LB to any DEA in the environment (or, are there Availability Zones which prescribe sets of Routers and DEAs?)
3. DEAs cannot talk directly to each other; APP1 to APP2 communication must go through a Router?
4. If I deploy my own LB solution -- how do I dynamically update Router addresses in my LB (as Routers are created/destroyed)?
Communication from Router to App:
1. Router can use some algorithm (like round-robin) to direct traffic to a DEA.
2. Router to DEA traffic: is there an overlay network? or are we just utilizing the native network?
3. Router to DEA traffic: is the Router just changing the destination address of the request to the address of the DEA and forwarding the request with the source address intact?
4. Router to DEA traffic: let's say the Router dies half way through; can we mirror state to another Router?
5. If a Router dies – all the DEAs can still be accessed via other Routers; is this right?
Communication from point of view of App/Container:
1. An APP (container) cannot directly talk to another APP (container) even in the same DEA. This communication must go through a Router. Is this accurate?
2. The container is in a Network Name Space which is bridged to a Linux Bridge that then joins to physical NIC.
3. Containers are isolated from each other because they are in different Name Spaces and because of IPTables rules.
3. IPTables rules allow the container to communicate with all Routers.
4. IPTables rules bar the container from directly talking to anything that is not a Router.
East-West traffic between Containers:
1. E-W traffic must go through a Router.
2. APP1 will seek out a Router (which one?)
3. The Router will direct the request to APP2 on some DEA using some algorithm (say, round-robin).
4. The reverse traffic from APP2 to APP1 would need to be NATTED to the Router address. Also, we need a destination NAT. Not sure how the NAT function would do this work.
Management:
1. Is there ability to define network policy in WARDEN to shut an APP?
2. We may want to define policy based on bandwidth usage.
3. Can we configure QoS bits on an application?
Troubleshooting:
1. Is there a promiscuous APP on a container that can sniff all traffic so we can troubleshoot?
2. Use case for above: let's say an APP appears to freeze -- having a packet capture from the DEA node could help diagnose the problem.
Performance:
1. When is a new CF Router instance spun up? Can I set up a rule in BOSH to spin up new router when a certain traffic threshold is exceeded?
2. Similarly, when are new APP instances spun up?
3. Is there any performance data available on the CF Router?
DEA:
1. can DEA's be multihomed on Public and private networks?
2. the BOSH agent on each DEA – what are all its functions? Is it collecting health data used by the router in the LB decision?
Packet walk (please include LB and overlay technologies involved):
1. From App to App within a droplet?
2. From App to App between droplets on the same host?
3. From App to App between droplets on different hosts?
4. From App to App between Availability Zones? (is this allowed?)
5. From web server (outside CF environment) to App.
IP Addressing:
1. The containers all take addresses from a NATTED range (say, 10.254.0.0/16). Don’t I also need to NAT my source address? Example, I am coming from an Apache web server to a CF App. The source address of the Apache web server cannot be from the 10.254.0.0/16 range (if it were, we would need to NAT the source).
2. Are the container addresses further subnetted (say, /24 per host?)
IP Multicast: Assuming there is no requirement for IP multicast in this space.
Details: Commands to check which containers are up? What are their addresses? Which DEAs a router knows about? What tcp sessions are active? Where can I find the detailed documentation?