Uncategorized

Beyond Application Security Groups: Avoid App Restarts with Dynamic Egress Policies in PCF 2.4.

Do you wish you could apply a simple egress network policy on an app in Cloud Foundry, without requiring an app restart? Want greater control over how the apps in a given Space access an external service? Now you can, with a new beta feature in Pivotal Cloud Foundry 2.4: dynamic egress policies.

Before we dive into the new feature, let’s take a step back and define what the heck an egress policy is, and how it’s managed in Pivotal Cloud Foundry (PCF). Note that we are only focusing on egress traffic in this blog since ingress traffic is typically controlled outside the platform.

Egress Policies in Pivotal Cloud Foundry

Egress policies control how traffic flows from your apps to off-platform services. Platform operators manage these policies using Application Security Groups (ASGs). ASGs are a collection of rules that specify the protocols, ports, and IP address ranges where application or task instances send traffic. ASGs are a tried-and-true model for controlling access to applications. They work well for many scenarios.

However, operators have bumped up against the limits of ASGs in recent times. In particular, PCF customers are looking for ways around these limitations of ASGs:

  • Permissions are too coarse. You have to create a policy at the space level even if only a few apps in the space require access to the external service.

  • App restarts are required. You have to restart your apps when applying these policies, which causes downtime.

This brings us to the news of the day!

Say Hello to Dynamic Egress Policies, Now a Beta Feature

You now have a better way to set and manage egress policies with Dynamic Egress Policy Configuration, a beta capability in PCF 2.4. (This feature was released as part of open source Cloud Foundry v2.19.0.) Note that ASGs will be supported in PCF for the near future, while we continue to enhance dynamic egress policies. In the meantime, policies created by both mechanisms will apply to your environment.

Before we dive too deep into how dynamic egress policies work, there are a couple of prerequisites to using this feature:

  1. This feature is disabled by default. To enable this feature, you must select Enable Beta Dynamic Egress Enforcement in the PAS Networking pane. Additionally, you must have Silk selected as your Container Network Interface Plugin.

  2. To administer dynamic egress policies, you must have the network.admin UAA scope. If you are a CF admin, you already have the network.admin scope. An admin can also grant the network.admin scope to a space developer.

With these requirements met, you can proceed to create dynamic egress policies to allow your apps to communicate with external services, such as a MySQL database. The workflow is as follows:

  1. Create a destination object with details about the external service that your application or space needs to access.

cf curl -X POST /networking/v1/external/destinations -d '{

 "destinations": [

  {

     "name": "MySQL",

      "description": "Demo",

      "ips": [{"start": "23.96.35.235", "end": "23.96.35.235"}],

      "ports": [{"start": 80, "end": 80}],

       "protocol":"tcp"

  }

]

}'

  1. Fetch the id of the new destination object.

  cf curl /networking/v1/external/destinations | jq .

{

 "total_destinations": 1,

 "destinations": [

   {

     "id": "e8a85db3-5189-48d8-566b-086c886d819e",

     "name": "MySQL",

     "description": "Demo",

     "protocol": "tcp",

     "ports": [

       {

         "start": 80,

         "end": 80

       }

     ],

     "ips": [

       {

         "start": "23.96.35.235",

         "end": "23.96.35.235"

       }

     ]

   }

 ]

}

  1. Fetch the GUID of the app/space that need access to the external service.

cf app backend --guid

887757be-5eda-48ce-b427-79dcc0705e91

  1. Create an egress policy from the application or space to the destination object.

cf curl -X POST /networking/v1/external/egress_policies -d '{

"egress_policies": [

  {

     "source": {

       "type": "app",

       "id": "887757be-5eda-48ce-b427-79dcc0705e91"

     },

     "destination": {

       "id": "e8a85db3-5189-48d8-566b-086c886d819e"

     }

   }

 ]

}'

And voila! Your app now has access to the external service with no app restart required.

Also, note that you can now fine-tune permissions to suit your requirements. In other words, access at the space level is not enabled by default. But you can still specify the source type as "space", if you so choose. Just use the space guid in the source object to allow all apps in a space access the external service. Check out our Github page for API instructions.

ASGs and Dynamic Egress Policies: A Closer Look

Let us look at some scenarios and compare the workflows between ASGs and the new dynamic egress policy configuration. You’ll notice the new dynamic policies eliminate the restart step.

Scenario 1: Space A has 2 apps – Frontend and Backend. Backend application must access an external MySQL DB at 23.96.35.235 to retrieve data and send to the Frontend app.

ASG Workflow

Dynamic Egress Policy Workflow

App developer pushes the Backend app

App developer pushes the Backend app

App developer requests the security team to allow access from the Backend App to the external MySQL DB at 23.96.35.235

App developer requests the security team to allow access from the Backend App to the external MySQL DB at 23.96.35.235

Security team reviews the request and approves

Security team reviews the request and approves

Platform Operator creates an ASG for the MySQL DB and binds it to Space A

Platform Operator creates a destination object for the MySQL DB and creates an egress policy from the Backend application to the destination object

App developer restarts apps in Space A

           
A conceptual look at Scenario 1.

Now let’s make things a little more interesting with another common use case.

Scenario 2: Space A has 2 apps – Frontend and Backend. Backend application has an egress policy to access an external MySQL DB at 23.96.35.235 to fetch data. The DB is moved to a different data center and the new IP address is 216.58.195.78

ASG Workflow

Dynamic Egress Policy Workflow

App developer requests the security team to allow access from the Backend application to the new IP address 216.58.195.78 for the MySQL DB

App developer requests the security team to allow access from the Backend application to the new IP address 216.58.195.78 for the MySQL DB

Security team reviews the request and approves

Security team reviews the request and approves

Platform Operator creates a new ASG for the new IP address and binds it to Space A

Platform Operator updates the existing destination object that the Backend application has an egress policy to, with the new IP address of the MySQL DB

Platform Operator unbinds the old ASG with the old IP address from Space A and also deletes the ASG

App developer restarts apps in Space A

 You might be thinking, “Why didn’t you just improve ASGs?”

It’s fair question. For starters, the new dynamic egress feature uses the same policy server that defines container-to-container networking policies. So why do this instead of updating the ASG implementation?

There are two reasons:

  1. The policy server was originally created with a vision to encompass more than just container to container communication.

  2. Having two sources of truth for policies is confusing. The new implementation alleviates this risk.

What’s Next with Dynamic Egress Policies

We’re excited to launch dynamic egress policies. Now you have a solution to the application restart issue with ASGs, as well as a way to control egress policies at both the space and application levels. However, we’re still early in the development of this feature.

We have a few roadmap ideas in mind, and would love to hear your feedback:

  • The eventual deprecation of ASGs.

  • Explore FQDN-based egress policy enforcement.

  • Automate egress policy configuration during service binding.

  • Allow policy enforcement via the application manifest.

We’re also following the work being done by the Kubernetes community, in particular the Network Policy model based on labels and selectors.

Want to tell us what we should do next? You can do so via the comments here, or with a Github issue/feature request. We also invite you to join us on the Cloud Foundry #container-networking Slack channel!