Why Secure Kubernetes at the Network Fabric
Traditional firewalls match on source and destination IP addresses. Kubernetes breaks this model. Pods get ephemeral IPs that change on every restart, scale event, or rolling deployment — making IP-based rules meaningless within minutes. Securing Kubernetes traffic requires a shift from address-based policy to identity-based policy.The Problem with IP-Based Security in Kubernetes
A pod running at100.64.3.47 right now may be gone in 30 seconds. A completely different workload may inherit that same IP moments later. Firewall rules like “allow 100.64.3.47 to 10.0.1.5 on port 443” are stale before the next deployment.
- Ephemeral IPs — Pods receive random IPs from a CIDR pool. Every restart, scale-up, or rolling update shuffles the assignments. Static rules cannot keep pace.
- Overlapping CIDRs at scale — When multiple clusters use the same secondary CIDR (e.g.,
100.64.0.0/16), the same IP address maps to entirely different workloads in different clusters. CIDR-based rules are ambiguous. - IPs do not represent identity — Kubernetes workloads are defined by namespace, labels, and service accounts. These are the primitives that express intent — not IP addresses.
IP overlap is a real operational problem, but it is a symptom. The deeper issue is that IP addresses do not represent workload identity in Kubernetes. Solving overlap without solving identity leaves security gaps.
External Firewalling: Don’t Touch the Cluster
Aviatrix enforces Kubernetes security at the network fabric — outside the cluster — using the spoke gateway as the policy enforcement point. This keeps the cluster clean:- Use the cloud-native CNI — The VPC CNI is battle-tested, AWS-supported, and deeply integrated with EC2 networking. Aviatrix works with it, not against it. No CNI replacement required.
- No DaemonSets — In-cluster network policy engines (Calico, Cilium) require DaemonSets on every node — another component to patch, monitor, and troubleshoot inside the blast radius of the cluster itself.
- No sidecars — Service mesh approaches (Istio, Linkerd) inject proxies into every pod, adding latency, memory overhead, and operational surface. For network-level segmentation, this is unnecessary.
- Separation of concerns — The networking team manages firewall policy in the Aviatrix fabric. Application teams manage workloads in the cluster. Neither modifies the other’s domain. A platform engineer can enforce “namespace X can only talk to namespace Y” without ever running
kubectl. - Consistent across clusters and clouds — The same DCF policies work whether the cluster is EKS, AKS, or GKE. Policy follows the architecture, not the Kubernetes distribution.
Platform Engineering Models
This architecture enables two common platform delivery patterns:- Namespace as a service — Multiple teams share a single EKS cluster, isolated by namespace. DCF uses Kubernetes-type SmartGroups to enforce per-namespace egress policy at the spoke gateway. Teams get self-service namespaces; the platform team controls what leaves the cluster.
- Cluster as a service — Each tenant gets a dedicated EKS cluster in its own VPC with its own (potentially overlapping) pod CIDR. DCF uses VPC-type SmartGroups for cluster-level segmentation. Tenants get full cluster isolation; the platform team manages inter-cluster and egress policy centrally.
- Multi-cluster, multi-cloud — Both models compose. An organization can run namespace-as-a-service in EKS and cluster-as-a-service in AKS, governed by a single set of DCF policies.
Overview
One of the practical challenges with EKS at scale is IP exhaustion: the VPC CNI plugin assigns one VPC IP address per pod, rapidly consuming routable RFC 1918 address space. A cluster running 500 pods consumes 500 IPs from your enterprise IP plan — and that is a single cluster. AWS documents this problem and a native solution in Addressing IPv4 address exhaustion in Amazon EKS clusters using private NAT gateways. The AWS approach uses a private NAT gateway as the SNAT translation point between non-routable pod subnets and routable VPC subnets. When Aviatrix is the network fabric, the Aviatrix spoke gateway replaces the private NAT gateway — it becomes the SNAT translation point. This is simpler (no additional NAT gateway infrastructure), integrates with DCF for policy enforcement, and works consistently across multi-cloud environments. This architecture supports both the namespace-as-a-service and cluster-as-a-service platform engineering models described above.SmartGroups are enforced on cluster egress traffic. Kubernetes network policies should be used outside of Aviatrix to restrict traffic between namespaces within the cluster.
Routable vs. Non-Routable Address Space
The EKS VPC uses two CIDR blocks with distinct roles:| Layer | CIDR Source | Routable? | Purpose |
|---|---|---|---|
| Primary CIDR | Enterprise IP plan (e.g., 10.10.0.0/23) | Yes — unique, advertised across the fabric | Cluster ingress, EKS control plane ENIs, node management, Aviatrix gateway |
| Secondary CIDR | RFC 6598 space (e.g., 100.64.0.0/16) | No — never advertised, can overlap across VPCs | Pod networking via VPC CNI custom networking |
/23 (512 IPs) is often sufficient. The heavy lifting moves to the secondary CIDR, which provides tens of thousands of pod IPs without consuming enterprise address space.
What Lives on the Routable (Primary) CIDR
- Cluster ingress — Load balancer ENIs (ALB, NLB, or any ingress controller) sit in routable subnets so they are reachable from other VPCs, on-prem, or the internet.
- EKS control plane ENIs — The managed cross-account ENIs that EKS places in your VPC for API server connectivity. These must be in routable subnets so nodes and pods can reach the Kubernetes API.
- EKS nodes — Node EC2 instances launch in routable private subnets. Their primary ENI gets a routable IP; secondary ENIs (for pods) are placed in non-routable subnets via ENIConfig.
- Aviatrix spoke gateway — The gateway ENIs need routable IPs since they are the SNAT translation point.
What Lives on the Non-Routable (Secondary) CIDR
- Pod ENIs — Every pod gets an IP from the
100.64.0.0/16secondary CIDR via VPC CNI custom networking. These IPs are never exposed directly outside the VPC — all egress is SNATed by the Aviatrix spoke gateway.
Traffic Flow
Step 1: AWS VPC Configuration
1.1 Add a Secondary CIDR Block
Attach a secondary CIDR from RFC 6598 space (100.64.0.0/10) to the EKS VPC. A /16 provides 65,536 pod IPs — sufficient for most clusters.
1.2 Create Pod Subnets from the Secondary CIDR
Create one pod subnet per Availability Zone from the secondary CIDR:- AZ-a:
100.64.0.0/17(32,768 IPs) - AZ-b:
100.64.128.0/17(32,768 IPs)
aws_vpc_ipv4_cidr_block_association resource.
1.3 Subnet Layout
| Subnet | CIDR Source | Routable? | Purpose |
|---|---|---|---|
| Aviatrix gateway (x2 AZs) | Primary | Yes | Spoke gateway ENIs |
| Load balancer / ingress (x2 AZs) | Primary | Yes | ALB, NLB, or any ingress controller ENIs |
| Infrastructure / nodes (x2 AZs) | Primary | Yes | EKS nodes, control plane ENIs |
| Pod networking (x2 AZs) | Secondary | No | Pod ENIs via VPC CNI custom networking |
- Public-facing:
kubernetes.io/role/elb = 1 - Internal-facing:
kubernetes.io/role/internal-elb = 1
Step 2: EKS VPC CNI Configuration
2.1 Enable Custom Networking on the vpc-cni Addon
Three environment variables control the behavior:| Variable | Value | Purpose |
|---|---|---|
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG | "true" | Pods use ENIConfig-specified subnets instead of the node’s subnet |
ENI_CONFIG_LABEL_DEF | "topology.kubernetes.io/zone" | CNI selects the ENIConfig matching the node’s AZ label |
AWS_VPC_K8S_CNI_EXTERNALSNAT | "true" | Disables CNI-level SNAT — delegates NAT to Aviatrix |
2.2 Create ENIConfig Custom Resources
One ENIConfig per AZ, created before node groups launch:2.3 Pod Security Group
Create a dedicated security group for pod ENIs with the following rules: Ingress:- All traffic from node security group (node-to-pod communication)
- All traffic from self (pod-to-pod within cluster)
- TCP 9443 from EKS cluster primary security group (control plane webhook callbacks)
- All traffic to
0.0.0.0/0
2.4 Cluster Management Access from Pods
Pods on the non-routable secondary CIDR (100.64.x.x) need to reach the EKS API server for normal Kubernetes operations — service account token refresh, watch streams, operator controllers, etc. This works because the control plane ENIs and pod ENIs are in the same VPC, even though they are on different CIDRs.
The key security group rule is:
100.64.x.x can reach control plane ENIs on 10.x.x.x within the VPC.
The reciprocal rule (in section 2.3 above) allows the control plane to call back to pod webhooks on port 9443. This is required for admission controllers like the AWS Load Balancer Controller or cert-manager.
No SNAT is involved in this path — traffic stays entirely within the VPC and does not traverse the Aviatrix gateway.
2.5 Cluster Ingress on the Routable CIDR
Ingress load balancers must be placed in routable subnets (from the primary CIDR) so they are reachable from outside the VPC. The pods behind them are on the non-routable secondary CIDR — the load balancer bridges the gap. Target type must beip regardless of which ingress controller you use. With CNI-level SNAT disabled, the instance target type (which routes through kube-proxy on the node) does not work correctly. The ip target type registers pod IPs (100.64.x.x) directly with the load balancer, and the load balancer reaches them within the VPC.
- ALB (AWS Load Balancer Controller)
- NLB Service
AWS Load Balancer Controller caveat: When pods run on the secondary CIDR, they cannot reach EC2 instance metadata (IMDS) to auto-detect VPC ID and region. The controller must be configured with explicit values:This caveat is specific to the AWS Load Balancer Controller. Other ingress controllers may not have this dependency.
Step 3: Aviatrix Spoke Gateway — Custom SNAT
This is where Aviatrix replaces the AWS private NAT gateway. Instead of deploying a NAT gateway in the VPC and configuring route tables to point non-routable subnets at it, the Aviatrix spoke gateway performs the same SNAT function — and integrates with DCF for policy enforcement.3.1 Do NOT Enable single_ip_snat
The spoke gateway must not usesingle_ip_snat = true. That mode applies blanket NAT and cannot be combined with policy-based rules. Leave it at the default (false).
3.2 Configure Custom SNAT Policies
Create anaviatrix_gateway_snat resource with three policies:
Policy 1 — Pod traffic toward the transit (east-west to other VPCs, on-prem, etc.)
single_ip_snat is disabled, nodes also lose default internet NAT. Add a policy per infrastructure subnet:
3.3 Field Notes
connectionandinterfaceare mutually exclusive — when specifyingconnection, do not setinterface, and vice versa.snat_ipsis the spoke gateway’s primary private IP — the routable VPC address.- Custom SNAT policies must be configured on each gateway individually. If the spoke has an HA gateway, create a separate
aviatrix_gateway_snatresource targeting the HA gateway with its own private IP as thesnat_ipsvalue.
Step 4: Aviatrix Transit Gateway — Route Exclusion
The transit must not advertise the non-routable pod CIDR to other spokes or on-prem:100.64.0.0/16 into the transit routing domain. If multiple clusters use the same secondary CIDR, this creates a routing conflict. Even with a single cluster, advertising a non-routable CIDR is unnecessary — all pod traffic is SNATed to a routable address before entering the fabric.
Step 5: Aviatrix DCF and SmartGroup Considerations
How DCF Handles SNAT
DCF evaluates traffic at the spoke gateway — the entry point into the Aviatrix fabric. Transit gateways are forwarders and do not perform policy evaluation. The DCF engine is NAT-aware and automatically accounts for SNAT translations when programming policies. You do not need to manually adjust rules for pre- vs. post-SNAT addresses.Use Dynamic-Type SmartGroups
Because the pod CIDR is non-routable and potentially overlapping across VPCs, CIDR-based SmartGroups matching100.64.0.0/16 are ambiguous — DCF cannot distinguish which cluster a 100.64.x.x address belongs to. Use dynamic-type SmartGroups that resolve to identity:
| Dynamic Type | Use Case |
|---|---|
| VPC | Cluster-level / VPC-level segmentation |
| Kubernetes | Workload-level segmentation by namespace, label, or service account |
- VPC-type SmartGroup
- Kubernetes-type SmartGroup
Prerequisites for Kubernetes-Type SmartGroups
EKS clusters must be onboarded to the Aviatrix Controller viaaviatrix_kubernetes_cluster. The Controller needs read access to the cluster (EKS access entries with view permissions) to discover pod, namespace, and label metadata.