Contents

Kubernetes: Deployments, Services, Gateway API and How to Deploy Your First App

Updated March 2026: This article has been revised to reflect the changes in Kubernetes v1.35. Ingress has been replaced with Gateway API as the recommended standard for managing external traffic.

Prerequisites

Before we get started, make sure you have everything from the first chapter ready:

  • Docker (or OrbStack on macOS) running.
  • kubectl installed and configured.
  • Kind installed with an active cluster.

If you don’t have that yet, check out the previous post where I walk you through setting up your local cluster step by step.


Introduction

In the previous post we set up our first local cluster with Kind and understood what Kubernetes is and why it exists. Now comes the fun part: putting stuff inside the cluster.

In this chapter we’re going to learn about the fundamental Kubernetes resources, deploy our first app, and learn how to expose it to the outside world. Think of it as going from knowing what a kitchen is to cooking your first meal.


The Fundamental Kubernetes Resources

Before touching the keyboard, you need to meet the main characters. Kubernetes works with resources (objects) that you define and it takes care of maintaining. These are the most important ones to get started:

Recursos fundamentales de Kubernetes

Traffic flow: Gateway → HTTPRoute → Service → Deployment → Pods (with HPA auto-scaling)

Pod

The Pod is the smallest unit in Kubernetes. It’s not a container — it’s a wrapper that can hold one or more containers that share networking and storage.

Anatomía de un Pod

A Pod can have one or more containers that share an IP and volumes

Think of it this way: if a Docker container is a person, a Pod is an apartment where one or more people live sharing the same address (IP) and utilities (storage).

Key characteristics:

  • Each Pod has its own IP within the cluster.
  • Containers inside the same Pod communicate via localhost.
  • If a Pod dies, it doesn’t come back on its own — you need something to manage it (like a Deployment).
  • A Pod is ephemeral: it’s born, does its job, and can die at any moment.

Deployment

The Deployment is the resource you’ll actually use 90% of the time. It’s the boss of the Pods: you tell it how many replicas you want and it takes care of creating them, keeping them alive, and updating them.

What does a Deployment do?

  • Creates and manages Pods automatically through a ReplicaSet.
  • Rolling updates: updates your Pods with zero downtime, replacing them gradually.
  • Rollback: if something goes wrong, you can roll back to the previous version with a single command.
  • Scaling: increase or decrease the number of replicas as needed.

Service (svc)

Pods have IPs that change all the time (remember, they’re ephemeral). How do you connect to something that keeps changing its address? That’s where the Service comes in.

A Service is a stable access point that directs traffic to a group of Pods. Think of it as the front desk of a hotel: the guests (Pods) change rooms, but the front desk (Service) is always in the same place.

Main types:

  • ClusterIP (default): accessible only within the cluster.
  • NodePort: exposes a port on every node in the cluster.
  • LoadBalancer: creates an external load balancer (on cloud providers).

Gateway API (the Ingress replacement)

If you’ve seen older tutorials, they probably mention Ingress as the way to expose apps externally. Ingress still works, but it’s frozen — it won’t receive new features. Also, the popular ingress-nginx controller is being retired (support until March 2026).

The modern alternative is Gateway API, the new official standard for managing HTTP/HTTPS traffic in Kubernetes. Think of it as Ingress 2.0 but much better designed.

Why is Gateway API better?

  • Separation of roles: the infra team manages the Gateway, devs configure their HTTPRoutes. Each team only touches what belongs to them.
  • More expressive: supports traffic splitting, header matching, redirects, and more natively (no hacky annotations).
  • Multi-protocol: not just HTTP. Natively supports TCP, UDP, and gRPC.
  • Portable: works the same with NGINX Gateway Fabric, Envoy Gateway, Traefik, Istio, etc.

The main Gateway API resources are:

  • GatewayClass: defines which implementation to use (like the “driver” — NGINX, Envoy, etc.).
  • Gateway: the entry point that listens on specific ports/hostnames.
  • HTTPRoute: the routing rules that connect the Gateway to your Services.
Note
If you’re migrating from Ingress, there’s a tool called ingress2gateway that converts your Ingress manifests to Gateway API automatically.

HPA (Horizontal Pod Autoscaler)

The HPA is the resource that automatically scales your Pods based on metrics like CPU or memory. If your app is getting a lot of traffic and the Pods are drowning, the HPA creates more replicas automatically. When traffic drops, it scales them down.

Spoiler
The HPA deserves its own full chapter, so here we’ll only cover the basics. In the next post we’ll explore it in depth with practical examples of autoscaling based on CPU, memory, and custom metrics.

Deploying Your First Pod with a Command

Let’s get practical. Make sure your Kind cluster is running (if you followed the previous post, you already have it). First up: launch a Pod with a single command.

kubectl run mi-nginx --image=nginx:latest --port=80

This creates a Pod called mi-nginx using the NGINX image and exposing port 80. That simple.

Verify it’s running:

kubectl get pods
NAME       READY   STATUS    RESTARTS   AGE
mi-nginx   1/1     Running   0          30s

To see more details:

kubectl describe pod mi-nginx

And when you no longer need it:

kubectl delete pod mi-nginx
Important
A Pod created this way with kubectl run is a “loose” Pod. If it dies, nobody brings it back. For production you’ll always use Deployments.

Deploying with a Deployment (command)

Now let’s move on to the Deployment, which is what you’ll actually use. We create a Deployment with a command:

kubectl create deployment mi-app --image=nginx:latest --replicas=2

This creates:

  • A Deployment called mi-app.
  • A ReplicaSet (managed automatically).
  • 2 Pods running the NGINX image.

Verify:

kubectl get deployments
NAME     READY   UP-TO-DATE   AVAILABLE   AGE
mi-app   2/2     2            2           15s
kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
mi-app-5d9b7f6b4-abc12   1/1     Running   0          20s
mi-app-5d9b7f6b4-def34   1/1     Running   0          20s

Notice how the Pods have automatically generated names. The Deployment takes care of everything.


What Is a YAML Manifest?

So far we’ve used imperative commands (kubectl run, kubectl create). That’s fine for testing, but in the real world we use the declarative approach: writing a YAML file that describes exactly how you want your resource to look.

Estructura de un manifiesto YAML

A YAML manifest defines the desired state of your resource in Kubernetes

Why YAML instead of commands?

  • Versionable: you put it in Git and you have a change history.
  • Reproducible: anyone can apply the same file and get the same result.
  • Declarative: you tell K8s “I want this” and it figures out how to get there.
  • Living documentation: the YAML is the source of truth for your infrastructure.

Every Kubernetes manifest has 4 required fields:

apiVersion: # Versión del API (apps/v1, v1, networking.k8s.io/v1)
kind:       # Tipo de recurso (Deployment, Service, Ingress)
metadata:   # Nombre, labels, namespace
spec:       # La especificación del recurso (aquí va la magia)

Defining Our Deployment with YAML

Let’s create a full Deployment in YAML. Create a file called deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mi-app
  labels:
    app: mi-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: mi-app
  template:
    metadata:
      labels:
        app: mi-app
    spec:
      containers:
        - name: mi-app
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              memory: "64Mi"
              cpu: "50m"
            limits:
              memory: "128Mi"
              cpu: "100m"

Let’s break down the important parts:

FieldWhat it does
replicas: 2We want 2 Pods running
selector.matchLabelsHow the Deployment finds its Pods (by labels)
templateThe template used to create each Pod
containersList of containers inside each Pod
resourcesCPU and memory limits (always a good practice)

Apply it:

kubectl apply -f deployment.yaml
deployment.apps/mi-app created

Verify:

kubectl get deployment mi-app
kubectl get pods -l app=mi-app

Exposing the App with a Service

Our Pods are running, but nobody can access them. Let’s create a Service to give them a stable access point. Create service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: mi-app-svc
spec:
  selector:
    app: mi-app  # Selecciona los Pods con label app=mi-app
  ports:
    - protocol: TCP
      port: 80        # Puerto del Service
      targetPort: 80   # Puerto del contenedor
  type: ClusterIP

Apply it:

kubectl apply -f service.yaml

Verify:

kubectl get svc mi-app-svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
mi-app-svc   ClusterIP   10.96.45.123   <none>        80/TCP    10s

Now any Pod within the cluster can access your app using mi-app-svc:80 or the full DNS mi-app-svc.default.svc.cluster.local.


Exposing to the Outside World with Gateway API

For external traffic to reach your app, we’ll use Gateway API — the modern Kubernetes standard for managing incoming traffic.

Installing Gateway API on Kind

First, install the Gateway API CRDs (Custom Resource Definitions):

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.5.1/standard-install.yaml

Next, you need a Gateway Controller. We’ll use Envoy Gateway for its simplicity:

kubectl apply -f https://github.com/envoyproxy/gateway/releases/download/v1.7.1/install.yaml

Wait for it to be ready:

kubectl wait --namespace envoy-gateway-system \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/name=envoy-gateway \
  --timeout=90s

Creating the Gateway and HTTPRoute

First, create the Gateway (the entry point). Create gateway.yaml:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: mi-gateway
spec:
  gatewayClassName: eg  # Envoy Gateway
  listeners:
    - name: http
      protocol: HTTP
      port: 80

Then, create the HTTPRoute that connects the Gateway to your Service. Create httproute.yaml:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: mi-app-route
spec:
  parentRefs:
    - name: mi-gateway
  hostnames:
    - "mi-app.local"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: mi-app-svc
          port: 80

Apply both:

kubectl apply -f gateway.yaml
kubectl apply -f httproute.yaml

Verify:

kubectl get gateway
NAME         CLASS   ADDRESS        PROGRAMMED   AGE
mi-gateway   eg      172.18.0.200   True         30s
kubectl get httproute
NAME           HOSTNAMES            PARENTREFS       AGE
mi-app-route   ["mi-app.local"]     ["mi-gateway"]   15s

Notice how the separation is much cleaner: the Gateway defines where to listen and the HTTPRoute defines how to route. In a real team, the infra team would create the Gateway and each development team would configure their own HTTPRoutes.

Tip
To test, add 127.0.0.1 mi-app.local to your /etc/hosts file and you’ll be able to access it from the browser.

Scaling Replicas in the Deployment

Does your app need more muscle? Scaling is ridiculously easy. From the command line:

kubectl scale deployment mi-app --replicas=5

Verify:

kubectl get pods -l app=mi-app
NAME                      READY   STATUS    RESTARTS   AGE
mi-app-5d9b7f6b4-abc12   1/1     Running   0          10m
mi-app-5d9b7f6b4-def34   1/1     Running   0          10m
mi-app-5d9b7f6b4-ghi56   1/1     Running   0          5s
mi-app-5d9b7f6b4-jkl78   1/1     Running   0          5s
mi-app-5d9b7f6b4-mno90   1/1     Running   0          5s

Or if you prefer the declarative approach, simply change replicas: 5 in your deployment.yaml and apply again:

kubectl apply -f deployment.yaml

To scale down, same process but with a smaller number. Kubernetes will remove the extra Pods in an orderly fashion.


Autoscaling with HPA (Preview)

Manual scaling is fine, but what happens at 3 AM when your app goes viral? You’re not going to be awake to run kubectl scale. That’s what the Horizontal Pod Autoscaler (HPA) is for.

The HPA monitors metrics (like CPU usage) and automatically adjusts the number of replicas:

kubectl autoscale deployment mi-app --min=2 --max=10 --cpu-percent=70

This tells Kubernetes: “keep between 2 and 10 replicas of mi-app, and if CPU usage goes above 70%, scale up”.

Verify the HPA:

kubectl get hpa
NAME     REFERENCE           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
mi-app   Deployment/mi-app   10%/70%   2         10        2          30s
Info
For the HPA to work, you need to have the Metrics Server installed in your cluster and your Pods must have resources.requests defined (as we did in our YAML above).

In the next chapter we dive deep into the HPA: YAML configuration, memory-based scaling, custom metrics with Prometheus, behavior policies, and an intro to VPA.


Port-forward: Testing Your Pod Locally

There’s a quick and direct way to test a Pod or Service without needing Ingress: port-forward. This creates a tunnel from your local machine to the cluster.

kubectl port-forward

Port-forward creates a direct tunnel from your machine to the Pod inside the cluster

Port-forward to a Pod

kubectl port-forward pod/mi-app-5d9b7f6b4-abc12 8080:80

This redirects localhost:8080 on your machine to port 80 on the Pod. Open your browser at http://localhost:8080 and you’ll see NGINX.

Port-forward to a Service

More practical, because you don’t need to know the exact Pod name:

kubectl port-forward svc/mi-app-svc 8080:80

This is perfect for development and debugging. The terminal stays busy while the port-forward is active. Use Ctrl+C to stop it.

Note
Port-forward is only for development and testing. In production you’ll use LoadBalancer-type Services or Gateway API to expose your apps.

Official References

Here are the links to the official Kubernetes documentation for each resource we covered. It’s always a good idea to keep these handy:


Summary

We covered a lot of ground today. Let’s recap:

  • Pod: the smallest unit in Kubernetes, a wrapper for containers.
  • Deployment: the Pod manager that handles replicas, rolling updates, and rollbacks.
  • Service: the stable access point to reach your Pods.
  • Gateway API: the modern standard for managing HTTP/HTTPS traffic from the outside (replaces Ingress).
  • HPA: the autoscaler that adjusts replicas based on metrics (more in the next chapter).
  • We learned to deploy with imperative commands and with YAML manifests (declarative).
  • We used port-forward to test our app locally.

In the next chapter we dive right into the HPA: advanced configuration, CPU and memory-based scaling, custom metrics with Prometheus, behavior policies, and an introduction to VPA.


Did you enjoy this article? Share it with someone who’s learning Kubernetes. And if you have questions, leave me a comment!