luvv to helpDiscover the Best Free Online Tools
Topic 1 of 7

Deployments Services Ingress Basics

Learn Deployments Services Ingress Basics for free with explanations, exercises, and a quick test (for MLOps Engineer).

Published: January 4, 2026 | Updated: January 4, 2026

Who this is for

  • MLOps Engineers deploying ML inference APIs or batch services on Kubernetes.
  • Data/ML Engineers who need stable rollout, scaling, and simple external access.
  • Developers moving from local Docker to production Kubernetes.

Prerequisites

  • Comfort with Docker images.
  • Basic kubectl usage (apply, get, describe, logs).
  • Know Pods and containers (what a Pod is, basic YAML structure).

Why this matters

Your day-to-day MLOps tasks often include:

  • Rolling out a new model version with zero downtime.
  • Keeping a stable endpoint for clients even as Pods restart or scale.
  • Routing external HTTP traffic to the right service, often with path-based rules like /predict or /metrics.
  • Quickly reverting a bad release without breaking traffic.

Deployments, Services, and Ingress are the trio that make this reliable:

  • Deployment = desired state, rolling updates, scaling for your Pods.
  • Service = stable virtual IP and DNS name for accessing Pods.
  • Ingress = HTTP(S) routing from outside the cluster to internal Services.

Concept explained simply

Think of Kubernetes networking like a building with three layers of doors:

  • Deployment: The operations team that ensures a certain number of identical rooms (Pods) are always ready and handles swapping occupants during renovations (rolling updates).
  • Service: A receptionist with a permanent phone number (ClusterIP) who forwards calls to any available room that matches certain labels.
  • Ingress: The main entrance for visitors from the street; it reads the visitor's request (host/path) and directs them to the right receptionist (Service).

Mental model

  • Labels and selectors are the glue. Deployment labels Pods; Service selects Pods via those labels; Ingress points to the Service.
  • Ports must align. Container port (in Pod) -> Service targetPort -> Service port -> Ingress backend service/port.
  • Health checks matter. Readiness gates traffic until the app is ready; liveness restarts stuck containers.

Worked examples

Example 1: Deployment for an inference API

Goal: run 2 replicas of a FastAPI/Flask model server with safe rollouts and health checks.

Show Deployment YAML
apiVersion: apps/v1
kind: Deployment
metadata:
  name: infer-deploy
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: infer
  template:
    metadata:
      labels:
        app: infer
    spec:
      containers:
      - name: infer
        image: ghcr.io/example/infer:1.0
        ports:
        - containerPort: 8080
        env:
        - name: MODEL_NAME
          value: resnet50
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /live
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 10
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "1"
            memory: "1Gi"

What this gives you: consistent scaling, safe rolling updates, and probes that prevent bad pods from receiving traffic.

Example 2: Service for stable access

Goal: create a stable virtual IP to reach the Pods.

Show Service YAML
apiVersion: v1
kind: Service
metadata:
  name: infer-svc
spec:
  type: ClusterIP
  selector:
    app: infer
  ports:
  - name: http
    port: 80
    targetPort: 8080

Traffic flow: Service port 80 forwards to Pod port 8080. Inside the cluster, DNS name is infer-svc.default.svc.cluster.local (format varies by namespace).

Example 3: Ingress for external HTTP routing

Goal: expose /predict and /health from the outside world.

Show Ingress YAML (generic, controller required)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: infer-ing
  annotations:
    # Example for NGINX; annotations vary by controller
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  rules:
  - host: ml.example.local
    http:
      paths:
      - path: /predict
        pathType: Prefix
        backend:
          service:
            name: infer-svc
            port:
              number: 80
      - path: /health
        pathType: Prefix
        backend:
          service:
            name: infer-svc
            port:
              number: 80

Note: An Ingress controller must be installed in the cluster. The YAML declares intent; the controller does the actual routing.

How to apply: step-by-step

  1. Apply the Deployment. Wait until all Pods are Ready.
  2. Apply the Service. Confirm it has a ClusterIP and endpoints.
  3. Apply the Ingress. Verify the controller admits the rule and endpoints are ready.
  4. Smoke test /health first, then /predict to validate functionality.
Useful kubectl commands
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml

kubectl get deploy,po,svc,ing
kubectl describe deploy infer-deploy
kubectl describe svc infer-svc
kubectl describe ing infer-ing

kubectl get endpointslices -l app=infer
kubectl logs deploy/infer-deploy

Exercises (your turn)

These map to the graded exercises below. You can complete them in any conformant Kubernetes cluster. If you cannot run a cluster now, write the YAML and self-check against the checklist.

Exercise 1: Create a Deployment and Service for an inference API

Requirements:

  • Deployment named infer-deploy with 2 replicas.
  • Image: ghcr.io/example/infer:1.0
  • Container port: 8080; readiness on /health; liveness on /live.
  • Env: MODEL_NAME=beta
  • Service infer-svc exposes port 80 -> targetPort 8080.
Self-check checklist
  • kubectl get deploy shows 2/2 READY.
  • kubectl get svc shows infer-svc with a ClusterIP and port 80.
  • kubectl describe endpoints or endpointslices shows addresses for 2 Pods.
  • kubectl port-forward svc/infer-svc 8080:80 returns healthy response on /health.

Exercise 2: Add an Ingress for /predict

Requirements:

  • Ingress name: infer-ing
  • Host: ml.example.local
  • Path: /predict routes to infer-svc port 80
Self-check checklist
  • kubectl get ing shows ADDRESS populated (after controller config).
  • curl -H "Host: ml.example.local" hits the service.
  • Readiness failures do not receive traffic (test by breaking readiness temporarily).

Common mistakes and how to self-check

  • Mismatched labels: Service selector must match Pod labels. Self-check: kubectl get pods -l app=infer and ensure non-empty.
  • Port mismatch: Service targetPort must equal containerPort (or named port). Self-check: kubectl describe svc and verify.
  • Only liveness probe: Using liveness without readiness can send traffic to unready Pods. Add readiness to gate traffic.
  • Forgetting Ingress controller: Ingress YAML without a controller does nothing. Self-check: kubectl get pods -n ingress-controller-namespace (varies by setup).
  • Zero maxUnavailable during rollout without surge: Can deadlock. Use maxSurge: 1 and maxUnavailable: 0 for zero-downtime updates.

Practical projects

  • Blue/Green switch: Deploy infer-v1 and infer-v2. Point Service selector to one at a time and switch over.
  • Path router: Expose /predict to infer-svc and /metrics to a separate metrics-svc via Ingress.
  • A/B experiment: Two Deployments behind two Services, then manually route 10% of test traffic to B using a separate path (/predict-b).

Learning path

  • Start: Deployments/Services/Ingress basics (this lesson).
  • Next: ConfigMaps/Secrets for model configs and API keys.
  • Then: Horizontal Pod Autoscaler for load-based scaling of inference.
  • Advanced: Canary rollouts with Progressive Delivery tools; TLS and cert rotation; Multi-tenancy and network policies.

Next steps

  • Finish the exercises and compare with the provided solutions.
  • Take the quick test below to validate knowledge. Note: Everyone can take the test; progress is saved only when logged in.
  • Apply to a real service in your environment and capture before/after SLOs (latency, error rate) during a rollout.

Mini challenge

Create two Deployments (infer-v1, infer-v2) and two Services (infer-v1-svc, infer-v2-svc). Use an Ingress to route:

  • /v1/predict -> infer-v1-svc
  • /v2/predict -> infer-v2-svc

Bonus: Switch your main /predict to point to v2 by updating only the Ingress. Measure downtime (should be near zero if readiness is correct).

Practice Exercises

2 exercises to complete

Instructions

Create a Deployment infer-deploy with 2 replicas using image ghcr.io/example/infer:1.0, containerPort 8080, env MODEL_NAME=beta, readiness /health on 8080, liveness /live on 8080. Then create a Service infer-svc (ClusterIP) that exposes port 80 to targetPort 8080.

Verify Pods become Ready and the Service has endpoints.

Expected Output
Deployment infer-deploy shows 2/2 READY; Service infer-svc has one ClusterIP on port 80 and lists 2 endpoints.

Deployments Services Ingress Basics — Quick Test

Test your knowledge with 10 questions. Pass with 70% or higher.

10 questions70% to pass

Have questions about Deployments Services Ingress Basics?

AI Assistant

Ask questions about this tool