TelecomWavecom

61% Cost Reduction via Cloud Migration

We incrementally decomposed Wavecom's legacy Java EE monolith into Spring Boot microservices, migrated 12TB from Oracle to PostgreSQL, and moved everything to AWS EKS — cutting infrastructure costs by 61% and enabling daily deployments.

8 months (Jan 2025 – Aug 2025)
2 backend engineers, 1 database specialist, 1 DevOps engineer, 1 QA engineer, 1 PM
Spring BootPostgreSQLAWS EKSTerraformOracle Migration

61%

Infrastructure Cost Reduction

$4.2M/year → $1.64M/year. Oracle license alone was $1.4M of the savings.

18 min

Deployment Time

Down from 6-hour Saturday deployment windows with a 47-page runbook

0

Downtime During Migration

12TB Oracle → PostgreSQL with dual-write strategy, zero data loss

Daily

Deploy Frequency

Up from one release every 6 weeks

The Challenge

What We Were Up Against

Wavecom's core platform was a Java EE monolith deployed on Oracle WebLogic, originally built in 2005. It managed billing, customer provisioning, network configuration, and reporting for 1.8 million subscribers. The Oracle database license alone was $1.4M/year, and the WebLogic cluster required 28 dedicated bare-metal servers in a colocation facility. Release cycles were 6 weeks because the QA team needed 3 weeks to regression test the entire monolith — there were no automated tests and the deployment process was a 47-page runbook. The operations team of 8 people spent 60% of their time on infrastructure maintenance rather than feature work.

Oracle License Costs

Oracle Database Enterprise Edition with RAC licensing was $1.4M/year. The 12TB database was 40% reporting data that didn't need the performance characteristics of Oracle.

6-Week Release Cycles

No automated tests meant a 3-week manual regression cycle for every release. The 47-page deployment runbook required 4 engineers working in coordination for a 6-hour Saturday deployment window.

Scaling Limitations

The monolith scaled vertically only. During bill-run periods (1st–3rd of each month), CPU utilization hit 95% on all 28 servers, causing degraded response times for customer-facing APIs.

Recruitment Challenges

Java EE (EJB 2.1, JSF, Oracle PL/SQL) made hiring nearly impossible. Three senior engineering positions had been open for 8+ months with zero qualified candidates.

Constraints & Requirements

Zero downtime for billing operations — 1.8M subscribers depend on accurate, timely billing

Must maintain regulatory compliance for telecom audit requirements

Existing PL/SQL business logic (400+ stored procedures) contains 20 years of billing edge cases

Oracle RAC cluster can't be shut down until all dependent services are migrated

Our Approach

How We Built It

We used an incremental strangler fig approach, extracting one bounded context at a time. We started with the billing service because it had the highest business value and was the primary driver of Oracle licensing costs. Spring Boot was chosen over other frameworks because the existing team already knew the Java ecosystem — retraining 12 engineers on a new language would have added months. The Oracle-to-PostgreSQL migration was the riskiest part: 400+ stored procedures needed to be rewritten, and we used a dual-write strategy with automated reconciliation to ensure zero data loss.

01

Discovery & Architecture Design

Weeks 1–4

Mapped all domain boundaries in the monolith using static analysis and production traffic patterns. Identified 6 bounded contexts. Designed the target microservices architecture and data migration strategy.

Domain map with 6 bounded contexts and dependency graph
Target architecture document with AWS EKS design
Data migration strategy with dual-write reconciliation approach
Risk register with mitigation plans for each phase
02

Billing Service Extraction

Weeks 5–14

Extracted the billing bounded context into a Spring Boot service. Rewrote 142 of the 400+ PL/SQL stored procedures that handled billing logic. The rest were deferred to later phases.

Spring Boot billing service with 142 rewritten business rules
PostgreSQL schema for billing data (4.8TB migrated from Oracle)
Dual-write adapter: writes to both Oracle and PostgreSQL simultaneously
Automated reconciliation job comparing Oracle/PostgreSQL data hourly
03

Customer Management & Provisioning

Weeks 15–22

Extracted customer management and network provisioning services. These had tighter coupling to each other than to billing, so they were extracted together with a shared event bus.

Customer management Spring Boot service
Provisioning service with network API integrations
Event bus (Amazon SQS) for inter-service communication
Remaining 183 PL/SQL procedures rewritten to Java
04

Reporting & Oracle Decommission

Weeks 23–34

Migrated the reporting workload to PostgreSQL with a dedicated analytics replica. Decommissioned the Oracle RAC cluster and the 28 bare-metal servers. Final PL/SQL procedures migrated.

Reporting service with PostgreSQL analytics replica
All 400+ PL/SQL procedures migrated to Java
Oracle RAC cluster decommissioned (saving $1.4M/year in licenses)
28 bare-metal servers replaced by EKS cluster ($16.4K/month)

Key Features

What We Built

Dual-Write Migration Strategy

Writes went to both Oracle and PostgreSQL simultaneously during the migration, with hourly reconciliation jobs ensuring zero data divergence before cutover.

Technical Detail

A custom JDBC interceptor captured all write operations and replayed them against PostgreSQL. Reconciliation compared row counts, checksums, and random-sample deep comparisons hourly. Divergences triggered alerts and automatic investigation. We found and fixed 7 data type edge cases (Oracle NUMBER vs PostgreSQL NUMERIC precision differences) during the dual-write period.

PL/SQL Business Logic Migration

400+ Oracle stored procedures containing 20 years of telecom billing edge cases were systematically rewritten to Spring Boot services with full test coverage.

Technical Detail

Each stored procedure was first documented by pairing with Wavecom's senior billing engineer. Then rewritten in Java with unit tests that verified output against the Oracle procedure for 10,000 historical transaction samples. Acceptance criteria: 100% output match on all test cases before the Oracle procedure was retired.

Auto-Scaling for Bill Runs

Monthly bill-run periods (1st–3rd) that previously pegged all 28 servers at 95% CPU now auto-scale seamlessly on EKS.

Technical Detail

Kubernetes HPA with custom metrics from Prometheus. Bill-run pods scale from 4 replicas to 16 based on queue depth. Pre-scaling kicks in at 11 PM on the last day of each month based on a CronJob. Post-bill-run, pods scale back down within 2 hours.

Automated Testing & Deployment Pipeline

Replaced the 47-page deployment runbook and 3-week manual regression cycle with a fully automated CI/CD pipeline that deploys to production in 18 minutes.

Technical Detail

Jenkins pipeline (migrated to GitHub Actions in month 6) with 3 stages: unit tests (4 min), integration tests against a PostgreSQL test instance (8 min), canary deployment with automated smoke tests (6 min). Rollback is automatic if smoke tests fail. Test coverage went from 0% to 76% over the 8-month engagement.

Tech Stack

Why We Chose What We Chose

Backend

Spring Boot 3.2

Team of 12 Java engineers. Retraining on Go or Kotlin would have added 2+ months. Spring Boot modernized their skills while staying in the Java ecosystem.

PostgreSQL 16

Replaced Oracle Enterprise. $0 licensing cost for comparable OLTP performance. The 61% cost reduction was primarily driven by this switch.

Amazon SQS

Inter-service event bus. Chose over Kafka — Wavecom didn't need replay/stream processing, and SQS required zero operational overhead.

Infrastructure

AWS EKS

Replaced 28 bare-metal servers. EKS gave auto-scaling for bill-run periods — something that was impossible with physical hardware.

Terraform

Infrastructure as code for the first time in Wavecom's history. Every environment is reproducible from a single terraform apply.

Helm Charts

Standardized Kubernetes deployment configurations across all 6 microservices. One values.yaml file per service per environment.

Data Migration

AWS DMS

Initial bulk migration of 12TB from Oracle to PostgreSQL. CDC (Change Data Capture) for the dual-write synchronization period.

pgLoader

Schema conversion from Oracle DDL to PostgreSQL. Handled most data type mappings automatically, leaving only 7 edge cases for manual review.

Observability

Prometheus + Grafana

Open-source observability stack. Wavecom didn't want another vendor license after the Oracle experience.

ELK Stack

Centralized logging. Replaced the practice of SSH-ing into servers to read log files.

Impact

Before & After

Metric

Before

After

Infrastructure Cost

$4.2M/year

$1.64M/year

Release Cycle

6 weeks

Daily

Deployment Duration

6 hours (manual)

18 minutes (automated)

Test Coverage

0%

76%

Bill-Run CPU Utilization

95% (all 28 servers)

45% (auto-scaled)

Open Senior Eng Positions

3 (unfilled 8+ months)

0 (filled within 6 weeks)

Engineering Quality

How We Ship

Test Coverage

76% overall (up from 0%), 92% on billing-critical paths

CI/CD Pipeline

GitHub Actions — 18-minute pipeline with unit, integration, and smoke tests

Monitoring

Prometheus + Grafana dashboards with PagerDuty alerting for billing SLA breaches

Deploy Frequency

Daily deployments via automated canary pipeline

We'd been told by two other consultancies that we needed a complete rewrite — 18 months minimum, $3M budget. TechWithCare's incremental approach meant we started seeing cost savings by month 3 when the billing service went live on PostgreSQL. The dual-write migration strategy was genius — we never had a moment where we weren't 100% confident in our data integrity.

P

Priya Narasimhan

VP of Engineering, Wavecom

Ongoing

What's Next

1

Migrating the legacy reporting dashboard to a modern React frontend

2

Implementing event sourcing for billing audit trail (regulatory requirement for 2026)

3

Adding multi-region deployment for disaster recovery

MORE BUILDLESS BREAK

Start building with a team that cares. No credit card required.