Published on

The Complete Guide to Modern Java Architecture - Part 4: Performance & Scalability

Authors

The Complete Guide to Modern Java Architecture - Part 4: Performance & Scalability

This is Part 4 of a comprehensive 5-part series on Modern Java Architecture. Building on the implementation foundations from Part 3, we now focus on making systems fast, responsive, and capable of handling massive scale.

Series Overview:

  • Part 1: Foundation - Evolution, principles, and modern Java features
  • Part 2: Architecture Patterns - Monoliths, microservices, and event-driven design
  • Part 3: Implementation Deep Dives - APIs, data layer, security, and observability
  • Part 4: Performance & Scalability (This post) - Optimization, reactive programming, and scaling patterns
  • Part 5: Production Considerations - Deployment, containers, and operational excellence

Performance isn't an afterthought—it's a fundamental architectural quality that shapes every design decision. After optimizing systems serving billions of requests, from startup MVPs to enterprise platforms processing terabytes daily, I've learned that scalability is achieved through the right combination of reactive programming, intelligent caching, JVM optimization, and horizontal scaling patterns.

This part provides battle-tested strategies for building systems that perform under load and scale gracefully as demand grows.

Reactive Programming: The Foundation of Scale

Understanding the Reactive Stack

Traditional servlet-based applications use one thread per request, limiting scalability. Reactive programming enables handling thousands of concurrent requests with minimal threads.

// Traditional blocking approach - Thread per request
@RestController
public class BlockingOrderController {
    
    private final PaymentService paymentService;
    private final InventoryService inventoryService;
    private final EmailService emailService;
    
    @PostMapping("/orders")
    public ResponseEntity<Order> createOrder(@RequestBody CreateOrderRequest request) {
        // Each operation blocks the thread
        PaymentResult payment = paymentService.processPayment(request.getPayment()); // 200ms
        InventoryResult inventory = inventoryService.reserve(request.getItems());    // 150ms
        
        Order order = orderService.createOrder(request, payment, inventory);
        
        // Email notification blocks thread
        emailService.sendConfirmation(order); // 300ms
        
        return ResponseEntity.ok(order);
        // Total: 650ms blocking time per request
        // With 200 threads: max 200 concurrent requests
    }
}

// Reactive approach - Event loop with backpressure
@RestController
public class ReactiveOrderController {
    
    private final ReactivePaymentService paymentService;
    private final ReactiveInventoryService inventoryService;
    private final ReactiveEmailService emailService;
    
    @PostMapping("/orders")
    public Mono<ResponseEntity<Order>> createOrder(@RequestBody CreateOrderRequest request) {
        return paymentService.processPayment(request.getPayment())
            .zipWith(inventoryService.reserve(request.getItems()))
            .flatMap(tuple -> {
                PaymentResult payment = tuple.getT1();
                InventoryResult inventory = tuple.getT2();
                
                return orderService.createOrder(request, payment, inventory);
            })
            .flatMap(order -> 
                // Non-blocking email notification
                emailService.sendConfirmation(order)
                    .then(Mono.just(ResponseEntity.ok(order)))
            )
            .timeout(Duration.ofSeconds(5))
            .onErrorResume(this::handleError);
            // Parallel execution: ~200ms total time
            // Event loop: thousands of concurrent requests
    }
    
    private Mono<ResponseEntity<Order>> handleError(Throwable error) {
        return switch (error) {
            case PaymentException e -> Mono.just(ResponseEntity.badRequest().build());
            case InventoryException e -> Mono.just(ResponseEntity.status(409).build());
            case TimeoutException e -> Mono.just(ResponseEntity.status(408).build());
            default -> Mono.just(ResponseEntity.status(500).build());
        };
    }
}

Advanced Reactive Patterns

Backpressure Handling:

@Service
public class ReactiveOrderProcessingService {
    
    private final Flux<OrderEvent> orderEventStream;
    
    public Flux<ProcessingResult> processOrderStream() {
        return orderEventStream
            // Handle backpressure with buffering
            .onBackpressureBuffer(1000, 
                dropped -> log.warn("Dropped order event due to backpressure: {}", dropped),
                BufferOverflowStrategy.DROP_OLDEST)
            
            // Parallel processing with bounded concurrency
            .parallel(Runtime.getRuntime().availableProcessors())
            .runOn(Schedulers.parallel())
            
            // Process each order with rate limiting
            .flatMap(this::processOrder, 10) // Max 10 concurrent per worker
            
            // Collect results back to sequential stream
            .sequential()
            
            // Retry with exponential backoff
            .retryWhen(Retry.backoff(3, Duration.ofSeconds(1))
                .filter(throwable -> throwable instanceof TransientException))
            
            // Circuit breaker pattern
            .transform(CircuitBreakerOperator.of(circuitBreaker))
            
            // Metrics collection
            .doOnNext(result -> meterRegistry.counter("orders.processed").increment())
            .doOnError(error -> meterRegistry.counter("orders.failed").increment());
    }
    
    private Mono<ProcessingResult> processOrder(OrderEvent event) {
        return Mono.fromCallable(() -> {
            // CPU-intensive processing
            return complexOrderProcessing(event);
        })
        .subscribeOn(Schedulers.boundedElastic()) // Offload to bounded thread pool
        .timeout(Duration.ofSeconds(30))
        .map(ProcessingResult::success)
        .onErrorReturn(ProcessingResult::failure);
    }
}

Reactive Data Access:

// R2DBC for reactive database access
@Repository
public class ReactiveOrderRepository {
    
    private final R2dbcEntityTemplate template;
    private final DatabaseClient databaseClient;
    
    public Mono<Order> save(Order order) {
        return template.insert(order)
            .doOnSuccess(saved -> log.debug("Saved order: {}", saved.getId()))
            .doOnError(error -> log.error("Failed to save order", error));
    }
    
    public Flux<Order> findByCustomerId(String customerId) {
        String sql = """
            SELECT o.*, oi.* 
            FROM orders o 
            LEFT JOIN order_items oi ON o.id = oi.order_id 
            WHERE o.customer_id = :customerId 
            ORDER BY o.created_at DESC
            """;
            
        return databaseClient.sql(sql)
            .bind("customerId", customerId)
            .fetch()
            .all()
            .bufferUntilChanged(row -> row.get("id"))
            .map(this::mapToOrder)
            .take(100); // Limit results for performance
    }
    
    // Streaming large result sets
    public Flux<OrderSummary> streamOrderSummaries(LocalDate fromDate) {
        return databaseClient.sql("""
            SELECT 
                DATE(created_at) as order_date,
                COUNT(*) as order_count,
                SUM(total_amount) as total_revenue
            FROM orders 
            WHERE created_at >= :fromDate
            GROUP BY DATE(created_at)
            ORDER BY order_date
            """)
            .bind("fromDate", fromDate)
            .fetch()
            .all()
            .map(row -> OrderSummary.builder()
                .date((LocalDate) row.get("order_date"))
                .orderCount((Long) row.get("order_count"))
                .totalRevenue((BigDecimal) row.get("total_revenue"))
                .build())
            .window(Duration.ofSeconds(1)) // Emit in 1-second windows
            .flatMap(window -> window.collectList().map(Flux::fromIterable))
            .flatMap(batch -> batch);
    }
}

// Reactive caching with Redis
@Service
public class ReactiveOrderCacheService {
    
    private final ReactiveRedisTemplate<String, Order> redisTemplate;
    private final ReactiveOrderRepository orderRepository;
    
    public Mono<Order> getOrder(String orderId) {
        String cacheKey = "order:" + orderId;
        
        return redisTemplate.opsForValue().get(cacheKey)
            .switchIfEmpty(
                // Cache miss - fetch from database
                orderRepository.findById(orderId)
                    .flatMap(order -> 
                        // Cache for 1 hour
                        redisTemplate.opsForValue()
                            .set(cacheKey, order, Duration.ofHours(1))
                            .then(Mono.just(order))
                    )
            )
            .doOnNext(order -> log.debug("Retrieved order from cache: {}", orderId))
            .timeout(Duration.ofSeconds(5));
    }
    
    public Mono<Void> invalidateOrder(String orderId) {
        return redisTemplate.delete("order:" + orderId)
            .then();
    }
}

Caching Strategies: Multi-Level Performance

Intelligent Caching Architecture

// Multi-level caching with cache-aside pattern
@Service
public class MultiLevelCacheService {
    
    private final Cache<String, Order> l1Cache; // Caffeine - In-memory
    private final ReactiveRedisTemplate<String, Order> l2Cache; // Redis - Distributed
    private final OrderRepository repository; // Database - Source of truth
    
    public Mono<Order> getOrder(String orderId) {
        // L1 Cache (In-memory) - Fastest
        Order cachedOrder = l1Cache.getIfPresent(orderId);
        if (cachedOrder != null) {
            return Mono.just(cachedOrder)
                .doOnNext(order -> recordCacheHit("L1", orderId));
        }
        
        // L2 Cache (Redis) - Fast network
        return l2Cache.opsForValue().get("order:" + orderId)
            .doOnNext(order -> {
                l1Cache.put(orderId, order); // Populate L1
                recordCacheHit("L2", orderId);
            })
            .switchIfEmpty(
                // Cache miss - Load from database
                repository.findById(orderId)
                    .flatMap(order -> {
                        // Populate both cache levels
                        l1Cache.put(orderId, order);
                        return l2Cache.opsForValue()
                            .set("order:" + orderId, order, Duration.ofHours(1))
                            .then(Mono.just(order));
                    })
                    .doOnNext(order -> recordCacheMiss(orderId))
            );
    }
    
    public Mono<Void> updateOrder(Order order) {
        return repository.save(order)
            .flatMap(savedOrder -> {
                // Invalidate both cache levels
                l1Cache.invalidate(order.getId());
                return l2Cache.delete("order:" + order.getId());
            })
            .then();
    }
}

// Cache warming strategies
@Component
public class CacheWarmupService {
    
    @EventListener(ApplicationReadyEvent.class)
    public void warmupCaches() {
        log.info("Starting cache warmup");
        
        // Warm frequently accessed data
        warmupPopularProducts()
            .then(warmupActiveCustomers())
            .then(warmupRecentOrders())
            .subscribe(
                unused -> log.info("Cache warmup completed"),
                error -> log.error("Cache warmup failed", error)
            );
    }
    
    private Mono<Void> warmupPopularProducts() {
        return productService.getTopProducts(100)
            .flatMap(productCacheService::preload)
            .then();
    }
    
    @Scheduled(fixedRate = 300000) // Every 5 minutes
    public void refreshHotData() {
        // Refresh cache for hot data before expiration
        hotDataIdentifier.getHotKeys()
            .flatMap(this::refreshCacheEntry)
            .subscribe();
    }
}

// Cache-aside with write-behind pattern
@Service
public class WriteOptimizedCacheService {
    
    private final Cache<String, Order> writeCache;
    private final ScheduledExecutorService writeExecutor;
    
    @PostConstruct
    public void initializeWriteBehind() {
        // Batch write every 5 seconds
        writeExecutor.scheduleAtFixedRate(this::flushWrites, 5, 5, TimeUnit.SECONDS);
    }
    
    public Mono<Void> updateOrder(Order order) {
        // Immediate cache update
        writeCache.put(order.getId(), order);
        
        // Mark for batch write
        pendingWrites.add(order.getId());
        
        return Mono.empty(); // Non-blocking return
    }
    
    private void flushWrites() {
        Set<String> toWrite = new HashSet<>(pendingWrites);
        pendingWrites.clear();
        
        if (!toWrite.isEmpty()) {
            List<Order> orders = toWrite.stream()
                .map(writeCache::getIfPresent)
                .filter(Objects::nonNull)
                .collect(toList());
                
            repository.saveAll(orders)
                .doOnSuccess(saved -> log.debug("Flushed {} orders to database", saved.size()))
                .subscribe();
        }
    }
}

Cache Consistency Patterns

// Event-driven cache invalidation
@Component
public class CacheInvalidationHandler {
    
    @EventListener
    @Async
    public void handleOrderUpdated(OrderUpdatedEvent event) {
        String orderId = event.getOrderId();
        
        // Invalidate related caches
        Mono.when(
            orderCacheService.invalidate(orderId),
            customerCacheService.invalidateOrderHistory(event.getCustomerId()),
            analyticsCacheService.invalidateOrderStats(event.getDate())
        ).subscribe(
            unused -> log.debug("Cache invalidated for order: {}", orderId),
            error -> log.error("Cache invalidation failed", error)
        );
    }
    
    @EventListener
    @Async
    public void handleProductUpdated(ProductUpdatedEvent event) {
        // Invalidate product cache and related order caches
        String productId = event.getProductId();
        
        productCacheService.invalidate(productId)
            .then(findOrdersContainingProduct(productId))
            .flatMapMany(Flux::fromIterable)
            .flatMap(orderCacheService::invalidate)
            .then()
            .subscribe();
    }
}

// Distributed cache locking for cache-aside
@Service
public class DistributedCacheService {
    
    private final RedisTemplate<String, Object> redisTemplate;
    private final RedissonClient redissonClient;
    
    public Mono<Order> getOrderWithLocking(String orderId) {
        String cacheKey = "order:" + orderId;
        String lockKey = "lock:order:" + orderId;
        
        return Mono.fromCallable(() -> redisTemplate.opsForValue().get(cacheKey))
            .cast(Order.class)
            .switchIfEmpty(
                // Cache miss - acquire lock and load
                acquireLock(lockKey)
                    .flatMap(lock -> 
                        // Double-check cache under lock
                        Mono.fromCallable(() -> redisTemplate.opsForValue().get(cacheKey))
                            .cast(Order.class)
                            .switchIfEmpty(
                                // Still not in cache - load from database
                                repository.findById(orderId)
                                    .flatMap(order -> {
                                        // Cache the result
                                        redisTemplate.opsForValue().set(cacheKey, order, 
                                            Duration.ofMinutes(30));
                                        return Mono.just(order);
                                    })
                            )
                            .doFinally(signal -> lock.unlock())
                    )
            );
    }
    
    private Mono<RLock> acquireLock(String lockKey) {
        return Mono.fromCallable(() -> {
            RLock lock = redissonClient.getLock(lockKey);
            boolean acquired = lock.tryLock(5, 30, TimeUnit.SECONDS);
            if (!acquired) {
                throw new LockAcquisitionException("Failed to acquire lock: " + lockKey);
            }
            return lock;
        });
    }
}

JVM Performance Optimization

Memory Management and GC Tuning

# Production JVM configuration for high-throughput applications
JAVA_OPTS="
  # Heap sizing - 70% of available memory
  -Xms8g -Xmx8g
  
  # G1GC for low-latency requirements
  -XX:+UseG1GC
  -XX:MaxGCPauseMillis=100
  -XX:G1HeapRegionSize=16m
  -XX:G1NewSizePercent=30
  -XX:G1MaxNewSizePercent=40
  
  # GC logging for monitoring
  -Xlog:gc*:gc.log:time,tags
  -XX:+UseStringDeduplication
  
  # JIT compilation optimization
  -XX:+TieredCompilation
  -XX:TieredStopAtLevel=1
  -XX:CompileThreshold=1500
  
  # Memory optimization
  -XX:+UseCompressedOops
  -XX:+UseCompressedClassPointers
  
  # Monitoring and profiling
  -XX:+UnlockDiagnosticVMOptions
  -XX:+DebugNonSafepoints
  -XX:+FlightRecorder
  
  # Container awareness
  -XX:+UseContainerSupport
  -XX:InitialRAMPercentage=70.0
  -XX:MaxRAMPercentage=70.0
"

Application-Level Memory Optimization:

// Object pooling for high-frequency allocations
@Component
public class ObjectPoolManager {
    
    private final ObjectPool<StringBuilder> stringBuilderPool;
    private final ObjectPool<ByteBuffer> byteBufferPool;
    
    public ObjectPoolManager() {
        this.stringBuilderPool = new GenericObjectPool<>(
            new StringBuilderPooledObjectFactory(), 
            buildPoolConfig(100, 10)
        );
        
        this.byteBufferPool = new GenericObjectPool<>(
            new ByteBufferPooledObjectFactory(8192),
            buildPoolConfig(50, 5)
        );
    }
    
    public <T> T withPooledStringBuilder(Function<StringBuilder, T> operation) {
        StringBuilder sb = null;
        try {
            sb = stringBuilderPool.borrowObject();
            return operation.apply(sb);
        } catch (Exception e) {
            throw new RuntimeException("Pool operation failed", e);
        } finally {
            if (sb != null) {
                sb.setLength(0); // Clear for reuse
                try {
                    stringBuilderPool.returnObject(sb);
                } catch (Exception e) {
                    log.warn("Failed to return object to pool", e);
                }
            }
        }
    }
    
    private GenericObjectPoolConfig<Object> buildPoolConfig(int maxTotal, int maxIdle) {
        GenericObjectPoolConfig<Object> config = new GenericObjectPoolConfig<>();
        config.setMaxTotal(maxTotal);
        config.setMaxIdle(maxIdle);
        config.setMinIdle(2);
        config.setTestOnBorrow(true);
        config.setTestWhileIdle(true);
        config.setBlockWhenExhausted(false);
        return config;
    }
}

// Memory-efficient data structures
@Service
public class OptimizedDataService {
    
    // Use primitive collections to avoid boxing overhead
    private final TIntObjectHashMap<Order> orderIndex = new TIntObjectHashMap<>();
    private final TLongSet processedOrderIds = new TLongHashSet();
    
    // Flyweight pattern for repeated data
    private final LoadingCache<String, ProductCategory> categoryCache = 
        Caffeine.newBuilder()
            .maximumSize(1000)
            .build(this::loadProductCategory);
    
    // Off-heap storage for large datasets
    private final ChronicleMap<String, Order> persistentOrderCache = 
        ChronicleMap.of(String.class, Order.class)
            .entries(1_000_000)
            .create();
    
    public void optimizedOrderProcessing(List<Order> orders) {
        // Process in batches to manage memory
        Lists.partition(orders, 1000)
            .forEach(batch -> {
                processBatch(batch);
                // Explicit garbage collection hint after batch
                if (System.currentTimeMillis() % 10000 < 100) {
                    System.gc();
                }
            });
    }
    
    // Use value objects and records to reduce memory overhead
    public record OrderMetrics(
        int orderCount,
        BigDecimal totalRevenue,
        double averageOrderValue
    ) {
        public static OrderMetrics calculate(Collection<Order> orders) {
            int count = orders.size();
            BigDecimal total = orders.stream()
                .map(Order::getTotalAmount)
                .reduce(BigDecimal.ZERO, BigDecimal::add);
            double average = total.divide(BigDecimal.valueOf(count), 2, RoundingMode.HALF_UP)
                .doubleValue();
            
            return new OrderMetrics(count, total, average);
        }
    }
}

JIT Optimization and Profiling

// JIT-friendly code patterns
@Service
public class JitOptimizedOrderService {
    
    // Avoid polymorphic calls in hot paths
    public void processOrders(List<StandardOrder> orders) {
        // Monomorphic call site - JIT can inline aggressively
        for (StandardOrder order : orders) {
            order.calculateShipping(); // Single implementation
        }
    }
    
    // Use final classes and methods for better optimization
    public final class OrderCalculator {
        
        // Hot method - keep simple for easy inlining
        public final BigDecimal calculateTotal(List<OrderItem> items) {
            BigDecimal total = BigDecimal.ZERO;
            for (int i = 0; i < items.size(); i++) {
                OrderItem item = items.get(i);
                total = total.add(item.getPrice().multiply(
                    BigDecimal.valueOf(item.getQuantity())));
            }
            return total;
        }
    }
    
    // Minimize object allocations in hot paths
    public void updateOrderStatuses(List<Order> orders, OrderStatus newStatus) {
        // Reuse collections
        List<String> orderIds = new ArrayList<>(orders.size());
        
        for (Order order : orders) {
            if (order.getStatus() != newStatus) {
                order.setStatus(newStatus);
                orderIds.add(order.getId());
            }
        }
        
        if (!orderIds.isEmpty()) {
            batchUpdateDatabase(orderIds, newStatus);
        }
    }
    
    // Use primitive streams where possible
    public OptionalDouble calculateAverageOrderValue(List<Order> orders) {
        return orders.stream()
            .mapToDouble(order -> order.getTotalAmount().doubleValue())
            .average();
    }
}

// Performance monitoring and profiling
@Component
public class PerformanceMonitor {
    
    private final JfrRecordingManager jfrManager;
    private final MeterRegistry meterRegistry;
    
    @EventListener
    public void startupComplete(ApplicationReadyEvent event) {
        // Start JFR recording for continuous profiling
        jfrManager.startRecording("production-profile", Duration.ofHours(1));
    }
    
    // Method-level performance tracking
    @Around("@annotation(PerformanceMonitored)")
    public Object monitorPerformance(ProceedingJoinPoint joinPoint) throws Throwable {
        String methodName = joinPoint.getSignature().getName();
        Timer.Sample sample = Timer.start(meterRegistry);
        
        try {
            Object result = joinPoint.proceed();
            sample.stop(Timer.builder("method.execution")
                .tag("method", methodName)
                .tag("status", "success")
                .register(meterRegistry));
            return result;
        } catch (Throwable e) {
            sample.stop(Timer.builder("method.execution")
                .tag("method", methodName)
                .tag("status", "error")
                .register(meterRegistry));
            throw e;
        }
    }
}

Horizontal Scaling Patterns

Load Balancing and Auto-Scaling

# Kubernetes Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 3
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60

Application-Level Scaling Strategies:

// Stateless service design for horizontal scaling
@Service
public class StatelessOrderService {
    
    // No instance variables - pure functions
    private final OrderRepository orderRepository;
    private final PaymentService paymentService;
    private final EventPublisher eventPublisher;
    
    public Order processOrder(ProcessOrderCommand command) {
        // All state passed as parameters
        return doProcessOrder(command, Instant.now(), UUID.randomUUID());
    }
    
    private Order doProcessOrder(ProcessOrderCommand command, 
                                Instant timestamp, 
                                UUID correlationId) {
        // Stateless processing - can be scaled horizontally
        Order order = Order.create(command, timestamp);
        
        PaymentResult payment = paymentService.process(
            command.getPaymentDetails(), correlationId);
            
        if (payment.isSuccessful()) {
            order.confirm();
            orderRepository.save(order);
            eventPublisher.publish(new OrderProcessedEvent(order, correlationId));
        } else {
            order.reject(payment.getErrorMessage());
        }
        
        return order;
    }
}

// Distributed session management
@Configuration
public class SessionConfig {
    
    @Bean
    public ReactiveSessionRepository sessionRepository() {
        // Store sessions in Redis for scalability
        return new ReactiveRedisSessionRepository(
            redisTemplate(), Duration.ofMinutes(30));
    }
    
    @Bean
    public WebSessionManager webSessionManager() {
        ReactiveRedisSessionRepository repository = sessionRepository();
        DefaultWebSessionManager manager = new DefaultWebSessionManager();
        manager.setSessionStore(repository);
        return manager;
    }
}

// Circuit breaker pattern for resilient scaling
@Service
public class ResilientOrderService {
    
    private final CircuitBreaker paymentCircuitBreaker;
    private final CircuitBreaker inventoryCircuitBreaker;
    
    public Mono<Order> processOrder(CreateOrderRequest request) {
        return Mono.fromCallable(() -> request)
            .flatMap(this::validateOrder)
            .flatMap(this::processPaymentWithCircuitBreaker)
            .flatMap(this::reserveInventoryWithCircuitBreaker)
            .flatMap(this::createOrder)
            .timeout(Duration.ofSeconds(10))
            .retry(2);
    }
    
    private Mono<PaymentResult> processPaymentWithCircuitBreaker(CreateOrderRequest request) {
        return Mono.fromSupplier(() -> 
            paymentCircuitBreaker.executeSupplier(() -> 
                paymentService.processPayment(request.getPayment())
            )
        ).onErrorResume(CircuitBreakerOpenException.class, 
            e -> Mono.just(PaymentResult.serviceUnavailable()));
    }
}

Database Scaling Patterns

// Read replicas and write/read separation
@Configuration
public class DatabaseScalingConfig {
    
    @Bean
    @Primary
    public DataSource writeDataSource() {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl("jdbc:postgresql://write-master:5432/orders");
        config.setMaximumPoolSize(20);
        config.setMinimumIdle(5);
        config.setConnectionTimeout(30000);
        config.setIdleTimeout(600000);
        config.setMaxLifetime(1800000);
        return new HikariDataSource(config);
    }
    
    @Bean
    @Qualifier("readOnly")
    public DataSource readOnlyDataSource() {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl("jdbc:postgresql://read-replica:5432/orders");
        config.setMaximumPoolSize(40); // More connections for reads
        config.setReadOnly(true);
        return new HikariDataSource(config);
    }
}

// Repository with read/write separation
@Repository
public class ScalableOrderRepository {
    
    private final JdbcTemplate writeTemplate;
    private final JdbcTemplate readTemplate;
    
    public ScalableOrderRepository(@Qualifier("writeDataSource") DataSource writeDs,
                                  @Qualifier("readOnly") DataSource readDs) {
        this.writeTemplate = new JdbcTemplate(writeDs);
        this.readTemplate = new JdbcTemplate(readDs);
    }
    
    // Write operations go to master
    @Transactional
    public Order save(Order order) {
        String sql = "INSERT INTO orders (id, customer_id, total, status) VALUES (?, ?, ?, ?)";
        writeTemplate.update(sql, order.getId(), order.getCustomerId(), 
            order.getTotal(), order.getStatus());
        return order;
    }
    
    // Read operations go to replica
    @Transactional(readOnly = true)
    public List<Order> findByCustomerId(String customerId) {
        String sql = "SELECT * FROM orders WHERE customer_id = ? ORDER BY created_at DESC";
        return readTemplate.query(sql, orderRowMapper, customerId);
    }
    
    // Analytics queries go to dedicated replica
    @Transactional(readOnly = true)
    public OrderAnalytics getOrderAnalytics(LocalDate from, LocalDate to) {
        String sql = """
            SELECT 
                COUNT(*) as order_count,
                SUM(total) as total_revenue,
                AVG(total) as avg_order_value
            FROM orders 
            WHERE DATE(created_at) BETWEEN ? AND ?
            """;
        return readTemplate.queryForObject(sql, analyticsRowMapper, from, to);
    }
}

// Database sharding strategy
@Service
public class ShardedOrderService {
    
    private final Map<String, OrderRepository> shardRepositories;
    private final ConsistentHashRouter hashRouter;
    
    public Order save(Order order) {
        String shardKey = determineShardKey(order);
        OrderRepository repository = getRepositoryForShard(shardKey);
        return repository.save(order);
    }
    
    public Optional<Order> findById(String orderId) {
        String shardKey = extractShardKeyFromId(orderId);
        OrderRepository repository = getRepositoryForShard(shardKey);
        return repository.findById(orderId);
    }
    
    // Cross-shard queries require scatter-gather
    public List<Order> findByCustomerId(String customerId) {
        return shardRepositories.values().parallelStream()
            .map(repo -> repo.findByCustomerId(customerId))
            .flatMap(List::stream)
            .sorted(Comparator.comparing(Order::getCreatedAt).reversed())
            .collect(toList());
    }
    
    private String determineShardKey(Order order) {
        // Shard by customer ID for data locality
        return hashRouter.route(order.getCustomerId());
    }
}

Load Testing and Performance Validation

Comprehensive Load Testing Strategy

// JMeter equivalent in code for CI/CD integration
@Component
public class LoadTestSuite {
    
    private final WebTestClient webTestClient;
    private final ExecutorService executorService;
    
    @Test
    public void loadTestOrderCreation() {
        int concurrentUsers = 100;
        int requestsPerUser = 100;
        Duration testDuration = Duration.ofMinutes(10);
        
        List<CompletableFuture<TestResult>> futures = IntStream.range(0, concurrentUsers)
            .mapToObj(userId -> CompletableFuture.supplyAsync(() -> 
                simulateUser(userId, requestsPerUser, testDuration), executorService))
            .collect(toList());
            
        // Collect results
        List<TestResult> results = futures.stream()
            .map(CompletableFuture::join)
            .collect(toList());
            
        // Validate performance metrics
        TestMetrics metrics = TestMetrics.calculate(results);
        assertThat(metrics.getAverageResponseTime()).isLessThan(Duration.ofMillis(500));
        assertThat(metrics.get99thPercentile()).isLessThan(Duration.ofSeconds(2));
        assertThat(metrics.getErrorRate()).isLessThan(0.01); // < 1% error rate
        assertThat(metrics.getThroughput()).isGreaterThan(1000); // > 1000 req/sec
    }
    
    private TestResult simulateUser(int userId, int requests, Duration duration) {
        List<Duration> responseTimes = new ArrayList<>();
        int errors = 0;
        
        Instant endTime = Instant.now().plus(duration);
        int requestCount = 0;
        
        while (Instant.now().isBefore(endTime) && requestCount < requests) {
            try {
                Instant start = Instant.now();
                
                CreateOrderRequest request = generateTestOrder(userId, requestCount);
                webTestClient.post()
                    .uri("/api/orders")
                    .bodyValue(request)
                    .exchange()
                    .expectStatus().isCreated();
                    
                Duration responseTime = Duration.between(start, Instant.now());
                responseTimes.add(responseTime);
                
                // Realistic user behavior - think time
                Thread.sleep(ThreadLocalRandom.current().nextInt(100, 1000));
                
            } catch (Exception e) {
                errors++;
            }
            requestCount++;
        }
        
        return new TestResult(userId, requestCount, responseTimes, errors);
    }
}

// Performance regression testing
@Component
public class PerformanceRegressionTest {
    
    @Test
    public void validatePerformanceBaseline() {
        // Load baseline metrics from previous runs
        PerformanceBaseline baseline = loadBaseline();
        
        // Run current performance test
        PerformanceMetrics current = runPerformanceTest();
        
        // Validate no significant regression
        assertThat(current.getAverageResponseTime())
            .isLessThan(baseline.getAverageResponseTime().multipliedBy(110).dividedBy(100)); // 10% tolerance
            
        assertThat(current.getThroughput())
            .isGreaterThan(baseline.getThroughput().multipliedBy(90).dividedBy(100)); // 10% tolerance
            
        assertThat(current.getMemoryUsage())
            .isLessThan(baseline.getMemoryUsage().multipliedBy(120).dividedBy(100)); // 20% tolerance
    }
    
    @AfterEach
    public void savePerformanceBaseline() {
        if (isMainBranch() && allTestsPassed()) {
            PerformanceMetrics current = getCurrentMetrics();
            baselineRepository.save(current);
        }
    }
}

Conclusion: Building for Scale

Performance and scalability aren't achieved through single optimizations—they require a systematic approach across all layers:

  • Reactive Programming: Event-driven, non-blocking processing for maximum throughput
  • Intelligent Caching: Multi-level strategies with consistency patterns
  • JVM Optimization: Memory management, GC tuning, and JIT-friendly code
  • Horizontal Scaling: Stateless design, load balancing, and database scaling
  • Continuous Validation: Load testing and performance regression detection

These patterns work together to create systems that perform well under load and scale gracefully as demand grows.

In Part 5, we'll complete the series with production considerations: deployment strategies, container optimization, monitoring, and operational excellence practices that keep high-performance systems running reliably in production.

Coming Next:

  • Container optimization and deployment strategies
  • Production monitoring and alerting
  • Operational excellence and incident response
  • Capacity planning and cost optimization

This is Part 4 of "The Complete Guide to Modern Java Architecture." Complete the series with [Part 5: Production Considerations] for deployment and operational excellence.

Download the companion code examples and architecture templates at: GitHub Repository