DevOps From Scratch

How a Health and Wellness SaaS Startup improved Software Delivery

Customer Overview

Industry Focus: Employee Health and Wellness

Size and Offering: Startup SaaS

Targeted Customers: Small to Medium sized US-based businesses, Insurance Provider Channel Partners

Challenge: Manual Technology Delivery

Solution: Automate and improve technology delivery implementing iterative DevOps

Result: Drastic time reduction of Software Delivery, 99.99% uptime

The Challenge

As a Technology-First provider in an industry dominated by Service-First vendors, the Company's main appeal to targeted customers is fast delivery of new offerings, broad customization options, and rapid feedback-based iterations.

As demand by customers increased, the Company's ability to deliver the new value to customers started to lag. New features were still completed quickly in short iterative development cycles. However, first customer interactions fell further behind, expanding lead time. The culprit: scaling didn't extend equally to existing delivery proccesses, and resource availability and expertise delayed the delivery of these features.

Spirebase aligned stakeholders to a shared plan, to determine and implement the appropriate DevOps strategy:

  • Maintain full compliance with HIPAA and GDPR
  • Significantly improve Feature Delivery lead times
  • Achieve Zero-Down-Time Deployments, even during peak traffic times
  • Maintain 100% uptime, even during Database schema change deployments
  • Optimize Amazon Web Services (AWS) resource usage and costs
  • Multi-Datacenter availability
  • Longer term scaling for Multi-Cloud and On-Premise options

The Stack

Core technology of the Company:

  • Cloud: Amazon Web Services (AWS)
  • OS: Windows with Active Directory
  • Database: MS SQL Server and Postgres (mixed single and multi tenancy)
  • Middle Tier: Java, Spring Boot, ~25 Microservices
  • Front end: React, native javascript, legacy jquery/handlebars
  • Mobile: iOS and Android, backend APIs

Approach

Spirebase worked with the Company to design a roadmap inclusive of their current technology delivery elements, stakeholder goals, all aligned to ensure success. The 3-step methodology consisted of:

  1. Determine the throughput baseline. Map existing processes to incorporate automation. Implement a "As Code" operation.
  2. Strengthen the stack. Improve on weaknesses in the existing DevOps stack, adapting the SDLC to scale to the Company's needs.
  3. Expand. Implement "NewGen" DevOps approaches. Integrate improved analytics and insights into a "Customer Value Delivery" process.

Included in each step of this process is a focus on how each adjustment improves processes by retiring actions that didn't scale, and shedding manual steps that were more effectively automated as team members adopt to benefiting from new skills and processes.

By following this model, impact to the current Technology operation was minimized from the start. Compliance and existing committments were maintained. The Company emerged with improved DevOps processes.

Implementation

Initial rollout: New CI/CD tooling was selected. This facilitated the replication of existing processes and scripts where applicable, creating a fully automated CI/CD ecosystem.

  • Artifact Storage: Nexus3
  • Build Management: Headless Jenkins
  • Provisioning: HashiCorp Terraform, Ansible

Secondary phase: Focus on strengthening the SDLC. Doing so provided better observability, and laid a foundation for future, scalable expansion.

  • SDLC: Jira, Bitbucket, Crowd, Confluence
  • Security: SonarQube
  • Analytics: ELK

Regulatory challenges: On-premise installations of the Atlassian stack. We minimized potential risks of data exposure and leakage while meeting or exceeding compliance requirements for HIPAA and GDPR.

Launching the Company's development highway: By leveraging Spirebase product lines, final implementation steps optimized deployment processes. The Company could better measure organizational effectiveness and use Jira Service Desk as an additional tool for optimization of customer requests, improved insights tracking.

  • Jira Service Desk
  • Orchestration: Spirebase "Launch" JVM Orchestration
  • Deployment Management: Spirebase "Route" Application Gateway
  • Intelligence: Spirebase "Velocity" Analytics

Business Impact

The Company grew organically from the start, taking a typical Agile & Sprint based CI/CD approach. Application Updates consisted of multiple independant changes grouped together and deployed in a single after hours "big bang" deployment.

Spirebase started measuring DevOps efficiency and Organizational Efficiency on several levels to derive the impact of:

  • Current processes on the Company. Primary measure: What is their ability to ship code effectively?
  • Current processes on their Customers. Primary measure: How is value delivered to users?

Measuring these from the start allowed us to better gain insights into improvements and benefits of a DevOps transformation.

Lifecycle Insights

Spirebase insight gathering started with a "Source Control Touch" event such as a pull request. Once merged back into the base branch, it is tracked using Spirebase's Insight KPIs.

  1. Time to Artifacts - Total time until all required artifacts are ready.
  2. First Seen on Production - The first time the application is started in a production environment.
  3. Time to First Route - First user request is served by new deployed application version.
  4. Time to Meaningful Routes - Minimum threshold of approx 20% of total requests routed is reached.
  5. Time to All Traffic - Time until this application is promoted to serve all requests.

Capturing these insights allowed us to uncover hidden feedback for Development. This was used to tighten the Company's overall deployment strategy.

Before and After

All times are based on a change to a master branch of an application being deployed.

KPIBeforeAfter
Time to Artifacts3.7 weeks30 min
First Seen on Production+ 3 hrs+ 21 min
Time to First Routen/a+ 1.8 hrs
Time to Meaningful Routesn/a+ 5.2 hrs
Time to All Trafficn/a+ 11.9 hrs
Total Time3.7 weeks<20 hrs

Before Spirebase, each deployment followed a traditional "Agile-Big-Bang" release path. No production route strategy was possible: applications were deployed to all users at the same time.

After Spirebase, the overall Development lifecycle maintained an agile approach for time and stakeholder management. What changed was the deployment ability and strategy: deployments were available to be pushed immediately when ready.

  • over 2x Feature ticket completion per sprint
  • 36% reduction in Bug to Feature Ratios
  • 2.4 weeks (40% reduction) for customer feature customization

Additional Stats

Windows-to-Linux Migration

Spirebase maintained the Company's service at 100% uptime during the migration from a Windows Service based application structure to a Linux-based JVM cluster.

Cloud AWS Costs

The Company reduced overall AWS costs, even as the number of virtual machines increased, by a factor of 3 during normal operational loads (5 times during peak traffic conditions). Cost reductions included a combination of effective selection of instance types, use of savings plans in place of instance reservations, and combining spot instances to handle peak traffic outliers.

  • Month 1 - Instance Type Optimization: 21% reduction in AWS bill
  • Month 7 - Final Restructure: 49% reduction in AWS bill

Database Optimization

The initial stack consisted of a single MS SQL Server instance and single AD Domain Controller. Spirebase recommended adding AD DC secondaries and SQL Server AlwaysOn with WSFC for automated failovers.

A distributed EHCache Cluster was deployed for all app servers, to further optimize application level database performance.

The results were easy to measure: Addition of High Availability support to keep uptime at 100%; Reduced average Edge API Endpoint response time from 130ms to 30ms.

Reproducibility

A true testament to "Infrastrcture As Code" is the ability to recreate everything from scratch. This includes the entirety of cloud resources, SDLC tooling, DevOps specific tools and assets, applications, and their assets. Doing so efficiently allows a repeatable process from ZERO to everything in less than 2.5 hours.

Are you ready to make the shift to application management
that is easier, intelligent and effective?

Let's Chat