Industrial Scientific shrinks AWS bill by 13% and simplifies SRE.

Homepage
indsci.com
TL;DR
Goals & challenges
  1. Reachability troubleshooting in a complex AWS environment
  2. Cost control and cost troubleshooting,
  3. Lean SRE team with growing responsibilities.
The solution
  1. Context-rich cost troubleshooting capabilities with direct and indirect costs,
  2. Non-intrusive platform with cross-account visibility,
  3. Expansion opportunities to cover security & compliance use cases.
Results
  1. Reduced the monthly AWS bill by 13% within two months of using Stream,
  2. Sped up troubleshooting processes - from manual steps that took hours to minutes,
  3. Potential to further reduce AWS bills with additional optimization opportunities.

The customer

Since 1985, Industrial Scientific has been a safety company focused on eliminating death on the job, first with cutting-edge gas detection technologies and now with asset management, remote monitoring, productivity tracking, and other software applications.

From the International Space Station to mines deep inside the earth, men and women around the world bet their lives on the technologies our teams have created, including the first three-gas detector, six-gas detector, and wireless gas detector.

At any given time, hundreds of thousands of people are betting their lives on the work Industrial Scientific does as a company. Industrial Scientific has 16+ years of cloud experience and relies on AWS and Stream (along with other technologies) for their critical IT infrastructure today.

Industrial Scientific has been growing their AWS footprint and are looking to expand to a multi-region topology to support their global operations.

"Our AWS infrastructure helps us keep our wearable gas detection instruments up and running across the world. Every day, human lives depend on the reliability and monitoring capabilities we provide."
Rob Surrena, Senior Site Reliability Engineer at Industrial Scientific

The challenge

The Industrial Scientific team is very adept at using cloud technology and has previously moved their on-premises datacenter to AWS for improved reliability and 24x7 continuous monitoring. Leveraging cloud opened up many opportunities for their business, yet also introduced additional complexity for daily operations.  

The Site Reliability Engineering (SRE) team quickly noticed that they needed better technology to respond to interesting audit patterns, emerging trends and reachability troubleshooting in their environment. While useful, the native tools didn’t provide granular and actionable insights for their AWS topology on availability and cost. The team had to resort to manual scripting and manual reviews of the AWS console, but they were aware that these methods would not scale, especially with how lean the team is and how their infrastructure is growing.  

Cloud cost optimization is top of mind for all organizations and Industrial Scientific is no exception. Cost optimization activities would just happen in the spur of the moment, and the team would spend hours trying to understand AWS Cost Explorer and other native tools.  

The solution:

“Seeing how our infrastructure is laid out on AWS with Stream, and being able to get all the answers we need for audits and optimization opportunities is huge.”
Rob Surrena, Senior Site Reliability Engineer at Industrial Scientific

When first introduced with Stream, the SRE team pushed back on committing to a proof of concept because of the day-to-day operational workload. Yet, as soon as they started the trial, the team discovered that Stream was invaluable for taking care of different aspects of their AWS, especially for small teams. Stream simplified a lot of the daily tasks and helped the team get more done in a repeatable way.

When it comes to cost optimization, Stream reveals all direct and indirect cost for every resource, so the SRE team can resolve cost issues in seconds or minutes. During the trial, the team also discovered many cost optimization opportunities in their AWS environment and realized that they could further reduce their AWS bill. The team started to implement Stream’s cost optimization suggestions (such as migrating from gp2 to gp3 instance types and removing unused RDS resources) and reduced their monthly AWS bill by 13% during the first two months with Stream.  

Before working with Stream, the team was limited to native AWS tools to handle cross-account reachability and cost issues. These processes were all manual and the data wasn’t presented clearly, nothing comparable to Stream. How Stream presents the data is proven to be key to Industrial Scientific’s success: being able to click on a resource to retrieve all related configuration and components is very helpful.

The team appreciated the non-intrusiveness of Stream and how the platform brought visibility with context to all AWS accounts, tracking cost, availability and security in the same view instead of dedicated point solutions.  

As a result of working with Stream, the SRE team at Industrial Scientific gained:  

  • Simplified daily operations thanks to the contextual visibility into AWS,
  • Superior cost troubleshooting and optimization capabilities,  
  • Superior incident management capabilities as force multiplier for security, availability, and cost.

Quantifiable benefits for the Industrial Scientific team include:  

  • Reduced the monthly AWS bill by 13%, with potential for further reduction.  
  • Reduced the time it takes to troubleshoot cost and reachability from hours to minutes.
“Stream was pivotal in our ability to monitor and manage AWS costs. The platform made it very easy to understand how our cost is structured with direct and indirect costs, and how we can optimize it.
For example, the direct cost of an ECS Cluster might be a dollar, but the total indirect cost of that cluster might be a hundred dollars because of the EC2 Instances in it.“
Rob Surrena, Senior Site Reliability Engineer at Industrial Scientific

Working with Stream

The Industrial Scientific and Stream team have formed a close partnership from the start. Both teams brought extraordinary expertise and collaborated on making this a successful project.

"I see a lot of potential in the visibility capabilities that Stream brings. The team has been very responsive and knowledgeable so far. How Stream represents information with the right context is extremely helpful. What’s also great is the platform is not intrusive, it gives us visibility into all our AWS accounts for cost, security and reachability aspects. The product can pay for itself with the cost improvements we’ve implemented within two months of implementation."
Rob Surrena, Senior Site Reliability Engineer at Industrial Scientific

Next, Industrial Scientific is looking to increase its Stream usage by adopting:  

  • Simulation capabilities for Terraform in their CI/CD processes,
  • Network log and VPC flow forwarding to Stream,  
  • Custom architectural standards to enforce internal best practices.