From AI skeptic to an advocate

"Everyone starts as a skeptic at the beginning, and the only difference is when you start not being a skeptic anymore. And for me, that was maybe two years ago or so."

Key takeaways

Implementing AI across all functions (PM, design, engineering, QA) creates broader efficiency gains than just focusing on engineering.
Legacy systems create AI barriers. Teams working on newer codebases see significantly better AI ROI than those maintaining 12-year-old legacy systems.
Leadership must actively re-engage skeptics. It's not enough to provide tools. Leaders need to help engineers overcome early negative experiences with AI.
Context window limitations are real. Large, complex codebases bump against fundamental LLM limitations, requiring sophisticated management strategies.
Individual curiosity trumps seniority. Personal learning agility and curiosity matter more than traditional experience levels for AI adoption success.
Operational burden creates AI opportunity. Super apps and complex platforms that require significant operational overhead are ideal candidates for AI-powered toil reduction.

About

I work as a software engineer at Coinbase, but my background is pretty diverse. I've got experience in software engineering, system engineering, and Linux administration. I've been a big Linux fan since I was a teenager, which gives me a really solid foundation in systems thinking.

I've been on call for many years and have worked in some interesting places - including a police station more than 10 years ago. This systems and infrastructure background shapes how I think about AI and production systems.

How has your journey with AI been from being a skeptic to a believer?

My transformation follows a pattern that's probably familiar to a lot of experienced engineers. I started as a complete skeptic - when I first saw AI technologies, I thought they were impossible and wouldn't amount to anything.

But about two years ago, something clicked. I had this realization that there actually was potential and a real future with AI. That's when my curiosity kicked in and I wanted to learn more.

"Everyone starts as a skeptic at the beginning, and the only difference is when you start not being a skeptic anymore. And for me, that was maybe two years ago or so."

While I acknowledge that AI is useful for generating code, I think that's just scratching the surface. I see a lot of unexplored areas where AI could make engineers' lives much easier.

Some of the areas I'm excited about include fixing code that other people wrote, debugging production systems that are misbehaving, improving code quality, adding tests to existing code, and automating manual tasks that machines could handle better than humans.

"Creating code is just the tip of the iceberg of what you can do. There's so many still unexplored areas."

What are some interesting learnings you ran into, while adopting AI?

I've learned that specificity is absolutely critical when working with AI tools. The more lazy you are at the beginning when describing what you want, the more time you'll spend later trying to fix or change what you get back.

I've also realized that you have to learn how to "speak to AI" - it's almost like learning a new communication style. You can't just throw vague requests at it and expect good results.

"The more lazy you are at the beginning with the description of what you get, the more time you will have to spend later to fix or change what you get"

According to my experience, there's still a huge gap between what AI generates and what you can actually run in production. And humans play a big role in bridging that gap.

One specific problem I've noticed is that AI tends to generate too much code - it can be verbose and include more than you actually need. In my view, less code is better because it means fewer chances for bugs and easier maintenance.

"From something that is generated from AI to something that can run in production, there's a big gap still that is not covered, and humans have a big role in that"

What skills are becoming more important because of AI?

I predict that code review skills are going to become much more important than code writing skills. As AI generates more code, engineers will need to get really good at reading, understanding, and evaluating code rather than just writing it from scratch.

This is a pretty fundamental shift in what it means to be a software engineer. Instead of spending most of your time writing code, you might spend most of your time reviewing and understanding code that something else wrote.

"One ability people are gonna develop more is to be able to read and review code more than writing. And that's a very important skill that we need to think about more than ever"

One area where I've found AI really valuable is understanding new codebases. When I'm working with code I've never seen before, I can ask AI to help me understand what different parts do.

This works well for both experienced engineers exploring new domains and newcomers trying to get up to speed on complex systems. It's like having a knowledgeable colleague who can explain unfamiliar code.

"I use it for learning some new code base that I've never worked on. Like help me understand what this thing does"

What are my concerns about AI and security?

I raise some important concerns about security vulnerabilities in AI-generated code. The issue is that AI systems are trained on existing code samples, and a lot of that training data includes buggy code.

Since AI learns from open source code, and open source code contains bugs and security vulnerabilities, there's a risk that AI will perpetuate and spread these problems. I'm worried we might see more vulnerabilities in the future because of this.

"I'm thinking potentially you can have security bugs in code that is generated, because it has been trained on samples that were not maybe that good. It learns from open source code mostly. Open source code contains bugs too."

Interestingly, I see the potential security issues as both a concern and a business opportunity. While it's obviously problematic if AI generates more vulnerable code, it also creates opportunities for security companies and tools.

If AI is going to create more security problems, then there will be more demand for solutions to detect and fix those problems. It's an interesting way to think about the second-order effects of AI adoption.

"Maybe for the security industry it's a big concern, but also a big business opportunity."

How are you planning to use AI for your production systems?

When it comes to production systems and incident response, I want AI to provide relevant information at the right time, especially during incidents. I'm not looking for more data. I want the right data.

The key difference I point out is between analyzing static code (which is relatively easy) and dealing with a production system that's misbehaving when you have limited time to fix it. That's a much more stressful and complex scenario.

"As an engineer, I would like to know more of the relevant information right now. What's going on, especially during an incident."

How should organizations be thinking about using AI for their production systems?

I suggest that organizations should look at their past outage data and try to learn from it. I think companies have a lot of really good data that isn't being used effectively.

By analyzing historical incidents, AI systems could potentially identify patterns and gaps that would help during future incidents. It's about learning from what's happened before to be better prepared for what might happen next.

"I would look at the past outages, past data that an organization has, and try to learn from that. Companies have a lot of really good data that is not harnessed well enough."

I emphasize that signal quality is really important. In my view, it might be better for an AI system to say nothing rather than give you information that's wrong, especially in high-pressure production scenarios.

I warn against AI systems that might create false alarms, wake people up unnecessarily, mislead investigations, or waste time during critical incidents. In production environments, false positives can be more harmful than no information at all.

"Maybe it's better not say anything if you're not sure, instead of giving something that is not true."

Based on my experience, I would focus on avoiding scenarios where AI systems could:

Create false alarms that wake people unnecessarily
Mislead incident investigations
Make people waste time during critical situations
Provide incorrect information during high-pressure moments

My team has actually built some small AI tools for incident management. One example is a tool that summarizes the status of an incident and explains what's going on. It provides a useful summary of conversations during incidents, which can be helpful as a starting point.

What's your most memorable on-call experience?

Over 10 years ago when I was working at a playstation, the entire internet went down around 2-3 AM, and we spent all night trying to figure out what was wrong.

Eventually, we discovered it was a kernel bug in Linux related to leap seconds - something that happens from time to time but can completely mess up the kernel. It took us hours to figure out, but it was the kind of complex, environmental issue that makes for memorable war stories.

Resolve.ai

Social