How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom

Channel: aiDotEngineer

Published at: 2026-05-15

YouTube video id: 4_VQBbs2iQA

Source: https://www.youtube.com/watch?v=4_VQBbs2iQA

[music]
>> Hey, I'm Brian from Intercom.
I'm this has been such a great
conference so far. I've learned so much
alpha and inspiration from the talks and
all the chats and people.
So Intercom is a 15-year-old privately
held Irish-American B2B SaaS startup
that has pivoted to be an AI company the
weekend that ChatGPT came out. I've got
about 1,400 people across Dublin,
London, Berlin, SF, Chicago, Sydney. R&D
is led from Dublin.
Engineering is almost entirely across
Europe. Four deployed engineers have
kind of changed that up a bit. And this
graph compares our revenue growth to the
growth rates of publicly traded SaaS
companies over the last few years.
And you can see publicly traded SaaS
companies kind of on the down. Intercom,
this amazing view.
And like we're booking the standard
trends that SaaS companies
have been suffering from recently.
And now I'm going to shut Intercom down
live on stage.
I once did a live deployment during a
talk and I thought that was impressive.
So like Intercom has become the poster
child for companies redefining
themselves in the age of AI. New York
Times recently did an article about SaaS
companies reinventing themselves that
prominently featured Intercom.
And I'm being an AI company means a lot
more just than slapping on, you know,
lightweight wrappers or like auto
complete some text field or whatever.
Our agent our AI agent for customer
support, Finn, has over 8,000
customers, industry-leading average
resolution rates, revenues like
approaching 100 million, launched the
day GPT-4 came out, first product
actually released on GPT-4.
And we've been building AI features
since about 2018 or so.
And but you know, the modern LLM models
have unlocked huge capabilities for
dealing with customer support questions,
completely obvious.
And companies like Anthropic, Snowflake,
Linear, Glean, LaunchDarkly use Finn for
their customer support. So maybe SaaS
isn't dead.
And it works well for all size
businesses.
Also, we recently announced that we have
our own model serving 100% of Finn.
Like English text-based conversations
outperforming frontier models like SaaS,
cheaper, faster, better.
And we're at like about 2 2 million
resolutions a race
resolutions a week.
And we're also happy to sell direct
access to our suite of models.
I'm not talking about any of this
though. So I'm a senior principal
engineer at Intercom, been there for 12
years and I'm on our platform group. We
take care of Intercom's uptime,
performance, security, cost management,
observability,
our majestic monolith applications that
we love, mostly Ruby on Rails,
and all internal developer productivity.
And another thing about Intercom is that
we are obsessed with shipping.
Shipping fast and iteratively is the
best way to build high-quality products
that customers love to use.
And so developer productivity is
something we've always invested in.
Shipping is the heartbeat of your
company. It's a great blog post that we
did like many many years ago and
Honeycomb made cool stickers about it.
And obviously for the last few years,
I've been spending a lot of time on
enabling use of AI in our software
development life cycle. So I'm going to
kind of talk about that.
And so unsurprisingly, we've been very
excited about AI in general, you know,
changed the whole company to
build customer support
using AI agents.
And we've been impatient about
getting its adoption and changing how we
build across Intercom. You know, we went
down some kind of familiar routes, you
know, we're all using GitHub Copilot and
then everyone started adopting Cursor
and like we looked at Augment and a few
other things. But ultimately,
you know, say middle of last year, we've
been dissatisfied with the results.
Some good signs, some kind of tasks,
some work made marginally better and
kind of more fun.
But you know, we we're pretty aware of
where the models are going and the
harnesses and we have a strong
conviction that AI
like from many years ago is going to
change in all knowledge work.
So last year, middle of last year, we
set a simple goal. Let's double the
throughput of engineering any year. And
you know, we measure a lot of things in
Intercom. We use like we do a lot of
developer surveys. We use tools like DX.
And but we picked code changes per R&D
person as the primary way we're
measuring productivity.
Every measure is bad. Once you start
measuring it, it's not a measure and all
this.
But also like we're
like impatient about
or like expect the overall throughput to
increase. Like if we're
like actually adopting new ways of
working, putting AI into all of the
different places, then we should expect
a large throughput increase. And so 2x
what we we call this 2x the name of the
project and their team and everything.
This is like wildly ambitious. Like when
we published this back last June or
something like that.
Doubling productivity without doubling
team size. But also kind of wildly
unambitious as well if you like connect
the dots and see where the models and
coding harnesses are going. So in this
talk, I'm going to talk about how we
went about this, how we think about
productivity, and a sneak peek at some
of our internal data and skills and
stuff.
You know, this also coincided
the work here with like the most notable
shift in model capability and coding
capability.
And so, you know, we've all seen this
and this was like one of our principal
engineers
posting just kind of like everyone else
was in around Christmas break last year
going like, oh my god, like the things
have changed massively. And so that has
contributed a lot to our success on 2x.
So this is the kind of engineering
leadership-y part of the talk. And so
you need to be decisive and give clear
executive guidance. And you know, do
organizational change. And we've done a
lot of things. We updated job
descriptions. If you're not adopting AI
in Intercom, whether you're a designer,
product manager, engineer, whatever, you
are not meeting expectations. Binary.
And
yeah, you have to say the same message
over and over and over 100 times, every
different forum, whatever. You just got
to stay on message and constantly talk
about the urgency of us doing this.
You got to reward us as well. Like when
people do good stuff, you got to like
all the Slack channels
showing like automating where people
like automating
when people update skills or do this
that and the other. It's like it's get
put into these channels. We celebrate
stuff.
People are showing each other different
techniques and what's working for them
and that kind of thing. We've done
hackathons. We've done AI immersion
days. And you know, all of these things
are necessary to kind of bring people
along. Like also, we staffed this
full-time. We have a team 2x. That seems
to be just keeps on growing and growing
and growing.
And you know, we're we're not just
saying, hey, you got to AI everything.
Best of luck.
We're like trying to bring everyone like
the hundreds of engineers, hundreds of
people in R&D along with us. So you
know, if you're in a medium or large
organization, you absolutely need to
have people and like your best people
on this full-time.
And and so we chose Cloud Clo Cloud Code
as our platform. So prior to this, we're
kind of only Divora and like letting
people choose their favorite editor and
this that and the other. And
you know, there's like loads of people
adopting Clo Cloud Code, loads of people
using Cursor, loads of people using
Augment.
But
we like we're a believer in platforms in
general.
And it kind of doesn't matter what you
choose.
But choosing one is important. You know,
to a certain extent, you need to get
away from model anxiety. It's like being
multi-cloud. Like you don't get the
compounding benefits of a well-designed
platform if you're sending all your
different work across different cloud
providers or whatever. You're way better
being all in on one and optimizing and
improving that it works.
And like unless there's like very
specific or impactful reasons why you
need to be spread across multiple agents
or whatever.
And so our vision on this was like to
treat Cloud or to like work on to get
Cloud to be able to act like a senior
engineer on any technical task across of
Intercom.
And our vision here was like connect
Cloud to everything. So anything I do on
my laptop,
Cloud should be able to do this. And
that means everything. Like
now of course, we're not reckless. We're
not like just trying to let the thing go
off and delete all of our databases. But
we're like a mature company. We've got
plenty of controls and permissions and
audits and everything like that
that gives us a lot of confidence to be
able to like unleash Cloud in the same
way that we unleash our engineers in our
environments.
And you know, we got to onboard it. We
got to teach it all the stuff that we
teach people when they join Intercom.
All of our Rails conventions, our
architecture, React patterns. Like we've
built a lot of software in 15 years.
But like standard testing standards,
security rules, all this. Cloud
absolutely has to know the Intercom
specific information to be able to do
the job.
And
most importantly, start using the
platform for all technical work. And it
doesn't get things right first time,
hits an issue, goes down the wrong path,
update the guidance. Like this is a
flywheel that we're all contributing to.
And so we've encapsulated a lot of this
knowledge and context in engineering
captures, skills, guidance, hooks to
force these things. We spend a lot of
time cajoling Cloud Code to work well.
We do things like
push out our internal Cloud plugins
to everyone's laptops like bypassing the
all the Cloud Code updates mechanisms
because just it's, you know, it's
you spend a lot of time debugging Cloud
Code installs on like hundreds of
laptops. It's like trying to install
Python or manage Python installs or
something. And so ultimately though,
like every single part of technical
work. So it's not just code production,
it's not like more advanced auto
completes. It's Everything.
So debugging, testing, planning, all
this kind of stuff.
It should just be you driving clouds and
ideally like driving it less and less
and moving higher up the the food chain
and it you know delivers real value,
delivers the code products, whatever to
customers. So everything's in scope. And
like we think that even if the models
and harnesses do not improve at all
which is not definitely not happening.
If anything like this capability curve
is accelerating but like the building we
have the building blocks today to
to improve like basically move vast
amounts of work in our software
development life cycle to be agent
first. Like they could just
pause everything and we've just got this
flywheel and we're going through
everything and looking at every single
piece of work and like the tools are
good enough today to do this.
I wrote some principles to help guide us
along the way. You know,
when you have you're trying to get
hundreds of people to change how they
work or understand what we're trying to
achieve, you need to write things down
and help them out.
And you know, different principles
should apply in different places but
like
you know, we believe that all of
engineering is changing.
Everything that you can do the agent
must be able to do and that that can
feel weird as well like when you're
first connecting us into production
systems, whatever.
And
yeah, like the our job is moving up the
stack as engineers product builders,
whatever. And like if I
a long time ago I used to be a Unix
sysadmin
and you know, you'd like go into data
centers, racking servers,
cabling things, configuring networks and
all that.
And then the cloud came along and I
moved up the stack. I you know, and
people transition from being sysadmins
to SREs. The work was more automation
oriented, more impactful, higher paid.
And so I think this is like we're kind
of speed running this 100 times faster
on a full industry scale. But I kind of
feel like I've been through this before.
We
in Intercom are technically
conservative. We like using single tools
and just using them extremely well.
So hence we end up with these Ruby on
Rails monoliths and stuff. And so we're
kind of applying this thought process as
well to like you know, what is the where
should our focus be? Where is our
attention? Do we want everyone writing
their own multi-agent orchestrators or
opinionated workflows and you know, we
want to build durable, testable,
high-quality components and people to be
considering like the lifetime value of
what they produce and like you know,
the tools, the specific implementations
of these things will change over time
but
I'm pretty sure that writing down how to
do work in Intercom will be valuable no
matter what happens. Maybe it might be
easier to discover in the future. That's
like
a problem at the moment.
And so what this what this really means
in practice is that we spend our time
focusing on small, high-quality,
durable, testable skills that do the job
extremely well that we can you know, use
data, use backtesting We've got like all
of the work we've got this huge body of
work and changes in code and
incidents and everything and so we're
using all of this to help form us and
prove out that these skills are
operating at extremely high quality.
And you know, we
then we and we also practice continuous
improvement here, get these things to be
self-updating,
make sure that these things are very
high quality.
And
yeah, we don't want to get stuck behind
the curve like getting stuck because
we've implemented a lot of our own
own things. We just want to use things
that have come available as Intercom
ship or whatever.
And maybe we might not stay on topic
forever but like
we're very we we're eager to get the
advantage of somebody else building and
shipping great software and capabilities
rather than us having to build
everything ourselves.
So yeah, another thing we guide people
to do is like you want to give problems
agents not tasks. You know, a lot of the
time people will even say in Intercom
are saying like prompting agents say run
this skill to do a thing which is mostly
fine and still kind of necessary. I
still do it a lot but like
we're more kind of having to like move
ourselves to be kind of just describing
the problem or
describing the task and let let the
agent figure out what skills to invoke
and what to do here.
I have fun story recently I was brought
into a security incident. We had
accidentally published some
snowflake table metadata to a public
GitHub repository
and I just habitually
opened
Claude code, told us to join a Slack
channel, take a look.
And I didn't even know that a skill
existed that actually perfectly
encapsulated all of our like data breach
policies and criteria and what to do,
how to analyze this. Claude just
automatically downloaded the files, did
full analysis, concluded it was
innocuous, told me all next steps.
And I like I didn't tell it to do this.
It just kind of figured it out. It was
done in like 2 minutes.
And like that would have been a
20-minute task and kind of boring work.
I'd have to go where's that policy and
take a look at this, that and the other.
And like this just felt like a little
like it was a small example but it's
like again, I just like gave it the
problem of like taking a look at
security incidents. I just figured out
the intent
and used a well-written internal skill
that did this job for me.
And
and yeah, it I mentioned it even in
Intercom like AI adoption is unevenly
distributed. I think we're ahead of the
vast majority of companies but you still
need to help people understand where
they're at and grow towards being highly
effective at using agents in their work.
I see VAEGI recently talked about like
maturity rating for engineers and like
our internal one is kind of similar
here. You're kind of like trying to get
through these different kind of levels
and like ultimately you kind of end up
mastering all skills and like knowing
the tool inside out and ultimate like
what we want people to do is like use
Claude code for everything, automate
your work, then move that to a skill,
then get really good at writing skills
and then writing write skills and
approve the skills
and then optimize the environment for
agents. That could be everything from
software architecture, maybe just to
documentation or other approaches or
other ways of doing things that allows
the agents to be even more effective and
optimized for what they're great at
today.
So here's where we're at. You can see
yeah, wild inflection points
after after going all in on one tool.
That decision was made in December. We
started rolling it out in January.
And we've been just like we have reached
the doubling PR throughput in faster
than 1 year.
And
here's more like data from our internal
dashboards. There's some interesting
stuff in here. There's like yeah, number
of pull requests out of our Claude code.
It's like
in the 90 somethings.
You can see also we're starting to move
into like our current bottleneck is
code review and but you can see we have
this like 17.6%
uh
approval rate
of our automatic code approval and it's
like a lot more
in-depth than just like hey Claude, can
you approve this?
We've gone through a lot of detailed
work to figure out again using
backtesting and previous data
and then getting humans to kind of label
the outputs and figure out like get the
confidence level of the automatic
approvers and kind of shape the pull
requests towards very safe and simple
pull requests which probably always
should have been that way.
But now like they're just approved
automatically and you know, we've also
worked with our auditors to ensure that
we're fully SOC 2, ISO 27001, HIPAA
compliant, all that. You do not need
humans in the loop to
to meet these certifications. You do you
do need to know exactly what you're
doing though and make sure you've got
like auditing controls and everything.
And so by moving approvals to an
extremely well-organized, tested and
competent suite of agents
including Codex for code reviews. I
think multimodal code reviews are okay.
I just like completely went back on my
platform thing.
Like
>> [snorts]
>> we've got a high confidence that like
this stuff is not degrading environment
or adding additional risk. In fact, I
think it's removing risk because humans
aren't actually as good as agents like
when they're well-defined.
Here's like skill invocation. I actually
think the earlier numbers were a bit
wonky. So like we hook up everything
into Honeycomb. All the we've got hooks
all over the place
for basic information about like which
skills are being
invoked and things like that and that's
internally available. There's no private
information
in this and everyone can kind of use it
to kind of get an idea of like what's
being used and where. But we also pull
in all session transcripts
into S3 for data mining, writing
reports, guide like also looking to see
our skills effective, that kind of
stuff. So we've got like a feedback loop
using the session data which is
we we can get more out of this but it's
we're we're we're doing some interesting
stuff with it already.
And this wasn't a goal but like I'm not
particularly proud of like defects
always increasing
until recently but like defects are
getting closed faster than ever. And
like some teams have been inspired by
the move to AI to think about things
like backlog zero or crunching through
hundreds or thousands
of defects.
So like some of this was like a bit
deliberate and planned but just in that
at the same time there's just like this
natural deflation cuz
getting through this work, getting
through all the defects so much faster
these days.
And yeah, it's like we're just seeing
this naturally. We've also been working
with like Stanford. There's a research
group there.
We give them all our code and our code
quality per their metrics has been
increasing over the last while.
And
Okay,
I'm kind of running out of time at this
point.
We have like hundreds of contributors,
tens like tens and lines of code of code
in our Claude code plugins.
It's very active
and
yeah, I mean Claude Claude itself loves
us.
Here's an example skill. This is like
not the most We've got like base
plugins, things that like do all the
session transcripts, session sinking,
some safety hooks and things.
And here's like a a skill I built which
like it just it fixes flaky specs. We
have hundreds of thousands of of tests
and you know, they get a bit flaky
over time and uh we don't we ship a lot
so we just kind of barge through the
kind of flakes
but this this scale was not built like
by me kind of sitting down and figuring
out like oh what what are all the things
he needs to do to fix flaky specs. I've
worked in a feedback loop gave the
gave the agent a goal and through like
guiding us to the right place
and working with us to fix a lot of
flaky specs. It's written this pretty
decent thing with all these cheat codes
or like look up tables and
relatively well organized using
progressive disclosure and all that
and it is like fixing stuff that if our
most senior rails engineers were doing
this I'd be like well they're amazing
and yeah like a lot of other stuff going
on like our CI melted we had to fix that
cloud code is actually widely used
across Intercom outside of software it's
gone completely viral people are banging
down our doors to like use console use
consoles
and
yeah we're yeah we're thinking a lot
about like the future of engineering
product like should we just marriage all
product manager design everything
oh yes the single person team product
experiments have been pretty interesting
as well and I've even been shipping like
codes like stuff that people can use in
their agents to sign up to Intercom this
is stuff that like I like I've just been
using our skills to act as a product
manager which is pretty wild
so that's it I wish you all the best of
luck if you're not doing pretty much all
of this today you're going to be doing
it in the very near future
my contact details are at
brian.scanlon.ie you can interact with
Finn in the messenger configure with our
CLI
and you can check out
ideas.finn.ai for a lot more information
about Intercom and our agents Thank you.
>> [applause]
[music]