Updates & News

How It's Tested | Ep. #3, Balancing the Ownership of Testing with Alan Page

Estimated Read Time: # Minutes
Team Mobot
September 7, 2023

Learn more about Alan Page.

Eden Full Goh: Hey, Alan. Thanks so much for joining me on the How It's Tested Podcast. It's really nice to meet you.

Alan Page: Hey, Eden. Nice to meet you too. I'm excited to be here.

Eden: Yeah, I was taking a look at your profile just before this and was just really impressed by the depth of testing experience and software engineering and leadership experience that you have across your career, spanning companies like Unity and Microsoft. I thought it'd be awesome for you to come on our podcast and share some stories of all the different technologies you've probably had the chance to test across different tech stacks. I'm really fascinated and excited to get into it.

Alan: It'll be fun to talk about. I was writing up a little intro about myself, and I realized I've been in tech now for 30 years and I'm beginning to wonder... People think I know something about testing, but I think I may have spent more of my career now not in dedicated testing roles than I have, in dedicated testing roles. But I do still know a lot about testing, I care a lot about it, I care a lot about quality so I'm sure there's lots of things we can dive into.

Eden: Yeah. I guess one thing that would be really helpful for our audience to get to know you is I know you started out your career as a tester or as a software development and test at Microsoft. Is that right?

Alan: Actually, no. I started my career as a software tester for a company that made music software. I also did tech support and I also managed their network. But I was a software tester for a company called MIDISoft that made some early audio recording and MIDI recording and playback, and some things like that for, back in the old days, when you bought a computer it didn't have any audio. You had to go buy a sound card, and the sound card came with some free software and often that was from my company.

Eden: Interesting. Was that your first exposure to, I guess, testing best practices and the role of testing?

‍

Focusing on What Matters Most
‍

Alan: Yeah. I wouldn't say best practices. I was the only tester there, there was somebody else that did a little bit of it. But at the time, I did it like a lot of testers do today. They use the software, they learn, they ask questions, they get curious and explore. I did all that and found some bugs, but then at the what eventually became Microsoft Visual Test, but was soon to be called MS Test, this very early version, this beta version of Microsoft Test Automation Framework.

I had never programmed before either, but one of the programmers there took me under his wing, taught me to program, and then showed me this automated testing application they had on one of the MSDN CDs. I used that and I began automating a lot of our projects. I wrote a lot of really, really bad automation, but it was complete. I almost wrote it as a way to learn automation, in hindsight, but that was my introduction to testing and test automation, was at that company. Then when-

Eden: Wait, what makes bad automation, exactly?

Alan: When you write tests that last like three hours that really do what should be tested in about 10 seconds. So I decided to test, there was an input that took a value between 1 and 65,535 and I decided I should test every single one of those because I could. It was a dialog, I just tested all different combinations of every value, which is very much the definition of over testing because I didn't know any better. I thought, "Well, I have the automation," I was kind of making it up. I didn't know design very well, so I wrote inefficient automation is bad automation, and it was very inefficient.

Eden: That's a really interesting point, and it's something I reflect on a lot because in a world where you have automation at your fingertips, you could automate anything and everything. Shouldn't you want to automate every little thing possible? Every possible edge case? Make sure every nook and cranny of your product is covered? I think that's a common misconception, or maybe it is a valid perspective depending on the situation.

Alan: Well, coverage, nook and cranny, you want to get the coverage as much as you can. But often when we have hundreds of hours of tests, which can happen when you have even a moderately complex product with different configurations, you can have hundreds of hours of tests. In fact, I'll tell a little story. I worked on a product recently that had hundreds and hundreds of hours of tests that ran across multiple configurations.

It took a long time for them to run, and when you have that many tests, there's always a little bit of flakiness to go investigate. But at the end of the day when all these tests were run, code coverage for what it's worth... I say code coverage is a wonderful tool but a horrible metric.

Code coverage was right around 50%, so what that tells me is a lot of things were tested multiple times, which is what I was doing back at MIDISoft. Yet, there were a lot of things that weren't tested at all so the trick is you want to use automation to accelerate your testing and more isn't always better.

Eden: Yeah. I think that lines up with a lot of what I've discovered, especially when we work with different engineering teams and sometimes during that discovery phase, they just want to name every feature, everything under the sun. But then when you think about, well, there's finite time and devices and robots available for testing, what about we just focus on the things that matter the most? The business use case or the end user case? So I do think there's a strategy that you have to implement and a thoughtfulness you have to implement in testing. It's not just about looping through everything.

Alan: Yeah, as we talked briefly about before here, I believe that developers should own the vast majority of test automation and I have coached hundreds of developers on how to write automation. But what's interesting is, for most developers as I teach them to write test automation, they make a lot of the mistakes that I made in my first few years of writing test automation.

I'll tell them, "No, you don't need to test all those." "But we can." I have to explain the problems that are going to happen down the road, and they go, "Oh, cool. All right, you're right." But it's funny to see, even experienced, very experienced developers will make some of the same automation mistakes that I made early on in my career. I just find that really fascinating.

Eden: Could you give another example, either from your career or an engineer that you've mentored in teaching them? What is the first instinct of a test case that you'd want to automate everything for? And what's actually the better or more recommended way to automate and approach that?

Alan: I'll answer a little differently. Everyone is different is the short answer. There's this model, I forget where it's from, but people need to be aware of the change they want to make. Then the D is they have to have the desire to make the change they want to make, and then the knowledge to make it. So usually what I have to work on first, this usually happens at more of a leadership level, is the desire.

I show the data to show and how it's going to help us, and I can get that desire built pretty quickly, and, "Okay, yeah. We should write automated tests." From that I got one of the quotes I've used quite a bit, there was an exec VP that asked me how his team was doing at testing, I said they do all the testing they know how to do. Which meant they had strong desire, but iffy knowledge on how to do it. So every developer has their different... Some kind of get it right away.

Often it's coaching more than telling, asking questions like do you think that test would work better if you only tested one thing during the test, or do you think it's good to test 10 different things? Or if you've tried the debug fail test, and if you have to rerun the test to look at it and see what's failed as very slow, so I bring up things like what kind of errors should you log if this fails? What would help you understand what the bug was, if you saw a test failure?

So I just ask a lot of questions about it but it's very contextual. Going back to my story about learning to test by questioning the software, often the coaching I do when I'm helping developers learn how to test is just teaching them how to ask questions like what would happen if...? When this test fails, what would I do? It's imagining those scenarios that begins to build up those test design chops pretty quickly.

Some of the developers I've worked with are better testers than most of the testers that I've ever worked with, so they really, really get it. But most do pretty well, pretty well. There's a few that kind of struggle, but for the vast majority they can figure it out and they make some mistakes, but they're willing to learn from those. I think your answer was somewhere buried in there.

Eden: Yeah, thank you. So when you were at MIDISoft, how did you transition to a role at Microsoft? Because I think something that I really want to get to talking about is the work you did one the Xbox One.

Transitioning to Microsoft

Alan: Okay, long story. I was talking to my son last week, he asked me, "Which projects did you work on at Microsoft?" I started listing those, he goes, "Wow, you did a lot of things." So what happened was, this was 1994, so we were fairly early in the PC revolution, and I had no idea... I had been a software tester, a network administrator at MIDISoft for a year and a half. But we had an ethical issue where our CFO, this was a public company by the way, our CFO would come to my office with my network administrator hat on and ask me to turn back the clocks on our accounting server a few days so he could get some extra sales into that quarter.

I thought this seems like a not right thing, and so after the second time he did it, I decided that I would go apply for some jobs. So in 1994, the way you did this was you faxed from your fax modem on your computer, I faxed a bunch of resumes to some contract companies. My phone was ringing off the hook the next day, all kinds of companies saying, "Hey, we'll talk to you." I interviewed with about three different places and got a job testing networking on Windows 95.

Another one of my great automation quotes, I thought they hired me because I was really good at test automation, and I wasn't but I thought I was. But they hired me because I knew a lot about networking, I didn't realize how much I knew about networking and how viable that was.

So my manager gave me a list of test cases on my first day. I said, "Great. When would you like these automated?" And he said, "Oh no, we don't have time to automate these. We just need you to run them every day." That was Monday, and I had them automated by Thursday, so then I did exploratory testing in my spare time and found more things to do and they gave me work to do. That was great.

So I was a contractor and I started in January '95 and I didn't know how contracts work. It was a six month contract, so about five months go by, and I thought, "Oh, my contract's going to end. I better get another job. This was fun, but whatever." So I applied for a few jobs and actually got a job offer from Software Testing Labs, working for James Bach. Again, this was still 1995, June '95.

I came back and told my manager that, "Hey, my contract is up so I got another job, my last day will be blah." Apparently they liked my work because that afternoon, his boss was in my office asking me if I'd entertain an offer and I didn't even... It was so fast that I didn't even have a good poker face, I just said, "Okay, sure." The other job was a contract job, this was a full time job, and he said, "Well, if we give you an offer, do you think you'd consider it?" And I said, "Okay, sure. I'd consider it." The next morning, this all happens the next morning, a courier showed up at my door with a contract for Microsoft, then I was there for 22 years.

Eden: That's amazing.

Alan: So I've been very good at learning as I go, that's my one superpower is that I can just figure stuff out so I do that a lot. At first I did it mostly as a test developer, whichever acronym you want to use, and then later I spent a lot of my time on tools and infrastructure, including tests, including things like CI and cloud platforms and those things.

Then in leadership roles. I try and find that balance between knowing I need to learn a lot to figure out what to do and I very much live in the fake it till you make it, or on the edge of uncomfortable to get stuff done. Anyway, that's how I survived there. Then very quickly the products I worked on, Windows '95, Win '98, Internet Explorer 2 and 3, what eventually was Windows 2000, I worked on Windows Millennium, sorry, Windows CE for a few years.

I worked at a central org called Engineering Excellence. I worked on Microsoft Link and then Xbox One. Xbox One, probably some good stories in there. I joined the team, it was about nine people while everybody was working on Xbox 360. Unlike this model, we bootstrapped a lot of things and I bootstrapped a lot of things on the test and quality and build and infrastructure side.

Then after Xbox I worked on this dumb science project to make Android apps run on Windows Phone, then finally Microsoft Teams before joining Unity after that. I was counting for my son, 19 offices in 22 years and 12 different teams. It was a good time.

Eden: Of all the products that you've worked on and tested, do you have a couple that stand out as they were well engineered and easy to test? I don't know if easy to test is a thing.

Alan: Yeah. Well, the story on Xbox One is my favorite. I joined that team and again I mentioned bootstraps and things, I got some code coverage tools working early, some code analysis tools working, debugging tools. Just ramped up. A friend colleague of mine was working a lot on how we were going to test and we were the only two test faults on the team. It wasn't a long time for us to bootstrap, a few months.

But we had a little reorg and my new manager asked me if I would like to run the tools team for Xbox One, and I said no, and he said, "Why not?" I said, "Well, let's be clear, I can be responsible for all of our tools but at Microsoft, tools teams get really beat up at review time so I don't want to do it that way." I had just read The Cathedral and The Bazaar, and I was excited about how open source projects worked, so I proposed that I run the tools team, the tools for Xbox One like an open source org.

We would get people somehow to want to contribute to the tools, and there were a few pockets of people on teams who were dedicated. But we'd try and make it a crowd project and see what would happen. That was fun, it ended up being a little bit of a leadership challenge, there was a lot of communication to get people involved and report back on what was going on, that generated momentum. All those kind of fun things.

But in the end we ended up with a whole bunch of test and diagnostic and build tools that are still used by the console today, because eight years later they haven't changed. One of the cool ones, and if you remember Xbox 360 had this, it was famous for its Red Ring Of Death where it would take an update and it wouldn't work, and it was bad. Always in the back of our mind, we never want to have anything like this, how can we avoid this? And how can we have the safest update procedure possible? So in the very beginning of Xbox it was just a PC, it was just a big, heavy computer that had a close approximation to the final hardware.

Of course the final hardware was going to be rebuilt by the vendors and soldered into the console that you see today. But at the time it was just a big, heavy PC, and because it was a PC with a regular network card in it, it could do a network boot, which is the way for a lot of corporations, the way you get your computer set up. So it was handy. In the early days, we just used an automated system that on that network boot, it would download whatever the Xbox images were and run tests.

We could use that to get the machine set up for testing. That worked fine for like the first year, until we got the actual consoles or the prototypes for the consoles. Those did not have any sort of BIOS like a PC or any sort of network boot, but we did have a feature coming in that was going to be the future on how Xbox One would be updated in the field, in my room right over there.

So we leveraged that, we used that, and I guess the way most mobile phones work today now is two slots, your working operating system is in Slot A, then when you get an update you get a whole new one in Slot B. Then I reboot, we try and switch over with the settings and everything, and if it fails at all then we can roll back. Then now B is the active one, next time we'll do an A and we'll roll over.

So we used the Xbox update process to install new builds of Xbox in our lab. We had a lab of hundreds of Xboxes and we had a very, very cool test management system. That sounds so gross, not a test management system. But a way to run tests, we had a tool called XTP for Xbox Test Pass. On the command line you could say things like, "Run graphics tests. Run the smoke tests. Run all of the graphic tests that'll run in 10 minutes."

And it would do some selection for you on the backend. Basically, a SQL sort of command line you could use to be very clear on what test you want to run and it would go find machines for you. So these hundreds of machines were running all day, all night, running different tests for different people. There was a queuing system. It was really cool.

But because we were using, every time we ran tests, we put a new image of the Xbox on there, we were using this update system constantly, and we found dozens of critical, show stopper, Red Ring Of Death type bugs that were all fixed by the time... just because we wanted to use it for testing. I haven't searched on the internet, but I don't think there are any wide scale, at least reports, of any sort of rings of death or failures after update on Xbox One. So I think that one worked out pretty well for us, but it was a lot of fun. Like I said, I'm really proud of the fact that a lot of those tools are still running today, so I think that one worked out really well for us.

Eden: So was it just the testing infrastructure on the Xbox 360? And maybe it's just it was a few years earlier that it was just that infrastructure was different, and so there weren't-

Alan: Yeah, it was different and I wasn't around for 360 so I don't have all the details. But there was the unspoken mantra of, "No Red Rings Of Death. No Bricked Xboxes."

Eden: Because you guys knew what to protect against in this new iteration.

Alan: We knew the bricking, the red rings happened on updates so we had a... One, the solid plan was to do the Slot A, Slot B, which is pretty normal today. But then testing the crap out of it to make sure it worked because you could get into a state early on where both slots could be in a bad shape, and we didn't want to be there. So it was fun.

Eden: Yeah. I guess with a product as complex as Xbox One, you oversaw the design of all this testing infrastructure, especially around updating. I'm curious, where does that extend to when you have the actual controllers? Whether it's wired or wireless, or connecting to a television, or even physical, manual testing? In addition to all of this automation that you had for Xbox One, what else was part of the testing process?

Testing Products at Microsoft

Alan: Yeah. So one cool thing you can do when you're Microsoft is you have beta testers who are ravenous to get a chance to use your new thing for something like Xbox One. So we counted a lot on our beta program, we had so many different user focused stories of people going through and telling us, both monitored to just give us feedback on going through a workflow to see how things worked. But then thousands and thousands of people both inside and outside of Microsoft with beta hardware, playing what few games we had and both giving us direct feedback and us collecting a lot of telemetry on crashes or user experience issues they had when they were running.

There's the advantage of having a big customer base, you can crowdsource some of that. People may say, "Don't make your users do the testing." Well, they're beta testers, it's what they're doing.

Largely, they were being successful, but we don't wait for them to struggle and send us a bug report. We're actually getting a lot of that information automatically through our telemetry and monitoring, so it just helps having a big audience for those things to get that sort of testing. Then of course we all had... Every time there were new games available, and if you remember when Xbox One shipped, there were very few launch titles.

The game titles were way behind. They came out quickly and caught up, but for every game we had, whatever state it was in... We did a lot of beta testing of testing the beta version of Rise Of Empires, I probably got the name wrong, and beta version of Xbox. A lot of us were playing that at home. We were playing dumb, little games at home that people built just to get the experience we could, because there's so much that goes with launching a game in the way... I'm not going to got into the full architecture, but there's multiple operating systems running at once on Xbox One. Just to make sure that whole thing worked seamless, and a lot of dog fooding, self hosting it and just counting on the big beta audience to help us.

Eden: You mentioned earlier that you feel pretty strongly that developers and engineers should write their own tests. But it looks like for a lot of your career, you were also focused very much on test engineering. Would you consider that part, that you were a dedicated tester or it's not quite the same?

Alan: I absolutely was. I spent the first piece of my career as a dedicated tester, because the way we make software has changed and a lot of things have changed, but definitely the first part of my career I was absolutely a dedicated tester. I moved maybe for the middle part of my career, maybe even in some ways to today to, one, focusing on how people tested and then also building the infrastructure for that.

So I've built many different test distribution systems and metric systems and analysis systems that help enhance the testing. I kind of moved into that for a while and that moved very seamlessly for me to working on build systems and CI and things like that. Then today, just more of a management overhead role. So I've definitely spent a bunch of my time as a dedicated tester, but it's really just the move to more agile delivery methods where we're trying to ship frequent value more often.

It's more efficient to have the developers write those tests. Not necessarily all of them. There's a blog post I wrote where I took a lot of hate because I said not all teams need dedicated testers, and every exception in the world, I think, contacted me to tell me why they were different and that's fine. But there are a lot of teams and companies out there shipping pretty good software without dedicated testers.

There are also a lot of companies with testers shipping very bad software, so there's not a direct correlation. But I have seen inefficiency there, and even in Accelerate, the book by Nicole Forester and Jean Kim and others. There's data they have that shows that there is a high correlation of product quality with developer owned testing that does not exist when testers own that automation.

Nicole Forester goes on to say that teams still need testers for a lot of the exploratory things, again if you have a UI or if you have a complex application. For sure. If you're delivering a Rest API, maybe you don't need a dedicated tester to help you with that, if you're delivering a lower level code. My feeling is you do not need dedicated testers to do that work for you.

Eden: It almost sounds like you're saying that maybe engineers can own the testing infrastructure if the code or the tests are essentially automatable, but-

Alan: I'd say even better, because what happens is when the developers own writing the test automation, they write more testable code. If they know they don't have to write that automation, they don't care.

Eden: What about for interfaces and products like the Xbox One, where there is a significant physical component? Or a mobile app where not every test case is actually automatable? Then how do you balance the ownership of testing? Because it becomes this dreaded automation versus manual ratio that you're trying to strike with your testing strategy.

Balancing the Ownership of Testing

Alan: The answer to any sufficiently complex question is it depends. In our case, like I mentioned, we relied, and this is a good leading question into the cool stuff you're doing, but in our case with Xbox, for example, we just relied on a whole bunch of beta testers. Even for a view on product, we could do that but you don't always have that luxury. A tautology I have used a lot is you should automate 100% of the tests that should be automated. I think in most orgs, testers are automating too much or things that are too difficult to get right.

I think you want to design your software so that the vast majority of testing can be done at a level below the UI, then when you're testing at the UI you can do a couple different things. It could be a little bit of random testing like moving stuff around, clicking things, or just sanity to make sure things keep working. I go back to the story of me writing an automated test to test every single permutation of a data entry dialog box.

We don't want to do that. The design challenge there is what are the set of tests that'll get us the coverage we need at this level? Where I think people fall off the wagon, a slight metaphor or not, is when they want to automate everything through the UI when not everything needs to be automated through the UI. Unfortunately, I think a lot of testers are given that charter, like, "Here's our webpage, here's our application. Please test it."

And what they really should get if they're going to be writing automation is, "Here are all the tests that exist for a testing level below this. Let's make sure the UI is working as well." And of course you don't want to put logic in the UI, but you do want to make sure it does the right things in the right places and things don't get cut off and things like that. So those sorts of things are very prime for automation.

Eden: Yeah, I think you make a very interesting observation. That's something that I've seen a lot, which is the kinds of test cases that we're presented with at Mobot to solve, there are sometimes where I think, yeah, with slightly more robust testing infrastructure, there are probably things that shouldn't be tested at the UI level. But it manifests that way through a push notification, or through some sort of multi device interaction, so sometimes it's just easier to cut that corner, easier to just test it physically, easier to just test it at the UI level. I can see how there needs to be a more informed way of just separating those use cases, because not everything needs to be tested at the UI level. You're right.

Alan: Yeah. There's a great example of where I wouldn't. Say you do something on Device A that eventually causes a notification to happen on Device B. That's really hard to test at the UI level, but you could monitor APIs, you could say, "I set this thing, I can wait for this post over here. Okay, I got it." Then the thing you want to test, you can even test that locally with a mock, is just post to yourself, you get a message and that would be, I think, a more efficient test.

Eden: Yeah. The push notification itself, or the data pass through the push notification part, yes, that's not the hard part to test. But we've seen the deep links that happen afterwards. But it's true that in those situations, sometimes when you're thinking, "I just need to test the push notification," you're not separating out the different components of is it the data being passed into the push notification that you're testing?

Is it the deep link and the way that it's opening up the app from the background that impacts? There's so many other details, and so breaking out a test case into its different components, especially if you're a tester, I think sometimes it's so tempting to just approach it at the UI level. That's something that I've definitely seen as a pattern.

Alan: Yeah, it is. You want to keep in mind, what is it that you're trying to discover or figure out? What do I want to know from writing this test will help you. I think sometimes folks get caught into, "I'm going to make it do this, I'm going to make it do this, I'm going to make it do this."Then if I ask them, "What do you want to learn from these tests?" They don't know. I think if you start that there's an important thing I need to learn from writing this automation, I think you may be in a better place to write better automation.

Eden: Yeah. I guess one question I have for you is you're encouraging and mentoring engineers to write better testing infrastructure and approach testing coverage in a more holistic and comprehensive way, but how do you balance that with the business needs, the desire to move faster? You have this brilliant, talented engineer who could spend more time writing more features with less test coverage and you might be able to get away with it, and I think that's the temptation and the trap we all fall into. So how do you encourage people away from that? Or encourage people to think about that?

Alan: Well, one, they're wrong. It's interesting, because the best developers I know are already writing a lot of tests. It's often the ones who think they're really good and can move really fast who don't realize they would do better with tests. It's okay, a lot of times, sure, maybe I can write a whole bunch of code that doesn't need very many tests or any tests and it works and it's great.

But what happens when I need to go back and change that code and there's no tests for it? Or add something to it? I have so many stories in my early years at Microsoft where I would find a bug in some component and they'd say, "Oh yeah? Unless it's really bad, we're not going fix it because nobody is confident enough to change that code. Everyone is afraid to change that code because it's so brittle and nobody knows it."

The reason tests exist is so we can do that refactor. So I guess as long as you know the wonderful code you wrote is never going to need to be changed or anything added to it or anything else, sure, maybe you can get away without any tests. But I don't think that's realistic. So the way you do it, is a little bit of building a culture of expectation.

You have to build an expectation that done means tested, and done doesn't mean tested by... I've thrown it over to the tester. But done isn't done until it's tested. What I'm finding is over the years of doing this, it's much easier, it's gotten much easier to have developers who understand that they are expected to write a big chunk of testing. It was harder 10 years ago.

It could just be my little bubble and circle, but it's really just that expectation, done means tested. You start with that and maybe there's some conversations because, as Weinberg says, "Eventually in software, it's always a people problem." So maybe you have to have some conversations and talks to help people get over that hump a bit. But it's still important, though.

Eden: I guess when do you decide as an engineer that a test case is not worth automating? You were alluding to it just now with if it catches a bug that you're not even going to fix, even if that test case flags something, or if you're trying so hard to automate it but it's just not working exactly as intended. What are the things that you think about if whether or not a test case is worth automating?

‍

What Makes a Test Case Worth Automating?

Alan: Well, I think you can apply a bunch of heuristics to that. I think most test cases, there's a reason you want to automate it, like, "I need to learn something. I'm worried something..." There's some reason you want to automate it. After you automate it, it may be flaky, it may not quite work, it may take too long. Often though, I'm going to answer your question in a very roundabout way here, I'm reflecting really quickly here while I talk, but any test I've ever written that's been very difficult to write is because the code has been difficult to test.

If that code is difficult to test, it's always, again in my experience, always difficult to change later, it's difficult to code review, it has all kinds of other difficulties. Basically, you're in this weird paradox where it's difficult to write a test for this thing and the reason is all the clues point to this code being difficult to test and probably buggy. Again, this is a great example of why developers should write tests for their code, because that code needs to be refactored.

The vast majority of testing should be easy to write. Now, when it gets fun and difficult, here's the challenges that I want to mark my stick on, is the multi machine testing. I was working with someone who was putting together testing for a multiplayer framework. That's a whole new animal, that synchronization between things. That needs some work. That's difficult to test, but also needs to be tested.

I think for the vast majority of things that are difficult to write automation for and find value, I'll get back to your first question in a second here, is because the actual underlying code needs to be refactored. Then the other thing, maybe related to your question, is I mentioned the story about hundreds of hours of testing and not getting that much coverage. It doesn't mean all those tests are bad or anything. What it means is we probably don't need to run all those tests.

One thing also we did on Xbox which I liked was we could limit the number of tests by time and we could prioritize tests. We prioritized tests not by having everybody, assign them a Prior One, Prior Two, Prior Three, because when you do that everybody makes their test Prior One. You just build a little heuristic engine that figures out which tests you want to run all the time?

The tests that gets lots of coverage, that run really fast, and that find bugs once in a while. That's the very best one. It doesn't mean the tests that don't find bugs don't need to be run, but let's say we decide we're only going to run two hours for the tests. So we make our best guess, we run the fast tests that cover the most breadth across the product. We'll assign a couple things.

We can collect a bunch of data around those tests, and metadata like did it find a bug, how fast did it run, what area is it in, has this test ever found a bug, when was the last time this test ran. Because when we choose not to run a test, and this is the big thing that happens and everyone, developers and testers running automation, they feel like if they write a test, they must now run that test on every build of the product from now until time ends because the moment they stop running it, a bug will appear there.

That's a fallacy and they're imagining something. So I have seen tests that have run and never, ever failed for 10 years on a product. Turning that test off, it's passed for 10 years, it's going to pass for 10 more whether I run it or not. It's a tree falls in the forest thing. But to satisfy those people, you can also in the test selection heuristic, you can factor in how long since I've run this test.

So by choosing not to run a test, it becomes slightly more valuable to run the next time. If I skip it again, it becomes slightly more valuable, and eventually I run it again, it passes, it goes back to the back of the queue. But this prioritization thing, I think it's okay to write 100 hours of tests, or 200 hours of tests or 1,000 hours of tests, but don't try and run them every single build.

You need some prioritization. That's the next trick in selection, I think. I would bet 90% of the testers and developers I talk to believe that once they write an automated test, that test should run as often as possible until time ends, and that's just not true.

Eden: It's really nice to meet someone like you with this perspective because I think that's something that we encounter a lot, right? We use mechanical robots, testing real phones, and you're not going to be able to test everything all the time. We're not at a point where the robots run 24/7/365.

Alan: Yeah, it's even more important for you, right? Because there's a finite... They run pretty fast, I've seen the videos, they're really cool. But still, there's a very finite amount of testing the robots can do so you can't throw everything at them.

Eden: Yeah, you have to choose the tests that matter and that means designing a good strategy and understanding the product well enough to be able to pick the right tests. Ultimately, this kind of testing, you don't want to run the really silly test cases that are going to pass. You want to use the right tool for the right job, and so, yeah, it's really refreshing to get to talk to someone who understands that because I have this conversation all the time. It's like, "We need to be able to test everything or it's not worth it."

Alan: Oh boy, yeah.

Eden: Alan, I have really enjoyed our conversation. I could honestly nerd out about this topic for a lot more time. But I want to be respectful of your time and, who knows, we should do this again another time.

Alan: I'm happy to come back for part two some day.

Eden: Yeah. I'd love to get into some of the other stories and the other operating systems that you were building at Microsoft. Honestly, we didn't even get to touch on any of the work you did at Unity, so I'm sure you have so many other testing stories and best practices.

Alan: I have a lot of stories. If it's one thing I have in 30 years of tech, it's stories.

Eden: We'll have to do another episode soon, but thanks so much for joining me on the podcast, Alan. It's been awesome.

Alan: I'm honored to be here. Thanks for inviting me. It was great talking with you.

Updates & News

How It's Tested | Ep. #3, Balancing the Ownership of Testing with Alan Page

Estimated Read Time: # Minutes
Team Mobot
September 7, 2023

Learn more about Alan Page.

Eden Full Goh: Hey, Alan. Thanks so much for joining me on the How It's Tested Podcast. It's really nice to meet you.

Alan Page: Hey, Eden. Nice to meet you too. I'm excited to be here.

Eden: Yeah, I was taking a look at your profile just before this and was just really impressed by the depth of testing experience and software engineering and leadership experience that you have across your career, spanning companies like Unity and Microsoft. I thought it'd be awesome for you to come on our podcast and share some stories of all the different technologies you've probably had the chance to test across different tech stacks. I'm really fascinated and excited to get into it.

Alan: It'll be fun to talk about. I was writing up a little intro about myself, and I realized I've been in tech now for 30 years and I'm beginning to wonder... People think I know something about testing, but I think I may have spent more of my career now not in dedicated testing roles than I have, in dedicated testing roles. But I do still know a lot about testing, I care a lot about it, I care a lot about quality so I'm sure there's lots of things we can dive into.

Eden: Yeah. I guess one thing that would be really helpful for our audience to get to know you is I know you started out your career as a tester or as a software development and test at Microsoft. Is that right?

Alan: Actually, no. I started my career as a software tester for a company that made music software. I also did tech support and I also managed their network. But I was a software tester for a company called MIDISoft that made some early audio recording and MIDI recording and playback, and some things like that for, back in the old days, when you bought a computer it didn't have any audio. You had to go buy a sound card, and the sound card came with some free software and often that was from my company.

Eden: Interesting. Was that your first exposure to, I guess, testing best practices and the role of testing?

‍

Focusing on What Matters Most
‍

Alan: Yeah. I wouldn't say best practices. I was the only tester there, there was somebody else that did a little bit of it. But at the time, I did it like a lot of testers do today. They use the software, they learn, they ask questions, they get curious and explore. I did all that and found some bugs, but then at the what eventually became Microsoft Visual Test, but was soon to be called MS Test, this very early version, this beta version of Microsoft Test Automation Framework.

I had never programmed before either, but one of the programmers there took me under his wing, taught me to program, and then showed me this automated testing application they had on one of the MSDN CDs. I used that and I began automating a lot of our projects. I wrote a lot of really, really bad automation, but it was complete. I almost wrote it as a way to learn automation, in hindsight, but that was my introduction to testing and test automation, was at that company. Then when-

Eden: Wait, what makes bad automation, exactly?

Alan: When you write tests that last like three hours that really do what should be tested in about 10 seconds. So I decided to test, there was an input that took a value between 1 and 65,535 and I decided I should test every single one of those because I could. It was a dialog, I just tested all different combinations of every value, which is very much the definition of over testing because I didn't know any better. I thought, "Well, I have the automation," I was kind of making it up. I didn't know design very well, so I wrote inefficient automation is bad automation, and it was very inefficient.

Eden: That's a really interesting point, and it's something I reflect on a lot because in a world where you have automation at your fingertips, you could automate anything and everything. Shouldn't you want to automate every little thing possible? Every possible edge case? Make sure every nook and cranny of your product is covered? I think that's a common misconception, or maybe it is a valid perspective depending on the situation.

Alan: Well, coverage, nook and cranny, you want to get the coverage as much as you can. But often when we have hundreds of hours of tests, which can happen when you have even a moderately complex product with different configurations, you can have hundreds of hours of tests. In fact, I'll tell a little story. I worked on a product recently that had hundreds and hundreds of hours of tests that ran across multiple configurations.

It took a long time for them to run, and when you have that many tests, there's always a little bit of flakiness to go investigate. But at the end of the day when all these tests were run, code coverage for what it's worth... I say code coverage is a wonderful tool but a horrible metric.

Code coverage was right around 50%, so what that tells me is a lot of things were tested multiple times, which is what I was doing back at MIDISoft. Yet, there were a lot of things that weren't tested at all so the trick is you want to use automation to accelerate your testing and more isn't always better.

Eden: Yeah. I think that lines up with a lot of what I've discovered, especially when we work with different engineering teams and sometimes during that discovery phase, they just want to name every feature, everything under the sun. But then when you think about, well, there's finite time and devices and robots available for testing, what about we just focus on the things that matter the most? The business use case or the end user case? So I do think there's a strategy that you have to implement and a thoughtfulness you have to implement in testing. It's not just about looping through everything.

Alan: Yeah, as we talked briefly about before here, I believe that developers should own the vast majority of test automation and I have coached hundreds of developers on how to write automation. But what's interesting is, for most developers as I teach them to write test automation, they make a lot of the mistakes that I made in my first few years of writing test automation.

I'll tell them, "No, you don't need to test all those." "But we can." I have to explain the problems that are going to happen down the road, and they go, "Oh, cool. All right, you're right." But it's funny to see, even experienced, very experienced developers will make some of the same automation mistakes that I made early on in my career. I just find that really fascinating.

Eden: Could you give another example, either from your career or an engineer that you've mentored in teaching them? What is the first instinct of a test case that you'd want to automate everything for? And what's actually the better or more recommended way to automate and approach that?

Alan: I'll answer a little differently. Everyone is different is the short answer. There's this model, I forget where it's from, but people need to be aware of the change they want to make. Then the D is they have to have the desire to make the change they want to make, and then the knowledge to make it. So usually what I have to work on first, this usually happens at more of a leadership level, is the desire.

I show the data to show and how it's going to help us, and I can get that desire built pretty quickly, and, "Okay, yeah. We should write automated tests." From that I got one of the quotes I've used quite a bit, there was an exec VP that asked me how his team was doing at testing, I said they do all the testing they know how to do. Which meant they had strong desire, but iffy knowledge on how to do it. So every developer has their different... Some kind of get it right away.

Often it's coaching more than telling, asking questions like do you think that test would work better if you only tested one thing during the test, or do you think it's good to test 10 different things? Or if you've tried the debug fail test, and if you have to rerun the test to look at it and see what's failed as very slow, so I bring up things like what kind of errors should you log if this fails? What would help you understand what the bug was, if you saw a test failure?

So I just ask a lot of questions about it but it's very contextual. Going back to my story about learning to test by questioning the software, often the coaching I do when I'm helping developers learn how to test is just teaching them how to ask questions like what would happen if...? When this test fails, what would I do? It's imagining those scenarios that begins to build up those test design chops pretty quickly.

Some of the developers I've worked with are better testers than most of the testers that I've ever worked with, so they really, really get it. But most do pretty well, pretty well. There's a few that kind of struggle, but for the vast majority they can figure it out and they make some mistakes, but they're willing to learn from those. I think your answer was somewhere buried in there.

Eden: Yeah, thank you. So when you were at MIDISoft, how did you transition to a role at Microsoft? Because I think something that I really want to get to talking about is the work you did one the Xbox One.

Transitioning to Microsoft

Alan: Okay, long story. I was talking to my son last week, he asked me, "Which projects did you work on at Microsoft?" I started listing those, he goes, "Wow, you did a lot of things." So what happened was, this was 1994, so we were fairly early in the PC revolution, and I had no idea... I had been a software tester, a network administrator at MIDISoft for a year and a half. But we had an ethical issue where our CFO, this was a public company by the way, our CFO would come to my office with my network administrator hat on and ask me to turn back the clocks on our accounting server a few days so he could get some extra sales into that quarter.

I thought this seems like a not right thing, and so after the second time he did it, I decided that I would go apply for some jobs. So in 1994, the way you did this was you faxed from your fax modem on your computer, I faxed a bunch of resumes to some contract companies. My phone was ringing off the hook the next day, all kinds of companies saying, "Hey, we'll talk to you." I interviewed with about three different places and got a job testing networking on Windows 95.

Another one of my great automation quotes, I thought they hired me because I was really good at test automation, and I wasn't but I thought I was. But they hired me because I knew a lot about networking, I didn't realize how much I knew about networking and how viable that was.

So my manager gave me a list of test cases on my first day. I said, "Great. When would you like these automated?" And he said, "Oh no, we don't have time to automate these. We just need you to run them every day." That was Monday, and I had them automated by Thursday, so then I did exploratory testing in my spare time and found more things to do and they gave me work to do. That was great.

So I was a contractor and I started in January '95 and I didn't know how contracts work. It was a six month contract, so about five months go by, and I thought, "Oh, my contract's going to end. I better get another job. This was fun, but whatever." So I applied for a few jobs and actually got a job offer from Software Testing Labs, working for James Bach. Again, this was still 1995, June '95.

I came back and told my manager that, "Hey, my contract is up so I got another job, my last day will be blah." Apparently they liked my work because that afternoon, his boss was in my office asking me if I'd entertain an offer and I didn't even... It was so fast that I didn't even have a good poker face, I just said, "Okay, sure." The other job was a contract job, this was a full time job, and he said, "Well, if we give you an offer, do you think you'd consider it?" And I said, "Okay, sure. I'd consider it." The next morning, this all happens the next morning, a courier showed up at my door with a contract for Microsoft, then I was there for 22 years.

Eden: That's amazing.

Alan: So I've been very good at learning as I go, that's my one superpower is that I can just figure stuff out so I do that a lot. At first I did it mostly as a test developer, whichever acronym you want to use, and then later I spent a lot of my time on tools and infrastructure, including tests, including things like CI and cloud platforms and those things.

Then in leadership roles. I try and find that balance between knowing I need to learn a lot to figure out what to do and I very much live in the fake it till you make it, or on the edge of uncomfortable to get stuff done. Anyway, that's how I survived there. Then very quickly the products I worked on, Windows '95, Win '98, Internet Explorer 2 and 3, what eventually was Windows 2000, I worked on Windows Millennium, sorry, Windows CE for a few years.

I worked at a central org called Engineering Excellence. I worked on Microsoft Link and then Xbox One. Xbox One, probably some good stories in there. I joined the team, it was about nine people while everybody was working on Xbox 360. Unlike this model, we bootstrapped a lot of things and I bootstrapped a lot of things on the test and quality and build and infrastructure side.

Then after Xbox I worked on this dumb science project to make Android apps run on Windows Phone, then finally Microsoft Teams before joining Unity after that. I was counting for my son, 19 offices in 22 years and 12 different teams. It was a good time.

Eden: Of all the products that you've worked on and tested, do you have a couple that stand out as they were well engineered and easy to test? I don't know if easy to test is a thing.

Alan: Yeah. Well, the story on Xbox One is my favorite. I joined that team and again I mentioned bootstraps and things, I got some code coverage tools working early, some code analysis tools working, debugging tools. Just ramped up. A friend colleague of mine was working a lot on how we were going to test and we were the only two test faults on the team. It wasn't a long time for us to bootstrap, a few months.

But we had a little reorg and my new manager asked me if I would like to run the tools team for Xbox One, and I said no, and he said, "Why not?" I said, "Well, let's be clear, I can be responsible for all of our tools but at Microsoft, tools teams get really beat up at review time so I don't want to do it that way." I had just read The Cathedral and The Bazaar, and I was excited about how open source projects worked, so I proposed that I run the tools team, the tools for Xbox One like an open source org.

We would get people somehow to want to contribute to the tools, and there were a few pockets of people on teams who were dedicated. But we'd try and make it a crowd project and see what would happen. That was fun, it ended up being a little bit of a leadership challenge, there was a lot of communication to get people involved and report back on what was going on, that generated momentum. All those kind of fun things.

But in the end we ended up with a whole bunch of test and diagnostic and build tools that are still used by the console today, because eight years later they haven't changed. One of the cool ones, and if you remember Xbox 360 had this, it was famous for its Red Ring Of Death where it would take an update and it wouldn't work, and it was bad. Always in the back of our mind, we never want to have anything like this, how can we avoid this? And how can we have the safest update procedure possible? So in the very beginning of Xbox it was just a PC, it was just a big, heavy computer that had a close approximation to the final hardware.

Of course the final hardware was going to be rebuilt by the vendors and soldered into the console that you see today. But at the time it was just a big, heavy PC, and because it was a PC with a regular network card in it, it could do a network boot, which is the way for a lot of corporations, the way you get your computer set up. So it was handy. In the early days, we just used an automated system that on that network boot, it would download whatever the Xbox images were and run tests.

We could use that to get the machine set up for testing. That worked fine for like the first year, until we got the actual consoles or the prototypes for the consoles. Those did not have any sort of BIOS like a PC or any sort of network boot, but we did have a feature coming in that was going to be the future on how Xbox One would be updated in the field, in my room right over there.

So we leveraged that, we used that, and I guess the way most mobile phones work today now is two slots, your working operating system is in Slot A, then when you get an update you get a whole new one in Slot B. Then I reboot, we try and switch over with the settings and everything, and if it fails at all then we can roll back. Then now B is the active one, next time we'll do an A and we'll roll over.

So we used the Xbox update process to install new builds of Xbox in our lab. We had a lab of hundreds of Xboxes and we had a very, very cool test management system. That sounds so gross, not a test management system. But a way to run tests, we had a tool called XTP for Xbox Test Pass. On the command line you could say things like, "Run graphics tests. Run the smoke tests. Run all of the graphic tests that'll run in 10 minutes."

And it would do some selection for you on the backend. Basically, a SQL sort of command line you could use to be very clear on what test you want to run and it would go find machines for you. So these hundreds of machines were running all day, all night, running different tests for different people. There was a queuing system. It was really cool.

But because we were using, every time we ran tests, we put a new image of the Xbox on there, we were using this update system constantly, and we found dozens of critical, show stopper, Red Ring Of Death type bugs that were all fixed by the time... just because we wanted to use it for testing. I haven't searched on the internet, but I don't think there are any wide scale, at least reports, of any sort of rings of death or failures after update on Xbox One. So I think that one worked out pretty well for us, but it was a lot of fun. Like I said, I'm really proud of the fact that a lot of those tools are still running today, so I think that one worked out really well for us.

Eden: So was it just the testing infrastructure on the Xbox 360? And maybe it's just it was a few years earlier that it was just that infrastructure was different, and so there weren't-

Alan: Yeah, it was different and I wasn't around for 360 so I don't have all the details. But there was the unspoken mantra of, "No Red Rings Of Death. No Bricked Xboxes."

Eden: Because you guys knew what to protect against in this new iteration.

Alan: We knew the bricking, the red rings happened on updates so we had a... One, the solid plan was to do the Slot A, Slot B, which is pretty normal today. But then testing the crap out of it to make sure it worked because you could get into a state early on where both slots could be in a bad shape, and we didn't want to be there. So it was fun.

Eden: Yeah. I guess with a product as complex as Xbox One, you oversaw the design of all this testing infrastructure, especially around updating. I'm curious, where does that extend to when you have the actual controllers? Whether it's wired or wireless, or connecting to a television, or even physical, manual testing? In addition to all of this automation that you had for Xbox One, what else was part of the testing process?

Testing Products at Microsoft

Alan: Yeah. So one cool thing you can do when you're Microsoft is you have beta testers who are ravenous to get a chance to use your new thing for something like Xbox One. So we counted a lot on our beta program, we had so many different user focused stories of people going through and telling us, both monitored to just give us feedback on going through a workflow to see how things worked. But then thousands and thousands of people both inside and outside of Microsoft with beta hardware, playing what few games we had and both giving us direct feedback and us collecting a lot of telemetry on crashes or user experience issues they had when they were running.

There's the advantage of having a big customer base, you can crowdsource some of that. People may say, "Don't make your users do the testing." Well, they're beta testers, it's what they're doing.

Largely, they were being successful, but we don't wait for them to struggle and send us a bug report. We're actually getting a lot of that information automatically through our telemetry and monitoring, so it just helps having a big audience for those things to get that sort of testing. Then of course we all had... Every time there were new games available, and if you remember when Xbox One shipped, there were very few launch titles.

The game titles were way behind. They came out quickly and caught up, but for every game we had, whatever state it was in... We did a lot of beta testing of testing the beta version of Rise Of Empires, I probably got the name wrong, and beta version of Xbox. A lot of us were playing that at home. We were playing dumb, little games at home that people built just to get the experience we could, because there's so much that goes with launching a game in the way... I'm not going to got into the full architecture, but there's multiple operating systems running at once on Xbox One. Just to make sure that whole thing worked seamless, and a lot of dog fooding, self hosting it and just counting on the big beta audience to help us.

Eden: You mentioned earlier that you feel pretty strongly that developers and engineers should write their own tests. But it looks like for a lot of your career, you were also focused very much on test engineering. Would you consider that part, that you were a dedicated tester or it's not quite the same?

Alan: I absolutely was. I spent the first piece of my career as a dedicated tester, because the way we make software has changed and a lot of things have changed, but definitely the first part of my career I was absolutely a dedicated tester. I moved maybe for the middle part of my career, maybe even in some ways to today to, one, focusing on how people tested and then also building the infrastructure for that.

So I've built many different test distribution systems and metric systems and analysis systems that help enhance the testing. I kind of moved into that for a while and that moved very seamlessly for me to working on build systems and CI and things like that. Then today, just more of a management overhead role. So I've definitely spent a bunch of my time as a dedicated tester, but it's really just the move to more agile delivery methods where we're trying to ship frequent value more often.

It's more efficient to have the developers write those tests. Not necessarily all of them. There's a blog post I wrote where I took a lot of hate because I said not all teams need dedicated testers, and every exception in the world, I think, contacted me to tell me why they were different and that's fine. But there are a lot of teams and companies out there shipping pretty good software without dedicated testers.

There are also a lot of companies with testers shipping very bad software, so there's not a direct correlation. But I have seen inefficiency there, and even in Accelerate, the book by Nicole Forester and Jean Kim and others. There's data they have that shows that there is a high correlation of product quality with developer owned testing that does not exist when testers own that automation.

Nicole Forester goes on to say that teams still need testers for a lot of the exploratory things, again if you have a UI or if you have a complex application. For sure. If you're delivering a Rest API, maybe you don't need a dedicated tester to help you with that, if you're delivering a lower level code. My feeling is you do not need dedicated testers to do that work for you.

Eden: It almost sounds like you're saying that maybe engineers can own the testing infrastructure if the code or the tests are essentially automatable, but-

Alan: I'd say even better, because what happens is when the developers own writing the test automation, they write more testable code. If they know they don't have to write that automation, they don't care.

Eden: What about for interfaces and products like the Xbox One, where there is a significant physical component? Or a mobile app where not every test case is actually automatable? Then how do you balance the ownership of testing? Because it becomes this dreaded automation versus manual ratio that you're trying to strike with your testing strategy.

Balancing the Ownership of Testing

Alan: The answer to any sufficiently complex question is it depends. In our case, like I mentioned, we relied, and this is a good leading question into the cool stuff you're doing, but in our case with Xbox, for example, we just relied on a whole bunch of beta testers. Even for a view on product, we could do that but you don't always have that luxury. A tautology I have used a lot is you should automate 100% of the tests that should be automated. I think in most orgs, testers are automating too much or things that are too difficult to get right.

I think you want to design your software so that the vast majority of testing can be done at a level below the UI, then when you're testing at the UI you can do a couple different things. It could be a little bit of random testing like moving stuff around, clicking things, or just sanity to make sure things keep working. I go back to the story of me writing an automated test to test every single permutation of a data entry dialog box.

We don't want to do that. The design challenge there is what are the set of tests that'll get us the coverage we need at this level? Where I think people fall off the wagon, a slight metaphor or not, is when they want to automate everything through the UI when not everything needs to be automated through the UI. Unfortunately, I think a lot of testers are given that charter, like, "Here's our webpage, here's our application. Please test it."

And what they really should get if they're going to be writing automation is, "Here are all the tests that exist for a testing level below this. Let's make sure the UI is working as well." And of course you don't want to put logic in the UI, but you do want to make sure it does the right things in the right places and things don't get cut off and things like that. So those sorts of things are very prime for automation.

Eden: Yeah, I think you make a very interesting observation. That's something that I've seen a lot, which is the kinds of test cases that we're presented with at Mobot to solve, there are sometimes where I think, yeah, with slightly more robust testing infrastructure, there are probably things that shouldn't be tested at the UI level. But it manifests that way through a push notification, or through some sort of multi device interaction, so sometimes it's just easier to cut that corner, easier to just test it physically, easier to just test it at the UI level. I can see how there needs to be a more informed way of just separating those use cases, because not everything needs to be tested at the UI level. You're right.

Alan: Yeah. There's a great example of where I wouldn't. Say you do something on Device A that eventually causes a notification to happen on Device B. That's really hard to test at the UI level, but you could monitor APIs, you could say, "I set this thing, I can wait for this post over here. Okay, I got it." Then the thing you want to test, you can even test that locally with a mock, is just post to yourself, you get a message and that would be, I think, a more efficient test.

Eden: Yeah. The push notification itself, or the data pass through the push notification part, yes, that's not the hard part to test. But we've seen the deep links that happen afterwards. But it's true that in those situations, sometimes when you're thinking, "I just need to test the push notification," you're not separating out the different components of is it the data being passed into the push notification that you're testing?

Is it the deep link and the way that it's opening up the app from the background that impacts? There's so many other details, and so breaking out a test case into its different components, especially if you're a tester, I think sometimes it's so tempting to just approach it at the UI level. That's something that I've definitely seen as a pattern.

Alan: Yeah, it is. You want to keep in mind, what is it that you're trying to discover or figure out? What do I want to know from writing this test will help you. I think sometimes folks get caught into, "I'm going to make it do this, I'm going to make it do this, I'm going to make it do this."Then if I ask them, "What do you want to learn from these tests?" They don't know. I think if you start that there's an important thing I need to learn from writing this automation, I think you may be in a better place to write better automation.

Eden: Yeah. I guess one question I have for you is you're encouraging and mentoring engineers to write better testing infrastructure and approach testing coverage in a more holistic and comprehensive way, but how do you balance that with the business needs, the desire to move faster? You have this brilliant, talented engineer who could spend more time writing more features with less test coverage and you might be able to get away with it, and I think that's the temptation and the trap we all fall into. So how do you encourage people away from that? Or encourage people to think about that?

Alan: Well, one, they're wrong. It's interesting, because the best developers I know are already writing a lot of tests. It's often the ones who think they're really good and can move really fast who don't realize they would do better with tests. It's okay, a lot of times, sure, maybe I can write a whole bunch of code that doesn't need very many tests or any tests and it works and it's great.

But what happens when I need to go back and change that code and there's no tests for it? Or add something to it? I have so many stories in my early years at Microsoft where I would find a bug in some component and they'd say, "Oh yeah? Unless it's really bad, we're not going fix it because nobody is confident enough to change that code. Everyone is afraid to change that code because it's so brittle and nobody knows it."

The reason tests exist is so we can do that refactor. So I guess as long as you know the wonderful code you wrote is never going to need to be changed or anything added to it or anything else, sure, maybe you can get away without any tests. But I don't think that's realistic. So the way you do it, is a little bit of building a culture of expectation.

You have to build an expectation that done means tested, and done doesn't mean tested by... I've thrown it over to the tester. But done isn't done until it's tested. What I'm finding is over the years of doing this, it's much easier, it's gotten much easier to have developers who understand that they are expected to write a big chunk of testing. It was harder 10 years ago.

It could just be my little bubble and circle, but it's really just that expectation, done means tested. You start with that and maybe there's some conversations because, as Weinberg says, "Eventually in software, it's always a people problem." So maybe you have to have some conversations and talks to help people get over that hump a bit. But it's still important, though.

Eden: I guess when do you decide as an engineer that a test case is not worth automating? You were alluding to it just now with if it catches a bug that you're not even going to fix, even if that test case flags something, or if you're trying so hard to automate it but it's just not working exactly as intended. What are the things that you think about if whether or not a test case is worth automating?

‍

What Makes a Test Case Worth Automating?

Alan: Well, I think you can apply a bunch of heuristics to that. I think most test cases, there's a reason you want to automate it, like, "I need to learn something. I'm worried something..." There's some reason you want to automate it. After you automate it, it may be flaky, it may not quite work, it may take too long. Often though, I'm going to answer your question in a very roundabout way here, I'm reflecting really quickly here while I talk, but any test I've ever written that's been very difficult to write is because the code has been difficult to test.

If that code is difficult to test, it's always, again in my experience, always difficult to change later, it's difficult to code review, it has all kinds of other difficulties. Basically, you're in this weird paradox where it's difficult to write a test for this thing and the reason is all the clues point to this code being difficult to test and probably buggy. Again, this is a great example of why developers should write tests for their code, because that code needs to be refactored.

The vast majority of testing should be easy to write. Now, when it gets fun and difficult, here's the challenges that I want to mark my stick on, is the multi machine testing. I was working with someone who was putting together testing for a multiplayer framework. That's a whole new animal, that synchronization between things. That needs some work. That's difficult to test, but also needs to be tested.

I think for the vast majority of things that are difficult to write automation for and find value, I'll get back to your first question in a second here, is because the actual underlying code needs to be refactored. Then the other thing, maybe related to your question, is I mentioned the story about hundreds of hours of testing and not getting that much coverage. It doesn't mean all those tests are bad or anything. What it means is we probably don't need to run all those tests.

One thing also we did on Xbox which I liked was we could limit the number of tests by time and we could prioritize tests. We prioritized tests not by having everybody, assign them a Prior One, Prior Two, Prior Three, because when you do that everybody makes their test Prior One. You just build a little heuristic engine that figures out which tests you want to run all the time?

The tests that gets lots of coverage, that run really fast, and that find bugs once in a while. That's the very best one. It doesn't mean the tests that don't find bugs don't need to be run, but let's say we decide we're only going to run two hours for the tests. So we make our best guess, we run the fast tests that cover the most breadth across the product. We'll assign a couple things.

We can collect a bunch of data around those tests, and metadata like did it find a bug, how fast did it run, what area is it in, has this test ever found a bug, when was the last time this test ran. Because when we choose not to run a test, and this is the big thing that happens and everyone, developers and testers running automation, they feel like if they write a test, they must now run that test on every build of the product from now until time ends because the moment they stop running it, a bug will appear there.

That's a fallacy and they're imagining something. So I have seen tests that have run and never, ever failed for 10 years on a product. Turning that test off, it's passed for 10 years, it's going to pass for 10 more whether I run it or not. It's a tree falls in the forest thing. But to satisfy those people, you can also in the test selection heuristic, you can factor in how long since I've run this test.

So by choosing not to run a test, it becomes slightly more valuable to run the next time. If I skip it again, it becomes slightly more valuable, and eventually I run it again, it passes, it goes back to the back of the queue. But this prioritization thing, I think it's okay to write 100 hours of tests, or 200 hours of tests or 1,000 hours of tests, but don't try and run them every single build.

You need some prioritization. That's the next trick in selection, I think. I would bet 90% of the testers and developers I talk to believe that once they write an automated test, that test should run as often as possible until time ends, and that's just not true.

Eden: It's really nice to meet someone like you with this perspective because I think that's something that we encounter a lot, right? We use mechanical robots, testing real phones, and you're not going to be able to test everything all the time. We're not at a point where the robots run 24/7/365.

Alan: Yeah, it's even more important for you, right? Because there's a finite... They run pretty fast, I've seen the videos, they're really cool. But still, there's a very finite amount of testing the robots can do so you can't throw everything at them.

Eden: Yeah, you have to choose the tests that matter and that means designing a good strategy and understanding the product well enough to be able to pick the right tests. Ultimately, this kind of testing, you don't want to run the really silly test cases that are going to pass. You want to use the right tool for the right job, and so, yeah, it's really refreshing to get to talk to someone who understands that because I have this conversation all the time. It's like, "We need to be able to test everything or it's not worth it."

Alan: Oh boy, yeah.

Eden: Alan, I have really enjoyed our conversation. I could honestly nerd out about this topic for a lot more time. But I want to be respectful of your time and, who knows, we should do this again another time.

Alan: I'm happy to come back for part two some day.

Eden: Yeah. I'd love to get into some of the other stories and the other operating systems that you were building at Microsoft. Honestly, we didn't even get to touch on any of the work you did at Unity, so I'm sure you have so many other testing stories and best practices.

Alan: I have a lot of stories. If it's one thing I have in 30 years of tech, it's stories.

Eden: We'll have to do another episode soon, but thanks so much for joining me on the podcast, Alan. It's been awesome.

Alan: I'm honored to be here. Thanks for inviting me. It was great talking with you.