Software Development with Darren Platt - developing internal software for labs
Updated: Sep 21, 2020
Hi everyone, it’s Amber Shao, Founder and CEO of AduroSys, a laboratory data management software company. Welcome to AduroSys Lab Software podcast. Joining me today is a very special guest Darren Platt. Darren is Chief Information Officer and President at Demetrix. One of his roles is to oversee the team that handles data management and analysis for Demetrix’s cell engineering platform. Prior to Demetrix, Darren was VP of Data Science at Amyris, Head of Research at 23andMe, and led the computing efforts at Joint Genome Institute from DOE and Exelixis, overseeing several multi-million dollar software projects and large teams of software developers in biotech settings covering DNA sequencing, consumer genomics, and Synthetic Biology.
It’s my pleasure to introduce Darren Platt. Welcome, Darren.
I’m very excited to have the opportunity to talk with you today. You have decades of experience in computer science, informatics and genomics and worked for many biotech start-ups and major genome centers. You and your team have built many software products used in biotech companies. There are many topics I’d like to discuss with you around building software in biotech industries and we’re going to talk about them over the next few episodes.
Amber: First, maybe you can tell the listeners a little bit about the software that you had built over the years.
Darren: I'm very happy to. I remember somebody very early on in my career saying they weren't sure that computational biology would ever really be a thing. Maybe you'd never be taken seriously by a computer scientist or biologist. And I'm happy to say that there's been tons of work to do in that interface. I've typically been working with the lab, mostly experimental lab science and trying to connect that to sort of computing. Except for 23andMe, there was no lab. It was the interfaces more with consumers and that's involved everything from just basic algorithms designing experiments or reagents, sometimes helping people just work out what they're going to do tomorrow, as well as collecting huge amounts of data, trying to understand what's going on in that data, and then an enormous amount of software to actually run a lab. People often ask, why do you need software even to run a lab? A typical lab might move somewhere between 100000 and a million bits liquid around in a year. And I think it's a dirty little secret in my field that everybody gets signed up to do the big data machine learning because that's the really exciting part. Ninety percent of my time ends up and the pain ends up being just actually getting the data collected and labeled in a way that's actually going to make that possible. And I always tell people sometimes the most useful thing you can do for a scientist is just help them design a primer or organize the next experiment. So the work can be very, very basic, but it's pretty hard to run a modern lab without software.
Amber: So out of all the software that you have built, including some of the ones that your team, you and your team are working right now, what is the hardest software that you have built?
Darren: I can definitely remember some pretty painful experiences and their hard-won victories when you finally roll out a piece of software and feel really good about it. It's usually because it was so difficult to get there. And I think some of the hardest stuff wasn't necessarily that complicated in the sense that if you said, look, this is what we need to build and gave it to some software engineers, they could get something pretty quickly. But we were trying to work out what those requirements were with the users and they're changing every day. And then we were trying to replace a life running system with a better version, and that can take you 12 to 18 months to understand what's needed and then really painful and sort of swapping out how the data systems work, how the software worked. And then often you have to actually change the human system are you going to have to get people to behave differently once this thing is in there and then they change their mind as you go along. That can be really difficult. And one of the hard parts of that is, I think to be successful, sometimes you've actually got to change the users to be successful. So they come to you and say, you know, I need this really, really specific thing. And you take a look at it. You think maybe if I actually build that, it's probably not going to be very useful. I think there's a better way of doing it. So you have to go back and forth to sort of just argue with them, even to get to the right thing. You've got to remember that software is codifying a way of working. It's actually going to control that lab once you roll it out quite often. And when you're making software, you're making a set of decisions. And so you're literally helping somebody organize their life when you write that thing. Then there are other types of software where the idea is very simple, but conceptually it's very, very difficult. I think probably the hardest thing I deal with at the moment is how do you represent things like the design for a piece of DNA? All the different things are user can think of building, representing all the parts and the reagents. You've got to sit down, talk to them, understand what they need, and then you've got to imagine sort of the future and all things they might do. Then you need a very powerful and flexible way of representing that in your software so that there's lots of different ways of software that can be hard.
Amber: I'm sure in all those hard work, there are also fun things to do. So give us some examples of the coolest software you wrote.
Darren: I think the coolest thing was probably a DNA compiler I built. And this is basically a piece of software that can take the language that a scientist might write on or even on a whiteboard. When they say this is the design, I want my DNA and it translates that into all the material reagents and design to actually build that DNA. So it's like a compiler where the input is genetics and the output is DNA. And it's kind of funny because it started off as a nomenclature exercise. We're originally just trying to help them be regular about running their notations. We're just going to check and make sure they follow the rules. And then we realize going into it that if you can check the notation, you can parse it. If you can parse it, you could translate it. If you could translate it, you can generate something. And so I sat down with the other biologists and said, look, literally how do you do your job? How do you know what are the design roles you're using? And then we tried to code these into software. Initially they didn't like necessarily the design. They'd say, oh, that's too long, lists too short. I'd make that a little higher melting temperature, then I'd move that around. But just by sitting down and iterating with improving the software eventually got to a point where they would just trust it and they would let it spit out thousands of designs and then they would want it to do even more. It's very satisfying being able to take that very kind of complex human activity and actually reduce it to the software that could help them.
Amber: So there are quite a few commercial software aiming for scientific industries. What are the advantage and disadvantages of a building software in-house as opposed to purchasing off-the-shelf solutions?
Darren: Good question. And say should probably one of the things I get most from people and it's one of the reasons very excited about AduroSys because really building your own software is incredibly painful. That said, I've certainly been involved in those efforts and I'd say the great things about building your software is like designing your own house. You can make it exactly the way you want it to be so you can really customize it. And it probably feels a little bit like you can change it. That may or may not be true, but you feel like you've got some control over it. And you can get up in the morning and make it do something different and you can own the data structures. The data's in your database and you feel like you can kind of extend it if your users are particularly fussy over something very sort of peculiar about how you operate, you can craft the software to exactly how they think can make it. That may be a good thing or a bad thing. And then occasionally there's just. Your business may have something so specialized that other people just don't do it. And so, you know, if you need to design really complex pieces of DNA and you need algorithms for that, then that's something maybe you're going to have to do. I think the downside of doing yourself is that it's always slower than you think it will be. Sort of think of maybe a month or two or some person over there on the corner and I'll have a prototype. It ends up being quite expensive. It's not uncommon in a biotech setting for a single FTE to cost a company maybe a quarter of a million dollars. So if you have four people working on software for a year, you just spent a million dollars. But that's kind of hidden because it's sort of, you know, it's in people rather than a check you're writing to somebody, you're also suddenly going to become a really good recruiter. You're competing with tension with Google and Facebook to hire the best software engineers. And then you need a whole mixture of different things. You want people who can write good software, but you also want them to be able to talk to biologists, understand what they want, and then build it. And it's also really hard if you haven't done it before. So you're probably going to need to recruit people with a lot of expertise. So I get bug regularly. So can you help me design this thing we're building? And, you know, I'd love to help somebody, but there's only so many hours in the day. And so really, this is going to get rebuilt over and over again by people who are learning from the first time. I think that can be a good outcome. It can also be fairly painful.
Amber: So once a group decided to build some software in-house, what do you think they need to consider in preparation to make this work?
Darren: I think actually, whether you're going to build it or even if you're going to try and buy it, it's really important to actually have a stable process in your lab that represents what you're trying to do. It's sort of a fool's errand, trying to build a system. When you think of LIMS software in particular, as sort of a virtual mirror of a physical thing that's going on in the lab. If that physical thing isn't particularly stable, then it's changing. And we sort of worked it out. Yet we haven't labeled everything. We don't have names. You haven't decided where to store stuff, then you don't really have requirements. And so you need requirements to build good software. I think sometimes people think my lab is a mess, but if I bring software and fix everything and software on its own, can't save a poorly designed lab workflow, it can't get the software, doesn't get people to agree on one way of naming things. And so there's a human element to implementation and you're gonna have to solve that. And the more cohesive your team is before you decide to get into the software business, the better it's going. And then I think you need to be realistic about time and budget when you need the software. If you hire somebody tomorrow and said, I need a database in a month, it's gonna be probably a fairly simple database. And it may be a year before that thing is really going be in a place where, you know, you can rely on it. In the meantime, users are going be using spreadsheets on workarounds and then be realistic about how many people it would take to build it. Plenty of companies who get started with the one guy or girl in the corner and who's just sort of doing software on the side, and then it becomes more their full-time job and they have sort of a one-person software engineering team, and that person gets kind of overwhelmed and that slows down. And maybe they even leave and you don't know how the software works. Realistically, it's probably going to take a group to do it. Scene and to find that person who can lead that group and they got to hire other people. And I think it's worth asking, how exotic is your data management problem? Are you representing some sort of weird brain images or something that nobody's ever thought about storing before? Or are you tracking clear liquid moving in 96 well plates around the lab? And I think the closer your problem is to something other people have tackle, the more likely you are to benefit from buying software that's already been used by other people. I think there's also just different classes of problems. How you're going to be dealing with. I'd say sort of tall, skinny data. Maybe you just have tens of thousands of 96 well plates going through a very similar process every day. And, you know, the problem is how much? How do you get a lot of data and how maybe you get a really messy, messy problem where no sample ever gets handled the same way and you're just in a lot of flexibility. So you think about all of those things. And either way, again, if you buy it or build it, you're going to have to work out the requirements and work out ways of storing those data. If you do it yourself, how and what are the database representations? It's a lot of work, and then you need to build a large enough team that it's sort of self-sustaining. The one computer literate biologist writing it on the side is not going to build a system that a second person can understand. And it's not uncommon for biotech to end up with pretty bad technical debt systems that everybody is a little bit scared to touch. Nobody kind of knows how it works anymore because it was sort of done quickly. If you're going to run a professional software operation and testing 10 years integration sort of modern software engineering practices, you have to sort of attract those people and then you're competing with the tech world often to hire them.
Amber: Well, there are definitely a lot more questions I want to ask you, Darren, but unfortunately, we're out of time for today. But we have more episodes planned coming up. So stay tuned. Thank you for your time today.
Darren: Pleasure. Thank you.