This is the second episode of Stack Stories, our new podcast, where we highlight the world's best software engineering teams and how they're building software. Hosted by Yonas (CEO, StackShare) and Siraj (of Sirajology). Follow us on SoundCloud.
What if all your favorite gadgets, from the phone in your pocket to the chair you’re sitting on, cost less? As it turns out, that's happening right now. And we have Flexport to thank.
Having just closed a $65M funding round, Flexport has brought some of Silicon Valley’s best engineers to disrupt international freight, and are well on their way to dominating the trillion dollar market.
So how do you scale your tech stack to handle a massive market like international freight? We sat down with Amos Elliston (CTO), Evie Gillie (Engineer #3 and Engineering Manager) and Desmond Brand (Engineering Manager and React Guru) to hear all about how they moved from pure Rails to a React-driven app, growing pains, and future plans to scale with new tools like GraphQL, Relay, and Docker.
Listen to the interview in full or check out the transcript below (edited for brevity and re-ordered to make more sense).
- Background and MVP
- Current Stack & Architecture
- Data Challenges
- DevOps and SaaS Tools
- Team Structure & What's Next
Background & Minimum Viable Product (MVP)
It started with our founder Ryan. He had been doing import/export for 15 years. He used freight forwarders and customs brokers that were terrible. They didn't even have a website. He was like, "God. There's got to be something better. I bet I can build it." He started a customs brokers. Did YC Batch 2014 Winter. His original app was done by some contractors in the Philippines so that was Rails. It was pretty much just a database of "this is what's happening right now." The basic functionality was almost database admin interface. It'd be like hey here are all our shipments which is represented like 4 or 5 tables in our database so it was very bare bones. Quoting was in Excel..and they used to track shipments in Trello [internally].
The choice of React was very interesting. At the time, this was 2014. What was React like 0.9? Was it even 1.0? That was interesting because I told one of our lead front-end guys, "You gotta to choose between Angular and React. I wasn't as involved in the front-end world and he's like, "React. No problem."...I was like well, "Isn't everyone using Angular? Why not use the dominant platform? We’re going to get left behind. There's more support for it." He's like, "Look, just because it's the dominant platform doesn't mean it's right. I can definitely see why React is going to beat Angular one of these days." He was using the Prototype jQuery analysis. He's said, "At one time Prototype just crushed everyone and then jQuery came along and all of a sudden it was the dominant platform." That was a good argument. It was a good choice I think.
I'd say the most complex is all the quoting stuff that Brian and folks are working on now and we did originally with RTQ. Just getting our structure of like if you want to go between these two ports what are all the different charges? Do you charge by weight? Different partners, for the first 100kg, they charge this much and so on and you get these really complex rate types...sketchy, weird spreadsheets. Not even CSVs they're like these complicated Excel spreadsheets. Other people's sophistication is probably one of our biggest challenges because we get data sources in such funny ways so we have to basically, all this massaging code to deal with it....One way I look at the hard challenge is the business logic and domain rules are really complicated and not very intuitive. No engineer has really done anything with freight forwarding. It's hard to use intuition about what should be happening without spending a lot of time learning about it first. Just like modeling the supply chain in an accurate way without our code turning into spaghetti is a pretty big challenge I would say.
We use a couple of things. The black magic one that I hadn't seen before I joined this company is FullStory. Because we have fewer users that have larger transactions. You get a lot of value out of watching sessions. That's been really valuable for that and also for figuring out if a bug was user-impacting or if some part of the UI is confusing or anything like that so FullStory is definitely a good tool that we get a lot of use out of. We also use Periscope Data for all our analytics. A lot of people probably know what this is, it basically looks at your slave database and allows you to build graphs off that. We'll push stuff to out to event logs as well to figure out what's happening in user flow and just basically runs SQL queries that are graphed over time.
Amos: Hi I'm Amos Elliston. I am the CTO of Flexport.
Evie: I'm Evie Gillie. I was engineer number three.
Desmond: I'm Desmond Brand, I don't know what number engineer, but I've been at Flexport for about a year.
S: Awesome. Thanks guys. I just wanted to start off with, what is Flexport in your own words? When I saw it, I thought that freight shipments that's not that sexy, but it must be right.
A: So Flexport is an international freight forwarder and customs brokerage. That means we move goods usually over 150 kilograms in weight from one country to another. 150 kilograms is interesting because it's literally the size of freight that's too big to fit into a UPS or FedEx network, so you need someone like us to make sure it gets from a factory in China to the US. We're like travel agents for freight.
D: The thing I think about that's cool is the way we look at the problem. Freight forwarding is an information problem. We get information in about what needs to move. We give information out about what trucker to pick up something where and drop it off at this port, etc. Everyone else kinda just throws bodies at the problem, but we use information technology to solve an information problem. I don't know if that's actually supposed to be novel but it feels like it is in this industry.
Y: Yeah, you're moving tons, with software. That's pretty cool.
E: Very novel. For example, my dad tried to ship a container full of equipment a couple of years ago, and it was like, he could track his Domino's pizza way better than he could track his multi-thousand dollar business shipment.
S: We're really interested in your engineering culture and the kind of team that you guys want to build here. Just right off the bat, how would you describe the engineering culture here at Flexport?
A: Technically we have a pretty high bar. I think everyone says that. We believe it, and we're all really committed to working together to try to build this thing in a very collaborative environment.
E: We're fortunate enough to have the people who are the real power users of our product; the people running the shipments, sitting 20 feet away from us.
If you think of the travel agent of the shipment, the person handling all the handoffs and making sure everything's going okay, those are the Operations Associates and they work on the same floor as product and engineering. If an engineer is building product and they don't know what it should look like or what they're even doing because a lot of the engineers are new to the freight space in general, they can go and be like, "Phoebe what is this supposed to do?" The iteration loop is tight.
Y: Just as far as roles, you're just overseeing everything tech-related.
A: Yeah. I am the CTO, oversee tech and product as well. I make sure we get everything out the door on time, quickly, and well. Quality is very important.
D: I'm an engineering manager, and I guess I am focused a little bit on the front end kind of stuff, and the bugs process as well.
E: I’m also an engineering manager, but leaning more toward the backend.
S: Okay cool. If I heard correctly, you are a React expert. Is that a valid moniker?
Y: We'll circle back to the stack in detail.
S: When you come in to work what is that bigger vision that you have? What is that goal?
D: Yeah, good question. We set goals on a quarterly basis. We have some big pieces of product that we're trying to build in the next 90 days. Some of those right now that are happening, we need to get bookings which is this thing that will generate a bunch of shipments happening. As we grow we also need to lock down some of our sites. We're doing this permissions project.
On the technical side, we have some things going on in the front-end, that I think will make it more difficult to collaborate when we have more engineers. I'm trying to get ahead of that and set some better practices. One thing I'm experimenting with now is replacing some our Flux stores with GraphQL and Relay in the hope that we're going to have a little bit more of a scalable system if that turns out to work. There's a lot of different things that are on my mind for the long term. That's some of them.
E: Super excited about GraphQL and Relay.
Y: Okay, we're going to have to get into that. Maybe we could just talk about how Flexport started? What was the motivation? Where did the idea come from?
A: Sure. It started with our founder Ryan. He had been doing import/export for 15 years. He used to import ATVs and walk-in bathtubs which are these bathtubs for old people so they have doors on them. You get in, close it, fill up the tub. They're pretty sweet. I kind of want one for my house.
Also I think they sold in-room saunas. Ones that you could buy for $5,000 and get installed in your house. Some random stuff. He used freight forwarders and customs brokers that were terrible. They didn't even have a website. He showed us some of these things. It's literally like they just discovered the blink tag situation. He was like, "God. There's got to be something better. I bet I can build it." He started a customs brokers. Did YC Batch 2014 Winter. His original app was done by some people in the Philippines so that was Rails.
Yeah. That was the reason we picked Rails because that existed when I came to the company which was right after YC. Don't worry, we rewrote the whole thing. There's no consultant code in there. Git-blame is gone. It's all well thought out code now.
Y: Right. Were you in Rails before? Had you dealt with Rails prior to coming to Flexport?
A: Yeah so 2006 I helped start a company Geni. We were a genealogy company. It's like "do your social family tree." We were trying to be Facebook for families.
We also spun out Yammer which was Rails stack way back in the day. Yammer started 2007 on Rails.
Y: Yammer spun out of that company?
A: Yeah. Yammer started with Rails 2.1. We actually started basically right when GitHub had launched so we used Git which was pretty advanced at the time. Git was like the Flow, or the GraphQL of its day. We used Rails 2.2, 2.3 maybe, which was very cutting edge. I have lots of experience with it. Now some of my experience is very dated. I think they've moved on. I'm a little old school.
If I was going to start another company... I don't know if I would've picked it to be honest.
Y: Really? Why?
A: I don't know. It's an interesting choice. It's showing its age a little bit and there's some other interesting frameworks out there. I don't even know what I would have chosen. We talk about this all time. Would we have done Node? Would we have done Java? Would we have done Scala? It's a discussion we have a lot.
Y: You'd just have to go to StackShare and see what people are using.
A: Yeah, I mean I'm always curious about what are all the kids doing nowadays. What's the hotness? Is it Node?
E: Go is big.
A: Yeah, we talked about that. I still really do like Rails.
Y: Okay so you're still a Rails fan. Do you have other Rails fans in here? No?
E: I came in a skeptic but I'm kind of a fan now. There's just been so many moments when I need to do something, and there's already five gems.
Y: That's one of the biggest advantages, is everything has been done for you.
A: Yeah. She misses all the typing in Java. Her hands feel really-
E: That's all it took to make a HashMap? That doesn't feel right.
A: At least 15 lines to open a file.
Y: MVP was, a Rails app, was it just basic functionality like you could get freight container from A to B and that was it?
A: It was pretty much just a database of this is what's happening right now. The basic functionality was almost database admin interface. It'd be like hey here are all our shipments which is represented like 4 or 5 tables in our database so it was very bare bones.
E: Quoting was in Excel.
A: Quoting was in Excel. That was probably our first real big feature. We took an Excel spreadsheet that was mind-blowingly complicated, like Excel on steroids and then made it an app which increased the ability for us to quote. We went from 5 hours down to 2 minutes so it was a really big win.
D: Didn't people also used to track shipments in Trello instead of the app?
A: They used to track shipments in Trello so, our app we built a "Trello killer" which was great.
Y: Wait. Oh, you mean internally?
A: Internally right. It was for our workflow. It was a Trello killer for our workflow and we killed it successfully. They’re much happier than they were when they were using Trello.
Y: Got you. Maybe you could just walk us through the entire experience. You're dealing with all these folks that want to be able to move containers across the country. Right?
Y: They're looking for companies that do that.
A: Usually it's people moving stuff from China to the US. What happens is a factory in China will say we have this stuff ready for you to come ship. Come pick it up at our factory. You can’t drop it in the traditional mail because it’s pallets worth of goods. So they need someone like us to help.
Y: Then you're probably doing a bunch of web scraping and consuming some APIs because you need a good data source right?
A: Yes to get that data. To figure out at any given time where is the freight we have to do some scraping. We're trying to integrate with more and more to APIs.
Y: You're probably building what amounts to an API just scraping right?
E: Yeah... That's fair to say.
Y: You're structuring the data and then you have to feed it to-
D: I call scraping a hostile API integration.
Y: That's funny. Hostile. Okay so that doesn't sound too bad. Rails sounds like the perfect fit. You're doing a little scraping.
A: It does. The great thing is we monetize such a high percentage of our traffic we don't have crazy amounts of scale. It's not like you need these high concurrency websites. We can do a lot of work off a couple of servers. That's one of the big complaints like Rails is kinda slow. It's not an issue for us.
Y: Got you. That V1 you were hosted on what? Heroku? Where?
A: AWS I think always.
Y: Interesting. Okay.
A: I don't think they ever had Heroku.
Y: Okay, awesome. AWS, Rails all sounds very simple. At that point you were just trying to go after the folks that were looking to ship stuff from China and they needed a better solution. How did you get to them? SEO?
A: Yeah, SEO was a big way. Freight forwarders, they don't buy keywords. I shouldn't say this, but we got a lot of traffic from that. Those were our first few clients. We quickly hired a head of sales and he started going after smaller shippers. It was very, very successful right away because there was huge market need.
Y: You guys, maybe still are, the biggest software solution, first software movers in the space. Right?
A: The big guys do have software. It's just not web software. They have mainframes and AS/400's. That's their internal system, so their web presence is very minimal.
Y: Got you. Okay. So you guys moved away from Rails? What was the evolution? Challenges you faced?
A: The rendering has moved away from Rails, but all the backend is still Rails. And there's no plans to change that. It was just that Rails was rending like HTML, and now it's just a static HTML page where React does all the rendering. I guess technically, Rails is still doing the rendering because we have server-side rendering set up. The thing we moved away from was rendering views with Rails, and partials and stuff like that. All that stuff got moved to client-side. But controllers, ActiveRecord, routing, all that stuff is still Rails. And will be for a long time.
A: Postgres. We started with MySQL. I switched to Postgres because transactional migrations are huge for development. So it was a very nice win on that. I am also am better at admining Postgres databases when they're on fire, so I was like "eh, I know how to scale this better."
S: Why is that?
A: Both Geni and Yammer were always on Postgres so I had a lot of experience dealing with 3am fires on a Saturday night and I'm not as comfortable with MySQL.
E: Then we have PostGIS too. I love PostGIS.
A: I love Postgres. I guess now Uber's moving away. I can't believe it. It makes me cry.
E: I know. Apparently with their use case it makes sense but oh my god. PostGIS, we do the Kayak-style radius search on ports so if you want to go from Milan to New York or Shenzhen to LA it will search all the nearby airports and seaports. It's just so fast and easy.
S: Nice. Okay. You mentioned the admin side of it. Weren't you guys on RDS?
A: Yeah we're on RDS.
Y: Were you originally as well?
A: We were originally on RDS MySQL. I think it was probably the biggest push for me was the transactional migrations. I was getting very frustrated not being able to roll them back or not having them automatically rollback when you're doing Rails migrations. It was a huge pain point wasn't it?
A: Yeah. I like that part a lot. Again, because scale is not the biggest issue for us you could kind of pick either one. What's faster to develop on?
S: Got you. Okay. That was probably the biggest piece of the stack that you guys changed.
A: Yeah, well the front-end stuff. I think our templates were HAML right? What were they? HAML and what was the other one?
A: Slim which was an even weirder version.
D: There's still about ten or eleven Slim files that just render everything.
A: Slim was odd. I inherited that too. I don't think I would choose it.
E: It got to the point where no one wanted to touch any of the pages that weren't React.
Y: What was the main reason why Slim was so odd?
A: There were certain things you actually couldn't do with it. I can't remember the exact scenario. There's a couple things you cannot do with it and you're just like this is crazy. It prevents me writing HTML so you have to have a string of HTML inside of it. It's a deficiency.
Y: Yeah. We're using HAML for our site and there are some limitations. The benefit though is it's super clean. It's very clean, structured. You know exactly what's going on, but yeah it can get confusing.
A: The choice of React was very interesting. At the time, this was 2014. What was React like 0.9? Was it even 1.0? That was interesting because I told one of our lead front-end guys, "You gotta to choose between Angular and React. I wasn't as involved in the front-end world and he's like, "React. No problem."
Y: No research, just boom.
A: Well he knew it. He had been using Backbone. He was at Trulia before so he know Backbone really well. I think he knew where the evolution of stuff was going. He really didn't like Angular. We had two other companies in the office at time and I asked both of them. I'm like, "You know our lead front end guy, he'd only been here three weeks, and I don't know if I trust this guy. He's going to choose React over Angular." They both used Angular and they said, "Yeah. I don't know about that..."
To Nitesh, Nitesh runs this company called Padlet, I was like well, "Isn't everyone using Angular? Why not use the dominant platform? We’re going to get left behind. There's more support for it." He's like, "Look, just because it's the dominant platform doesn't mean it's right. I can definitely see why React is going to beat Angular one of these days." He was using the Prototype jQuery analysis. He's said, "At one time Prototype just crushed everyone and then jQuery came along and all of a sudden it was the dominant platform." That was a good argument. It was a good choice I think.
Y: That was the right bet.
A: It was a really good bet.
Y: And this was pre-Angular 2.
D: Oh definitely.
Y: That's interesting. You definitely made the right call.
Y: Do we want to dive into more of the tech and talk about what are the moving pieces? Where are these apps? How many apps do we have going? Just run through the stack? Maybe briefly. You probably have a good idea of everything going on.
D: Yeah I think that's the terminology.
E: Isomorphic, asterisk.
D: I think the word is Universal now. People Reacted against isomorphic.
A: That never made sense to me.
D: There's three front-end apps. One for internal usage, one for clients, and then another one for partners which is a lot smaller. Mainly I think trucking companies are using that right now. There's different requirements for browsers on that app because trucking companies in China are probably on IE6. Whereas for Core which is the internal app we just require that everyone uses Chrome and we can use FlexBox or anything else that we want to make development easier on that.
Y: Is it three Ruby apps?
D: No these are three React apps.
A: No these are front page apps. Only one back end Ruby.
D: They all talk to the same monolithic Rails backend.
Y: Okay so one big Rails app serving up the API for the most part.
A: Yes and one database as well. One schema.
Y: Any plans for iOS or Android?
A: Most likely we'll probably do React Native.
D: Yeah, I'm definitely going to push for that.
A: We have several React experts on staff that were pushing for it. I think it makes logical sense.
S: Which of the three apps takes the most engineering effort would you say?
A: Core. Easy.
D: Yeah, which is the internal one.
*S: Wait, so when you say internal you're talking about internal to Flexport?
D: All of the people in this rest of this floor are the ones spending most of their day in it. That's the app where we have the really fast feedback cycle because we can launch stuff and literally walk over to see it over someone's shoulder in production. That's the most complex one.
Y: One thing you mentioned that I wrote down here, for pre-render have you experienced any issues using pre-render for SEO purposes? Has that impacted SEO at all? I know that's a big part of your strategy right?
D: I think we've been server-side rendering stuff from the start so I guess we wouldn't really detect if there are any issues. Most of our app is behind log-ins so it's not SEO-related. I think the public facing stuff's always been server-side rendered.
I actually did a lot of A/B testing for this at Khan Academy because SEO is really important there. Our app was very client rendered. We put a big emphasis on making sure that our code was server-side renderable for that purpose. I think Pinterest has a really good blog post about this called Demystifying SEO where they outline an A/B test they did. It's kind of old now. It might not be relevant anymore because I know that Google has actually stated that you can client render things and it will work, but they measured some stuff that was to the contrary. It might be a year and a half ago now.
S: You guys were talking as well about moving from Flux stores to GraphQL and Relay. Could you explain a bit why you guys chose to transition?
D: I don't think the choice has been made quite just yet, but we're definitely seeing some pain with Flux. State management in React is a thing where there's a bunch of different opinions about how to do it and our app is pretty big so there's a bunch of different ways that it gets done. Most of the time we use Flux. There is a lot of use of local component state as well, but some of the problems that we have are, for example, we have this shipment details page in Core which originally it started as an admin view of the shipment entity but now it's grown so many different things that it needs to display and so it makes a lot of calls to the backend. Those backend endpoints have that REST problem where they might get used on multiple views so they accumulate more and more properties on each model that may not necessarily be used on the calling view so you end up having to load a bunch of data that you don't need to render the page. GraphQL solves that problem really well. We're experimenting with that and what we're going to do is-
S: Sorry. GraphQL solves that problem really well because?
D: Sorry. It solves that problem really well because it co-locates the JSON template. It's basically JSON without values. So each React component describes exactly what it intends to display and then Relay is a sibling to GraphQL when using it with React. Relay assembles all of those components into one big query that says here is precisely the data I need to display this particular page, declaratively and then you'd get just that data. We've run into problems where we're trying to display a shipment but we might have an invoice and a due date but that invoice is, when we load it from the database, joining on a bunch of unrelated stuff that the shipment page doesn't need but because it happened to be in the API we're spending the time to generate it.
Y: You can specify that through Relay you're saying?
D: Yeah. I don't know if you guys have used PropTypes. It's basically like saying exactly the things that end up in the view are the only things that will come back from the API. Everything else will be ignored.
Y: What do you need GraphQL for?
D: GraphQL is kind of like a server-side component of that and Relay is the thing that takes a bunch of GraphQL fragments and assembles them together and says, "Hey GraphQL, get me this stuff."
Y: Then GraphQL is on top of Postgres?
D: Yeah. It's actually on top of Rails so it still will load ActiveRecord models and stuff to get the information. The other reason is sometimes you have miniature models. If you have a list view you might just want to get a lightweight shipment model for example, but when you go into the details page you want a heavyweight one. That means you have a different type of object but both of our languages are weakly typed so we do have a lot of errors where if you access this page in this particular way a bunch of the attributes will be null so you'll get a run-time error. GraphQL introduces some types to our API which will make it obvious when that kind of thing is going to happen. If we do that and then also start using Flow on the front end I feel like we can catch a lot of those errors that we run into at compile time. That would be awesome.
Y: For the real time tracking piece, can you talk about how that works? What are you guys using? How'd you build that?
E: FlightStats is a really useful thing we use for tracking flights because a lot of our cargo just goes on passenger flights in the belly of the plane. It’s the same services that power, say, AmericanAirlines.com flight status. With ships it's a little bit sketchier especially depending on the country and everything. We have a few different sources that we try and combine and it's just whoever reports that the ship has arrived first "wins."
D: For updating on the client side we use Firebase. We ping browsers with Firebase to make them request data from the API. We don't actually send information over Firebase except the fact that this model that you're looking at might be stale and needs to be updated. There's some websockets there for that I guess.
Y: Okay so you're requesting it from the API. That makes sense.
D: In Rails I think we have when a model gets written to there's a callback that says notify all browsers that have this model on the screen open so Firebase is the thing that does that. When a browser sessions' open it has a subscription to Firebase. Firebase is the thing that pings all the browsers to tell them my data is out of date. I need to request an update.
A: It's mainly also for presence to figure out if there's multiple people looking at a particular shipment at once will pop up their profile and be like, this person's looking and this person's looking. Yeah. That's all done through Firebase.
Y: Have you guys run into any challenges with that? Is that pretty simple to set up?
A: It was pretty simple to set up. The engineer who did it is not here but he did it in a weekend basically. It hasn't gone down. It went down once I think in a year. It's been good.
S: I'm just curious feature decisions in general, how does that play out in the engineering team?
A: Again, we're changing this process. A part of our big board is engineers can write project proposals to be approved by some other group of individuals. Again, there should be from one to four weeks. Previously it was pretty much the PM's would come up with a lot of the stuff. We originally had engineers come up with features, but because what we do is so complex and there are so many business rules like for each feature we did in the early days there was 50 new fields you would have to add to the database. At a certain point we kept adding fields we'd be like, "Oh, this is done. We have every single field of freight forwarding mapped out and then another project would come through and it'd be like a hundred more fields. It got to the point where you really need product managers to digest all this information and figure out what we need to build next. A lot of the stuff comes down from them, but the thing like the presence and the Firebase stuff that was just organic. That was an engineer who's like, "I think we need this." He did it over a week.
D: I think a lot of the tech tools are just an engineer is passionate about it and convinces the rest of the team that it's something we want to do.
S: Going back to deployment, most of the changes you're making all the deploys they're mostly to the Rails app right?
A: We consider the front end different so it's probably more to the front end. Right?
D: Yeah, I don't know. I haven't thought about it actually.
Y: So let's say you have a new version of the app, you're pushing out the Rails app?
A: We have one app. One repository of code.
Y: What are some of the challenges that you guys have run into as you've started to scale? As you've started to build out even the team? Maybe we can just get into some of the engineering challenges that you've faced. Was it on the data side? What was the hardest part about what you guys built?
E: I'd say the most complex is all the quoting stuff that Brian and folks are working on now and we did originally with RTQ. Just getting our structure of like if you want to go between these two ports what are all the different charges? Do you charge by weight? Different partners, for the first 100kg, they charge this much and so on and you get these really complex rate types.
Y: Where are you getting all this?
E: Partners actually send them to us directly.
Y: In a nice format, like an API? Of course not.
A: No, sketchy, weird spreadsheets. Not even CSVs they're like these complicated Excel spreadsheets.
E: Thank God for data entry.
A: That's really hard. Other people's sophistication is probably one of our biggest challenges because we get data sources in such funny ways so we have to basically, all this massaging code to deal with it.
Y: Okay so you have a bunch of Ruby scripts that are just parsing CSVs and-
E: Just a translation layer between who knows what business logic and who knows what format and our internal format.
S: Is there any sort of standard between these companies when they send you these CSVs or you just run the script for any company?
A: No, some of these are like, I wouldn't say mom and pop, they're actually pretty big companies, but they're in China and I don't know what their internal systems are, seems like Access databases. They spit out these crazy complicated Excel spreadsheets.
E: They themselves have sketchy fees like “handling fee” -- what does that really mean?
A: Yes. Yeah, they put in these really weird fees that we have to be like is this a real thing or this-
*Y: Yeah. No, it's interesting. This sounds like a lot of what Zenefits has to deal with. I was just dealing with Zenefits the other day and it's like all these outdated, complex systems that you just have to put a really nice UI on, but in the background there's a lot of stuff happening. Can you talk about how you're processing that stuff? Is it background jobs running every so often or rake tests?
E: Since we're processing it into our standard, we have a team of data entry folks who know what we expect. They transform it. Then when we get a new rate sheet, which is about every week, we have someone upload it and usually there's conflicts because of spelling errors and human error and they have to go into this edit view that gets all the conflicts between what we expected and what we got. Then it's just a dump in the database.
Y: It sounds like this could be a use case for predictive analytics? You have all these inputs and you have an output.
E: The historical data gets fun. I think in the future a big part of what we're going to do is like market predictions like should we buy now, should we wait for spot pricing, stuff like that.
Y: Surge pricing. Yeah. It's holiday season so you're going to have to pay more.
A: Predicting prices is a huge challenge in our industry because it's this funny kind of not really open market. All the shipping lines, there's like I don't know 17 of them in the world or something. It's a bit of a cartel. Hope none of them are listening. The way pricing moves it's sometimes just all over the place because it will be like some guy on some desk at Maersk being like, "I think this client should get these prices." There's no real marketplace for it at all.
Y: There's no transparency.
A: No transparency whatsoever so we hope to build up a data set enough that we're really good at it and if we can we'll probably be one of the best at it in the world, but right now we're like wow.
Y: Cool. All right so I guess you're saying one of the biggest challenges is the data side of it?
E: Just integrations with everybody else in the industry. There's a reason that this opportunity was still left available.
A: If it wasn't our problem then there wouldn't be a lot of value there.
D: One way I look at the hard challenge is the business logic and domain rules are really complicated and not very intuitive. No engineer has really done anything with freight forwarding. It's hard to use intuition about what should be happening without spending a lot of time learning about it first. Just like modeling the supply chain in an accurate way without our code turning into spaghetti is a pretty big challenge I would say.
S: Speaking about that modeling the supply chain programmatically, was there ever an engineering disaster like red alert, like oh my god? Is there something like that you can talk about where you felt really worried?
E: HackerOne's found some stuff that if it had been public it would have been bad.
D: We have penetration testing through HackerOne. There's been some stuff they've found. That's a really good service actually because we give bug bounties when people find stuff and I'm really glad that it's them finding it and not three years from when we're-
A: Not the cartel.
D: Right. Well not these people that apparently are murderers or whatever. HackerOne's pretty cool. Sometimes the communication there's a little bit difficult, but it's been worth it. Although it has forced us to add in Denial of Service protection and some stuff we wouldn't have prioritized otherwise but it kind of comes with the territory with HackerOne.
E: I think our scarier engineering moments have been HackerOne caused.
D: It's mainly HackerOne participates running scripts that is nothing like normal user behavior. In theory someone could do that to us, but I don't know if anyone ever would because that's malicious. To get value out of their service we had to make those things not affect production.
A: Which is good. That's kind of the point is your service, denial of service proof.
D: They're kind of like human fossers to some degree, but there's also some more intelligent attacks which is awesome.
E: Although it takes down the site and ops is like, "Wait, you mean we’re paying people to do this to us?"
S: It seems like you guys have things on lock more or less, but speaking about your development methodology? What is the methodology you guys use here? Waterfall?
**Y: What do sprints look like?
D: Yeah we don't do sprints. We don't do waterfall. We're actually kind of working on that at the moment. I think what we've done so far is just take some practices from what we've seen elsewhere and thought about if they make sense for us and applied the parts of Agile or whatever parts of Waterfall that make sense. I don't know if there's a name for what we do.
A: We were a small team for a while so we were kind of like whatever gets stuff out the door. Now that we're a lot bigger we're borrowing some stuff from Yammer this thing called a Big Board. We're about to release this tomorrow, so I'm announcing it here first before the rest team.
Y: This will go on much after.
E: You heard it here first folks.
A: I think this originated with the Yammer VP there, but it's basically trying to time box development from one to four weeks and getting feedback from the engineer or group of engineers that are working on it, how long it's going to take, and making sure each project has a DRI to borrow from Apple, Directly Responsible Individual to help with the time boxing.
D: It's all about fixed time but variable scope so we can try to know what we're doing and when it's going to happen and if things are delayed then we try to cut scope to fit it in. That's all happening literally tomorrow.
S: Okay. When you say Big Board do you mean literally there's a big board?
A: It's going to start with this white board that we have in front of us and it's going to expand. We bought a bigger one. It hasn't arrived yet. It's just literally going to have every single project we're doing and who's working on it and then the dates for when they're going to be done. We're going to do some feedback loops like say is it on time. What is it green, amber, red?
D: Yeah, green, amber, red. Yammer does this and I previously was at Khan Academy which was inspired by Yammer to do the same thing so it's not really that big of a thing but we just have a large proportion of descendants from the company here.
Y: The whole purpose there is to get a better idea of what the road map looks like time wise?
D: A big purpose is to provide space for engineers to work on projects and be accountable to getting it done on time without having to deal with a lot of interrupts. We have a support team that handles any kind of interrupt. They're like an interrupt handler so any bugs go to them, any PM that has a demo and we forgot this one little feature that has to happen tomorrow. That will go to the support team. The support team handles all the unpredictable stuff. That means developers can be heads down on a project and operate on maker time.
Y: You have engineers on support?
Y: Got you. Then does this also have to do with prioritization? I think Pandora does it. It sounds similar where they have a big board and they've got sticky notes and you put everything next to each other than you realize that man, my feature's not important.
A: That's like when they do they're 90 day road map planning. They have voting. It's usually for the executives to go in like, "Hey. I want this feature. I want this feature." Then the head of product will go back and be like, "Okay. This one wins. This is what we're going to work on. That's a separate process for us. We do that a little bit.
Along with what everything Desmond said, one of the problems we had with things just taking too long so we really want to force that scoping step. A PM will work with an engineer. The engineer will be like, "You know this is going to take me two months." What can we take out so it will only take me four weeks? Really just having that process nailed down.
S: Circling back to the stack, do you guys use continuous integration?
Y: What does your whole deploy process look like? Everyone's using Docker?
D: No ours is nowhere near as fancy as that. Actually ours has inspired a lot, well at least in my opinion, is inspired by Zach Holman's deploying software blog post which is really good.
Y: I remember that.
D: When someone has a pull request it gets approved and then as soon as it's merged to master it's immediately deployed. It's a manual step to deploy it. It's a bunch of Capistrano scripts.
Y: Deploying to EC2.
D: Yeah, exactly. We wait till the tests pass before it would be approved. Usually it gets code reviewed by someone and then once the tests are green the author merges it and immediately deploys to production. We're paying Travis the maximum we can which means we can test two things concurrently because we use 5 instances to run tests when we open a pull request.
Y: How long do your tests take to run?
D: I think the slowest one's about 11 minutes. 11 minutes after a pull request is open you know if it's passing or not.
Y: Okay. Then are you pushing it out to staging?
D: Sometimes. We have a few different staging environments. We've got just one that is identical to production and then there's a new thing that just got created recently. We're having some contention there where there's a single staging environment and multiple developers want to test their stuff.
Y: Yeah, you might have to have different staging zones for different branches.
D: An engineer just set up a containerized staging thing where we have a machine that will just deploy it to a container and then I think we can have ten or something there.
A: That's all Docker right?
D: Yeah, that's all on Docker.
Y: So feature branches are on different servers?
D: Yeah, yeah. I think long term our normal environment will move towards that. But that's brand new as of two weeks ago.
A: Yeah, production right now is not containerized but we have plans to get it there based on this test with staging.
Y: Interesting. Okay.
A: I guess our dev would then naturally fall out of that as well.
S: What's the current successful build rate?
D: I have no idea.
A: Out of every single build to master, how many break?
A: It's probably like way over 90% build rate. We should look at that.
D: Yeah it's definitely a faux pas to break master. I think the most frequent way it breaks is Travis has some network error. We have tests that we hit an endpoint and server-side rendering is a little bit flaky sometimes in a test environment for some reason, so sometimes those will fail. Just the other day we ran into an issue where all of our pre-rendering tests were on a single instance and I guess we just got enough of them that I think maybe it doesn't clean up memory after running them or something and it was getting out of memory on Travis so we just split it into two instances and we'll deal with that in the future.
A: For some perspective it breaks way less often than other companies I've been at. Like 90% less.
D: It's definitely above 90 yeah.
Y: Wow. Do you punish whoever broke the build?
E: There's a cone of shame.
Y: Send out an emoji. Are you guys using Slack or what are you guys using?
E: Oh yeah.
Y: We're just assuming stuff now.
S: You guys have coding standards? I mean of course you do, but what are ...
D: Yeah, we use Rubocop and ESLint to lint everything.
Y: CodeClimate? How does that run?
D: We don't use CodeClimate. We just have some manual linting setup in a configured file with a bunch of rules.
A: We used CodeClimate before, and we did not like it.
Y: Really? You didn't like the whole GPA thing?
A: Yeah. Consensus was like turn it off because it wasn't helping us at all. Maybe it's changed. That was a year and half ago.
Y: We're using CodeClimate. It's interesting because it really enforces a lot of the best practices for Ruby but some of the stuff, we just don't really care.
D: Don't you get that from Rubocop?
Y: You do. They just like run it over there and then they give you the score and you can see how it changes over time.
D: We just have it set up as a pre-commit hook so if you do anything wrong you just can't open a pull request.
Y: Oh really?
D: Yeah. You fix your stuff before you show it to anyone else so you don't have bike shedding discussions in pull requests. Then pull requests can focus on design or architecture or something as opposed to "it's a multi-line block so you should use do end versus braces" or whatever.
**Y: What you're saying makes sense. You want to solve that sort of thing before it gets there. Do you want to cover monitoring? What do you guys use to take a look at production and make sure that everything is running smoothly?
A: We use New Relic. We also have Sentry which is for exceptions, but then Pingdom. You guys probably don't even know that. It rarely breaks. I'm probably the only that gets those and I haven't gotten one in like a month or two.
Y: That's really good. That's awesome. 100% uptime for the past couple months.
A: Some things break. Pingdom doesn't get everything.
S: How do you split up PagerDuty?
A: Again, we don't have a need for it so we don't use it.
Y: You're just using pure Pingdom?
A: Yeah. That's it. We don't use PagerDuty.
Y: That's pretty cool.
**S: Everyone gets 8 hours.
A: I have spent a lot of time at 2am on Saturdays fixing code and this is, it's great.
E: It’s so nice.
A: It's great.
Y: Nice. Are you guys using anything else aside from New Relic like... Librato?
E: Bullet we use for digging into N+1 queries.
Y: New Relic is really all you need to see what's ...
A: It's pretty good. I haven't checked out Librato.
Y: Okay. Cool. Analytics? What do you guys use to look at front-end or user flows and all that stuff?
D: We use a couple of things. The black magic one that I hadn't seen before I joined this company is FullStory. Because we have fewer users that have larger transactions. You get a lot of value out of watching sessions.
Y: They record the whole session.
D: Yeah. That's been really valuable for that and also for figuring out if a bug was user-impacting or if some part of the UI is confusing or anything like that so FullStory is definitely a good tool that we get a lot of use out of.
E: If I could invest in FullStory I would.
A: It's a good tool. We also use Periscope Data for all our analytics. A lot of people probably know what this is, it basically looks at your slave database and allows you to build graphs off that. We'll push stuff to out to event logs as well to figure out what's happening in user flow and just basically runs SQL queries that are graphed over time.
Y: Oh, okay so it's like UI over your data.
S: An easy way to slice and dice. Very cool.
D: I guess Google Analytics somewhat as well.
A: Some people use it. We don't use it as much.
D: I use it to see how many people are using bad browsers and stuff like that.
Y: Nice, browser breakdown. For FullStory, are you just recording every session all the time? How does that work? Is a percentage of traffic?
D: I think right now we're recording everything.
S: How do you know what to look at? You can see which ones have exceptions?
D: Well, their exception stuff isn't that great actually. We use Sentry to figure out when exceptions happen and often you open Sentry and then you figure out who it was then you go find them on FullStory. I would really love if those two things would merge actually.
Y: Yeah, integration alert. Yeah, that doesn't seem too bad though. Payments?
E: Stripe. There was some talk of Adyen for SEPA but we haven't gotten to that point yet. Although I think Stripe has a beta SEPA payment out now.
S: Throw Bitcoin in there as well maybe.
E: We've talked about Bitcoin.
A: We're a little worried that Bitcoin makes us look like we're money launderers or drug dealers which is really bad when you're licensed by the customs board. It was a pretty good argument our CEO made.
E: Like, "Huh, they paid for this container from Colombia with Bitcoin."
A: Exactly so we've stayed away probably wisely.
Y: Got you. I'm just running through looking at this stack here. Emails? You said there were some issues from time to time. What are you guys using?
S: SendGrid. Okay. Those are all transactional. You guys send out newsletters? Flexpoint Daily?
E: We do some digests.
A: I think we do MailChimp for newsletters. I could be wrong.
Y: That's awesome. Cool. Okay. That's good to hear. Maybe we should, we're coming up on time, but we never really covered team structure and team size. How big is the engineering team?
A: The engineering team total is about twenty right now. Then we have four PM's, four designers. We really want to keep it at a 1:1 PM to designer ratio and then about 1:4 engineer to PM ratio. Again, because it's so complex. It's hard to have a bunch of engineers do..we need a lot of PM's to build what we need to build.
Y: Got you. Then at least two engineering managers.
A: We have three now.
Y: Then there's sales and support.
A: Yeah, so we have sales, operations, and they make up a big part of it. Again because it's a very human-powered thing right now. HR, recruiting, finance.
Y: Then are you counting the support engineers as part of engineering?
A: The support engineers they are engineers. The way we do support is a rotation. We do a week long rotation of one engineer will be on support for a week, then a different one, then a different one to go through everyone.
We have a QA person but just one.
D: She's a recent convert from ops so she's got a lot of product knowledge and a lot of knowledge about how to work around bugs and stuff like that. She's developing knowledge on the technical side. That's actually made our bugs process go a lot more smoothly.
A: Yeah, it's been great because she has- business rules are so complex that she already knows all that stuff.
E: She teaches the support engineers what is going on.
A: She teaches like, "This is what we need to do to move the shipment here."
D: She's able to prioritize bugs way better than we can. Before we had her and you were on support duty you would get a bug and you wouldn't really know is that actually a bug? Where do I go in the app to even get here? It was a lot harder to diagnose what's going on.
E: People would just send you a screenshot and you'd be like "I don't know what's wrong with that."
S: She seems pretty high quality. It seems like you guys have high quality engineers here in general. What is the interview process like?
Y: Are you looking for engineers?
A: Yes, we are looking for engineers. Always.
E: We start with a phone screen -- unless people go through Triplebyte, which a lot of people do come through, if they pass the phone screen we bring them on-site. Three engineering interviews. We usually like to have them get coffee with someone on operations so they can ask about the business and stuff. We tend to attract entrepreneurial engineers who see the business opportunity and that's what caught their eye about us.
Standard issue interviews. We give a demo. We try and dig into someone's past projects and try and get them to tell us in depth about something they've really worked on or enjoyed working on. Try to get a sense for what makes them happy day to day and how they fit in the team.
Y: Cool. All righty. Anything else before we wrap up here?
S: What's next for Flexport?
Y: Oh yeah we didn't even talk about that. What's next on the tech side, business side? Anything big in the pipeline?
A: Yeah. The next ten years we're hoping to build the platform for global trade. We think we have a unique opportunity in this space because a lot of the bigger players stopped innovating a long time ago in terms of technology. Maybe this is a little too far out, but we really are trying to build out, end-to-end how you move freight so building up this platform that anyone can use and trying to open it up to other providers, other freight forwarders. So, broad-based, that's definitely what's next for us technically.
E: It's such a fragmented space because the top ten forwarders have 9% market share.
A: Yes, and it's a $1.2 trillion space so it's just massive. There's a lot of opportunity there.
Y: Definitely. Super exciting. On the tech side anything new in the pipeline? Any major changes?
D: Yeah, I talked a little bit about Relay and GraphQL we're experimenting with. Flow is another thing we're going to start using if I have anything to do with it. I don't know.
Y: Containerizing stuff.
D: Yeah that's true. Containerizing is going to happen soon.
E: I'm excited about advance quoting and all of the routing optimization problems we're going to run into. The graph theory nerd in me has been waiting for that since day one.
Y: Upgrading to Rails 5 probably? Never?
D: I mean maybe.
A: We should do that.
D: We'll probably get around to it.
S: Cool. Cool. Awesome. Well thanks so much guys.
Y: Thank you.
A: Thank you.