- Over 61M users - gaining 1.5M users/month
- 2PB of user media files - 1.5TB of new files/day
- 3 data centers + 2 clouds (Google, AWS)
- 1.5B HTTP requests/day
- 400 engineers
Wix is a cloud-based web development platform that makes it easy for everyone to create beautiful websites. Our drag-and-drop, online editor makes it simple for everyone to get online with a beautiful, professional, and functional web presence. With over 60M sites built in more than 190 countries, Wix is the world's largest "build it yourself" website-building platform.
Wix’s open architecture enables users to add functional services to a site- including email, contact forms, and eCommerce shops with a single click, making it a one-stop shop for managing a small business presence online.
I’ve been working for Wix on and off since the day the company was founded. I interviewed the first backend developer in 2006, then rejoined the company for 2 months in 2008. I’ve been working full-time as Chief Architect since 2010.
As part of my role, I have led the transition from a traditional waterfall development model to Agile continuous delivery, helped introduce DevOps, and initiated the Wix SDK. I am now working on other products, as well as the next-generation infrastructure for Wix.
Wix is organized into internal companies and guilds. With around 400 engineers, QA, and product people, organizational structure becomes key for our success. Guilds focus on technical aspects such as scalability, developer productivity, and interservice communication, while companies focus on products. For example, we have an eCommerce company, a backend guild, an Angular guild, etc.
A guild is responsible for hiring people, training them, and building infrastructure and tools to help engineers be more productive. Some guilds take as much as 20% of engineers’ time to work on guild-related tasks (building infrastructure, testing tools, giving lectures, facilitating retrospectives, etc). Companies, on the other hand, are composed of people from different guilds working to build a specific product. Each company has one company head who is the final decision maker. Engineers are members of one guild, and most are assigned to a company. A few select engineers are kept in the guild for cross-company tasks, such as to build frameworks or a knowledge base.
As Wix started, we built a single monolith using Java, Hibernate, Ehcache, Tomcat, and MySQL. This monolith enabled us to learn more about our business and for Wix to grow, but by 2008, it started to introduce difficulties.
Our monolith served two functions: serving sites built with Wix, and supporting sites that users were building. Most of the development effort at the time was to support the site-building tools, resulting in a lot of changes in this function. Serving sites, on the other hand, was relatively stable. However, the cost of problems with the first function was considerably higher because it impacted all of the sites built with Wix, compared to only impacting users actively building websites. Because both functions were supported by the same monolith, any change in the site-building function resulted in risk for the site-serving function, even if the site-serving function itself did not change. We experienced pains such as downtimes (planned or unplanned) imposed by changes in the site-building function, but those downtimes affected all Wix users.
So, by 2008, with over 1,000,000 sites, we made our first architecture change based on service-level requirements. Two functions with different sources of risk and different service levels needed to be two distinct software applications. To this day we use Service Level-driven Architecture to separate concerns to different microservices.
Over the years we dropped Hibernate and Ehcache, and broke down the monolith (a 4.5-year process). This was a long, complicated process because we kept Wix running and enhanced Wix with new features while gradually moving features from the monolith to new microservices. Today at Wix we have over 100 microservices. Most are based on the Scala programming language, with Jetty, Spring, and our internal framework.
When Wix started, we had a Flash-based product—both the Editor and the created sites were Flash applications—which turned into a second monolith (especially the Editor application).
In a nutshell, Wix’s current architecture involves 4 main groups of services:
WixMP - An Internet media filesystem, built and optimized for hosting and delivering images, video, music, and plain files, integrated with CDNs (Akamai, Level3, Fastly), SSL, etc. The platform is running on two clouds (Google and Amazon), utilizing their compute instances and storage (Google Cloud Storage and Amazon S3) for on-the-fly image manipulation and video transcoding. The compute instances software is developed using Python, Go, and C, where applicable.
Verticals - A set of applications that adds value to a Wix site, such as eCommerce, Shoutout, or Hotels. The verticals are built using an Angular frontend and the Jetty/Spring/Scala stack for backend. We selected Angular over React for verticals because Angular provides a more complete application framework including dependency injection and services abstraction.
For us, a microservice is a single application deployed as a process with one clear responsibility. A common question is the size of a microservice: Is it a single function? A single class? Or a whole monolith? Our answer is that a microservice is as large as the team of people managing it (a single team can manage a few microservices). That team has to be able to describe the microservice responsibility in one clear sentence. Only a single microservice will write to a specific database (the microservice owns that database). The microservice itself has to be stateless to support frequent deployments and multiple instances, and all persistent states are stored in the database.
One of the other questions people always ask is: why does Wix keep the websites in a JSON representation and not HTML? The reason is that it enables us to address issues with different browsers or mobile devices, fixing just the JS layer without changing the stored site definition. This in turn allows us to respond quickly to various changes and challenges—for example, new releases of browsers. This also enables us to optimize all Wix websites for search engines (SEO) and to update the optimization constantly as search engines evolve.
Wix is a large-scale shop. We have over two petabytes of media (3M files uploaded daily), lots of BI and log data, and some MySQL tables totaling hundreds of gigabytes. We cope with size differently based on the service. For example, WixMP runs on two clouds, utilizing the clouds’ storage and compute services, as well as a CDN for scale.
Most of Wix’s microservices run in our own data centers using stateless Scala services and a MySQL storage engine. Scaling those services involves mostly scaling MySQL, which for us works best by using functional sharding and a NoSQL usage pattern. We found that MySQL is actually a better NoSQL than most, if it’s used as a NoSQL engine (client-generated IDs; lookup only by index or primary key, in which any field that is not indexed is folded into a single JSON field; no joins; etc.). With MySQL working in this way, we get Active - Active replication between data centers, as well as ~1 mSec response time on huge tables for reads by primary keys.
Around 2012 we introduced an applications SDK, allowing third parties to create widgets and applications for a Wix site, such as eCommerce, blogs, contact forms, or CRM systems. Today we’re proud to have over 250 partners selling third-party applications in our App Market.
Workflow & Tools
We build our software using Maven, Grunt, and TeamCity. We use Git repositories hosted on GitHub (over 400 different repositories). For project management we use Jira, to deploy we use Chef, to configure services we use ZooKeeper, and to run A/B tests and experiments we use Petri. For monitoring, we use New Relic, Nagios, Graphite, and BI business alerting. The BI stack is based on Hadoop, Pig, HBase, and Storm, with in-house-built business alerts and exploration tools.
Our services stack is built on the JVM (Scala) running as a standalone application, which uses embedded Jetty, Spring MVC, and our own framework (targeting developer productivity, and providing support for testing and connectivity to other Wix infrastructure services). We use JSON/RPC and ActiveMQ for communication among services, MySQL, MongoDB, and our own WixMP for persistency. Our frontend toolbox, again includes Angular and React, as well as the build and testing ecosystem of both.
We're now in the process of analyzing what’s holding us back, and we’ve recognized a few issues we want to address. One such issue is the build process and test feedback time that occurs because of dependencies between projects. Changes in some projects require testing a large number of other services, a process taking up to half an hour. We’ve identified considerable inefficiencies in the traditional build pipeline of compile code, compile tests, run tests, compile integration tests, run integration tests, and only then move to the next project down the pipeline. We are considering moving to other build tools, such as Google Bazel, or building our own tool, focusing on parallelizing compilation and testing between different dependent services.
Another challenge we need to revisit is our cloud strategy: which services run in the cloud vs. which services run in our own data centers. I cannot really say a lot about it at this stage, except that we are working with three different cloud providers to figure out how we can improve developer productivity and scale, while keeping the complexity under control.
At Wix we believe that hiring the right people is the key to a successful company. We believe that it is not just hiring the right people, but also getting them to do the right job with the freedom to innovate and impact the business.