By
Brad Brewer

December 16, 2016 — 4 minute read

Schema Migration

At InVision, we manage dozens of database schemas across hundreds of instances. Some of these schemas contain hundreds of millions of records and live on large MySQL clusters while others are small, brand new services with fledgling databases. Whether your databases are large or small, schema migrations play an integral part in any rapidly growing platform.

The InVision engineering organization continues leveraging microservice architecture more and more, resulting in parallel development that screams for automation. It isn’t uncommon for the Data Services team to provision a dozen new database instances in a day.

With the paralleled development of numerous engineering teams frequently updating their respective services, hands on DBA schema management becomes a choke-point for velocity, something our fearless CTO ensures we don’t lose sight of.

If you aren’t familiar with the concept, schema migrators are tools that serve to manage incremental changes to relational database schemas.

What we created:

A schema migrator that leverages existing infrastructure built on top of docker/kubernetes, automatically deploying database migrations to all environments. We did so with the following goals in mind:

Empowering engineering teams to own their database changes
Well-tracked schema migrations
Reproducible and predictable migrations across all environments
Management of seed data - a high priority when your engineering team is growing rapidly, and forging a new code-base every week!
Automated migration deployment from development to production

How does it work?

Make the schema changes to your local MySQL server
Run [create-my-changeset.sh] - This step executes a schema diff between your local environment and the pristine state of the database, producing a file that can be submitted as a pull request
Submit the PR for review
Upon Data Services review/approval, the engineer who initiated the change can deploy. This step spins up a dedicated migration container that will apply these changes to all environments.

What went well?

Developers are now empowered to initiate and deploy data migrations via an automated channel the Data Services team maintains. This provides the greatest possible velocity for developers, as well as oversight into production database changes for the Data Services team.

We used off the shelf open-source tools, focusing our time on composing our own application logic to stitch those tools together to fit our specific needs. Specifically, we leveraged Liquibase for basic migration management and tracking, and Percona Toolkit to ensure our migrations could take place without downtime.

You might be wondering:

“Brad, you mentioned tables with hundreds of millions of records. If you’re going to alter these giant tables, some of which are on master/slave pairs, what about replication lag? Don’t some of these schema changes lag the slaves when your fancy automated deploy executes your database changeset?”

Typically, no. One of the more exciting elements of our schema migration tool is that we stood on the shoulders of giants, leveraging Percona’s pt-online-schema-change tool to handle large-scale changesets. Whether you’re adding a new column to a small table, or changing the data type for a 500 million record table, our tool will handle those changes with poise and grace. Though, the latter will take some time, as the migrator works diligently [via ptosc —max-lag] (https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-change.html#cmdoption-pt-online-schema-change--max-lag) to keep the slaves up to date.

“The DBAs where I work are subject matter experts that see themselves as gods over all things databases, your fancy automated migrator would never fly here.”

Sure it would! The PR process for schema changes provides a well tracked pipeline for developers to propose changes to us in Data Services, while allowing us to approve potentially impactful changes to our large production databases.

“Aren’t you automating yourself out of a job!?”

Not at all! The more we automate, the more we can accomplish. In addition to automating common tasks (of which there is no shortage!), the Data Services team is always available, and often sought out by our engineers, to consult with our internal engineering teams on everything from initial architectural decisions to query performance.

That’s it!

That’s how we scale schema migrations at InVision. If you have any questions/comments, please share!

Like what you've been reading? Join us and help create the next generation of prototyping and collaboration tools for product design teams around the world. Check out our open positions.