Cloud Controller migration in CF-241


Nicholas Calugar
 

TL;DR: MySQL implicitly ends transactions before and often after certain
statements including DDL statements. Cloud Controller in CF-241 contains a
database migration that will not be executed atomically, so there is a
potential data consistency issue for applications pushed with a specified
buildpack while deploying CF-241.

While investigating the issue we marked CF-241 as Pre-Release, it has since
been re-enabled as the latest release. While these sort of migrations are
uncommon, this is not the first release containing a migration that is
potentially unsafe on MySQL. The timeline for the investigation can be
found in the release notes [1] and below. The CAPI team intends to explore
how we can make migrations on MySQL better in the future [2].


2016-09-01 17:25 UTC - The Cloud Controller database migration in CF-241
is not wrapped in a transaction. During a rolling deploy of Cloud
Controllers, API requests to Cloud Controllers with the previous code could
result in data inconsistencies. We will update these release notes when we
determine the proper resolution.

2016-09-01 21:36 UTC - The underlying Sequel gem automatically runs
migrations in a transaction for RDBMs that support transactions for DDL
statements. This means PostgreSQL will run the entire migration in a
transaction, but MySQL will not. We are still determining the proper steps
to take for MySQL.

2016-09-02 17:06 UTC - MySQL implicitly ends transactions before (and
often after) certain statement including DDL statements . A Cloud
Controller database migration in CF-241 is encrypting the specified
buildpack of an application as this column could contain a Git url
containing a username and password. To perform this migration, it creates
new columns, encrypts the existing buildpack data and saves it to the new
columns, then deletes the old column. This results in a period of time
where Cloud Controllers running the code from a previous release can
potentially write data to the old column, which is about to be deleted,
when an app is pushed with a specified buildpack. While these sort of
migrations are uncommon, this is not the first time Cloud Controller has
made this sort of migration. Operators that are particularly sensitive to
this can always scale their Cloud Controller to a single instance in order
to take downtime while the migration is performed.


[1] https://github.com/cloudfoundry/cf-release/releases/tag/v241
[2] https://www.pivotaltracker.com/story/show/129709617
[3] https://dev.mysql.com/doc/refman/5.7/en/implicit-commit.html


Nicholas Calugar
Product Manager - Cloud Foundry API
Pivotal Software, Inc.

Join {cf-dev@lists.cloudfoundry.org to automatically receive all group messages.