DDL - Data Definition Language / Schema change handling
Last updated
Was this helpful?
Last updated
Was this helpful?
The design behind TiDB's DDL implementation can be read in the .
TiDB is a distributed database which needs to have a consistent view of all schemas across the whole cluster. To achieve this in a more asynchronous way, the system uses internal states where each single stage transition is done so that the old/previous stage is compatible with the new/current state, allowing different TiDB Nodes having different versions of the schema definition. All TiDB servers in a cluster shares at most two schema versions/states at the same time, so before moving to the next state change, all currently available TiDB servers needs to be synchronized with the current state.
The states used are defined in :
Note: this is a very different implementation from MySQL, which uses Meta Data Locks (MDL) for keeping a single version of the MySQL instance schemas at a time. This results in MySQL replicas can have different version of the schemas, due to lag in asynchronous replication. TiDB has always a consistent view of all schemas in the cluster.
All DDL jobs goes through two cluster wide DDL Queues:
The two base operations for these queue are:
There are two main execution parts handling DDLs:
DDL background goroutines:
General worker, handling the default DDL queue where only metadata changes are needed.
Add Index worker, which updates/backfills data requested in the AddIndexJob queue.
The execution of the DDL is started through the 'Next' iterator of the DDLExec class (just like normal query execution):
Where the different DDL operations are executed as their own functions, like:
Let us use the simple CREATE TABLE as an example (which does not need any of the WriteOnly or DeleteOnly states):
DDL job encoded as JSON:
A ddl worker goes through this workflow in a loop (which may handle one job state per loop, leaving other work to a new loop):
Start a transaction.
Checks if it is the owner (and if not just returns).
Picks the first job from its DDL queue.
Waits for dependent jobs (like reorganizations/backfill needs to wait for its meta-data jobs to be finished first, which it is dependent on).
Waits for the current/old schema version to be globally synchronized, if needed, by waiting until the lease time is passed or all tidb nodes have updated their schema version.
If the job is done (completed or rolled back):
Clean up old physical tables or indexes, not part of the new table.
Remove the job from the ddl queue.
Add the job to the DDL History.
Return from handleDDLJobQueue, we are finished!
Update the DDL Job in the queue, for the next loop/transaction.
Write to the binlog.
Where each operation is handled separately, which is also one of the reasons TiDB has the limitation of only one DDL operation at a time (i.e. not possible to add one column and drop another column in the same DDL statement).
Notice that it will take another loop in the handleDDLJobQueue above to finish the DDL Job by updating the Schema version and synchronizing it with other TiDB nodes.
Generic queue for non data changes, .
Add Index Queue for data changes/reorganizations, , to not block DDLs that does not require data reorganization/backfilling).
, adding one DDL job to the end of the queue.
, pop the first DDL job from the queue (removing it from the queue and returning it).
When a DDL job is completed it will be .
TiDB session, which executes your statements. This will parse and validate the SQL DDL statement, create a , and enqueue it in the corresponding queue. It will then monitor the DDL History until the operation is complete (succeeded or failed) and return the result back to the MySQL client.
which takes tasks from the sessions and adds to the DDL queues in batches.
for processing DDLs:
managing that one and only one TiDB node in the cluster is the DDL manager.
This statement has the CreateTableStmt Abstract Syntax Tree type and will be handled by / functions.
It will fill in a struct according to the table definition in the statement and create a and call which goes through the goroutine and adds one or more jobs to the DDL job queue in
When the tidb-server starts, it will initialize a where it creates a DDL object and calls which starts the limitDDLJobs goroutine and the two DDL workers. It also starts the / which monitor the owner election and makes sure to elect a new owner when needed.
Wait for either a signal from local sessions, global changes through PD or a ticker (2 * lease time or max 1 second) and then calls .
Otherwise, execute the actual DDL job, See more about this below!
The execution of the job's DDL changes in looks like this:
Following the example of CREATE TABLE
we see it will be handled by , which after some checks, will create a new Schema version and if table is not yet in StatePublic
state, it will create the table in which simply writes the TableInfo as a JSON into the meta database.
An overview of the DDL execution flow in the TiDB cluster can be seen here:
And more specifically for the DDL worker: