TL;DR: If a race condition between two MySQL transactions appears, the row
ID = N may appear in the database BEFORE another row with
ID < N.
Last week, I’ve been asked to build a microservice that will be responsible for polling a table for data and pushing it to a Kafka topic. As the table contains immutable events, it’d wise to use some kind of incremental loading.
Confluent JDBC Connector (see more) is a ready-made add-on exploiting Kafka Connect possiblities, it simplifies building such a service to few lines of properties files and the data just flows… ;)
However, one needs to set the properties carefully, adjusting each of them
to the particular use case. One of them is the mode,
which defines how the table should be queried for new rows. By default,
it adopts the
incrementing strategy, which should be fine for most. At least,
the docs don’t mention any kind of risks involved here.
Beware, it’s a trap! (At least with MySQL that I had to use here.)
We need three terminals here, let’s call them A, B and C.
In the terminal A, we create an empty table
foo (a very simple one -
autoincrement primary key + createdat timestamp + some text column
create table foo ( id int not null auto_increment, value char(30) not null, created_at timestamp default current_timestamp, primary key (id) ); Query OK, 0 rows affected (0,08 sec) select * from foo; Empty set (0,00 sec)
OK, done - created and empty. Let’s use the second connection, terminal B, to start
a new transaction and insert a row
alfa (I’ll use
the Alphabet to maintain a readable order).
start transaction; insert into foo (value) values ('alfa');
Since we’re in the middle on an uncommited transaction (with defualt isolation level),
alfa in not yet visible for connection A.
select * from foo; Empty set (0,00 sec)
Let’s use the last connection, C, to make an insert of
bravo row (with autocommit).
insert into foo (value) values ('bravo'); Query OK, 1 row affected (0,01 sec)
This should be visible from now on for all other connections, that’s what terminal A says:
select * from foo; +----+-------+---------------------+ | id | value | created_at | +----+-------+---------------------+ | 2 | bravo | 2018-04-09 13:13:51 | +----+-------+---------------------+ 1 row in set (0,00 sec)
It’s there! As you can notice, it got assigned the
ID = 2, because
ID = 1
had been previously allocated for the
alfa row. But hey, where is it?!
Uncommited! Hanging in the air!
Now we commit the connection B transation:
commit; Query OK, 0 rows affected (0,02 sec)
…and run a query using connection A:
select * from foo; +----+-------+---------------------+ | id | value | created_at | +----+-------+---------------------+ | 1 | alfa | 2018-04-09 13:13:21 | | 2 | bravo | 2018-04-09 13:13:51 | +----+-------+---------------------+ 2 rows in set (0,01 sec)
bravo rows are visible.
During the between-commit phase, the row
ID = 2 was visible
for other connections while the row
ID = 1) was still uncommited, thus
not visible for others.
This is why I claim that in edge cases, when a race condition between transaction
auto_increment breaks its monotonic property. This is due to the fact
that the value is allocated on insert, not on commit.
For the microservice I’m developing, this is a serious threat. Consider that we query the db during this short between-commit phase:
- we retrieve a single row,
ID = 2,
- we process this batch of one row,
- we store the
ID = 2state for the next batch, telling it “you should add
where ID > 2to your query”.
We’ve just ommited the
Here’s a list of proposals that might work (or might not):
created_atcolumn to improve the situation? I don’t think it’ll help, but that’s one of the modes that Kafka JDBC connector can operate in.
Wait X seconds until all transactions with
ID < Nare commited (e.g. innodblockwait_timeout + 5 seconds).
Dump whole table instead of incremental loading.
Mark the already pushed rows with a flag, query with
where flag = false.
In case of any questions, comments, don’t hestitate to reach out.