Issue
The JobManager and TaskManagers fail with `org.rocksdb.RocksDBException: ... Not supported` and an entry like the following in the logs:
2019-10-15 12:33:53,699 INFO org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Could not complete snapshot 1 for operator testOperator (1/2). org.rocksdb.RocksDBException: while link file to /mnt/checkpoints/job/<path>/chk-1.tmp/000022.sst: /mnt/checkpoints/job/<path>/db/000022.sst: Not supported at org.rocksdb.Checkpoint.createCheckpoint(Native Method) at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51) at org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.takeDBNativeCheckpoint(RocksIncrementalSnapshotStrategy.java:243) at org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.doSnapshot(RocksIncrementalSnapshotStrategy.java:154) at org.apache.flink.contrib.streaming.state.snapshot.RocksDBSnapshotStrategyBase.snapshot(RocksDBSnapshotStrategyBase.java:128) at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.snapshot(RocksDBKeyedStateBackend.java:484) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:407) at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1113) at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1055) at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:729) at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:641) at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:586) at org.apache.flink.streaming.runtime.io.BarrierTracker.notifyCheckpoint(BarrierTracker.java:270) at org.apache.flink.streaming.runtime.io.BarrierTracker.processBarrier(BarrierTracker.java:186) at org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:105) at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:273) at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704) at java.lang.Thread.run(Thread.java:748)
Environment
- Flink version: 1.8 and later.
- Cloud: Azure
- Azure File Share mounted at
/mnt/checkpoints
- RocksDB is used as state backend with the following settings:
state.backend: rocksdb
state.backend.incremental: 'true'
state.checkpoints.dir: 'file:///mnt/checkpoints'
state.backend.rocksdb.localdir: /mnt/checkpoints/job
Resolution
While keeping `state.checkpoints.dir` on a distributed file system (Azure File Share in this case), move `state.backend.rocksdb.localdir` to a local file system like `/tmp`.
Cause
When `state.checkpoints.dir` and `state.backend.rocksdb.localdir` are configured to use the same file system, RocksDB makes use of hard links for checkpointing. Azure File Share does, however, not support hard-links and thus fails.
Important: RocksDB is a local embedded database used by Flink on each TaskManager. It is not used for state persistence or fault tolerance. As such, it should always be on a local file system, preferably a locally-attached SSD.