Skip to content

Fixes for deadlocks in AH plugin

Andrzej Lisak requested to merge abw_ah_lock_fix into develop

First problem was common for two calls in AH API with specific values:

  • account_history_api::get_ops_in_block("block_num":0,"include_reversible":true)
  • account_history_api::enum_virtual_ops("block_range_begin":0,"block_range_end":3,"group_by_block":true,"include_reversible": true,"limit":10)

When reversible data from block 0 was requested, the code was entering section with a lock to wait for end of ongoing process of moving reversible data to irreversible state or, second lock, for ongoing block processing. The problem was that value of zero indicates no such ongoing process, which meant that stopped hived instance would wait forever and hived instance in live sync would at least wait until one of those protected processes was started. In unit tests we have deadlock, because we have just one thread, so if it waits, nothing can unblock it in the background.

Second problem was spotted during code analysis. Due to issue #255 not yet addressed, a faulty plugin attached to on_post_apply_block signal could prevent AH plugin from unlocking its lock set in on_pre_apply_block. That lock would deadlock on reentry (during next block processing).

The whole locking used in AH plugin does not seem to prevent reading during writing, but I did not want to fix that. We should be taking regular read lock, however if we did, it would be blocking writer thread and AH calls can be big and slow. Since no one complained so far, I guess occasional crash during API call is acceptable price for not blocking writer thread (or maybe reading irreversible data is not used frequently enough for the issue to become official problem).

Merge request reports