After intial sync: Some pending transactions prevents to riged out dead tuples
The branch https://gitlab.syncad.com/hive/hivemind/-/tree/mi_log_update_rshares_plan has extended log to better explain what and why is slow during processing hive posts table after initial sync
db_state.py: def _finish_hive_posts. It turned out that calling
VACUUM ANALYZE hive_posts does not clean dead tuples because some other old transaction is still pending ( probably started in a separated thread by method
In the log we can find:
267945 INFO - sqlalchemy.dialects.postgresql - INFO: "hive_posts": found 0 removable, 148266746 nonremovable row versions in 6258066 out of 6443027 pages 267945 DETAIL: 55532380 dead row versions cannot be removed yet, oldest xmin: 14963082
It means that transaction 14963082 is still pending and do not allow the removal of dead tuples by newer transactions. Methods
update_active_starting_from_posts_on_block are executed on hive_post table which is full of dead tuples. It is not clear how this situation affects the performance of the methods (they takes 1601s and 11916s = 3h45m ).
At some point in log we can see:
282308 INFO - sqlalchemy.dialects.postgresql - INFO: "hive_posts": found 36379428 removable, 92736146 nonremovable row versions in 6409695 out of 6443027 pages 282308 DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 14963116
It means that old transaction 14963082 was already finished and did not block removing dead tuples. The only method from other threads which reported its finish was
update_account_reputations executed, so it is suspected to be the blocker 14963082.
If a large number of dead tuples slow down significantly operations on hive_posts we should consider execute them sequentially before the start of other finishing methods ( refactor