Skip to content
Snippets Groups Projects

Implement storing binary serialized operations in the HAF database

Merged Mateusz Tyszczak requested to merge tm-ops-as-hive-operation into develop

Related issue: #92 (closed)

Merge Request related to this one on the hive-side: hive!802 (merged)

Changes to the repo:

  • Fixed hive_fork_manager being built always in the mainnet configuration on the CI
  • Added new type: hive.operation
  • Added operator class (using btree) for the hive.operation type
  • Changed hive.operations.body column type from text to hive.operation
  • Made CUSTOM_LOG variadic arguments optional
  • Optimized escape_raw function
  • Changed tests to work with the new type
  • Made colect_data_and_fill_returned_recordset truly noexcept
  • Created colect_operation_data_and_fill_returned_recordset function wrapping colect_data_and_fill_returned_recordset function specifically for the hive.operation type
  • Changed sql_serializer to put binary data into the hive.operations table instead of the json representation
  • Added HiveOperation type class for the sqlalchemy ORM

What should be added/changed/discussed before merging (not critical):

  • Fix warnings generated by the sqlalchemy ORM related to the hive.operation type
  • Make update_state_provider_keyauth function work with the binary data representation instead of text one
  • Search for other possibilities to optimize the SQL code execution using binary data representation instead of the text one

@gandalf Please review and test full replay + live sync

Edited by Bartek Wrona

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Mateusz Tyszczak added 2 commits

    added 2 commits

    • 88237c24 - Change function name to better match its purpose
    • f1ec3bf9 - Remove redundant include

    Compare with previous version

  • Mateusz Tyszczak resolved all threads

    resolved all threads

  • Mateusz Tyszczak added 2 commits

    added 2 commits

    • b3fa130b - threat system tests warnings as errors and supply clearer logs
    • 92b80299 - in system tests use ORM defined in tables.py

    Compare with previous version

  • Mateusz Tyszczak changed the description

    changed the description

  • We have to analyze why the full replay took more than expected:

    rev Replay time [s] Change [s] (%) operations table size [GB] Change [GB] (%) overall database size [GB] Change [GB] (%)
    develop 49081 +0 (100%) 1349 +0 (100%) 3176 +0 (100%)
    tm-ops-as-hive-operation (9bdc8f7c) 104511 +55430 (212,94%) 809 -540 (59,97%) 2781 -395 (87,56%)
  • Mateusz Tyszczak added 33 commits

    added 33 commits

    • 92b80299...a257bf20 - 8 commits from branch develop
    • edb801e5 - Implement hive operation postgresql type
    • 64bbaf61 - Fix hive_fork_manager tests
    • 2d590108 - Optimize to hex raw escape in sql serializer
    • d10e9835 - Change utility sql functions input data type from string to hive.operation
    • 396895da - Fixed permission settings for new functions.
    • aceecbe6 - Remove conflicting pg_module_magic
    • f0368d76 - Implement op name in colect_data_and_fill_returned_recordset as arg
    • 993ded4d - Change arg type in get_keyauths_wrapper sql function
    • 6d918f18 - Add get_legacy_style_operation casts to hive.operation type
    • b08d5a05 - Fix haf tests to work with new hive operation type
    • 14076e4c - Make collect data and fill function fully noexcept
    • 23ad4e19 - Fix hive.operation combining queries
    • c6db9efb - Fix tests
    • 1035b18b - Replace old libdir in sql to module_pathname
    • bf048d30 - Update README
    • a76859fd - Update hive submodule
    • 6b384053 - Remove redundant hashable operator class for hive operation
    • 57f653c3 - Change column type in table tests
    • 00be0fcd - Fix hfm building always in mainnet
    • 3b41f5fc - Fix tests
    • 45bbc4c0 - Build hfm in testnet on CI
    • 11092b7b - Change function name to better match its purpose
    • ba5e3b4d - Remove redundant include
    • b833fd7d - threat system tests warnings as errors and supply clearer logs
    • 3f976b3c - in system tests use ORM defined in tables.py

    Compare with previous version

  • added 1 commit

    • 4c944718 - in system tests use ORM defined in tables.py

    Compare with previous version

  • added 1 commit

    • f1e656b5 - in system tests use ORM defined in tables.py

    Compare with previous version

    • Resolved by Mateusz Tyszczak

      @mtyszczak I'm a little confused by the data in the table above. It seems to indicate that the operations table size decreased by 540G, but the overall database size only changed by 395GB. Does this means that some extra data needed to be added to compensate for the new schema? Or does it just indicate that unrelated changes have increased the overall database size? Or maybe it is just because we're processing more blocks in the 2nd test?

      Edited by Dan Notestein
  • Mateusz Tyszczak added 46 commits

    added 46 commits

    • f1e656b5...9cb65db8 - 21 commits from branch develop
    • 899f0343 - Implement hive operation postgresql type
    • 216d7db1 - Fix hive_fork_manager tests
    • b338d863 - Optimize to hex raw escape in sql serializer
    • aad688bf - Change utility sql functions input data type from string to hive.operation
    • 623325e0 - Fixed permission settings for new functions.
    • 1b249f91 - Remove conflicting pg_module_magic
    • 287341f6 - Implement op name in colect_data_and_fill_returned_recordset as arg
    • 7551b009 - Change arg type in get_keyauths_wrapper sql function
    • 72ab75a7 - Add get_legacy_style_operation casts to hive.operation type
    • 483e11e6 - Fix haf tests to work with new hive operation type
    • 9c00163e - Make collect data and fill function fully noexcept
    • 0161fbb4 - Fix hive.operation combining queries
    • 29ef7bcd - Fix tests
    • d973e34f - Replace old libdir in sql to module_pathname
    • 539d8ee2 - Update README
    • 06c055e7 - Remove redundant hashable operator class for hive operation
    • e9c0ce23 - Change column type in table tests
    • b224808a - Fix hfm building always in mainnet
    • c990681c - Fix tests
    • f14d0c5d - Build hfm in testnet on CI
    • a299ab18 - Change function name to better match its purpose
    • ab038426 - Remove redundant include
    • b37bdb99 - threat system tests warnings as errors and supply clearer logs
    • 55dfcfd4 - in system tests use ORM defined in tables.py
    • 5ed175f2 - Update hive submodule

    Compare with previous version

  • Mateusz Tyszczak added 2 commits

    added 2 commits

    • 8cf4e20e - in system tests use ORM defined in tables.py
    • c034b0b3 - Update hive submodule

    Compare with previous version

  • added 1 commit

    • df058f34 - Fix operation JSON values in tests

    Compare with previous version

  • Mateusz Tyszczak added 21 commits

    added 21 commits

    • 43d6b2ee - 1 commit from branch develop
    • 45367181 - Update hive submodule
    • ccfee456 - Used hive_options.cmake defined in Hive submodule, to adjust hfm building...
    • 6a11819b - Build hfm in testnet configuration on CI jobs requiring it.
    • 8553978a - Optimize to hex raw escape in sql serializer
    • c2e041b6 - Change utility sql functions input data type from string to hive.operation
    • b8de2bf4 - Remove conflicting pg_module_magic
    • cd4be1e2 - Implement op name in colect_data_and_fill_returned_recordset as arg
    • 494df96c - Change arg type in get_keyauths_wrapper sql function
    • 66a5c909 - Add get_legacy_style_operation casts to hive.operation type
    • d0e70870 - Make collect data and fill function fully noexcept
    • c1758b38 - Fix hive.operation combining queries
    • 4b15ec49 - Replace old libdir in sql to module_pathname
    • fa41bde2 - Update README
    • ad618e63 - Remove redundant hashable operator class for hive operation
    • 3de5a7eb - Change function name to better match its purpose
    • 44723348 - Operations defined in hive_fork_manager regression tests must match to valid...
    • 661dc66f - Change column type in table tests
    • 33667030 - threat system tests warnings as errors and supply clearer logs
    • 5769793d - in system tests use ORM defined in tables.py
    • 85827902 - WIP: switching hive.operation storage to MAIN to improve performance.

    Compare with previous version

  • Mateusz Tyszczak resolved all threads

    resolved all threads

  • Mateusz Tyszczak added 17 commits

    added 17 commits

    • d960cdc8 - Optimize to hex raw escape in sql serializer
    • 2a8e98ca - Change utility sql functions input data type from string to hive.operation
    • 9f6e3026 - Remove conflicting pg_module_magic
    • 4c7d8eae - Implement op name in colect_data_and_fill_returned_recordset as arg
    • 31bc88f3 - Change arg type in get_keyauths_wrapper sql function
    • 22914603 - Add get_legacy_style_operation casts to hive.operation type
    • f8c7af8d - Make collect data and fill function fully noexcept
    • ba432d21 - Fix hive.operation combining queries
    • 45129e91 - Replace old libdir in sql to module_pathname
    • 8942d0f7 - Update README
    • a2256ebc - Remove redundant hashable operator class for hive operation
    • 10ec8d17 - Change function name to better match its purpose
    • 445dec98 - Operations defined in hive_fork_manager regression tests must match to valid...
    • d4097a65 - Change column type in table tests
    • b029dc57 - threat system tests warnings as errors and supply clearer logs
    • 5550bbff - in system tests use ORM defined in tables.py
    • 33d7943d - WIP: switching hive.operation storage to MAIN to improve performance.

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading