Problem with uncatched exception from node
I recognize some problem with uncatched exception from node and pytest react with "no test run".
I get this problem in haf mirrornet tests run local(commit haf@58757585) in test: test_replay_in_mirrornet.py
Logs from pytest:
(venv) haf_admin@95f259f2d99d:/workspace/haf$ cd /workspace/haf ; /usr/bin/env /workspace/haf/venv/bin/python /workspace/.vscode-server/extensions/ms-python.python-2023.15.12301911/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 46001 -- -m pytest --numprocesses=auto /workspace/haf/tests/integration/system/haf/mirrornet_tests/test_replay_in_mirrornet.py --timeout=360000 --block-log-path=/workspace/block_logs/new/block_log --snapshot-path=/workspace/manual_runs/witness_node_for_snapshot/snapshot -m\ mirrornet -s --full-trace
================================================================================================================== test session starts ===================================================================================================================
platform linux -- Python 3.10.12, pytest-7.2.2, pluggy-1.2.0
rootdir: /workspace/haf/tests/integration, configfile: pytest.ini
plugins: rerunfailures-10.2, tavern-2.2.0, repeat-0.9.1, xdist-3.2.0, timeout-2.1.0
timeout: 360000.0s
timeout method: signal
timeout func_only: False
[gw0] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw1] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw2] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw4] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw3] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw5] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw6] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw7] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw9] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw8] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw11] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw10] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw12] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw15] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw13] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
[gw14] Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
gw0 [2] / gw1 [2] / gw2 [2] / gw3 [2] / gw4 [2] / gw5 [2] / gw6 [2] / gw7 [2] / gw8 [2] / gw9 [2] / gw10 [2] / gw11 [2] / gw12 [2] / gw13 [2] / gw14 [2] / gw15 [2]
scheduling tests via LoadScheduling
tests/integration/system/haf/mirrornet_tests/test_replay_in_mirrornet.py::test_replay[disabled_indexes_in_replay]
tests/integration/system/haf/mirrornet_tests/test_replay_in_mirrornet.py::test_replay[enabled_indexes]
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
config = <_pytest.config.Config object at 0x7fb0c24216f0>, doit =
def wrap_session(
config: Config, doit: Callable[[Config, "Session"], Optional[Union[int, ExitCode]]]
) -> Union[int, ExitCode]:
"""Skeleton command line program."""
session = Session.from_config(config)
session.exitstatus = ExitCode.OK
initstate = 0
try:
try:
config._do_configure()
initstate = 1
config.hook.pytest_sessionstart(session=session)
initstate = 2
> session.exitstatus = doit(config, session) or 0
venv/lib/python3.10/site-packages/_pytest/main.py:270:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
config = <_pytest.config.Config object at 0x7fb0c24216f0>, session = testsfailed=0 testscollected=2>
def _main(config: Config, session: "Session") -> Optional[Union[int, ExitCode]]:
"""Default command line protocol for initialization, session,
running tests and reporting."""
config.hook.pytest_collection(session=session)
> config.hook.pytest_runtestloop(session=session)
venv/lib/python3.10/site-packages/_pytest/main.py:324:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <_HookCaller 'pytest_runtestloop'>, kwargs = {'session': testsfailed=0 testscollected=2>}, firstresult = True
def __call__(self, **kwargs: object) -> Any:
assert (
not self.is_historic()
), "Cannot directly call a historic hook - use call_historic instead."
self._verify_all_args_are_provided(kwargs)
firstresult = self.spec.opts.get("firstresult", False) if self.spec else False
> return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
venv/lib/python3.10/site-packages/pluggy/_hooks.py:433:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <_pytest.config.PytestPluginManager object at 0x7fb0c2efb550>, hook_name = 'pytest_runtestloop'
methods = [>, >]
kwargs = {'session': testsfailed=0 testscollected=2>}, firstresult = True
def _hookexec(
self,
hook_name: str,
methods: Sequence[HookImpl],
kwargs: Mapping[str, object],
firstresult: bool,
) -> object | list[object]:
# called from all hookcaller instances.
# enable_tracing will set its own wrapping function at self._inner_hookexec
> return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
venv/lib/python3.10/site-packages/pluggy/_manager.py:112:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self =
@pytest.hookimpl
def pytest_runtestloop(self):
self.sched = self.config.hook.pytest_xdist_make_scheduler(
config=self.config, log=self.log
)
assert self.sched is not None
self.shouldstop = False
while not self.session_finished:
> self.loop_once()
venv/lib/python3.10/site-packages/xdist/dsession.py:117:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self =
def loop_once(self):
"""Process one callback from one of the workers."""
while 1:
if not self._active_nodes:
# If everything has died stop looping
self.triggershutdown()
raise RuntimeError("Unexpectedly no active workers available")
try:
> eventcall = self.queue.get(timeout=2.0)
venv/lib/python3.10/site-packages/xdist/dsession.py:131:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = , block = True, timeout = 2.0
def get(self, block=True, timeout=None):
'''Remove and return an item from the queue.
If optional args 'block' is true and 'timeout' is None (the default),
block if necessary until an item is available. If 'timeout' is
a non-negative number, it blocks at most 'timeout' seconds and raises
the Empty exception if no item was available within that time.
Otherwise ('block' is false), return an item if one is immediately
available, else raise the Empty exception ('timeout' is ignored
in that case).
'''
with self.not_empty:
if not block:
if not self._qsize():
raise Empty
elif timeout is None:
while not self._qsize():
self.not_empty.wait()
elif timeout < 0:
raise ValueError("'timeout' must be a non-negative number")
else:
endtime = time() + timeout
while not self._qsize():
remaining = endtime - time()
if remaining <= 0.0:
raise Empty
> self.not_empty.wait(remaining)
/usr/lib/python3.10/queue.py:180:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = , 0)>, timeout = 1.9999994300305843
def wait(self, timeout=None):
"""Wait until notified or until a timeout occurs.
If the calling thread has not acquired the lock when this method is
called, a RuntimeError is raised.
This method releases the underlying lock, and then blocks until it is
awakened by a notify() or notify_all() call for the same condition
variable in another thread, or until the optional timeout occurs. Once
awakened or timed out, it re-acquires the lock and returns.
When the timeout argument is present and not None, it should be a
floating point number specifying a timeout for the operation in seconds
(or fractions thereof).
When the underlying lock is an RLock, it is not released using its
release() method, since this may not actually unlock the lock when it
was acquired multiple times recursively. Instead, an internal interface
of the RLock class is used, which really unlocks it even when it has
been recursively acquired several times. Another internal interface is
then used to restore the recursion level when the lock is reacquired.
"""
if not self._is_owned():
raise RuntimeError("cannot wait on un-acquired lock")
waiter = _allocate_lock()
waiter.acquire()
self._waiters.append(waiter)
saved_state = self._release_save()
gotit = False
try: # restore state no matter what (e.g., KeyboardInterrupt)
if timeout is None:
waiter.acquire()
gotit = True
else:
if timeout > 0:
> gotit = waiter.acquire(True, timeout)
/usr/lib/python3.10/threading.py:324:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = , signal_number = 2, current_stack_frame =
@classmethod
def __external_error_handler(cls, signal_number, current_stack_frame) -> None:
with cls.__lock:
if cls.__last_exception is None:
# Default SIGINT handler raises KeyboardInterrupt, so below code is not executed
> signal.default_int_handler(signal_number, current_stack_frame)
E KeyboardInterrupt
hive/tests/hive-local-tools/test-tools/package/test_tools/__private/raise_exception_helper.py:17: KeyboardInterrupt
================================================================================================================= no tests ran in 4.16s ==================================================================================================================
I find in last_run
, the node close with code -2
last_run
2023-08-21 09:22:09,344 [INFO] HafNode0: Running HafNode0, replaying and waiting for live... (node.py:475)
2023-08-21 09:22:09,344 [INFO] HafNode0: Preparing database postgresql:///haf_block_log_d0540a77cf48416eba1c4c4de04e3294 (_haf_node.py:70)
2023-08-21 09:22:09,677 [DEBUG] HafNode0: Notifications server is listening on 127.0.0.1:37503... (node.py:191)
2023-08-21 09:22:09,677 [DEBUG] HafNode0: /workspace/haf/build/hive/programs/hived/hived -d . --chain-id 42 --skeleton-key 5JNHfZYKGaomSFvd4NUdQ9qMcEAC43kujbfjueTHpVapX1Kzq2n --force-replay (node.py:90)
2023-08-21 09:22:09,677 [INFO] HafNode0: Using time_offset @2016-09-15 19:47:24 (fake_time.py:12)
2023-08-21 09:22:10,080 [DEBUG] HafNode0: Closed with -2 return code (node.py:138)
2023-08-21 09:22:10,177 [DEBUG] HafNode0: Notifications server closed (node.py:242)
2023-08-21 09:22:10,182 [DEBUG] HafNode0: Notifications server closed (node.py:242)
In stderr I found reason, why this exception was thrown:
2844008ms data_processor.cpp:212 handle_exception ] Data processor Check consistency of irreversible data detected SQL statement execution failure. Failing statement: `SELECT hive.initialize_extension_data();'.
Full stderr.
2844003ms json_rpc_plugin.cpp:222 initialize ] initializing JSON RPC plugin
2844003ms webserver_plugin.cpp:584 plugin_initialize ] initializing webserver plugin
2844003ms webserver_plugin.cpp:587 plugin_initialize ] configured with 32 thread pool size
2844003ms webserver_plugin.cpp:590 plugin_initialize ] Compression in webserver is disabled
2844003ms webserver_plugin.cpp:602 plugin_initialize ] configured http to listen on 0.0.0.0:0
2844003ms webserver_plugin.cpp:619 plugin_initialize ] configured ws to listen on 0.0.0.0:0
2844003ms notifications.cpp:64 setup ] setting up notification handler for 1 address
2844003ms database.cpp:673 set_chain_id ] hive_chain_id: 4200000000000000000000000000000000000000000000000000000000000000
2844003ms chain_plugin.cpp:961 plugin_initialize ] Setting custom skeleton key: 5JNHfZYKGaomSFvd4NUdQ9qMcEAC43kujbfjueTHpVapX1Kzq2n
2844003ms witness_plugin.cpp:593 plugin_initialize ] Initializing witness plugin
2844004ms witness_plugin.cpp:613 plugin_initialize ] warning: stale production is enabled, make sure you know what you are doing.
2844004ms witness_plugin.cpp:620 plugin_initialize ] warning: required witness participation=0, normally this should be set to 33
2844004ms account_by_key_plugin.cpp:289 plugin_initialize ] Initializing account_by_key plugin
2844004ms market_history_plugin.cpp:175 plugin_initialize ] market_history: plugin_initialize() begin
2844004ms market_history_plugin.cpp:199 plugin_initialize ] market_history: plugin_initialize() end
2844004ms sql_serializer.cpp:730 plugin_initialize ] Initializing sql serializer plugin
2844004ms sql_serializer.cpp:52 is_database_correct ] Checking correctness of database...
2844004ms data_processor.cpp:132 trigger ] Trying to trigger data processor: Check correctness...
2844004ms data_processor.cpp:136 trigger ] Data processor: Check correctness triggerred...
2844004ms data_processor.cpp:142 trigger ] Waiting until data_processor Check correctness will consume a data...
2844004ms data_processor.cpp:56 operator() ] Entering data processor thread: Check correctness
2844004ms data_processor.cpp:63 operator() ] Check correctness data processor is connecting ...
2844004ms data_processor.cpp:65 operator() ] Check correctness data processor connected successfully ...
2844004ms data_processor.cpp:71 operator() ] Check correctness data processor is waiting for DATA-READY signal...
2844004ms data_processor.cpp:75 operator() ] Check correctness data processor resumed by DATA-READY signal...
2844004ms data_processor.cpp:86 operator() ] Check correctness data processor consumed data - notifying trigger process...
2844004ms data_processor.cpp:92 operator() ] Check correctness data processor starts a data processing...
2844004ms data_processor.cpp:147 trigger ] Leaving trigger of data data processor: Check correctness...
2844004ms transaction_controllers.cpp:124 do_reconnect ] Trying to connect to database: `postgresql:///haf_block_log_d0540a77cf48416eba1c4c4de04e3294'...
2844004ms data_processor.cpp:185 join ] Trying to resume data processor: Check correctness...
2844004ms data_processor.cpp:187 join ] Data processor: Check correctness resumed...
2844006ms transaction_controllers.cpp:126 do_reconnect ] Connected to database: `postgresql:///haf_block_log_d0540a77cf48416eba1c4c4de04e3294'.
2844007ms data_processor.cpp:103 operator() ] Check correctness data processor finished processing a data chunk...
2844007ms data_processor.cpp:111 operator() ] Leaving data processor thread: Check correctness
2844007ms data_processor.cpp:198 join ] Data processor: Check correctness finished execution...
2844007ms data_processor.cpp:132 trigger ] Trying to trigger data processor: Check consistency of irreversible data...
2844007ms data_processor.cpp:136 trigger ] Data processor: Check consistency of irreversible data triggerred...
2844007ms data_processor.cpp:56 operator() ] Entering data processor thread: Check consistency of irreversible data
2844007ms data_processor.cpp:142 trigger ] Waiting until data_processor Check consistency of irreversible data will consume a data...
2844007ms data_processor.cpp:63 operator() ] Check consistency of irreversible data data processor is connecting ...
2844007ms data_processor.cpp:65 operator() ] Check consistency of irreversible data data processor connected successfully ...
2844007ms data_processor.cpp:71 operator() ] Check consistency of irreversible data data processor is waiting for DATA-READY signal...
2844007ms data_processor.cpp:75 operator() ] Check consistency of irreversible data data processor resumed by DATA-READY signal...
2844007ms data_processor.cpp:86 operator() ] Check consistency of irreversible data data processor consumed data - notifying trigger process...
2844007ms data_processor.cpp:92 operator() ] Check consistency of irreversible data data processor starts a data processing...
2844007ms data_processor.cpp:147 trigger ] Leaving trigger of data data processor: Check consistency of irreversible data...
2844007ms transaction_controllers.cpp:124 do_reconnect ] Trying to connect to database: `postgresql:///haf_block_log_d0540a77cf48416eba1c4c4de04e3294'...
2844007ms data_processor.cpp:185 join ] Trying to resume data processor: Check consistency of irreversible data...
2844007ms data_processor.cpp:187 join ] Data processor: Check consistency of irreversible data resumed...
2844008ms transaction_controllers.cpp:126 do_reconnect ] Connected to database: `postgresql:///haf_block_log_d0540a77cf48416eba1c4c4de04e3294'.
2844008ms data_processor.cpp:212 handle_exception ] Data processor Check consistency of irreversible data detected SQL statement execution failure. Failing statement: `SELECT hive.initialize_extension_data();'.
2844008ms data_processor.cpp:16 kill_node ] An error occured and HAF is stopping synchronization...
I try to found reason, why exception was not handle and i found, the keyboard incorrupt is thrown directy after hived subprocess popen
: https://gitlab.syncad.com/hive/test-tools/-/blob/master/package/test_tools/__private/node.py#L108.