fix: add retry logic to PostgreSQL check in extract-test-cache.sh
Summary
- Add retry loop to the PostgreSQL check before cache extraction
- Default 15s timeout (configurable via
POSTGRES_CHECK_TIMEOUT) - Prevents race condition when HAF restarts PostgreSQL during initialization
Problem
HAF service containers restart PostgreSQL during initialization (after applying CI config or installing extensions). The single-shot pg_isready check could fail if it ran during this ~3 second restart window, causing the script to:
- Think PostgreSQL wasn't running
- Delete the cache data directory
- Re-extract cache (which is now corrupted because PostgreSQL was using it)
This caused intermittent failures in hivemind's e2e_benchmark_on_postgrest job.
Solution
Replace the single pg_isready check with a retry loop that waits up to 15 seconds before deciding PostgreSQL isn't running. This gives HAF time to complete its restart cycle.
Test plan
- Verify script syntax with shellcheck
- Test with hivemind pipeline that previously failed