Skip to content

Conversation

@hiroTamada
Copy link
Contributor

@hiroTamada hiroTamada commented Jan 21, 2026

Summary

  • Fixes race condition where build reports "ready" before image conversion completes
  • Adds waitForImageReady() to poll image status before marking build complete
  • If image conversion fails or times out, build is marked as failed

Problem

The registry's triggerConversion() runs asynchronously after returning 201 to the builder:

if wrapper.statusCode == http.StatusCreated {
    go r.triggerConversion(...)  // ASYNC!
}

This caused a race where:

  1. Builder pushes image → Registry returns 201
  2. Builder reports success → Build status = "ready"
  3. User creates instance → FAILS (image still converting)

Solution

Build manager now waits for image to be ready before reporting build complete:

// Wait for image to be ready before reporting build as complete
if err := m.waitForImageReady(ctx, id); err != nil {
    // Mark build as failed
}
m.updateBuildComplete(id, StatusReady, ...)

Test plan

  • TestWaitForImageReady_Success - image already ready
  • TestWaitForImageReady_WaitsForConversion - polls until ready
  • TestWaitForImageReady_ContextCancelled - respects context timeout
  • TestWaitForImageReady_Failed - handles failed conversion
  • All existing builds tests pass

Fixes KERNEL-863


Note

Addresses a race where builds reported ready before image conversion finished.

  • Integrates images.Manager into builds manager and wiring (ProvideBuildManager, wire_gen.go)
  • Adds waitForImageReady() polling and invokes it in runBuild before setting status to ready; uses build context for accurate metrics/timeout
  • Updates metrics recording to use buildCtx
  • Adds tests: race demonstration and waitForImageReady success, conversion wait, context-cancel timeout, and failed conversion

Written by Cursor Bugbot for commit 450d496. This will update automatically on new commits. Configure here.

Fixes a race condition where build status would transition to "ready"
before the image conversion completed, causing instance creation to fail.

The registry's triggerConversion() runs asynchronously after returning
201 to the builder. This meant the builder could report success and
the build manager would set status="ready" while image conversion was
still in progress.

Changes:
- Add imageManager dependency to build manager
- Add waitForImageReady() that polls until image status is "ready"
- Call waitForImageReady() before setting build to "ready" status
- If image conversion fails/times out, mark build as failed
cursor[bot]

This comment was marked as outdated.

Addresses PR review feedback:
1. Use buildCtx instead of ctx for waitForImageReady to respect
   build timeout during image conversion wait
2. Recalculate duration after waitForImageReady completes to
   accurately report total build time including image conversion
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

imageMgr.images[imageRef].Status = images.StatusConverting
time.Sleep(100 * time.Millisecond)
imageMgr.images[imageRef].Status = images.StatusReady
}()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test has data race when modifying image status

Low Severity

TestWaitForImageReady_WaitsForConversion has a data race: the goroutine at lines 105-110 directly writes to imageMgr.images[imageRef].Status while waitForImageReady concurrently reads the same Status field through GetImage. The mock's GetImage returns a pointer to the same Image struct without copying, so both goroutines access the same memory location without synchronization. This will cause failures when running tests with -race.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine for now. We are not running test with race.

@hiroTamada hiroTamada merged commit 81c03b9 into main Jan 21, 2026
4 checks passed
@hiroTamada hiroTamada deleted the fix/kernel-863-build-image-race branch January 21, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants