Skip to content

cmd/go/internal/test: test concurrency limited by sequential printing #61233

Open
@howardjohn

Description

@howardjohn

What version of Go are you using (go version)?

$ go version
go version go1.20 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

Ran go test -debug-trace=trace -debug-actiongraph=actiongraph ./...

What did you expect to see?

Tests are run with maximum concurrency

What did you see instead?

test run tasks are started but not doing any work, blocking other work from progressing.


Example simple reproducer:

for i in a b c d; do
	mkdir -p $i
	cat <<EOF > "$i/${i}_test.go"
package $i

import (
	"testing"
	"time"
)

func TestSleep(t *testing.T) {
	if "$i" == "b" {
		time.Sleep(time.Second)
	}
}
EOF
done
for i in e f g; do
	mkdir -p $i
	cat <<EOF > "$i/${i}.go"
package $i

EOF
done
GOMAXPROCS=2 go test -debug-trace=trace ./...

Basically this just makes a few packages, one with a slow test, and a few empty packages.

When we run this, I would expect to have one worker blocked on the b test (sleep 1), while the rest execute in the other worker and quickly complete.

Instead, we see this:
2023-07-07_11-06-03

This looks odd - the e package doesn't have any tests, so why is test run e taking 1.3s?

The reason seems to be related to https://go-review.googlesource.com/c/go/+/448357 (although I am not certain it worked different before that CL). One the runTestActor starts, it is immediately blocked on <-r.prev. This, in turn, means it is blocked until the previous test (d) finishes.

Ultimately, this blocks an entire worker doing nothing.

A real world example of this can be found here (warning: 170mb). You can see it occurring in a few places, most prominently around 300s we have almost all 16 workers blocked, most/all on "no test files" packages.

I think based on the way actions are pushed into the ready queue, any action that has zero remaining dependencies may be pushed onto the queue. In some of my testing, I saw the occasionally some of the very last packages in the test ordering would get scheduled first - blocking an entire worker for almost the entire process.


I believe this new logic is only need for -json mode. One option is to just enable it only if -json is enabled.

I tested this locally against the same repo as the "real world example" above.
With Go 1.20, the tests took 70s to execute. With the patch, they took 30s.

With the same patch, the simple reproducer looks as expected:
2023-07-07_11-15-34

Before trace After trace (warning: 150mb)


Another approach is to add each test run action as a dependency for the previous, to serialize the ordering. However, this makes things far worse, as this is serializing things much stricter -- the current serialization just ensures the starts are serialized, while this approach would make sure the entire test execution is as well - not what we want.

In my testing this made things worse.

A similar approach may be to make another action type test start, but I didn't test this out

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoCommandcmd/goNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.ToolSpeed

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions