[usm] fix the script generating the gotls/lookup/luts.go file #37629

andreimatei · 2025-06-03T19:40:25Z

Before this patch, the generation of the gotls/lookup/luts.go file was
failing: Go versions below 1.16 cannot parse the go.mod file of the
x/net module at versions above 0.35 (released a few months ago). This
patch pins the module at v0.35. We were arbitrarily using the latest
x/net version (and thus using the debug info for the latest); now we're
arbitrarily using a fixed version. Luckily, the debug info we need for
x/net is stable -- we need the offset of an embedded field in a struct.

The regenerated luts.go doesn't have any changes. FWIW, I think this shows
that the debug info in that file is valid for Go 1.24; I'm not sure we
ever ran the script for 1.24, since I think the release of x/net that
broke it happened a little before 1.24.

bits-bot · 2025-06-03T19:40:30Z

All committers have signed the CLA.

cit-pr-commenter · 2025-06-03T21:02:16Z

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: d5ccc23a-e0fb-4deb-ba30-592a63f4b19d

Baseline: 405dcb1
Comparison: 58532aa
Diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
➖	quality_gate_logs	% cpu utilization	+2.02	[-0.80, +4.84]	1	Logs bounds checks dashboard
➖	quality_gate_idle_all_features	memory utilization	+1.18	[+1.04, +1.32]	1	Logs bounds checks dashboard
➖	otlp_ingest_logs	memory utilization	+0.53	[+0.41, +0.66]	1	Logs
➖	uds_dogstatsd_20mb_12k_contexts_20_senders	memory utilization	+0.24	[+0.19, +0.29]	1	Logs
➖	file_to_blackhole_300ms_latency	egress throughput	+0.08	[-0.59, +0.75]	1	Logs
➖	file_to_blackhole_100ms_latency	egress throughput	+0.04	[-0.59, +0.66]	1	Logs
➖	file_to_blackhole_500ms_latency	egress throughput	+0.00	[-0.52, +0.53]	1	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	+0.00	[-0.02, +0.02]	1	Logs
➖	uds_dogstatsd_to_api	ingress throughput	-0.01	[-0.28, +0.27]	1	Logs
➖	file_to_blackhole_1000ms_latency	egress throughput	-0.07	[-0.70, +0.56]	1	Logs
➖	file_to_blackhole_0ms_latency	egress throughput	-0.08	[-0.71, +0.55]	1	Logs
➖	uds_dogstatsd_to_api_cpu	% cpu utilization	-0.09	[-0.97, +0.79]	1	Logs
➖	file_to_blackhole_0ms_latency_http1	egress throughput	-0.09	[-0.70, +0.51]	1	Logs
➖	file_to_blackhole_1000ms_latency_linear_load	egress throughput	-0.11	[-0.35, +0.13]	1	Logs
➖	file_to_blackhole_0ms_latency_http2	egress throughput	-0.14	[-0.74, +0.46]	1	Logs
➖	ddot_metrics	memory utilization	-0.32	[-0.44, -0.21]	1	Logs
➖	tcp_syslog_to_blackhole	ingress throughput	-0.34	[-0.40, -0.28]	1	Logs
➖	ddot_logs	memory utilization	-0.45	[-0.59, -0.31]	1	Logs
➖	otlp_ingest_metrics	memory utilization	-0.52	[-0.67, -0.36]	1	Logs
➖	docker_containers_cpu	% cpu utilization	-0.58	[-3.65, +2.49]	1	Logs
➖	quality_gate_idle	memory utilization	-0.70	[-0.76, -0.63]	1	Logs bounds checks dashboard
➖	docker_containers_memory	memory utilization	-0.89	[-0.96, -0.83]	1	Logs
➖	file_tree	memory utilization	-2.40	[-2.60, -2.20]	1	Logs

Bounds Checks: ❌ Failed

perf	experiment	bounds_check_name	replicates_passed	links
❌	docker_containers_memory	memory_usage	0/10
❌	quality_gate_logs	memory_usage	9/10	bounds checks dashboard
✅	docker_containers_cpu	simple_check_run	10/10
✅	docker_containers_memory	simple_check_run	10/10
✅	file_to_blackhole_0ms_latency	lost_bytes	10/10
✅	file_to_blackhole_0ms_latency	memory_usage	10/10
✅	file_to_blackhole_0ms_latency_http1	lost_bytes	10/10
✅	file_to_blackhole_0ms_latency_http1	memory_usage	10/10
✅	file_to_blackhole_0ms_latency_http2	lost_bytes	10/10
✅	file_to_blackhole_0ms_latency_http2	memory_usage	10/10
✅	file_to_blackhole_1000ms_latency	memory_usage	10/10
✅	file_to_blackhole_1000ms_latency_linear_load	memory_usage	10/10
✅	file_to_blackhole_100ms_latency	lost_bytes	10/10
✅	file_to_blackhole_100ms_latency	memory_usage	10/10
✅	file_to_blackhole_300ms_latency	lost_bytes	10/10
✅	file_to_blackhole_300ms_latency	memory_usage	10/10
✅	file_to_blackhole_500ms_latency	lost_bytes	10/10
✅	file_to_blackhole_500ms_latency	memory_usage	10/10
✅	quality_gate_idle	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_idle	memory_usage	10/10	bounds checks dashboard
✅	quality_gate_idle_all_features	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_idle_all_features	memory_usage	10/10	bounds checks dashboard
✅	quality_gate_logs	intake_connections	10/10	bounds checks dashboard
✅	quality_gate_logs	lost_bytes	10/10	bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

CI Pass/Fail Decision

❌ Failed. Some Quality Gates were violated.

quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
quality_gate_logs, bounds check memory_usage: 9/10 replicas passed. Failed 1 which is > 0. Gate FAILED.
quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.

agent-platform-auto-pr · 2025-06-03T21:53:21Z

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor 410271e

Successful checks

Info

	Quality gate	Delta	On disk size (MiB)	Delta	On wire size (MiB)
✅	agent_deb_amd64	$${+0}$$	$${731.88}$$ < $${752.99}$$	$${-0.02}$$	$${181.32}$$ < $${187.44}$$
✅	agent_deb_amd64_fips	$${+0}$$	$${730.17}$$ < $${751.36}$$	$${+0.03}$$	$${180.79}$$ < $${187.06}$$
✅	agent_heroku_amd64	$${+0}$$	$${358.53}$$ < $${369.68}$$	$${+0}$$	$${96.5}$$ < $${99.55}$$
✅	agent_msi	$${+0}$$	$${959.7}$$ < $${987.01}$$	$${+0.02}$$	$${146.43}$$ < $${150.72}$$
✅	agent_rpm_amd64	$${+0}$$	$${731.87}$$ < $${752.98}$$	$${-0.03}$$	$${183.52}$$ < $${190.03}$$
✅	agent_rpm_amd64_fips	$${+0}$$	$${730.16}$$ < $${751.35}$$	$${+0.04}$$	$${183.67}$$ < $${189.81}$$
✅	agent_rpm_arm64	$${+0}$$	$${719.26}$$ < $${739.42}$$	$${+0.01}$$	$${165.44}$$ < $${171.23}$$
✅	agent_rpm_arm64_fips	$${+0}$$	$${717.68}$$ < $${737.91}$$	$${+0.02}$$	$${164.74}$$ < $${170.22}$$
✅	agent_suse_amd64	$${+0}$$	$${731.87}$$ < $${752.98}$$	$${-0.03}$$	$${183.52}$$ < $${190.03}$$
✅	agent_suse_amd64_fips	$${+0}$$	$${730.16}$$ < $${751.35}$$	$${+0.04}$$	$${183.67}$$ < $${189.81}$$
✅	agent_suse_arm64	$${+0}$$	$${719.26}$$ < $${739.42}$$	$${+0.01}$$	$${165.44}$$ < $${171.23}$$
✅	agent_suse_arm64_fips	$${+0}$$	$${717.68}$$ < $${737.91}$$	$${+0.02}$$	$${164.74}$$ < $${170.22}$$
✅	docker_agent_amd64	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_arm64	$${+0}$$	$${826.51}$$ < $${858.97}$$	$${-0}$$	$${263.66}$$ < $${274.36}$$
✅	docker_agent_jmx_amd64	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_jmx_arm64	$${+0}$$	$${826.51}$$ < $${858.97}$$	$${-0}$$	$${263.66}$$ < $${274.36}$$
✅	docker_agent_windows1809	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows1809_core	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows1809_core_jmx	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows1809_jmx	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows2022	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows2022_core	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows2022_core_jmx	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_agent_windows2022_jmx	$${+0.01}$$	$${815.67}$$ < $${849.39}$$	$${+0}$$	$${277.1}$$ < $${288.34}$$
✅	docker_cluster_agent_amd64	$${+0}$$	$${259.15}$$ < $${259.73}$$	$${+0}$$	$${102.85}$$ < $${103.68}$$
✅	docker_cluster_agent_arm64	$${+0}$$	$${273.57}$$ < $${274.24}$$	$${-0}$$	$${97.58}$$ < $${98.45}$$
✅	docker_cws_instrumentation_amd64	$${+0}$$	$${7.08}$$ < $${7.12}$$	$${+0}$$	$${2.95}$$ < $${3.29}$$
✅	docker_cws_instrumentation_arm64	$${0}$$	$${6.69}$$ < $${6.92}$$	$${-0}$$	$${2.7}$$ < $${3.07}$$
✅	docker_dogstatsd_amd64	$${+0}$$	$${38.92}$$ < $${39.57}$$	$${+0}$$	$${14.95}$$ < $${15.76}$$
✅	docker_dogstatsd_arm64	$${+0}$$	$${37.52}$$ < $${38.2}$$	$${+0}$$	$${13.96}$$ < $${14.83}$$
✅	dogstatsd_deb_amd64	$${0}$$	$${30.61}$$ < $${31.52}$$	$${+0}$$	$${8.03}$$ < $${8.97}$$
✅	dogstatsd_deb_arm64	$${0}$$	$${29.16}$$ < $${30.08}$$	$${-0}$$	$${6.97}$$ < $${7.92}$$
✅	dogstatsd_rpm_amd64	$${0}$$	$${30.61}$$ < $${31.52}$$	$${+0}$$	$${8.04}$$ < $${8.98}$$
✅	dogstatsd_suse_amd64	$${0}$$	$${30.61}$$ < $${31.52}$$	$${+0}$$	$${8.04}$$ < $${8.98}$$
✅	iot_agent_deb_amd64	$${0}$$	$${50.44}$$ < $${60.17}$$	$${-0}$$	$${12.85}$$ < $${15.82}$$
✅	iot_agent_deb_arm64	$${0}$$	$${47.9}$$ < $${56.94}$$	$${-0}$$	$${11.14}$$ < $${13.86}$$
✅	iot_agent_deb_armhf	$${0}$$	$${47.48}$$ < $${56.41}$$	$${+0}$$	$${11.2}$$ < $${13.86}$$
✅	iot_agent_rpm_amd64	$${0}$$	$${50.44}$$ < $${60.18}$$	$${-0}$$	$${12.87}$$ < $${15.84}$$
✅	iot_agent_rpm_arm64	$${0}$$	$${47.9}$$ < $${56.94}$$	$${+0}$$	$${11.16}$$ < $${13.76}$$
✅	iot_agent_suse_amd64	$${0}$$	$${50.44}$$ < $${60.18}$$	$${-0}$$	$${12.87}$$ < $${15.84}$$

guyarb · 2025-06-05T05:36:25Z

I'm not sure we
ever ran the script for 1.24, since I think the release of x/net that
broke it happened a little before 1.24.

We did run it. Go 1.24 was released on February 11th, while x/net v0.36 (the version that broke the support) was released only on March 4th

guyarb · 2025-06-05T05:39:38Z

pkg/network/go/lutgen/run.go

+	// Pin the golang.org/x/net module to an old version. Newer versions cannot
+	// be processed by Go <= 1.16 because the go.mod in x/net has the wrong
+	// format. Newer versions of the package have a go.mod file that can't be
+	// parsed by Go <= 1.16.
+	getCmd := exec.CommandContext(ctx, "go", "get", "golang.org/x/net@v0.35.0")
+	getCmd.Env = cmd.Env
+	getCmd.Dir = cmd.Dir
+	getCmd.Path = cmd.Path
+	output, err := getCmd.CombinedOutput()
+	if err != nil {
+		return fmt.Errorf("error executing 'go get': %s\n%s", err, output)
+	}


That's a problematic change. You assume that v0.35 will represent all future versions, while we can still have changes.

If the issue happens only with go1.16 and below, then the fix should take into account and pin the version only when we run the script for go1.16 and below

I've considered doing that, but this all leads to a deeper question -- which version of the library do we care about? Why would would we particularly care about the latest (and only the latest)? Then, caring about different library versions for different compiler versions seems confusing -- although you can argue that it makes some sense since new versions of the lib don't work with old compilers... I'm in the live debugger team, and so the real answer is perhaps that we should actually look at the particular binary and dynamically analyze its debug info :).

But also you can argue that dealing with the debug info for the x/net library at all is kinda silly -- the only thing we look at is the offset of an embedded field into a struct -- which seems likely to stay 0 for as long as that field exists. That's why I thought that the complexity of dealing with multiple versions is not really worth it. I guess using the latest version of the library tells us that the embedded field continues to exist, so there's some value in it...

Having written all this, if you think the complexity of changing the library version based on the compiler version is worth it, I'm happy to do it.

I've made the script use different versions of the library based on the Go version, as you suggested.

guyarb

reviewed

A comment was referencing the wrong type name.

For development you might want to run a single Go version. The script already lets you specify a minimum version.

Before this patch, the generation of the gotls/lookup/luts.go file was failing: Go versions below 1.16 cannot parse the go.mod file of the x/net module at versions above 0.35 (released a few months ago). This patch pins the module at v0.35. We were arbitrarily using the latest x/net version (and thus using the debug info for the latest); now we're arbitrarily using a fixed version. Luckily, the debug info we need for x/net is stable -- we need the offset of an embedded field in a struct (*). In addition to failing on Go 1.16, the latest version of x/net was forcing the toolchain selection mechanism in go 1.21+ to select 1.23 -- so we were not actually using go 1.21 and 1.22. The next commit makes the script robust to this. The regenerated luts.go doesn't have any changes. FWIW, I think this shows that the debug info in that file is valid for Go 1.24; I'm not sure we ever ran the script for 1.24, since I think the release of x/net that broke it happened a little before 1.24.

The generator script for luts.go intends to use every Go compiler version. Until this patch, it was fragile because it allowed Go's toolchain selection mechanism to transparently use a newer toolchain version than the one the script was intended to use, subject to the requirements of the modules used by the test program. This could lead to the generated source containing debug info for the wrong compiler versions. In fact, this was happening until the previous commit. This patch makes the generation robust by forcing the intended toolchain versions.

andreimatei · 2025-06-13T17:50:40Z

A side-note -- the script does not work for Go 1.25rc, which switched to generating dwarf v5 and our library that processes location lists doesn't like that. We need to fix this for Live Debugger too. I've added a commit to the PR that lets you set a max go version so we can still run the script in the meantime.

andreimatei requested a review from guyarb June 3, 2025 19:40

andreimatei requested a review from a team as a code owner June 3, 2025 19:40

andreimatei added qa/done QA done before merge and regressions are covered by tests short review PR is simple enough to be reviewed quickly labels Jun 3, 2025

github-actions bot added the component/system-probe label Jun 3, 2025

andreimatei added ask-review Ask required teams to review this PR changelog/no-changelog and removed component/system-probe labels Jun 3, 2025

andreimatei force-pushed the fix-gen branch from 331c17c to 58532aa Compare June 3, 2025 20:03

github-actions bot added the component/system-probe label Jun 3, 2025

andreimatei mentioned this pull request Jun 4, 2025

[usm] fix handling of missing pieces in location lists #37670

Open

guyarb reviewed Jun 5, 2025

View reviewed changes

guyarb requested changes Jun 5, 2025

View reviewed changes

github-actions bot added medium review PR review might take time and removed short review PR is simple enough to be reviewed quickly labels Jun 5, 2025

andreimatei added 4 commits June 13, 2025 11:30

[usm] fix comment

4505b26

A comment was referencing the wrong type name.

usm: add max go version flag to lookup table generation script

d9d34bf

For development you might want to run a single Go version. The script already lets you specify a minimum version.

andreimatei force-pushed the fix-gen branch from 58532aa to 101c51a Compare June 13, 2025 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[usm] fix the script generating the gotls/lookup/luts.go file #37629

[usm] fix the script generating the gotls/lookup/luts.go file #37629

Uh oh!

andreimatei commented Jun 3, 2025 •

edited

Loading

Uh oh!

bits-bot commented Jun 3, 2025 •

edited

Loading

Uh oh!

cit-pr-commenter bot commented Jun 3, 2025

Fine details of change detection per experiment

Explanation

Uh oh!

agent-platform-auto-pr bot commented Jun 3, 2025 •

edited

Loading

Info

Uh oh!

guyarb commented Jun 5, 2025

Uh oh!

guyarb Jun 5, 2025

Uh oh!

andreimatei Jun 5, 2025

Uh oh!

andreimatei Jun 13, 2025

Uh oh!

guyarb left a comment

Uh oh!

andreimatei commented Jun 13, 2025

Uh oh!

Uh oh!

[usm] fix the script generating the gotls/lookup/luts.go file #37629

Are you sure you want to change the base?

[usm] fix the script generating the gotls/lookup/luts.go file #37629

Uh oh!

Conversation

andreimatei commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bits-bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cit-pr-commenter bot commented Jun 3, 2025

Regression Detector

Regression Detector Results

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment

Bounds Checks: ❌ Failed

Explanation

CI Pass/Fail Decision

Uh oh!

agent-platform-auto-pr bot commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Static quality checks

Info

Uh oh!

guyarb commented Jun 5, 2025

Uh oh!

guyarb Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

andreimatei Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

andreimatei Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

guyarb left a comment

Choose a reason for hiding this comment

Uh oh!

andreimatei commented Jun 13, 2025

Uh oh!

Uh oh!

andreimatei commented Jun 3, 2025 •

edited

Loading

bits-bot commented Jun 3, 2025 •

edited

Loading

agent-platform-auto-pr bot commented Jun 3, 2025 •

edited

Loading