-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Static quality gates entrypoint to allow on-CI debugging #37395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Gitlab CI Configuration ChangesModified Jobsvariables (configuration) variables:
AGENT_API_KEY_ORG2: agent-api-key-org-2
AGENT_APP_KEY_ORG2: agent-app-key-org-2
AGENT_BINARIES_DIR: bin/agent
AGENT_GITHUB_APP: agent-github-app
AGENT_QA_E2E: agent-qa-e2e
API_KEY_ORG2: ci.datadog-agent.datadog_api_key_org2
ARTIFACT_DOWNLOAD_ATTEMPTS: 2
ATLASSIAN_WRITE: atlassian-write
BTFHUB_ARCHIVE_BRANCH: main
BUCKET_BRANCH: dev
CHANGELOG_COMMIT_SHA: ci.datadog-agent.gitlab_changelog_commit_sha
CHOCOLATEY_API_KEY: ci.datadog-agent.chocolatey_api_key
- CI_IMAGE_BTF_GEN: v66293343-2eef00c4
+ CI_IMAGE_BTF_GEN: v67501108-a78f81d6
CI_IMAGE_BTF_GEN_SUFFIX: ''
- CI_IMAGE_DEB_ARM64: v66293343-2eef00c4
+ CI_IMAGE_DEB_ARM64: v67501108-a78f81d6
CI_IMAGE_DEB_ARM64_SUFFIX: ''
- CI_IMAGE_DEB_ARMHF: v66293343-2eef00c4
+ CI_IMAGE_DEB_ARMHF: v67501108-a78f81d6
CI_IMAGE_DEB_ARMHF_SUFFIX: ''
- CI_IMAGE_DEB_X64: v66293343-2eef00c4
+ CI_IMAGE_DEB_X64: v67501108-a78f81d6
CI_IMAGE_DEB_X64_SUFFIX: ''
- CI_IMAGE_DOCKER_ARM64: v66293343-2eef00c4
+ CI_IMAGE_DOCKER_ARM64: v67501108-a78f81d6
CI_IMAGE_DOCKER_ARM64_SUFFIX: ''
- CI_IMAGE_DOCKER_X64: v66293343-2eef00c4
+ CI_IMAGE_DOCKER_X64: v67501108-a78f81d6
CI_IMAGE_DOCKER_X64_SUFFIX: ''
- CI_IMAGE_GITLAB_AGENT_DEPLOY: v66293343-2eef00c4
+ CI_IMAGE_GITLAB_AGENT_DEPLOY: v67501108-a78f81d6
CI_IMAGE_GITLAB_AGENT_DEPLOY_SUFFIX: ''
- CI_IMAGE_LINUX_GLIBC_2_17_X64: v66293343-2eef00c4
+ CI_IMAGE_LINUX_GLIBC_2_17_X64: v67501108-a78f81d6
CI_IMAGE_LINUX_GLIBC_2_17_X64_SUFFIX: ''
- CI_IMAGE_LINUX_GLIBC_2_23_ARM64: v66293343-2eef00c4
+ CI_IMAGE_LINUX_GLIBC_2_23_ARM64: v67501108-a78f81d6
CI_IMAGE_LINUX_GLIBC_2_23_ARM64_SUFFIX: ''
- CI_IMAGE_RPM_ARM64: v66293343-2eef00c4
+ CI_IMAGE_RPM_ARM64: v67501108-a78f81d6
CI_IMAGE_RPM_ARM64_SUFFIX: ''
- CI_IMAGE_RPM_ARMHF: v66293343-2eef00c4
+ CI_IMAGE_RPM_ARMHF: v67501108-a78f81d6
CI_IMAGE_RPM_ARMHF_SUFFIX: ''
- CI_IMAGE_RPM_X64: v66293343-2eef00c4
+ CI_IMAGE_RPM_X64: v67501108-a78f81d6
CI_IMAGE_RPM_X64_SUFFIX: ''
- CI_IMAGE_WIN_LTSC2022_X64: v66293343-2eef00c4
+ CI_IMAGE_WIN_LTSC2022_X64: v67501108-a78f81d6
CI_IMAGE_WIN_LTSC2022_X64_SUFFIX: ''
CLANG_BUILD_VERSION: v60409452-ee70de70
CLANG_LLVM_VER: 12.0.1
CLUSTER_AGENT_BINARIES_DIR: bin/datadog-cluster-agent
CLUSTER_AGENT_CLOUDFOUNDRY_BINARIES_DIR: bin/datadog-cluster-agent-cloudfoundry
CODECOV: codecov
CODECOV_TOKEN: ci.datadog-agent.codecov_token
COMPARE_TO_BRANCH: main
CWS_INSTRUMENTATION_BINARIES_DIR: bin/cws-instrumentation
DATADOG_AGENT_EMBEDDED_PATH: /opt/datadog-agent/embedded
DD_AGENT_TESTING_DIR: $CI_PROJECT_DIR/test/new-e2e/tests
DD_PKG_VERSION: latest
DEB_GPG_KEY: ci.datadog-agent.deb_signing_private_key_${DEB_GPG_KEY_ID}
DEB_GPG_KEY_ID: c0962c7d
DEB_GPG_KEY_NAME: Datadog, Inc. APT key
DEB_RPM_TESTING_BUCKET_BRANCH: testing
DEB_S3_BUCKET: apt.datad0g.com
DEB_SIGNING_PASSPHRASE: ci.datadog-agent.deb_signing_key_passphrase_${DEB_GPG_KEY_ID}
DEB_TESTING_S3_BUCKET: apttesting.datad0g.com
DOCKER_REGISTRY_LOGIN: ci.datadog-agent.docker_hub_login
DOCKER_REGISTRY_PWD: ci.datadog-agent.docker_hub_pwd
DOCKER_REGISTRY_RO: dockerhub-readonly
DOCKER_REGISTRY_URL: docker.io
DOGSTATSD_BINARIES_DIR: bin/dogstatsd
E2E_AZURE: e2e-azure
E2E_GCP: e2e-gcp
EXECUTOR_JOB_SECTION_ATTEMPTS: 2
FF_KUBERNETES_HONOR_ENTRYPOINT: true
FF_SCRIPT_SECTIONS: 1
FF_TIMESTAMPS: true
GENERAL_ARTIFACTS_CACHE_BUCKET_URL: https://dd-agent-omnibus.s3.amazonaws.com
GET_SOURCES_ATTEMPTS: 2
GITLAB_TOKEN: gitlab-token
GO_TEST_SKIP_FLAKE: 'true'
INSTALLER_TESTING_S3_BUCKET: installtesting.datad0g.com
INSTALL_SCRIPT_API_KEY_ORG2: install-script-api-key-org-2
INTEGRATION_WHEELS_CACHE_BUCKET: dd-agent-omnibus
KERNEL_MATRIX_TESTING_ARM_AMI_ID: ami-0b5f838a19d37fc61
KERNEL_MATRIX_TESTING_X86_AMI_ID: ami-05b3973acf5422348
KITCHEN_INFRASTRUCTURE_FLAKES_RETRY: 2
MACOS_APPLE_APPLICATION_SIGNING: apple-application-signing
MACOS_APPLE_DEVELOPER_ACCOUNT: apple-developer-account
MACOS_APPLE_INSTALLER_SIGNING: apple-installer-signing
MACOS_GITHUB_APP_1: macos-github-app-one
MACOS_GITHUB_APP_2: macos-github-app-two
MACOS_KEYCHAIN_PWD: ci-keychain
MACOS_S3_BUCKET: dd-agent-macostesting
OMNIBUS_BASE_DIR: /omnibus
OMNIBUS_GIT_CACHE_DIR: /tmp/omnibus-git-cache
OMNIBUS_PACKAGE_DIR: $CI_PROJECT_DIR/omnibus/pkg/
OMNIBUS_PACKAGE_DIR_SUSE: $CI_PROJECT_DIR/omnibus/suse/pkg
PIPELINE_KEY_ALIAS: alias/ci_datadog-agent_pipeline-key
PROCESS_S3_BUCKET: datad0g-process-agent
RESTORE_CACHE_ATTEMPTS: 2
RPM_GPG_KEY: ci.datadog-agent.rpm_signing_private_key_${RPM_GPG_KEY_ID}
RPM_GPG_KEY_ID: b01082d3
RPM_GPG_KEY_NAME: Datadog, Inc. RPM key
RPM_S3_BUCKET: yum.datad0g.com
RPM_SIGNING_PASSPHRASE: ci.datadog-agent.rpm_signing_key_passphrase_${RPM_GPG_KEY_ID}
RPM_TESTING_S3_BUCKET: yumtesting.datad0g.com
RUN_E2E_TESTS: auto
RUN_KMT_TESTS: auto
RUN_UNIT_TESTS: auto
S3_ARTIFACTS_URI: s3://dd-ci-artefacts-build-stable/$CI_PROJECT_NAME/$CI_PIPELINE_ID
S3_CP_CMD: aws s3 cp $S3_CP_OPTIONS
S3_CP_OPTIONS: --no-progress --region us-east-1 --sse AES256
S3_DD_AGENT_OMNIBUS_BTFS_URI: s3://dd-agent-omnibus/btfs
S3_DD_AGENT_OMNIBUS_JAVA_URI: s3://dd-agent-omnibus/openjdk
S3_DD_AGENT_OMNIBUS_LLVM_URI: s3://dd-agent-omnibus/llvm
S3_DSD6_URI: s3://dsd6-staging
S3_OMNIBUS_CACHE_BUCKET: dd-ci-datadog-agent-omnibus-cache-build-stable
S3_OMNIBUS_GIT_CACHE_BUCKET: dd-ci-datadog-agent-omnibus-git-cache-build-stable
S3_PERMANENT_ARTIFACTS_URI: s3://dd-ci-persistent-artefacts-build-stable/$CI_PROJECT_NAME
S3_PROJECT_ARTIFACTS_URI: s3://dd-ci-artefacts-build-stable/$CI_PROJECT_NAME
S3_RELEASE_ARTIFACTS_URI: s3://dd-release-artifacts/$CI_PROJECT_NAME/$CI_PIPELINE_ID
S3_RELEASE_INSTALLER_ARTIFACTS_URI: s3://dd-release-artifacts/datadog-installer/$CI_PIPELINE_ID
S3_SBOM_STORAGE_URI: s3://sbom-root-us1-ddbuild-io/$CI_PROJECT_NAME/$CI_PIPELINE_ID
SLACK_AGENT: slack-agent-ci
SMP_ACCOUNT: smp
STATIC_BINARIES_DIR: bin/static
SYSTEM_PROBE_BINARIES_DIR: bin/system-probe
VCPKG_BLOB_SAS_URL: ci.datadog-agent-buildimages.vcpkg_blob_sas_url
WINDOWS_BUILDS_S3_BUCKET: $WIN_S3_BUCKET/builds
WINDOWS_POWERSHELL_DIR: $CI_PROJECT_DIR/signed_scripts
WINDOWS_TESTING_S3_BUCKET: pipelines/A7/$CI_PIPELINE_ID
WINGET_PAT: ci.datadog-agent.winget_pat
WIN_S3_BUCKET: dd-agent-mstesting Added Jobsdebug_static_quality_gatesdebug_static_quality_gates:
image: registry.ddbuild.io/ci/datadog-agent-buildimages/docker_x64$CI_IMAGE_DOCKER_X64_SUFFIX:$CI_IMAGE_DOCKER_X64
needs:
- agent_deb-x64-a7
- agent_deb-x64-a7-fips
- agent_deb-arm64-a7
- agent_deb-arm64-a7-fips
- agent_rpm-x64-a7
- agent_rpm-x64-a7-fips
- agent_rpm-arm64-a7
- agent_rpm-arm64-a7-fips
- agent_suse-x64-a7
- agent_suse-x64-a7-fips
- agent_suse-arm64-a7
- agent_suse-arm64-a7-fips
- agent_heroku_deb-x64-a7
- docker_build_agent7
- docker_build_agent7_arm64
- docker_build_agent7_jmx
- docker_build_agent7_jmx_arm64
- docker_build_cluster_agent_amd64
- docker_build_cluster_agent_arm64
- docker_build_dogstatsd_amd64
- docker_build_dogstatsd_arm64
- docker_build_agent7_windows1809
- docker_build_agent7_windows1809_core
- docker_build_agent7_windows1809_core_jmx
- docker_build_agent7_windows1809_jmx
- docker_build_agent7_windows2022
- docker_build_agent7_windows2022_core
- docker_build_agent7_windows2022_core_jmx
- docker_build_agent7_windows2022_jmx
- dogstatsd_deb-x64
- dogstatsd_deb-arm64
- dogstatsd_rpm-x64
- dogstatsd_suse-x64
- iot_agent_deb-x64
- iot_agent_deb-arm64
- iot_agent_deb-armhf
- iot_agent_rpm-x64
- iot_agent_rpm-arm64
- iot_agent_rpm-armhf
- iot_agent_suse-x64
- windows_msi_and_bosh_zip_x64-a7
rules:
- allow_failure: true
if: $CI_COMMIT_BRANCH == "main"
when: manual
- if: $CI_COMMIT_BRANCH == "main"
when: never
- if: $CI_COMMIT_BRANCH =~ /^[0-9]+\.[0-9]+\.x$/
when: never
- if: $CI_COMMIT_BRANCH =~ /^mq-working-branch-/
when: never
- if: $CI_COMMIT_TAG != null
when: never
- allow_failure: true
when: manual
script:
- DOCKER_LOGIN=$($CI_PROJECT_DIR/tools/ci/fetch_secret.sh $DOCKER_REGISTRY_RO user)
|| exit $?
- $CI_PROJECT_DIR/tools/ci/fetch_secret.sh $DOCKER_REGISTRY_RO token | crane auth
login --username "$DOCKER_LOGIN" --password-stdin "$DOCKER_REGISTRY_URL"
- EXIT="${PIPESTATUS[0]}"; if [ $EXIT -ne 0 ]; then echo "Unable to locate credentials
needs gitlab runner restart"; exit $EXIT; fi
- DATADOG_API_KEY="$("$CI_PROJECT_DIR"/tools/ci/fetch_secret.sh "$AGENT_API_KEY_ORG2"
token)" || exit $?; export DATADOG_API_KEY
- export DD_API_KEY="$DATADOG_API_KEY"
- GITHUB_KEY_B64=$($CI_PROJECT_DIR/tools/ci/fetch_secret.sh $AGENT_GITHUB_APP key_b64)
|| exit $?; export GITHUB_KEY_B64
- GITHUB_APP_ID=$($CI_PROJECT_DIR/tools/ci/fetch_secret.sh $AGENT_GITHUB_APP app_id)
|| exit $?; export GITHUB_APP_ID
- GITHUB_INSTALLATION_ID=$($CI_PROJECT_DIR/tools/ci/fetch_secret.sh $AGENT_GITHUB_APP
installation_id) || exit $?; export GITHUB_INSTALLATION_ID
- echo "Using agent GitHub App"
- GITLAB_TOKEN=$($CI_PROJECT_DIR/tools/ci/fetch_secret.sh $GITLAB_TOKEN read_api)
|| exit $?; export GITLAB_TOKEN
- dda inv -- quality-gates.debug-specific-quality-gate "$GATE_NAME" || exit $?
stage: functional_test
tags:
- arch:amd64 Changes Summary
ℹ️ Diff available in the job log. |
Uncompressed package size comparisonComparison with ancestor Size reduction summary
Diff per package
Decision✅ Passed |
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: 650ae4e Optimization Goals: ✅ Improvement(s) detected
|
perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
---|---|---|---|---|---|---|
➖ | docker_containers_cpu | % cpu utilization | +2.70 | [-0.33, +5.73] | 1 | Logs |
➖ | quality_gate_logs | % cpu utilization | +0.43 | [-2.38, +3.24] | 1 | Logs bounds checks dashboard |
➖ | otlp_ingest_metrics | memory utilization | +0.05 | [-0.11, +0.21] | 1 | Logs |
➖ | ddot_metrics | memory utilization | +0.05 | [-0.06, +0.17] | 1 | Logs |
➖ | file_to_blackhole_1000ms_latency_linear_load | egress throughput | +0.03 | [-0.20, +0.26] | 1 | Logs |
➖ | tcp_syslog_to_blackhole | ingress throughput | +0.03 | [-0.02, +0.08] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency | egress throughput | +0.02 | [-0.55, +0.58] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency_http1 | egress throughput | +0.01 | [-0.61, +0.64] | 1 | Logs |
➖ | file_to_blackhole_500ms_latency | egress throughput | +0.01 | [-0.62, +0.63] | 1 | Logs |
➖ | uds_dogstatsd_to_api | ingress throughput | +0.00 | [-0.26, +0.26] | 1 | Logs |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | -0.00 | [-0.02, +0.01] | 1 | Logs |
➖ | file_to_blackhole_0ms_latency_http2 | egress throughput | -0.01 | [-0.63, +0.60] | 1 | Logs |
➖ | file_to_blackhole_1000ms_latency | egress throughput | -0.02 | [-0.58, +0.55] | 1 | Logs |
➖ | file_to_blackhole_300ms_latency | egress throughput | -0.02 | [-0.63, +0.59] | 1 | Logs |
➖ | file_to_blackhole_100ms_latency | egress throughput | -0.12 | [-0.67, +0.43] | 1 | Logs |
➖ | ddot_logs | memory utilization | -0.23 | [-0.34, -0.11] | 1 | Logs |
➖ | docker_containers_memory | memory utilization | -0.44 | [-0.52, -0.36] | 1 | Logs |
➖ | otlp_ingest_logs | memory utilization | -0.49 | [-0.62, -0.37] | 1 | Logs |
➖ | uds_dogstatsd_20mb_12k_contexts_20_senders | memory utilization | -0.56 | [-0.61, -0.50] | 1 | Logs |
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | -0.61 | [-1.46, +0.23] | 1 | Logs |
➖ | quality_gate_idle | memory utilization | -1.39 | [-1.46, -1.33] | 1 | Logs bounds checks dashboard |
➖ | quality_gate_idle_all_features | memory utilization | -3.50 | [-3.62, -3.39] | 1 | Logs bounds checks dashboard |
✅ | file_tree | memory utilization | -8.26 | [-8.45, -8.08] | 1 | Logs |
Bounds Checks: ❌ Failed
perf | experiment | bounds_check_name | replicates_passed | links |
---|---|---|---|---|
❌ | quality_gate_logs | memory_usage | 6/10 | bounds checks dashboard |
✅ | docker_containers_cpu | simple_check_run | 10/10 | |
✅ | docker_containers_memory | memory_usage | 10/10 | |
✅ | docker_containers_memory | simple_check_run | 10/10 | |
✅ | file_to_blackhole_0ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http1 | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http1 | memory_usage | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http2 | lost_bytes | 10/10 | |
✅ | file_to_blackhole_0ms_latency_http2 | memory_usage | 10/10 | |
✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_1000ms_latency_linear_load | memory_usage | 10/10 | |
✅ | file_to_blackhole_100ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_300ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_300ms_latency | memory_usage | 10/10 | |
✅ | file_to_blackhole_500ms_latency | lost_bytes | 10/10 | |
✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | |
✅ | quality_gate_idle | intake_connections | 10/10 | bounds checks dashboard |
✅ | quality_gate_idle | memory_usage | 10/10 | bounds checks dashboard |
✅ | quality_gate_idle_all_features | intake_connections | 10/10 | bounds checks dashboard |
✅ | quality_gate_idle_all_features | memory_usage | 10/10 | bounds checks dashboard |
✅ | quality_gate_logs | intake_connections | 10/10 | bounds checks dashboard |
✅ | quality_gate_logs | lost_bytes | 10/10 | bounds checks dashboard |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
CI Pass/Fail Decision
❌ Failed. Some Quality Gates were violated.
- quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check memory_usage: 6/10 replicas passed. Failed 4 which is > 0. Gate FAILED.
- quality_gate_logs, bounds check lost_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
Static quality checks✅ Please find below the results from static quality gates Successful checksInfo
|
…m/DataDog/datadog-agent into pythyu/debug_static_quality_gates
.gitlab-ci.yml
Outdated
CI_IMAGE_WIN_LTSC2022_X64: v66293343-2eef00c4 | ||
CI_IMAGE_WIN_LTSC2022_X64_SUFFIX: "" | ||
CI_IMAGE_BTF_GEN: v66796378-17ead302 | ||
CI_IMAGE_BTF_GEN_SUFFIX: "_test_only" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the suffixes are removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing the pipeline, so you can approve if this is the only issue with the PR 😄
What does this PR do?
This PR create a
debug_static_quality_gates
manual job that allows to run comparaison of files within packages and images of a specific static quality gate.Motivation
Give more insight of where a PR cause an increase in a static quality gate.
Describe how you validated your changes
Possible Drawbacks / Trade-offs
Additional Notes
Package size debug link