Skip to content

Performance regression in Julia 1.12 (beta4) for heavy OLS workloads (beyond package latency) #58707

Open
@dpanigo

Description

@dpanigo

Description

I also came across this discussion of package-loading latency in 1.12, which mostly attributes regressions to precompilation differences across Julia versions (relative to loading or downloading packages, in most cases). In my benchmarks, however, all packages are already precompiled, so loading times should be negligible—and yet the performance regression actually (slightly) grows in absolute terms with heavy tasks executions.

When running a large number of OLS estimations via GlobalSearchRegression on synthetic data, Julia 1.12 (beta) is significantly slower than both Julia 1.11 (release) and 1.9.4.

Benchmark Code

Trying to mimic the "real world scenario" for my work (e.g. calling julia codes from console) I did the following PowerShell minimal script (using juliaup to manage julia versions):

& {
  $results = @()

  @(15,25) | ForEach-Object {
    $cov  = $_
    $seed = $cov + 1000

    @('+1.8.5','+1.9.4','+lts','+release','+beta') | ForEach-Object {
      $v = $_

      $cmd =  'using Random,DataFrames,GlobalSearchRegression; ' +
              'rng = MersenneTwister(' + $seed + '); '      +
              'data = DataFrame(rand(rng,100,' + $cov + '), :auto); ' +
              'data.y = rand(rng,100); '                   +
              'gsreg("y ~ x*", data)'

      $times = 1..5 | ForEach-Object {
        ( Measure-Command { julia $v -p auto -e $cmd } ).TotalSeconds
      }
      $avg = ($times | Measure-Object -Average).Average

      $results += [PSCustomObject]@{
        JuliaVersion = $v
        Covariates   = $cov
        AverageTime  = [Math]::Round($avg, 4)
      }
    }
  }

  $results |
    Where-Object { $_.Covariates -ne 20 } |
    Sort-Object Covariates, JuliaVersion |
    Format-Table JuliaVersion, Covariates, AverageTime -AutoSize
}

Observed results

On Windows 11 machine (16 threads, 32 GB RAM) with PowerShell 7.x, running the above script produced:

JuliaVersion Covariates AverageTime (s)
+1.8.5 15 35.16
+1.9.4 15 39.53
+lts 15 56.52
+release 15 47.59
+beta 15 51.09
+1.8.5 25 154.37
+1.9.4 25 129.27
+lts 25 140.17
+release 25 128.22
+beta 25 135.89

Julia 1.12 (beta) is consistently (5–30%) slower than Julia 1.11 (release) and 1.9.4 on both moderate (15) and heavy (25) covariate workloads.

Discussion

Given the package-loading latency discussion and the performance improvements documented in the Julia 1.12 NEWS, one would anticipate that any initial “time-to-first-x” (TTFX) penalty in 1.12 should be pregressively reverted for prolonged, compute-intensive runs.
However, could it be that under heavier workloads and longer runs the absolute performance gap (in absolute terms) actually widens rather than narrows? Might this indicate that factors beyond simple TTFX are contributing to the observed regression?
I understand that Julia 1.13 is currently under development; I hope these findings can help for that work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceMust go fasterregressionRegression in behavior compared to a previous version

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions