Skip to content

net/http: internal error: connCount underflow #61474

Open
@zzkcode

Description

@zzkcode

What version of Go are you using (go version)?

$ go version
go version go1.18.3 linux/amd64

Does this issue reproduce with the latest release?

Haven't tried it yet. Will try it when we have a producer.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1421070104=/tmp/go-build -gno-record-gcc-switches"

What did you do?

We are piloting limited MaxConnsPerHost on several PRD instances, and running about 1~2 days separately. And one of them crashed with: panic: net/http: internal error: connCount underflow.

What did you expect to see?

No crash.

What did you see instead?

panic: net/http: internal error: connCount underflow.

Below are the details by examining the core file, and since there are some locals and args that seem to be optimized out or something, I try jump between frames to print out the details.

bt:

(dlv) bt
 0  0x000000000046e601 in runtime.raise
    at /usr/local/go/src/runtime/sys_linux_amd64.s:168
 1  0x000000000044fc25 in runtime.dieFromSignal
    at /usr/local/go/src/runtime/signal_unix.go:852
 2  0x00000000004505f6 in runtime.sigfwdgo
    at /usr/local/go/src/runtime/signal_unix.go:1066
 3  0x000000000044e947 in runtime.sigtrampgo
    at /usr/local/go/src/runtime/signal_unix.go:430
 4  0x000000000046f44e in runtime.sigtrampgo
    at <autogenerated>:1
 5  0x000000000046e8fd in runtime.sigtramp
    at /usr/local/go/src/runtime/sys_linux_amd64.s:361
 6  0x00007fa1b18df8e0 in ???
    at ?:-1
 7  0x0000000000439c49 in runtime.crash
    at /usr/local/go/src/runtime/signal_unix.go:944
 8  0x0000000000439c49 in runtime.fatalpanic
    at /usr/local/go/src/runtime/panic.go:1092
 9  0x0000000000439417 in runtime.gopanic
    at /usr/local/go/src/runtime/panic.go:941
10  0x0000000000869d52 in net/http.(*Transport).decConnsPerHost
    at /usr/local/go/src/net/http/transport.go:1475
11  0x000000000086472f in net/http.(*Transport).roundTrip
    at /usr/local/go/src/net/http/transport.go:604
12  0x000000000084c339 in net/http.(*Transport).RoundTrip
    at /usr/local/go/src/net/http/roundtrip.go:17
13  0x0000000000808d58 in net/http.send
    at /usr/local/go/src/net/http/client.go:252
14  0x00000000008085fb in net/http.(*Client).send
    at /usr/local/go/src/net/http/client.go:176
15  0x000000000080aa35 in net/http.(*Client).do
    at /usr/local/go/src/net/http/client.go:725
16  0x000000000101a6c9 in net/http.(*Client).Do
    at /usr/local/go/src/net/http/client.go:593
(dlv) frame 11
(dlv) p pconn.cacheKey
net/http.connectMethodKey {
        proxy: "http://proxy-server:3128",
        scheme: "https",
        addr: "real-endpoint-which-caused-crash:443",
        onlyH1: false,}

(dlv) frame 10
(dlv) p t.connsPerHost
map[net/http.connectMethodKey]int [
        // ignore unrelated connectMethods
        {proxy: "http://proxy-server:3128", scheme: "https", addr: "real-endpoint-which-caused-crash:443", onlyH1: false}: 1, 
]

It seems a little weird here, panic on net/http/transport.go:1475, but by printing t.connsPerHost, it shows that it should be 1?

	t.connsPerHostMu.Lock()
	defer t.connsPerHostMu.Unlock()
	n := t.connsPerHost[key]
	if n == 0 {
		// Shouldn't happen, but if it does, the counting is buggy and could
		// easily lead to a silent deadlock, so report the problem loudly.
		panic("net/http: internal error: connCount underflow")
	}

Note: n and key seem optimized out here? print them return as: unreadable could not find loclist entry..., and GDB show a large number:

(gdb) p n
$2 = 825625303360

Will try to reproduce it off the Production Environment, and then with -race and the latest Go Version. I have noticed there are some issues but closed already: #34941, #38172. Any comments and suggestions would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions