Docker deamon最近自己退出?


昨天发现线上使用docker部署的服务无法使用,查看原因是docker进程自己退出了。我们使用service docker start启动,系统为CentOS 7docker版本为1.6。
使用docker部署已经有5个月左右,就最近一个礼拜来,这已经发生第二次。我查看deamon的日志,看到有如下的信息:
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.goexit()
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc20849bfc0 sp=0xc20849bfb8
Nov 7 18:21:34 cb-cobranding-01 docker: created by github.com/docker/docker/daemon/logger.(*Copier).Run
Nov 7 18:21:34 cb-cobranding-01 docker: /builddir/build/BUILD/docker-ba1f6c3a8973725dcc97298aecb367ad5498955b/_build/src/github.com/docker/docker/daemon/logger/copier.go:37 +0x11c
Nov 7 18:22:27 cb-cobranding-01 kernel: docker: page allocation failure: order:0, mode:0x10200da
Nov 7 18:22:27 cb-cobranding-01 kernel: CPU: 0 PID: 15327 Comm: docker Not tainted 3.10.0-123.9.3.el7.x86_64 #1
Nov 7 18:22:28 cb-cobranding-01 systemd: docker.service: main process exited, code=killed, status=6/ABRT
Nov 7 18:22:28 cb-cobranding-01 systemd: Unit docker.service entered failed state.
Nov 8 09:38:29 cb-cobranding-01 docker-storage-setup: Volume group "xvda1" not found

这句信息Nov 7 18:22:27 cb-cobranding-01 kernel: docker: page allocation failure: order:0, mode:0x10200da,我不太明白是什么意思,通过google没有看到自己想要的信息,这是说内核无法给docker分配内存失败吗?机器的剩余内存有5G左右。

目前看到在当天还有OOM的日志,信息:
Nov 7 18:21:34 cb-cobranding-01 docker: fatal error: runtime: out of memory
Nov 7 18:21:34 cb-cobranding-01 docker: runtime stack:
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.throw(0x11be937)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/panic.go:491 +0xad fp=0x7fc25effcc18 sp=0x7fc25effcbe8
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.SysMap(0xc376400000, 0x80000000, 0x11d1a00, 0x11ddcb8)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/mem_linux.c:149 +0x98 fp=0x7fc25effcc50 sp=0x7fc25effcc18
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.MHeap_SysAlloc(0x11e33a0, 0x80000000, 0x439e32)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/malloc.c:284 +0x124 fp=0x7fc25effcca8 sp=0x7fc25effcc50
Nov 7 18:21:34 cb-cobranding-01 docker: MHeap_Grow(0x11e33a0, 0x40000, 0x0)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/mheap.c:398 +0x58 fp=0x7fc25effcce8 sp=0x7fc25effcca8
Nov 7 18:21:34 cb-cobranding-01 docker: MHeap_AllocSpanLocked(0x11e33a0, 0x40000, 0x200000004)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/mheap.c:298 +0x365 fp=0x7fc25effcd28 sp=0x7fc25effcce8
Nov 7 18:21:34 cb-cobranding-01 docker: mheap_alloc(0x11e33a0, 0x40000, 0x100000000, 0x0)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/mheap.c:190 +0x121 fp=0x7fc25effcd50 sp=0x7fc25effcd28
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.MHeap_Alloc(0x11e33a0, 0x40000, 0x10100000000, 0xc20855b560)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/mheap.c:240 +0x66 fp=0x7fc25effcd88 sp=0x7fc25effcd50
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.largeAlloc_m()
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/malloc.c:388 +0xce fp=0x7fc25effcdd8 sp=0x7fc25effcd88
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.onM(0xc208012000)
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/asm_amd64.s:257 +0x68 fp=0x7fc25effcde0 sp=0x7fc25effcdd8
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.mstart()
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/proc.c:818 fp=0x7fc25effcde8 sp=0x7fc25effcde0
Nov 7 18:21:34 cb-cobranding-01 docker: goroutine 273 :
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.switchtoM()
Nov 7 18:21:34 cb-cobranding-01 docker: /usr/lib/golang/src/runtime/asm_amd64.s:198 fp=0xc2085784d8 sp=0xc2085784d0
Nov 7 18:21:34 cb-cobranding-01 docker: runtime.mallocgc(0x7ffffe00, 0x9fc200, 0xc200000001, 0x0)

不过通过free -h看到物理内存free和cached还有很多,为何会发生OOM。
已邀请:

徐磊 - BlackOps@Qunar

赞同来自: 向太陽奔跑Alan


我不太确定你的Daemon是否有发生过OOM。

我之前也遇到过类似问题,而且版本也是1.6,从这个日志:

Nov 7 18:21:34 cb-cobranding-01 docker: created by github.com/docker/docker/daemon/logger.(*Copier).Run

看到了这条日志,隐约感觉应该是被OOM kill掉了,请检查你的kernel日志,查查有无OOM等关键字。

另外,你有执行过attach一类的操作吗(docker run -t也算)?

要回复问题请先登录注册