在运行cadvisor的时候遇到了如下异常....


[root@slave713 harbor]# docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
ec0fdefc112e05104389a292e8c37d19a50f1cf9a2331967786df15944d5de5d
[root@slave713 harbor]#
[root@slave713 harbor]#
[root@slave713 harbor]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ec0fdefc112e google/cadvisor:latest "/usr/bin/cadvisor -l" 30 seconds ago Exited (255) 13 seconds ago cadvisor

[root@slave713 harbor]# docker logs cadvisor
......
......
I1001 07:41:07.992665 1 manager.go:201] Version: {KernelVersion:3.10.0-327.el7.x86_64 ContainerOsVersion:Alpine Linux v3.4 DockerVersion:1.12.1 CadvisorVersion:v0.24.0 CadvisorRevision:0cdf491}
E1001 07:41:08.105321 1 factory.go:291] devicemapper filesystem stats will not be reported: RHEL/Centos 7.x kernel version 3.10.0-366 or later is required to use thin_ls - you have "3.10.0-327.el7.x86_64"
I1001 07:41:08.105422 1 factory.go:295] Registering Docker factory
W1001 07:41:08.105486 1 manager.go:244] Registration of the rkt container factory failed: unable to communicate with Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
I1001 07:41:08.105504 1 factory.go:54] Registering systemd factory
I1001 07:41:08.106429 1 factory.go:86] Registering Raw factory
I1001 07:41:08.107700 1 manager.go:1082] Started watching for new ooms in manager
W1001 07:41:08.121246 1 manager.go:272] Could not configure a source for OOM detection, disabling OOM events: unable to find any kernel log file available from our set: [/var/log/kern.log /var/log/messages /var/log/syslog]
I1001 07:41:08.123175 1 manager.go:285] Starting recovery of all containers
I1001 07:41:10.985414 1 manager.go:290] Recovery completed
F1001 07:41:11.084149 1 cadvisor.go:151] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpuacct,cpu: no such file or directory
[root@slave713 harbor]#

Google半天后找到了https://github.com/google/cadvisor/pull/1476,学者们建议是
  1. update the libcontainer dependency.
  2. look at all cgroup mounts


尝试之后发现问题依旧.....

同时还在 bugzilla.redhat.com 找到了关于了关于上面 devicemapper 的问题,提示我安装 “device-mapper-persistent-data”,安装之后问题依然存在

附上 docker version 如下:
[root@slave713 harbor]# docker version
Client:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built:

OS/Arch: linux/amd64

Server:
Version: 1.12.1
API version: 1.24
Go version: go1.6.3
Git commit: 23cf638
Built:

OS/Arch: linux/amd64

附上 mount -l 如下:
[root@slave712 ~]# mount -l
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=490460k,nr_inodes=122615,mode=755)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
configfs on /sys/kernel/config type configfs (rw,relatime)
/dev/mapper/centos-root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=34,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel)
/dev/sda1 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=100140k,mode=700)
/dev/mapper/centos-root on /var/lib/docker/devicemapper type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

附上系统信息如下:
[root@slave712 ~]# uname -a
Linux slave712 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

[root@slave712 ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

希望学者们勿啬指点,我将感激不尽!
已邀请:

nnnavy

赞同来自: Luva 禅意田园


同志们,是bug 拉这个版本google/cadvisor:v0.24.1 就ok了,活生生google了半天,恨

tangjiaxing669 - Tom

赞同来自:


国庆是不是都没人看问题了??????

徐磊 - BlackOps@Qunar

赞同来自:


没遇到过这个问题,不过可以给你个思路。

cAdvisor获取cgroup的path是根据/proc/self/mountinfo来的,你可以看下cAdvisor容器内cat /proc/self/mountinfo | grep cgroup中cpu,cpuacct这个subsystem的路径是否正确

easyroom

赞同来自:


你这个报错是docker容器里报的。你贴的系统信息是docker host的吧。修改你的image。 FROM 选一个7.x的centos 或者用你手头的做一个。

傻傻分不清楚

赞同来自:


我也遇到了你这个问题,发现Linux2.6是没问题的,放在3.10上就报错了。然后我把挂载/sys的注释掉了,能跑起来,但是最后的监控里少了容器的监控,点进去一个容器都没监控到

tangjiaxing669 - Tom

赞同来自:


谢谢各位了!

要回复问题请先登录注册