您的位置 首页 > 腾讯云社区

Error : No space left on device---PedroQin

记一次服务器异常及修复

起因

最近同事发现产线服务器重启服务时出现如下报错。

[root@server ~]# service sshd restart Redirecting to /bin/systemctl restart sshd.service Error: No space left on device [root@server ~]# systemctl restart dhcpd.service Error: No space left on device [root@server ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 472G 195G 278G 42% / devtmpfs 126G 0 126G 0% /dev tmpfs 126G 0 126G 0% /dev/shm tmpfs 126G 4.1G 122G 4% /run tmpfs 126G 0 126G 0% /sys/fs/cgroup /dev/sda1 1014M 166M 849M 17% /boot tmpfs 26G 0 26G 0% /run/user/0 debug过程

根据报的Error,字面意思为设备空间不足。一般来说,造成这种报错的原因一般有两种:

磁盘空间是inode空间不足

于是信心十足敲下命令证明自己猜想。。。

[root@server bin]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 472G 195G 278G 42% / devtmpfs 126G 0 126G 0% /dev tmpfs 126G 0 126G 0% /dev/shm tmpfs 126G 4.1G 122G 4% /run tmpfs 126G 0 126G 0% /sys/fs/cgroup /dev/sda1 1014M 166M 849M 17% /boot tmpfs 26G 0 26G 0% /run/user/0 [root@server bin]# df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/centos-root 247431168 1169533 246261635 1% / devtmpfs 33000182 748 32999434 1% /dev tmpfs 33004461 1 33004460 1% /dev/shm tmpfs 33004461 1514 33002947 1% /run tmpfs 33004461 16 33004445 1% /sys/fs/cgroup /dev/sda1 524288 340 523948 1% /boot tmpfs 33004461 1 33004460 1% /run/user/0

额?怎么跟预想的不太一样,空间看样子都足够的。查看message、dmesg、sel等信息,也无硬盘异常log,不像硬盘问题。

查找根源

By default, Linux only allocates 8192 watches for inotify, which is ridiculously low. And when it runs out, the error is also No space left on device, which may be confusing if you aren't explicitly looking for this issue.

可通过命令man 7 inotify查询inotify相关介绍(文末附录 man page for inotify)

[root@server ~]# sysctl fs.inotify fs.inotify.max_queued_events = 16384 fs.inotify.max_user_instances = 128 fs.inotify.max_user_watches = 8192 [root@server ~]# cat /proc/sys/fs/inotify/max_user_watches 8192

查询可得当前upper limit on the number of watches that can be created per real user ID的确是默认值8192。

查询当前实际值如下,实际值已大于默认设置最大值,故报错。

[root@server ~]# find /proc/*/fd -user "$USER" -lname anon_inode:inotify -printf '%hinfo/%fn' 2>/dev/null | xargs cat | grep -c '^inotify' 8557

命令原理:

This will first find all open file descriptors created by inotify_init*(2), and will then look into the corresponding /proc/PID/fdinfo/FD file for the info about the watch descriptors added with inotify_add_watch(2) to each of them (look into the proc(5) manpage under /proc/[pid]/fdinfo/ for a description of the inotify-specific entries).

同理可查询每个/proc/PID/fdinfo/FD对应watch descriptors数,并找出执行命令和文件

[root@server ~]# for i in `find /proc/*/fd -user "$USER" -lname anon_inode:inotify -printf '%hinfo/%fn' 2>/dev/null`; do echo -e "$i t `cat $i|grep -c '^inotify'`";done /proc/17810/fdinfo/11 2 /proc/17825/fdinfo/3 3 /proc/17825/fdinfo/8 4 /proc/17847/fdinfo/6 2 /proc/17873/fdinfo/6 1 /proc/17879/fdinfo/3 1 /proc/17880/fdinfo/3 1 /proc/18341/fdinfo/5 3 /proc/18882/fdinfo/7 1 /proc/19235/fdinfo/9 5 /proc/1/fdinfo/10 1 /proc/1/fdinfo/14 4 /proc/1/fdinfo/15 4 /proc/1/fdinfo/17 4 /proc/57300/fdinfo/4 8630 /proc/7143/fdinfo/3 2 /proc/9380/fdinfo/7 11 [root@server ~]# cat /proc/57300/cmdline python xxx.py

由此可看出上述PID 57300即为罪魁祸首,其实际命令也已查出。

解决问题

由于上一步查出的脚本为一关键任务脚本,暂时无法关掉,故增大fs.inotify.max_user_watches以解决此问题。

编辑/etc/sysctl.conf,添加行fs.inotify.max_user_watches = 81920,并执行以下命令

[root@server ~]# sysctl -p fs.inotify.max_user_watches = 81920

重新查询inotify:

[root@server ~]# sysctl fs.inotify fs.inotify.max_queued_events = 16384 fs.inotify.max_user_instances = 128 fs.inotify.max_user_watches = 81920 [root@server ~]# cat /proc/sys/fs/inotify/max_user_watches 81920

执行systemctl验证结果如下,已解决

[root@server etc]# service sshd restart Redirecting to /bin/systemctl restart sshd.service 附录参考链接

https://serverfault.com/questions/708001/error-no-space-left-on-device-when-starting-stopping-services-only

https://unix.stackexchange.com/questions/498393/how-to-get-the-number-of-inotify-watches-in-use

man page for inotifyNAME inotify - monitoring file system events DESCRIPTION The inotify API provides a mechanism for monitoring file system events. Inotify can be used to monitor individual files, or to monitor directories. When a directory is monitored, inotify will return events for the directory itself, and for files inside the directory. ... /proc interfaces The following interfaces can be used to limit the amount of kernel memory consumed by inotify: /proc/sys/fs/inotify/max_queued_events The value in this file is used when an application calls inotify_init(2) to set an upper limit on the number of events that can be queued to the corre‐ sponding inotify instance. Events in excess of this limit are dropped, but an IN_Q_OVERFLOW event is always generated. /proc/sys/fs/inotify/max_user_instances This specifies an upper limit on the number of inotify instances that can be created per real user ID. /proc/sys/fs/inotify/max_user_watches This specifies an upper limit on the number of watches that can be created per real user ID. ... ---来自腾讯云社区的---PedroQin

关于作者: 瞎采新闻

这里可以显示个人介绍!这里可以显示个人介绍!

热门文章

留言与评论(共有 0 条评论)
   
验证码: