云风的 BLOG: epoll 的一个设计问题

问题的起因是 skynet 上的一个 issue ，大概是说 socket 线程陷入了无限循环，有个 fd 不断的产生新的消息，由于这条消息既不是 EPOLLIN 也不是 EPOLLOUT ，导致了 socket 线程不断地调用 epoll_wait 占满了 cpu 。

我在自己的机器上暂时无法重现问题，从分析上看，这个制造问题的 fd 是 0 ，也就是 stdin ，猜想和重定向有关系。

skynet 当初并没有处理 EPOLLERR 的情况（在 kqueue 中似乎没有对应的东西），这个我今天的 patch 补上了，不过应该并不能彻底解决问题。

我做了个简单的测试，如果强行 close fd 0 ，而在 close 前不把 fd 0 从 epoll 中移除，的确会造成一个不再存在的 fd (0) 不断地制造 EPOLLIN 消息（和 issue 中提到的不同，不是 EPOLLERR）。而且我也再也没有机会修复它。因为 fd 0 被关闭，所以无法在出现这种情况后从 epoll 移除，也无法读它（内核中的那个文件对象），消息也就不能停止。

我对 epoll 了解的不多，google 了一下，搜到一篇有趣的 blog ，对这个问题有了更多的了解：

Epoll is fundamentally broken

他认为，epoll 的设计失误在于，接口设计上混淆了 "file descriptor" 和 "file description" 。我们在调用接口时，传入的是 file descriptor ，也就是用户空间中那个 fd 数字；但在内核中，引用的却是 file description , 即那个内核对象。如果我们在用户空间 close 了 file descriptor ，就无法再通过 epoll_ctl 去控制它了；但内核里却是按 file description 的生命期去工作的。

一旦出现这种无法消除的消息，唯一的方法只能是把整个 epoll fd 都抛弃掉，重新创建一个。由于这个原因，基于 epoll 的定义实现一个完备的抽象层是非常困难的。

illumos 是一个 OpenSolaris 的分支，它也提供了一套 epoll 的兼容接口，但在这点上，就拒绝按 linux epoll 原本的定义来实现。一旦一个 file descriptor 被 close 了，即使 file description 还在，也不会制造新的消息了。

我对比了 freebsd 上 kqueue 的定义，也是这么处理的。

Comments

一般处理方式,fd上触发消息后，如果返回这fd有问题，那么先从epoll中移除，然后在close掉这个fd，顺序正确就绝对不会出问题。用户使用epoll api是维护好上述顺序，就一定不会出问题。

Posted by: guccang | (13) May 10, 2024 11:46 AM

level-triggered(LT)，即水平触发模式对于写缓冲，只要可写，就一直会触发EPOLLOUT事件在linux上，我们经常在执行命令后，有太多的控制台输出，我们会习惯的按住回车键将屏幕上的输出推上去。如果采用水平触发模式，程序启动后，通常也会打印些日志，这时习惯的将这些日志信息推上去，这时将接收到的客户端连接fd加入到epoll中，同时注册EPOLLOUT事件，并不会一直触发EPOLLOUT事件。这个是什么原因？

Posted by: adam | (12) December 1, 2020 12:31 AM

应该是用得不对，导致问题吧。 API不应提供它够不着的安全，框架（如skynet）也不应提供它够不着的安全。用户应该规范自己的行为。

Posted by: ling0kill | (11) August 20, 2018 05:39 PM

应该是用得不对，导致问题吧。 API不应提供它够不着的安全。

Posted by: ling0kill | (10) August 20, 2018 05:37 PM

这是manual说法 Multithreaded applications If a file descriptor being monitored by select() is closed in another thread, the result is unspecified. On some UNIX systems, select() unblocks and returns, with an indication that the file descriptor is ready (a subsequent I/O operation will likely fail with an error, unless another the file descriptor reopened between the time select() returned and the I/O operations was performed). On Linux (and some other systems), closing the file descriptor in another thread has no effect on select(). In summary, any application that relies on a particular behavior in this scenario must be considered buggy

Posted by: xi | (9) July 5, 2017 12:06 PM

对于这种情况，manual有讨论 For a discussion of what may happen if a file descriptor in an epoll instance being monitored by epoll_wait() is closed in another thread, see select(2).

Posted by: xi | (8) July 5, 2017 12:04 PM

close的时候，如果在epoll中，则先从epoll移除，然后再真实的close

Posted by: Anonymous | (7) June 14, 2017 07:49 PM

@davidxu 这里的问题是，epoll 是 skynet 框架管理的，但 skynet 是个框架，并不阻止用框架的人通过第三方库调用 close 。

Posted by: Cloud | (6) June 6, 2017 04:51 PM

估计是close fd和处理事件的代码的生命周期没有同步控制。我常用libev, 到是在一个文件句柄被close掉后，epoll返回事件时，放在libev中的回调钩子只要不拿掉还时会被调用的。被调用时，读文件句柄应该会出错，错误号9. 这样就可以发现一些代码中的问题。

Posted by: davidxu | (5) June 6, 2017 11:54 AM

以前也碰到了楼主描述一模一样的问题, 但一直没有找到原因. 当时epoll不小心监听到了fd 0, 如果当前SecuryCRT不关闭,fd 0没有被close. CPU消耗正常. 但CRT关闭后, fd 0被close后,epoll就一直返回fd 0可读,第一次读已经被closed fd 0时,返回-1, 后续读操作全部返回为0

Posted by: AntyRao | (4) June 2, 2017 11:47 AM

@codesun 引用计数是一个实现手段，本身并不是问题。问题是一旦一个 file descriptor 不在了，而 file description 还在，那么就没有任何方法从 epoll 中移除了，所以这是一个设计失误。看看 illumos 的 man epoll : While a best effort has been made to mimic the Linux semantics, there are some semantics that are too peculiar or ill-conceived to merit accommodation. In particular, the Linux epoll facility will -- by design -- continue to generate events for closed file descriptors where/when the underlying file description remains open. 而 kqueue 的 man 是这样的： Calling close() on a file descriptor will remove any kevents that reference the descriptor.

Posted by: Cloud | (3) May 28, 2017 11:23 AM

man 7 epoll Q6 Will closing a file descriptor cause it to be removed from all epoll sets automatically? A6 Yes, but be aware of the following point. A file descriptor is a reference to an open file description (see open(2)). Whenever a file descriptor is duplicated via dup(2), dup2(2), fcntl(2) F_DUPFD, or fork(2), a new file descriptor referring to the same open file description is created. An open file description continues to exist until all file descriptors referring to it have been closed. A file descriptor is removed from an epoll set only after all the file descriptors referring to the underlying open file description have been closed (or before if the file descriptor is explicitly removed using epoll_ctl(2) EPOLL_CTL_DEL). This means that even after a file descriptor that is part of an epoll set has been closed, events may be reported for that file descriptor if other file descriptors referring to the same underlying file description remain open 我觉得manual已经写得很清楚了，另外close的引用计数在Linux中难道不是很常见吗？epoll并非例外。

Posted by: codesun | (2) May 28, 2017 12:44 AM

博客内容不错嘛，欢迎更多优秀内容 http://www.alipie.com/

Posted by: 乐了 | (1) May 27, 2017 10:42 PM

云风的 BLOG

思绪来得快去得也快，偶尔会在这里停留

epoll 的一个设计问题

Comments

Post a comment