前言#

在写这篇文章的时候，阅读源码遇到不懂的地方，经常会使用 AI 辅助理解，好像搜索资料，寻找优秀博文的频率变得越来越少了，不禁让人思考现在 AI 已经这么便利，还需要自己阅读源码吗？直接交给 AI 分析是不是更快，似乎连写这篇文章都没有必要了，想到这里，我陷入了沉思

和我有一样想法的人应该有很多，不止是阅读源码，平时遇到很多问题都可以使用 AI 帮助我们解决，能够使用如此有力的工具，何乐不为？

唯物辩证法告诉我们，事物普遍存在矛盾性，即对立统一。如果长期强依赖于使用 AI 解决问题，可能会慢慢失去一些独立思考、拆解问题等基础能力，在大多数人逐渐依赖使用 AI 解决问题的趋势下，如果我们能利用 AI 提升自己的基础能力，反过来更高效地运用 AI，这是一种螺旋式上升的过程，抓住机遇，也许就能在 AI 时代的浪潮下逆流而上

无论写博客还是阅读源码，我对此并不感到悲观，只要能在利用 AI 辅助理解的过程中提升我们的基础能力，倒也不必纠结所谓的意义，关键在于如何向 AI 提问问题

在此抛砖引玉

I/O 多路复用#

server 接收到连接请求会先进入 SingleThreadEventExecutor 的 doStartThread() 方法在一个死循环中开启 EventLoop 线程的主循环，然后跳出这个死循环，所以实际只执行了一次

这样做可以根据状态进行重试或者跳出循环，并且预留了扩展

1
private void doStartThread() {
2
  executor.execute(new Runnable() {
3
    @Override
4
    public void run() {
5
      // 获取锁
6
      processingLock.lock();
7
      // ...
8
      try {
9
        for (;;) {
10
          // 开启 bossGroup 线程 EventLoop 的主循环
11
          SingleThreadEventExecutor.this.run();
12
          success = true;
13

14
          int currentState = state;
15
          if (canSuspend(currentState)) {
16
              if (!STATE_UPDATER.compareAndSet(SingleThreadEventExecutor.this,
17
                      ST_SUSPENDING, ST_SUSPENDED)) {
18
                  // Try again as the CAS failed.
19
                  continue;
20
              }
21

22
              if (!canSuspend(ST_SUSPENDED) && STATE_UPDATER.compareAndSet(SingleThreadEventExecutor.this,
23
                          ST_SUSPENDED, ST_STARTED)) {
24
                  // Seems like there was something added to the task queue again in the meantime but we
25
                  // were able to re-engage this thread as the event loop thread.
26
                  continue;
27
              }
28
              suspend = true;
29
          }
30

31
          // 跳出循环
32
          break;
33
        }
34
      } catch (Throwable t) {
35
        unexpectedException = t;
36
        logger.warn("Unexpected exception from an event executor: ", t);
37
      } finally {
38
        // ...
39
      }
40
    }
41
  });
42
}

这个线程就是上一篇文章中，bossGroup 选择的 EventLoop 线程，绑定端口并且监听 OP_ACCEPT 事件

线程中通过 do… while() 不断循环干两件事

runIo() 监听和处理 I/O 事件
runAllTasks() 执行 EventLoop 线程任务

这就是 IO 多路复用，只不过在 Netty 中多了 EventLoop 线程任务

1
@Override
2
protected void run() {
3
  // 判断是否在 EventLoop 线程中
4
    assert inEventLoop();
5
    // 执行 ioHandler 的初始化（生命周期钩子函数）
6
    ioHandler.initialize();
7
    do {
8
    // 处理 I/O 就绪事件
9
        runIo();
10
        if (isShuttingDown()) {
11
          // ioHandler 销毁前执行（生命周期钩子函数）
12
            ioHandler.prepareToDestroy();
13
        }
14
        // Now run all tasks for the maximum configured amount of time before trying to run IO again.
15
        // 在规定时间内执行所有 EventLoop 的异步任务
16
        runAllTasks(maxTaskProcessingQuantumNs);
17

18
        // We should continue with our loop until we either confirmed a shutdown or we can suspend it.
19
        // 如源码注释所说，这里应该继续循环，直到我们认为可以停止
20
    } while (!confirmShutdown() && !canSuspend());
21
}
22

23
protected int runIo() {
24
    assert inEventLoop();
25
    // 通过 ioHandler 执行
26
    return ioHandler.run(context);
27
}
28

29
@Override
30
public int run(IoHandlerContext context) {
31
  int handled = 0;
32
  try {
33
    try {
34
      // 有任务调用一次 selectNow，及时处理 I/O 事件
35
      // 无任务，进入 SELECT 阻塞
36
      switch (selectStrategy.calculateStrategy(selectNowSupplier, !context.canBlock())) {
37
        case SelectStrategy.CONTINUE:
38
          return 0;
39

40
        case SelectStrategy.BUSY_WAIT:
41
          // fall-through to SELECT since the busy-wait is not supported with NIO
42

43
        case SelectStrategy.SELECT:
44
          // 通过 selector.select() 阻塞 I/O 监听事件
45
          select(context, wakenUp.getAndSet(false));
46

47
          // wakeUp 变量表示是否唤醒 selector
48
          // 每次新一轮循环，wakeUp 设置为 false
49
          // 当有任务执行时，会设置为 true 唤醒线程
50
          if (wakenUp.get()) {
51
            selector.wakeup();
52
          }
53
          // fall through
54
        default:
55
      }
56
    } catch (IOException e) {
57
      // If we receive an IOException here its because the Selector is messed up. Let's rebuild
58
      // the selector and retry. https://github.com/netty/netty/issues/8566
59
      rebuildSelector0();
60
      handleLoopException(e);
61
      return 0;
62
    }
63

64
    cancelledKeys = 0;
65
    needsToSelectAgain = false;
66
    // 处理 SelectedKey
67
    handled = processSelectedKeys();
68
  } catch (Error e) {
69
    throw e;
70
  } catch (Throwable t) {
71
    handleLoopException(t);
72
  }
73
  return handled;
74
}

ioHandler.run() 方法中通过策略模式判断是否有任务要执行

有，调用一次 selectNow()
无，进入 SELECT 阻塞

因为 selectNow() 立即返回就绪事件，不会阻塞线程，所以在执行 runAllTasks() 之前可以及时处理 I/O 事件

我们继续跟踪 select(context, wakenUp.getAndSet(false)) 方法，看看 Netty 是如何监听 I/O 事件的

避免 JDK 空轮询 Bug#

在这个方法中，主要通过 for 循环不断调用 selector.select(timeoutMillis) 方法阻塞线程，直到有任务要执行，跳出循环，执行后面的 runAllTasks() 方法

timeoutMillis 是根据下一个任务执行时间动态计算的

因有任务要执行而跳出循环的情况有以下两种：

执行任务把 wakenUp 设置为 true
timeoutMillis <= 0，达到执行任务时间

其中涉及到如何处理 JDK 的空轮询 Bug

JDK 空轮询 Bug

Java NIO 的 Selector.select() 在某些异常情况下，没有任何 I/O 事件，但持续频繁地返回事件数量 0，导致 CPU 占用率飙升至 100%

Netty 并没有实际解决 JDK 的空轮询 Bug，而是通过空轮询计数 + 重建 Selector，避免了这个 Bug

selector.select(timeoutMillis) 设置阻塞超时时间
计数 + 1，判断是否阻塞超时，如果正常超时重置计数为 1
如果计数超出 512 次，认为触发了空轮询 Bug
重新创建 Selector
通过 IoRegistration 把旧的 Channels 注册到新的 Selector 上

1
private void select(IoHandlerContext runner, boolean oldWakenUp) throws IOException {
2
  Selector selector = this.selector;
3
  try {
4
    int selectCnt = 0;
5
    long currentTimeNanos = System.nanoTime();
6
    // 根据下一个任务执行时间判断 selector.select() 可以阻塞多久
7
    // 截止时间 = 当前时间 + 队列第一个任务剩余执行时间
8
    long selectDeadLineNanos = currentTimeNanos + runner.delayNanos(currentTimeNanos);
9

10
    for (;;) {
11
      // 计算超时，防止一直阻塞，每次循环最后会更新 currentTimeNanos
12
      // 阻塞时间 = 截止时间 - 当前时间 + 0.5毫秒（四舍五入）/ 1000000L（转毫秒）
13
      long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
14
      // 小于等于0，说明任务该执行了，跳出循环
15
      if (timeoutMillis <= 0) {
16
        if (selectCnt == 0) {
17
          selector.selectNow();
18
          selectCnt = 1;
19
        }
20
        break;
21
      }
22

23
      // If a task was submitted when wakenUp value was true, the task didn't get a chance to call
24
      // Selector#wakeup. So we need to check task queue again before executing select operation.
25
      // If we don't, the task might be pended until select operation was timed out.
26
      // It might be pended until idle timeout if IdleStateHandler existed in pipeline.
27
      if (!runner.canBlock() && wakenUp.compareAndSet(false, true)) {
28
        selector.selectNow();
29
        selectCnt = 1;
30
        break;
31
      }
32

33
      // 规定时间内执行 select
34
      int selectedKeys = selector.select(timeoutMillis);
35
      selectCnt ++;
36

37
      // 如果有事件、任务唤醒等操作，跳出循环
38
      if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || !runner.canBlock()) {
39
        // - Selected something,
40
        // - waken up by user, or
41
        // - the task queue has a pending task.
42
        // - a scheduled task is ready for processing
43
        break;
44
      }
45
      if (Thread.interrupted()) {
46
        // Thread was interrupted so reset selected keys and break so we not run into a busy loop.
47
        // As this is most likely a bug in the handler of the user or it's client library we will
48
        // also log it.
49
        //
50
        // See https://github.com/netty/netty/issues/2426
51
        if (logger.isDebugEnabled()) {
52
          logger.debug("Selector.select() returned prematurely because " +
53
              "Thread.currentThread().interrupt() was called. Use " +
54
              "NioHandler.shutdownGracefully() to shutdown the NioHandler.");
55
        }
56
        selectCnt = 1;
57
        break;
58
      }
59

60
      long time = System.nanoTime();
61
      // 当前时间 - 阻塞时间 >= 循环开始时间
62
      // 等同于 阻塞时间 >= select 执行时间（当前时间 - 循环开始时间）
63
      if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
64
        // timeoutMillis elapsed without anything selected.
65
        // 正常超时 selectCnt 重置为 1
66
        selectCnt = 1;
67
        // 超出阈值，重建 selector
68
      } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
69
          selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
70
        // The code exists in an extra method to ensure the method is not too big to inline as this
71
        // branch is not very likely to get hit very frequently.
72
        selector = selectRebuildSelector(selectCnt);
73
        selectCnt = 1;
74
        break;
75
      }
76
      // 更新当前时间
77
      currentTimeNanos = time;
78
    }
79
    //...
80
  } catch (CancelledKeyException e) {
81
    if (logger.isDebugEnabled()) {
82
      logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
83
          selector, e);
84
    }
85
    // Harmless exception - log anyway
86
  }
87
}

重建 Selector 的逻辑我们大概跟一下，先创建 JDK 的 Selector，使用反射篡改 JDK 原生 Selector 对象内部的私有字段为数组，最后通过 IoRegistration 把旧的 channels 注册到新的 selector 中

1
void rebuildSelector0() {
2
    final Selector oldSelector = selector;
3
    final SelectorTuple newSelectorTuple;
4

5
    if (oldSelector == null) {
6
        return;
7
    }
8

9
    try {
10
      // 创建新的 selector
11
        newSelectorTuple = openSelector();
12
    } catch (Exception e) {
13
        logger.warn("Failed to create a new Selector.", e);
14
        return;
15
    }
16

17
    // Register all channels to the new Selector.
18
    // 通过 IoRegistration 把旧的 channels 注册到新的 selector
19
    int nChannels = 0;
20
    for (SelectionKey key : oldSelector.keys()) {
21
        DefaultNioRegistration handle = (DefaultNioRegistration) key.attachment();
22
        try {
23
            if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
24
                continue;
25
            }
26

27
            handle.register(newSelectorTuple.unwrappedSelector);
28
            nChannels++;
29
        } catch (Exception e) {
30
            logger.warn("Failed to re-register a NioHandle to the new Selector.", e);
31
            handle.cancel();
32
        }
33
    }
34

35
    selector = newSelectorTuple.selector;
36
    unwrappedSelector = newSelectorTuple.unwrappedSelector;
37

38
    try {
39
        // time to close the old selector as everything else is registered to the new one
40
        oldSelector.close();
41
    } catch (Throwable t) {
42
        if (logger.isWarnEnabled()) {
43
            logger.warn("Failed to close the old Selector.", t);
44
        }
45
    }
46
  //...
47
}
48

49
private SelectorTuple openSelector() {
50
    final Selector unwrappedSelector;
51
    try {
52
      // 创建 JDK NIO 的 selector
53
        unwrappedSelector = provider.openSelector();
54
    } catch (IOException e) {
55
        throw new ChannelException("failed to open a new selector", e);
56
    }
57

58
    if (DISABLE_KEY_SET_OPTIMIZATION) {
59
        return new SelectorTuple(unwrappedSelector);
60
    }
61

62
    Object maybeSelectorImplClass = AccessController.doPrivileged(new PrivilegedAction<Object>() {
63
        @Override
64
        public Object run() {
65
            try {
66
                return Class.forName(
67
                        "sun.nio.ch.SelectorImpl",
68
                        false,
69
                        PlatformDependent.getSystemClassLoader());
70
            } catch (Throwable cause) {
71
                return cause;
72
            }
73
        }
74
    });
75

76
  //...
77

78
    final Class<?> selectorImplClass = (Class<?>) maybeSelectorImplClass;
79
    final SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();
80
    // 反射替换 selectedKeySet 的数据结构，把 HashSet 替换成数组
81
    Object maybeException = AccessController.doPrivileged(new PrivilegedAction<Object>() {
82
        @Override
83
        public Object run() {
84
            try {
85
                Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
86
                Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");
87

88
                if (PlatformDependent.javaVersion() >= 9 && PlatformDependent.hasUnsafe()) {
89
                    // Let us try to use sun.misc.Unsafe to replace the SelectionKeySet.
90
                    // This allows us to also do this in Java9+ without any extra flags.
91
                    long selectedKeysFieldOffset = PlatformDependent.objectFieldOffset(selectedKeysField);
92
                    long publicSelectedKeysFieldOffset =
93
                            PlatformDependent.objectFieldOffset(publicSelectedKeysField);
94

95
                    if (selectedKeysFieldOffset != -1 && publicSelectedKeysFieldOffset != -1) {
96
                        PlatformDependent.putObject(
97
                                unwrappedSelector, selectedKeysFieldOffset, selectedKeySet);
98
                        PlatformDependent.putObject(
99
                                unwrappedSelector, publicSelectedKeysFieldOffset, selectedKeySet);
100
                        return null;
101
                    }
102
                    // We could not retrieve the offset, lets try reflection as last-resort.
103
                }
104
                //...
105
                selectedKeysField.set(unwrappedSelector, selectedKeySet);
106
                publicSelectedKeysField.set(unwrappedSelector, selectedKeySet);
107
                return null;
108
            } catch (NoSuchFieldException | IllegalAccessException e) {
109
                return e;
110
            }
111
        }
112
    });
113
  //...
114
    selectedKeys = selectedKeySet;
115
}

processSelectedKeys()#

监听到新的连接后（OP_ACCEPT），就会跳出 select 阻塞的循环，调用 processSelectedKeys() 方法处理 selectedKeys

遍历 selectedKeys 的方式有两种

优化过，通过创建 Selector 时反射替换的数组遍历
没优化过，直接使用原来的 HashSet 迭代器遍历

因为使用迭代器每次遍历都会创建一个新的对象，可能会造成频繁 GC，一般来说，Netty 默认都是使用优化过的处理方式遍历 key

这种优化方式不太常规，如果后续 JDK 有改动，反射可能会失效，但目前来说收益远大于风险，是一种 破坏封装换取极致性能 的权衡

1
public int run(IoHandlerContext context) {
2
  //...
3
  select(context, wakenUp.getAndSet(false));
4
  // 处理 SelectedKey
5
  handled = processSelectedKeys();
6
  //...
7
    return handled;
8
}
9

10
private int processSelectedKeys() {
11
    // selector 创建的时候通过反射复制 selectedKeys
12
    if (selectedKeys != null) {
13
        // 优化处理，使用数组
14
        return processSelectedKeysOptimized();
15
    } else {
16
        // 没有优化的处理，使用迭代器
17
        return processSelectedKeysPlain(selector.selectedKeys());
18
    }
19
}
20

21
private int processSelectedKeysOptimized() {
22
  int handled = 0;
23
  for (int i = 0; i < selectedKeys.size; ++i) {
24
    final SelectionKey k = selectedKeys.keys[i];
25
    // null out entry in the array to allow to have it GC'ed once the Channel close
26
    // See https://github.com/netty/netty/issues/2363
27
    // 有些 Channel 已经关闭，但 key 还存在，影响 GC
28
    selectedKeys.keys[i] = null;
29
    // 处理 key
30
    processSelectedKey(k);
31
    ++handled;
32

33
    if (needsToSelectAgain) {
34
      // null out entries in the array to allow to have it GC'ed once the Channel close
35
      // See https://github.com/netty/netty/issues/2363
36
      // 清除后面的 selectedKey，防止 Channel 关闭 key 没有遍历到
37
      selectedKeys.reset(i + 1);
38

39
      selectAgain();
40
      i = -1;
41
    }
42
  }
43
  return handled;
44
}
45

46
private void processSelectedKey(SelectionKey k) {
47
  // 从 SelectionKey 获取 IoRegistration
48
  final DefaultNioRegistration registration = (DefaultNioRegistration) k.attachment();
49
  if (!registration.isValid()) {
50
    try {
51
      registration.handle.close();
52
    } catch (Exception e) {
53
      logger.debug("Exception during closing " + registration.handle, e);
54
    }
55
    return;
56
  }
57
  // 从 SelectionKey 中获取操作数，通过 IoRegistration 的 handle 判断事件掩码
58
  // 选择对应的 Channel 执行
59
  registration.handle(k.readyOps());
60
}

最后把 key 的操作数交给 IoRegistration 的 IoHandle 选择对应的 Channel 处理事件，利用二进制掩码筛选 OP_ACCPET 事件是否在集合中，然后执行 read() 方法

NioUnsafe.read()#

这里的 read() 方法有两种实现

建立连接监听 OP_ACCPET 事件是 AbstractNioMessageChannel 里的 NioByteUnsafe
后续监听 OP_READ 事件是 AbstractNioByteChannel 里的 NioMessageUnsafe

在上一篇文章，io.netty.example.echo.EchoServer 的案例里 ServerBootstrap 配置的 Channel 是 NioServerSocketChannel.class

1
ServerBootstrap b = new ServerBootstrap();
2
b.group(bossGroup, workerGroup)
3
 .channel(NioServerSocketChannel.class)
4
//...

我们看一下它的继承关系

NioServerSocketChannel 继承了 AbstractNioMessageChannel，我们回顾一下，在后续的注册逻辑中 DefaultNioRegistration 会把 AbstractNioMessageChannel 的 unsafe 实例关联到 handle，最后把 OP_ACCPET 事件注册到 Selector

所以这里监听到 OP_ACCPET 事件，执行的 read() 方法是 AbstractNioMessageChannel 的 NioByteUnsafe 实现

1
@Override
2
public void handle(IoRegistration registration, IoEvent event) {
3
  try {
4
    NioIoEvent nioEvent = (NioIoEvent) event;
5
    // 获取当前事件的掩码
6
    NioIoOps nioReadyOps = nioEvent.ops();
7
    // We first need to call finishConnect() before try to trigger a read(...) or write(...) as otherwise
8
    // the NIO JDK channel implementation may throw a NotYetConnectedException.
9
    if (nioReadyOps.contains(NioIoOps.CONNECT)) {
10
      // remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
11
      // See https://github.com/netty/netty/issues/924
12
      removeAndSubmit(NioIoOps.CONNECT);
13

14
      unsafe().finishConnect();
15
    }
16

17
    // Process OP_WRITE first as we may be able to write some queued buffers and so free memory.
18
    if (nioReadyOps.contains(NioIoOps.WRITE)) {
19
      // Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to
20
      // write
21
      forceFlush();
22
    }
23

24
    // Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
25
    // to a spin loop
26
    // 通过位运算，判断当前事件掩码是否在 read 和 accept 事件集合中
27
    if (nioReadyOps.contains(NioIoOps.READ_AND_ACCEPT) || nioReadyOps.equals(NioIoOps.NONE)) {
28
      // 有两个实现，建立连接监听 OP_ACCPET 事件是 AbstractNioMessageChannel
29
      // 后续监听 OP_READ 事件是 AbstractNioByteChannel
30
      read();
31
    }
32
  } catch (CancelledKeyException ignored) {
33
    close(voidPromise());
34
  }
35
}

在 AbstractNioMessageChannel 的 read() 方法中，会先通过 RecvByteBufAllocator 分配缓冲区大小，然后执行 doReadMessages() 方法

在 doReadMessages() 方法的实现里会把新连接当作消息，先创建一个 JDK 原生的 Socket 连接封装到 NioSocketChannel 并且指定 OP_READ 操作数，然后放到 readBuf 中

最后遍历 readBuf 并调用 fireChannelRead() 方法传递 read 事件

RecvByteBufAllocator

内存大小预测器，可以计算预测下次申请 byteBuf 的容量大小，达到提升性能和节省内存的目的

1
@Override
2
public void read() {
3
  assert eventLoop().inEventLoop();
4
  final ChannelConfig config = config();
5
  final ChannelPipeline pipeline = pipeline();
6
  // 获取内存大小预测器，分配缓冲区
7
  final RecvByteBufAllocator.Handle allocHandle = unsafe().recvBufAllocHandle();
8
  // 重置内存大小预测器的属性
9
  allocHandle.reset(config);
10

11
  boolean closed = false;
12
  Throwable exception = null;
13
  try {
14
    try {
15
      do {
16
        // 服务端这里会创建 channel 放到 readBuf 中
17
        int localRead = doReadMessages(readBuf);
18
        if (localRead == 0) {
19
          break;
20
        }
21
        if (localRead < 0) {
22
          closed = true;
23
          break;
24
        }
25
        // 增加已读信息数
26
        allocHandle.incMessagesRead(localRead);
27
      } while (continueReading(allocHandle));
28
    } catch (Throwable t) {
29
      exception = t;
30
    }
31

32
    int size = readBuf.size();
33
    for (int i = 0; i < size; i ++) {
34
      // 是否等待下一次读事件
35
      readPending = false;
36
      // 传递 read 事件
37
      pipeline.fireChannelRead(readBuf.get(i));
38
    }
39
    readBuf.clear();
40
    // 读结束，根据这次读取的字节数，动态调整下次的预测
41
    allocHandle.readComplete();
42
    // 传递 readComplete 事件
43
    pipeline.fireChannelReadComplete();
44

45
    if (exception != null) {
46
      closed = closeOnReadError(exception);
47

48
      pipeline.fireExceptionCaught(exception);
49
    }
50

51
    if (closed) {
52
      inputShutdown = true;
53
      if (isOpen()) {
54
        close(voidPromise());
55
      }
56
    }
57
  } finally {
58
    if (!readPending && !config.isAutoRead()) {
59
      removeReadOp();
60
    }
61
  }
62
}
63

64
@Override
65
protected int doReadMessages(List<Object> buf) throws Exception {
66
  // 使用 JDK 创建一个 socket 连接
67
    SocketChannel ch = SocketUtils.accept(javaChannel());
68

69
    try {
70
        if (ch != null) {
71
            // 将 socket 连接包装为 channel
72
            buf.add(new NioSocketChannel(this, ch));
73
            return 1;
74
        }
75
    } catch (Throwable t) {
76
        logger.warn("Failed to create a new channel from an accepted socket.", t);
77

78
        try {
79
            ch.close();
80
        } catch (Throwable t2) {
81
            logger.warn("Failed to close a socket.", t2);
82
        }
83
    }
84

85
    return 0;
86
}
87

88
protected AbstractNioByteChannel(Channel parent, SelectableChannel ch) {
89
  // 指定 OP_READ 操作数
90
    super(parent, ch, SelectionKey.OP_READ);
91
}

最终会在 channelRead() 方法中通过轮询的负载均衡算法选择一个 ChildGroup 的 Worker 线程注册 Channel 并添加一个关闭的监听器

1
@SuppressWarnings("unchecked")
2
public void channelRead(ChannelHandlerContext ctx, Object msg) {
3
  final Channel child = (Channel) msg;
4
  // 添加 childHandler
5
  child.pipeline().addLast(childHandler);
6

7
  setChannelOptions(child, childOptions, logger);
8
  setAttributes(child, childAttrs);
9

10
  //...
11

12
  try {
13
    // 选择一个 worker 线程注册 channel，并且添加一个关闭的监听器
14
    childGroup.register(child).addListener(new ChannelFutureListener() {
15
      @Override
16
      public void operationComplete(ChannelFuture future) throws Exception {
17
        if (!future.isSuccess()) {
18
          forceClose(child, future.cause());
19
        }
20
      }
21
    });
22
  } catch (Throwable t) {
23
    forceClose(child, t);
24
  }
25
}
26

27
@Override
28
public ChannelFuture register(Channel channel) {
29
    return next().register(channel);
30
}
31

32
@Override
33
public EventExecutor next() {
34
  // 通过轮询的负载均衡算法选择一个 worker 线程
35
    return executors[idx.getAndIncrement() & executors.length - 1];
36
}

这里的 register 注册逻辑和上一篇文章是一样的

PS：但要提醒一下，注册流程会创建新的 IoRegistration，关联 NioSocketChannel 的 unsafe 实例，把自己绑定到 key 的 attachment 中

都会进入到 register0() 方法，主要区别是

初始化 ChildHandler 的 ChannelInitializer
isActive() 为 true

会直接调用 fireChannelActive() 方法传递 active 事件

1
private void register0(ChannelPromise promise) {
2
  //...
3
  // 初始化 ChildHandler 的 ChannelInitializer
4
  pipeline.invokeHandlerAddedIfNeeded();
5
  // 设置状态，通知 promise 的 Listener
6
  safeSetSuccess(promise);
7
  // 递归调用 ChannelHandlerContext 所有的 channelRegistered 方法（事件传递）
8
  pipeline.fireChannelRegistered();
9

10
  if (isActive()) {
11
    if (firstRegistration) {
12
      pipeline.fireChannelActive();
13
    } else if (config().isAutoRead()) {
14
      beginRead();
15
    }
16
  }
17
  //...
18

19
  // selector 注册到 IoHandle 然后通过 IoRegistration 注册 IoHandle 到 IoHandler
20
  doRegister(registerPromise);
21
}

在上一篇文章中有提到 fireChannelActive() 这个方法里会调用 readIfIsAutoRead() 方法开启一个 read 出站事件

最终进入到 doBeginRead() 方法，向 Selector 注册 OP_READ 操作数

1
protected void doBeginRead() throws Exception {
2
  // Channel.read() or ChannelHandlerContext.read() was called
3
  IoRegistration registration = this.registration;
4
  if (registration == null || !registration.isValid()) {
5
    return;
6
  }
7

8
  readPending = true;
9

10
  addAndSubmit(readOps);
11
}
12

13
protected void addAndSubmit(NioIoOps addOps) {
14
    int interestOps = selectionKey().interestOps();
15
    if (!addOps.isIncludedIn(interestOps)) {
16
        try {
17
          // 向 selector 注册 OP_READ 操作数
18
            registration().submit(NioIoOps.valueOf(interestOps).with(addOps));
19
        } catch (Exception e) {
20
            throw new ChannelException(e);
21
        }
22
    }
23
}

最后调用 fireChannelReadComplete() 方法传递 readComplete 事件，到此 server 建立连接的流程走完，开始下一轮循环继续监听新连接

后续 OP_READ 事件的监听，都是在 ChildGroup 的 Worker 线程中进行的

可以看出 Netty 的主从 Reactor 线程模型

主 Reactor （Boss 线程）负责监听 OP_ACCPET 事件
从 Reactor （Worker 线程）负责注册和处理 OP_READ / OP_WRITE 事件

Reactor 线程模型

一个线程循环调用 Selector 的 select() 方法（IO 多路复用）进行监听，作为一个 Reactor

本质是 I/O 多路复用 + 线程池

当监听到 OP_READ 事件，执行流程都是差不多的，只是从 key 中获取的 IoRegistration 不同，通过 IoHandle 执行的 Channel 不同，走的是 AbstractNioByteChannel 的 read() 方法读取消息

总结#

优化：

I/O 多路复用，循环调用 selector.select(timeoutMillis) 阻塞线程，监听 I/O 事件，如果有 Eventloop 任务需要执行，就唤醒线程执行任务
通过空轮询计数 + 重建 Selector 的方式避免 JDK 空轮询 Bug
监听到 I/O 事件，使用创建 Selector 时反射替换 selectedKeys 的数据结构优化遍历事件集合，提高效率
通过二进制掩码筛选事件类型
使用 RecvByteBufAllocator 计算预测下次申请 byteBuf 的容量大小提高效率

流程：

Netty 是通过 BossGroup 的 EventLoop 线程不断循环调用 select(timeoutMillis) 阻塞线程监听 I/O 事件，如果有 EventLoop 任务需要执行，就唤醒线程执行任务
当有新连接建立时，server 端会监听到 OP_ACCPET 事件的 key，然后通过 key 获取到对应的 IoRegistration 通过 IoHandle 执行 AbstractNioMessageChannel 的 read() 方法处理消息
通过把新连接当作消息，创建一个 JDK 原生的 Socket 连接封装到 NioSocketChannel 指定 OP_READ 操作数，交给 ChannelPipeline 当作消息传递 read 入站事件
通过轮询的负载均衡方式选择 ChildGroup 的一个 Worker 线程执行注册 NioSocketChannel
在 Worker 线程中传递 read 出站事件，完成注册 OP_READ 事件的逻辑