论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2011-04-13 21:30 |只看该作者 |倒序浏览

本帖最后由 lofeng410 于 2011-04-19 08:09 编辑

struct napi_struct
      struct list_head       poll_list;       unsigned long             state;
      int                      weight;
      int                      (*poll)(struct napi_struct *, int);
#ifdef CONFIG_NETPOLL
      spinlock_t             poll_lock;
      int                      poll_owner;
#endif

      unsigned int             gro_count;

      struct net_device       *dev;
      struct list_head       dev_list;       struct sk_buff             *gro_list;
      struct sk_buff             *skb;
};

struct net_device
{
struct list_head dev_list;
struct list_head napi_list;

复制代码

在这两个结构体中，那四个标红的struct list_head类型的变量之间是什么关系呢？

1.
在netif_napi_add()中，有这样的操作：
list_add(&napi->dev_list, &dev->napi_list);
这样是不是说每个网口可能有多个struct napi_struct类型的变量？

2.
感觉着四个struct list_head类型的变量中，就poll_list有多用（在net_rx_action中轮询该链表），其他的都没有用，具体是这样子的么？

3.void __napi_schedule(struct napi_struct *n)
{
      unsigned long flags;

      local_irq_save(flags);
   list_add_tail(&n->poll_list, &__get_cpu_var(softnet_data).poll_list);
      __raise_softirq_irqoff(NET_RX_SOFTIRQ);
      local_irq_restore(flags);
}
static void net_rx_action(struct softirq_action *h)
{
      struct list_head *list = &__get_cpu_var(softnet_data).poll_list;
从这两个地方看，处理时总是使用&__get_cpu_var(softnet_data).poll_list，好像每个CPU的poll处理队列是属于同一个网口，那这样就意味着一个CPU固定用于处理某个网口。但是我们有没有强制约定说每个网口的中断致能送到固定的CPU上，这样该如何来理解呢？

文库|博客

lofeng410

稍有积蓄

论坛徽章:: 0

2楼 [报告]

发表于 2011-04-19 08:10 |只看该作者

使用的内核版本为2.6.32.12

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

独孤九贱

富足长乐

论坛徽章:: 0

3楼 [报告]

发表于 2011-04-19 09:24 |只看该作者

内核要改写NAPI这块接口，引入了napi_struct结构，就是为多队列做支持的，每个队列，而非原来的每个网卡，对应一个napi结构，例如igb中有：

for (i = 0; i < adapter->num_rx_queues; i++) {
struct igb_ring *ring = &(adapter->rx_ring);
ring->count = adapter->rx_ring_count;
ring->adapter = adapter;
ring->queue_index = i;
ring->itr_register = E1000_ITR;

/* set a default napi handler for each rx_ring */
netif_napi_add(adapter->netdev, &ring->napi, igb_poll, 64);
}

为每个rx queue注册napi……

评分

参与人数 1	可用积分 +6	收起理由
Godbach	+ 6	感谢分享

查看全部评分

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

lofeng410

稍有积蓄

论坛徽章:: 0

4楼 [报告]

发表于 2011-04-19 23:44 |只看该作者

回复 3# 独孤九贱

非常感谢~

另外请教一个问题哈：
不知大侠是如何来跟踪linux中协议栈的变动的？感觉随着内核版本的递增，其协议栈变动太大。我现在对着2.6.32的内核代码来看《深入理解LINUX网络技术内幕》，看的有
点云里雾里

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

wangjl_sdu

稍有积蓄

论坛徽章:: 0

5楼 [报告]

发表于 2011-04-20 04:45 |只看该作者

1. 这样是不是说每个网口可能有多个struct napi_struct类型的变量？

是的，在NAPI机制里面，网卡驱动自己配置和管理napi_struct实例，如果驱动愿意，完全可以建立多个napi_struct实例，add到net_device的napi_list中，就是 list_add(&napi->dev_list, &dev->napi_list);做的。

2.
感觉着四个struct list_head类型的变量中，就poll_list有多用（在net_rx_action中轮询该链表），其他的都没有用，具体是这样子的么？

不是的，四个都有用。net_device.napi_list应该是这块网卡下面挂的napi_struct链表的表头
net_device.dev_list应该是把这块网卡实例链接到其他链表里面的节点。
napi_struct.dev_list是把napi_struct实例挂接到net_device.napi_list链表里面的节点。
napi_struct.poll_list是用来把自己挂接到softnet_data.poll_list里面的节点。
softnet_data.poll_list是每个CPU处理的napi_struct链表表头。

3 从这两个地方看，处理时总是使用&__get_cpu_var(softnet_data).poll_list，好像每个CPU的poll处理队列是属于同一个网口，那这样就意味着一个CPU固定用于处理某个网口。但是我们有没有强制约定说每个网口的中断致能送到固定的CPU上，这样该如何来理解呢？

这里理解有点偏差，NAPI的基本过程是这样，当网卡接收到数据，网卡中断被触发，某一个CPU会处理这个中断，假设这个CPU为CPUA。 CPUA处理中断处理程序，生成一个napi_struct实例，关闭网卡中断，调用napi_schedule()把napi_struct实例链接到CPUA的softnet_data.poll_list上，并触发软中断，然后退出中断处理程序。这时，网络软中断子系统开始运行，以轮询的方式接收网卡的大量数据包(net_rx_action)。

这里软中断仍然是CPUA来处理的，因为软中断可以用两种方式触发，第一个是中断退出时，这个时候肯定是与中断处理程序同一个CPU。另一种是 ksoftirq内核线程，每个CPU都有自己的软中断，所以CPUA在中断处理程序中产生的nap_struct仍然由本CPU来处理。所以软中断中使用 __get_cpu_var(softnet_data).poll_list是没有问题的。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

迟到千年

白手起家

论坛徽章:: 0

6楼 [报告]

发表于 2024-08-30 16:32 |只看该作者

linux数据包接收流程 napi机制

最近刚入门linux。主要是看驱动相关的，目前在看网卡收包的过程。也看了napi模式的收包。但是现在比较疑惑napi结构体和设备的对应关系是什么啊？是一个napi_struct对应一个网卡，还是一个napi_struct对应一个队列？
以及在do_IRQ中：
irq_enter()和irq_exit()的作用就是为了进入和退出硬中断上下文吗？（进入硬中断上下文其他中断就没办法抢占了，linux新版本禁止内核抢占？）

unsigned int __irq_entry do_IRQ(struct pt_regs *regs)
{
  struct pt_regs *old_regs = set_irq_regs(regs);

  /* high bit used in ret_from_ code  */
  unsigned vector = ~regs->orig_ax;
  unsigned irq;

  exit_idle();
  irq_enter();
  irq = __this_cpu_read(vector_irq[vector]);
//省略了中间的代码
  irq_exit();
  set_irq_regs(old_regs);
  return 1;
}

在软中断处理函数net_rx_action中：
几处local_irq_disable和local_irq_enable的作用是什么啊？我看书上说这两个函数的功能是禁止/激活cpu本地中断的传递？那他会影响网卡中断吗？
static void net_rx_action(struct softirq_action *h)
{
  struct softnet_data *sd = &__get_cpu_var(softnet_data);
  unsigned long time_limit = jiffies + 2;
  int budget = netdev_budget;
  void *have;

  local_irq_disable();

  while (!list_empty(&sd->poll_list)) {
struct napi_struct *n;
int work, weight;
if (unlikely(budget <= 0 || time_after(jiffies, time_limit)))
   goto softnet_break;

local_irq_enable();

n = list_first_entry(&sd->poll_list, struct napi_struct, poll_list);

have = netpoll_poll_lock(n);

weight = n->weight;
work = 0;
if (test_bit(NAPI_STATE_SCHED, &n->state)) {
   work = n->poll(n, weight);
   trace_napi_poll(n);
}

WARN_ON_ONCE(work > weight);

budget -= work;

local_irq_disable();
if (unlikely(work == weight)) {
   if (unlikely(napi_disable_pending(n))) {
      local_irq_enable();
      napi_complete(n);
      local_irq_disable();
   } else
      list_move_tail(&n->poll_list, &sd->poll_list);
}

netpoll_poll_unlock(have);
  }
out:
  net_rps_action_and_irq_enable(sd);
#ifdef CONFIG_NET_DMA
  dma_issue_pending_all();
#endif
  return;

softnet_break:
  sd->time_squeeze++;
  __raise_softirq_irqoff(NET_RX_SOFTIRQ);
  goto out;
}

以及在e1000注册的软中断处理函数中为什么当一个napi的work<weight的时候，移除了napi之后，就可以重新打开网卡中断呢？
那其他napi结构体的处理不就是在开中断的环境下执行了吗？还是说一个napi结构体就对应一个网卡啊？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

迟到千年

白手起家

论坛徽章:: 0

7楼 [报告]

发表于 2024-08-30 16:40 |只看该作者

回复 5# wangjl_sdu

您好！不知道现在还能不能看见提问，我想请问您的意思就是一个网卡是可以有多个napi_struct的对吗？每个napi_struct都会挂到cpu的softnet_data上？那在例如intel的e1000网卡中，当一个napi_struct处理完之后（即满足work<weight）的时候，就移除掉该napi_struct并且重新打开网卡中断？那这个时候后面没有处理的napi_struct怎么办？（假设这个时候软中断的配额和时间都还有剩余），处理完一个napi结构体就开中断的话，那napi轮询体现在哪里？e1000注册的软中断处理函数
static int e1000_clean(struct napi_struct *napi, int budget)
{
  struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi);
  int tx_clean_complete = 0, work_done = 0;

  tx_clean_complete = e1000_clean_tx_irq(adapter, &adapter->tx_ring[0]);

  adapter->clean_rx(adapter, &adapter->rx_ring[0], &work_done, budget);

  if (!tx_clean_complete)
work_done = budget;

  /* If budget not fully consumed, exit the polling mode */
  if (work_done < budget) {
if (likely(adapter->itr_setting & 3))
   e1000_set_itr(adapter);
napi_complete(napi);
if (!test_bit(__E1000_DOWN, &adapter->flags))
   e1000_irq_enable(adapter);
  }

  return work_done;
}
最后还想请教一下在net_rx_action中，这几个local_irq_disable和local_irq_enable开关中断针对的是cpu本地中断，那这个本地中断包括网卡的接收中断吗？
static void net_rx_action(struct softirq_action *h)
{
  struct softnet_data *sd = &__get_cpu_var(softnet_data);
  unsigned long time_limit = jiffies + 2;
  int budget = netdev_budget;
  void *have;

  local_irq_disable();

  while (!list_empty(&sd->poll_list)) {
struct napi_struct *n;
int work, weight;

/* If softirq window is exhuasted then punt.
   * Allow this to run for 2 jiffies since which will allow
   * an average latency of 1.5/HZ.
   */
if (unlikely(budget <= 0 || time_after(jiffies, time_limit)))
   goto softnet_break;

local_irq_enable();

/* Even though interrupts have been re-enabled, this
   * access is safe because interrupts can only add new
   * entries to the tail of this list, and only ->poll()
   * calls can remove this head entry from the list.
   */
n = list_first_entry(&sd->poll_list, struct napi_struct, poll_list);

have = netpoll_poll_lock(n);

weight = n->weight;

/* This NAPI_STATE_SCHED test is for avoiding a race
   * with netpoll's poll_napi().  Only the entity which
   * obtains the lock and sees NAPI_STATE_SCHED set will
   * actually make the ->poll() call.  Therefore we avoid
   * accidentally calling ->poll() when NAPI is not scheduled.
   */
work = 0;
if (test_bit(NAPI_STATE_SCHED, &n->state)) {
   work = n->poll(n, weight);
   trace_napi_poll(n);
}

WARN_ON_ONCE(work > weight);

budget -= work;

local_irq_disable();

/* Drivers must not modify the NAPI state if they
   * consume the entire weight.  In such cases this code
   * still "owns" the NAPI instance and therefore can
   * move the instance around on the list at-will.
   */
if (unlikely(work == weight)) {
   if (unlikely(napi_disable_pending(n))) {
      local_irq_enable();
      napi_complete(n);
      local_irq_disable();
   } else
      list_move_tail(&n->poll_list, &sd->poll_list);
}

netpoll_poll_unlock(have);
  }
out:
  net_rps_action_and_irq_enable(sd);

#ifdef CONFIG_NET_DMA
  /*
* There may not be any more sk_buffs coming right now, so push
* any pending DMA copies to hardware
*/
  dma_issue_pending_all();
#endif

  return;

softnet_break:
  sd->time_squeeze++;
  __raise_softirq_irqoff(NET_RX_SOFTIRQ);
  goto out;
}

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

返回列表

Chinaunix › 论坛 › 程序设计 › 内核源码 › 请教个napi_struct 、net_device数据结构中与链表相关的 ...

请教个napi_struct 、net_device数据结构中与链表相关的几个问题 [复制链接]

评分

linux数据包接收流程 napi机制

浏览过的版块