分配策略数组大小疑问?
本帖最后由 _nosay 于 2016-02-18 11:33 编辑《Linux内核源代码情景分析》第50页:
我最开始看到这句话的时候,产生了一个疑问,就是代表1个CPU结点的pglist_data结构,最多只有3个zone,怎么也拼不出来256种组合呀,数组定义这么大不是浪费么?
后来我“想通了”,这256个指针不一定非要指向自己的zone,也可以指向别的pglist_data结构的zone。
但是看了build_zonelists()函数,我又怀疑自己了,这256个指针都是指向pgdat的3个zone,而且有好多重复的(其实就“highmem→normal→dma→null”、“normal→dma→null”、“dma→null” 3种组合),所以又不明白node_zonelists[]数组为什么要定义成256大小了。
linux-2.4.0, mm/page_alloc.c, 705~752
/*
* Builds allocation fallback zone lists.
*/
static inline void build_zonelists(pg_data_t *pgdat)
{
int i, j, k;
for (i = 0; i < NR_GFPINDEX; i++) {
zonelist_t *zonelist;
zone_t *zone;
zonelist = pgdat->node_zonelists + i;
memset(zonelist, 0, sizeof(*zonelist));
zonelist->gfp_mask = i;
j = 0;
k = ZONE_NORMAL;
if (i & __GFP_HIGHMEM)
k = ZONE_HIGHMEM;
if (i & __GFP_DMA)
k = ZONE_DMA;
switch (k) {
default:
BUG();
/*
* fallthrough:
*/
case ZONE_HIGHMEM:
zone = pgdat->node_zones + ZONE_HIGHMEM;
if (zone->size) {
#ifndef CONFIG_HIGHMEM
BUG();
#endif
zonelist->zones = zone;
}
case ZONE_NORMAL:
zone = pgdat->node_zones + ZONE_NORMAL;
if (zone->size)
zonelist->zones = zone;
case ZONE_DMA:
zone = pgdat->node_zones + ZONE_DMA;
if (zone->size)
zonelist->zones = zone;
}
zonelist->zones = NULL;
}
}
build_zonelists()函数模拟
#include <stdio.h>
int main()
{
int i, k;
for (i = 0; i < 256; i ++)
{
k = 1;
if (i & 0x10)
k = 2;
if (i & 0x08)
k = 0;
printf("zonelist[%d]: ", i);
switch (k)
{
case 2:
printf("highmem ");
case 1:
printf("normal ");
case 0:
printf("dma ");
default:
break;
}
printf("\n");
}
return 0;
}
打印结果:
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: highmem normal dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
zonelist: dma
本帖最后由 _nosay 于 2016-02-18 15:20 编辑
分配页面时,由gfp_mask表示使用哪种分配策略,gfp_mask的意义不仅仅是策略编号(即node_zonelists[]数组的下标),它的意义还在于它的一些二进制位上,策略编号确定的情况下,由于其它二进制位的不同,就会造成这个值整体上发生变化,从而相同的策略要在不同的位置都放一份。static inline struct page * alloc_pages(int gfp_mask, unsigned long order)
{
/*
* Gets optimized away by the compiler.
*/
if (order >= MAX_ORDER)
return NULL;
return __alloc_pages(contig_page_data.node_zonelists+(gfp_mask), order);
} HI:
楼主,我觉得你想弄懂这个问题,请先更新你的内核版本,新版本中,定义如下:
struct pglist_data {
struct zone node_zones;
struct zonelist node_zonelists;
int nr_zones;
}
#define MAX_NR_ZONES根据系统中实际使用的区域数,一般为 2,其中包括 Normal 和 Highmem.
#define MAX_ZONELISTS 为系统所使用的节点数,在 UMA 系统中,一般为 1.
内核在使用 build_zonelists() 的时候会根据自身所具有的内存区域采取分配策略,内核会根据 GFP_* 标志从 Normal 或 Highmem 中分配内存. Buddy_Zhang1 发表于 2016-02-19 09:08 static/image/common/back.gif
HI:
楼主,我觉得你想弄懂这个问题,请先更新你的内核版本,新版本中,定义如下:
struct pglist_data {
...
其实就是二楼说的那样,分配策略gfp_mask不光指定了zone的使用顺序:
① 通过__GFP_DMA、__GFP_HIGHMEM位,判断使用“highmem→normal→dma→null”、“normal→dma→null”、“dma→null”中的哪个顺序;
(为什么没有__GFP_NORMAL?因为默认就是它。)
② __GFP_WAIT、__GFP_IO等其它位,对策略的描述。
现在假设对zone的使用顺序要求为__GFP_DMA(0x08),则gfp_mask低8位必须为xxxxx1xx,所以要保证00000100可以选到“dma→null”,也要保证0000101、0000110等,也能选到“dma→null”,所以就出现了模拟程序打印的那样,这样感觉不好,首先是浪费内存,另外是如果将来gfp_mask有意义的位超过了8位,node_zonelists[]数组的大小也要跟着变大才行。
个人理解,如果不对请帮忙纠正。
页:
[1]