论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2008-08-06 17:30 |只看该作者 |倒序浏览

Robert Love
Chapter 11. Memory Management
This chapter discusses the methods used to obtain memory inside the kernel.

Pages
The kernel represents every physical page on the system with a struct page structure. This structure is defined in :
struct page {
      page_flags_t       flags;
      atomic_t             _count;
      atomic_t             _mapcount;
      unsigned long       private;
      struct address_space  *mapping;
      pgoff_t             index;
      struct list_head    lru;
      void                *virtual;
};

Zones
Each zone is represented by struct zone, which is defined in :
struct zone {
      spinlock_t             lock;
      unsigned long          free_pages;
      unsigned long          pages_min;
      struct list_head       active_list;
      struct list_head       inactive_list;
      struct pglist_data    *zone_pgdat;
      struct page          *zone_mem_map;
      char                   *name;
...
...
};
The structure is big, but there are only three zones in the system and, thus, only three of these structures.

Getting Pages
The kernel provides one low-level mechanism for requesting memory, along with several interfaces to access it. All these interfaces allocate memory with page-sized granularity and are declared in .

struct page * alloc_pages(unsigned int gfp_mask, unsigned int order)

//convert a given page to its logical address
void * page_address(struct page *page)

If you have no need for the actual struct page, you can call
unsigned long __get_free_pages(unsigned int gfp_mask, unsigned int order)

struct page * alloc_page(unsigned int gfp_mask)
unsigned long __get_free_page(unsigned int gfp_mask)
unsigned long get_zeroed_page(unsigned int gfp_mask)

Table 11.2. Low-Level Page Allocations Methods
alloc_page(gfp_mask)
alloc_pages(gfp_mask, order)
Allocate 2order pages and return a pointer to the first page's page structure

__get_free_page(gfp_mask)
__get_free_pages(gfp_mask, order)
Allocate 2order pages and return a pointer to the first page's logical address

get_zeroed_page(gfp_mask)
Allocate a single page, zero its contents, and return a pointer to its logical address
void __free_pages(struct page *page, unsigned int order)
void free_pages(unsigned long addr, unsigned int order)
void free_page(unsigned long addr)
kmalloc()
The function is declared in :
void * kmalloc(size_t size, int flags)
void kfree(const void *ptr)
Note, calling kfree(NULL) is explicitly checked for and safe.

vmalloc()
void * vmalloc(unsigned long size)
void vfree(void *addr)
The vmalloc() function works in a similar fashion to kmalloc(), except it allocates memory that is only virtually contiguous and not necessarily physically contiguous.
kmalloc() function guarantees that the pages are physically contiguous (and virtually contiguous). The vmalloc() function only ensures that the pages are contiguous within the virtual address space. It does this by allocating potentially noncontiguous chunks of physical memory and "fixing up" the page tables to map the memory into a contiguous chunk of the logical address space.
The vmalloc() function is declared in  and defined in mm/vmalloc.c.

much greater TLB thrashing.
Because of these concerns, vmalloc() is used only when absolutely necessary, typically, to obtain very large regions of memory. For example, when modules are dynamically inserted into the kernel, they are loaded into memory created via vmalloc().

Slab Layer
The slab layer divides different objects into groups called caches, each of which stores a different type of object. There is one cache per object type. For example, one cache is for process descriptors (a free list of task_struct structures), whereas another cache is for inode objects (struct inode).
The caches are then divided into slabs (hence the name of this subsystem). The slabs are composed of one or more physically contiguous pages. Typically, slabs are composed of only a single page. Each cache may consist of multiple slabs.
Each slab contains some number of objects, which are the data structures being cached.
Each slab is in one of three states: full, partial, or empty.
This strategy reduces fragmentation.

Each cache is represented by a kmem_cache_s structure. This structure contains three listsslabs_full, slabs_partial, and slabs_emptystored inside a kmem_list3 structure. These lists contain all the slabs associated with the cache. A slab descriptor, struct slab, represents each slab:
struct slab {
      struct list_head  list;    /* full, partial, or empty list */
      unsigned long    colouroff;  /* offset for the slab coloring */
      void             *s_mem;    /* first object in the slab */
      unsigned int    inuse;    /* allocated objects in the slab */
      kmem_bufctl_t    free;    /* first free object, if any */
};

Slab descriptors are allocated either outside the slab in a general cache or inside the slab itself, at the beginning. The descriptor is stored inside the slab if the total size of the slab is sufficiently small, or if internal slack space is sufficient to hold the descriptor.

static void *kmem_getpages(kmem_cache_t *cachep, int flags, int nodeid)
//The slab allocator creates new slabs by interfacing with the low-level kernel page allocator via __get_free_pages()

void kmem_freepages(kmem_cache_t *cachep, void *addr)
//calls free_pages() on the given cache's pages
Slab Allocator Interface
kmem_cache_t * kmem_cache_create()
int kmem_cache_destroy(kmem_cache_t *cachep)
void * kmem_cache_alloc(kmem_cache_t *cachep, int flags)
void kmem_cache_free(kmem_cache_t *cachep, void *objp)

Statically Allocating on the Stack

High Memory Mappings

Permanent Mappings
void *kmap(struct page *page)
void kunmap(struct page *page)
This function works on either high or low memory. If the page structure belongs to a page in low memory, the page's virtual address is simply returned. If the page resides in high memory, a permanent mapping is created and the address is returned. The function may sleep, so kmap() works only in process context.

Temporary Mappings
void *kmap_atomic(struct page *page, enum km_type type)
void kunmap_atomic(void *kvaddr, enum km_type type)
The type parameters are defined in .
For times when a mapping must be created but the current context is unable to sleep, the kernel provides temporary mappings (which are also called atomic mappings).
Consequently, a temporary mapping can be used in places that cannot sleep, such as interrupt handlers, because obtaining the mapping never blocks. This function does not block and thus can be used in interrupt context and other places that cannot reschedule. It also disables kernel preemption.

Per_CPU Allocations

The New percpu Interface

Reasons for Using Per-CPU Data

Which Allocation Method Should I Use?
If you need contiguous physical pages, use one of the low-level page allocators or kmalloc().
Recall that the two most common flags given to these functions are GFP_ATOMIC and GFP_KERNEL.
Specify the GFP_ATOMIC flag to perform a high priority allocation that will not sleep. This is a requirement of interrupt handlers and other pieces of code that cannot sleep.
Code that can sleep, such as process context code that does not hold a spin lock, should use GFP_KERNEL. This flag specifies an allocation that can sleep, if needed, to obtain the requested memory.
If you want to allocate from high memory, use alloc_pages(). The alloc_pages() function returns a struct page, and not a pointer to a logical address. Because high memory might not be mapped, the only way to access it might be via the corresponding struct page structure. To obtain an actual pointer, use kmap() to map the high memory into the kernel's logical address space.
If you do not need physically contiguous pagesonly virtually contiguoususe vmalloc(), although bear in mind the slight performance hit taken with vmalloc() over kmalloc().
If you are creating and destroying many large data structures, consider setting up a slab cache.

本文来自ChinaUnix博客，如果查看原文请点：http://blog.chinaunix.net/u2/74638/showart_1110926.html

文库|博客

返回列表

Chinaunix › 论坛 › 操作系统 › Linux新手园地 › Linux文档专区 › lkd_ch11_MemoryManagement

lkd_ch11_MemoryManagement [复制链接]

浏览过的版块