论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2006-01-11 16:45 |只看该作者 |倒序浏览

该文章引用自http://forums.gentoo.org/viewtopic.php?p=1155852

Linux Memory Management or 'Why is there no free RAM?'
Revision 2.3
Copyright 2004 sapphirecat. The text of this post is licensed under a Creative Commons License.

Sections

Overview of memory management

The mysterious 880 MB limit on x86

The difference among VIRT, RES, and SHR in top output

The difference between buffers and cache

Swappiness (2.6 kernels)

1. Overview of memory management
Traditional Unix tools like 'top' often report a surprisingly small amount of free memory after a system has been running for a while. For instance, after about 3 hours of uptime, the machine I'm writing this on reports under 60 MB of free memory, even though I have 512 MB of RAM on the system. Where does it all go?

The biggest place it's being used is in the disk cache, which is currently over 290 MB. This is reported by top as "cached". Cached memory is essentially free, in that it can be replaced quickly if a running (or newly starting) program needs the memory.

The reason Linux uses so much memory for disk cache is because the RAM is wasted if it isn't used. Keeping the cache means that if something needs the same data again, there's a good chance it will still be in the cache in memory. Fetching the information from there is around 1,000 times quicker than getting it from the hard disk. If it's not found in the cache, the hard disk needs to be read anyway, but in that case nothing has been lost in time.

To see a better estimation of how much memory is really free for applications to use, run the command:
Code:
free -m

The -m option stands for megabytes, and the output will look something like this:
Code:
         total    used    free    shared buffers    cached
Mem:          503       451       52       0       14       293
-/+ buffers/cache:       143       360
Swap:       1027       0    1027

The -/+ buffers/cache line shows how much memory is used and free from the perspective of the applications. Generally speaking, if little swap is being used, memory usage isn't impacting performance at all.

Notice that I have 512 MB of memory in my machine, but only 503 is listed as available by free. This is mainly because the kernel can't be swapped out, so the memory it occupies could never be freed. There may also be regions of memory reserved for/by the hardware for other purposes as well, depending on the system architecture.

2. The mysterious 880 MB limit on x86
By default, the Linux kernel runs in and manages only low memory. This makes managing the page tables slightly easier, which in turn makes memory accesses slightly faster. The downside is that it can't use all of the memory once the amount of total RAM reaches the neighborhood of 880 MB. This has historically not been a problem, especially for desktop machines.

To be able to use all the RAM on a 1GB machine or better, the kernel needs recompiled. Go into 'make menuconfig' (or whichever config is preferred) and set the following option:
Code:
Processor Type and Features ---->
High Memory Support ---->
(X) 4GB

This applies both to 2.4 and 2.6 kernels. Turning on high memory support theoretically slows down accesses slightly, but according to Joseph_sys and log, there is no practical difference.

3. The difference among VIRT, RES, and SHR in top output
VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card's RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.

RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This also corresponds directly to the %MEM column.) This will virtually always be less than the VIRT size, since most programs depend on the C library.

SHR indicates how much of the VIRT size is actually sharable (memory or libraries). In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and will be counted in VIRT and SHR, but only the parts of the library file containing the functions being used will actually be loaded in and be counted under RES.

4. The difference between buffers and cache
Buffers are associated with a specific block device, and cover caching of filesystem metadata as well as tracking in-flight pages. The cache only contains parked file data. That is, the buffers remember what's in directories, what file permissions are, and keep track of what memory is being written from or read to for a particular block device. The cache only contains the contents of the files themselves.

Corrections and additions to this section welcome; I've done a bit of guesswork based on tracing how /proc/meminfo is produced to arrive at these conclusions.

5. Swappiness (2.6 kernels)
Since 2.6, there has been a way to tune how much Linux favors swapping out to disk compared to shrinking the caches when memory gets full.

ghoti adds:
When an application needs memory and all the RAM is fully occupied, the kernel has two ways to free some memory at its disposal: it can either reduce the disk cache in the RAM by eliminating the oldest data or it may swap some less used portions (pages) of programs out to the swap partition on disk.
It is not easy to predict which method would be more efficient.
The kernel makes a choice by roughly guessing the effectiveness of the two methods at a given instant, based on the recent history of activity.

Before the 2.6 kernels, the user had no possible means to influence the calculations and there could happen situations where the kernel often made the wrong choice, leading to thrashing and slow performance. The addition of swappiness in 2.6 changes this.
Thanks, ghoti!

Swappiness takes a value between 0 and 100 to change the balance between swapping applications and freeing cache. At 100, the kernel will always prefer to find inactive pages and swap them out; in other cases, whether a swapout occurs depends on how much application memory is in use and how poorly the cache is doing at finding and releasing inactive items.

The default swappiness is 60. A value of 0 gives something close to the old behavior where applications that wanted memory could shrink the cache to a tiny fraction of RAM. For laptops which would prefer to let their disk spin down, a value of 20 or less is recommended.

As a sysctl, the swappiness can be set at runtime with either of the following commands:
Code:
# sysctl -w vm.swappiness=30
# echo 30 >/proc/sys/vm/swappiness

The default when Gentoo boots can also be set in /etc/sysctl.conf:
Code:
# Control how much the kernel should favor swapping out applications (0-100)
vm.swappiness = 30

Some patchsets allow the kernel to auto-tune the swappiness level as it sees fit; they may not keep a user-set value.

[ 本帖最后由 platinum 于 2007-6-10 08:06 编辑 ]

文库|博客

albcamus

大富大贵

论坛徽章:: 0

2楼 [报告]

发表于 2006-01-12 09:21 |只看该作者

偶也转一个(转自kernelnewbies邮件列表)

On 10/17/05, Roy Smith <misterdabolina@xxxxxxxxx> wrote:
> Hi all,
>
> I want to refine a previous question of mine
> (and thank everybody who helped me!)
> about the virtual memory of the kenel itself.
>
> Where does the kernel keeps its own page tables ?
> are they all swapped-in all the time ?

The memory in the Kernel/Kernel Module will never be swapped-out !!!!

The kernel keeps the Virtual Address range from 3G (PAGE_OFFSET) to 4G
for it-self, in which it creates 1 to 1 mapping of physical memory
like 3G of Virtual Address points to 0 Physical Address and so on ....
till the 896MB Physical RAM or less than 896MB MAX avaialble RAM in
the System ... and for (3G + 896MB) to 4G is called as
VMALLOC_RESERVE, which is used in tempoarary mappings of the physical
memory above 896MB (HighMemory) .... and let say if system has 256MB
RAM then 3G to (3G + 256MB) Virtual Address will have direct mappings
and remaining 4G - (3G + 256MB) Virtual Address range is
VMALLOC_RESERVE .....

So the page table exists for 0 to 896MB of Physical RAM and for
accessing more than 896MB it temporary creates mappings with-in its
virtual address range already reserved for Highmemory mappings .....

> isn't it a lot of mostly-unused memory ?
>

No its not unused memory because in modern operating systems and in
Linux too the approach is not keep free memory there ... rather use
all memory for caching to speed-up the things and when some one needs
memory it simply fullfills the requirement from the memory already in
use for caching .... Although page structures and page tables uses and
required memory but that memory we can't avoid as those have to be
kept some where in memory for fast accessing .... and 0 to 896MB only
Physical Memory mapping/page tables are created to use the memory as
much less as possible for them .... because if the system has 32GB of
RAM and kernel create all the page-tables then u can think that more
than 32-times of memory required for keeping page-tables as compare to
keep them only for less than 1GB RAM ....

(I might not be so clear to explain and might be missing something, so
others plzz do make them correct)

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

snow_insky

稍有积蓄

论坛徽章:: 0

3楼 [报告]

发表于 2006-01-12 12:40 |只看该作者

dfsf

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

snow_insky

稍有积蓄

论坛徽章:: 0

4楼 [报告]

发表于 2006-01-12 12:42 |只看该作者

这两天，我怎么都不能发贴，不知怎么搞的！唉...

现在许多人对linux的内存使用量感到迷惑，老是觉得自己的机子内存无缘无故的被用完了，我想看了上面的东西，你们应该明白这是怎么回事。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

Solaris12

富足长乐

论坛徽章:: 0

5楼 [报告]

发表于 2006-01-13 15:28 |只看该作者

原帖由 snow_insky 于 2006-1-12 12:42 发表
这两天，我怎么都不能发贴，不知怎么搞的！唉...

现在许多人对linux的内存使用量感到迷惑，老是觉得自己的机子内存无缘无故的被用完了，我想看了上面的东西，你们应该明白这是怎么回事。

Solaris也有类似的机制，几乎所有现代系统都有。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

hjxisking

白手起家

论坛徽章:: 0

6楼 [报告]

发表于 2006-01-13 15:59 |只看该作者

原来内存被用的越多，意味着你的机器在处理上可以跑的更快一点啊？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

albcamus

大富大贵

论坛徽章:: 0

7楼 [报告]

发表于 2006-01-13 16:03 |只看该作者

原帖由 hjxisking 于 2006-1-13 15:59 发表
原来内存被用的越多，意味着你的机器在处理上可以跑的更快一点啊？

--编辑了半天，本着厚道的原则，自己删掉了。

学习一点BIO的知识吧，懂个大概再来批评，好吗？

[ 本帖最后由 albcamus 于 2006-1-13 16:06 编辑 ]

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

snow_insky

稍有积蓄

论坛徽章:: 0

8楼 [报告]

发表于 2006-01-14 14:33 |只看该作者

原帖由 hjxisking 于 2006-1-13 15:59 发表
原来内存被用的越多，意味着你的机器在处理上可以跑的更快一点啊？

我想我们也应该听到这种反对的声音，这样才能讨论嘛，谢谢！

是这样的，查看linux系统中处于free状态的内存有两个角度，一个是从内核的角度来看，一个是从应用层的角度来看的。

1.从内核的角度来看free的内存，就是内核目前可以直接分配到，不需要额外的操作，这个free值是不包括系统中处于buffer和cache状态的内存；但是在内核需要时，或在系统运行逐步推进时，buffer和cache状态的内存可以变为free状态的内存。

2.从应用层的角度来看系统处于free状态的内存，这个值是包括处于buffer和cache的，所以应用层分配内存时，可以直接从buffer和cache中拿。

linux系统之所以提高这种机制，是因为把内存都置为free状态，还不如把最近使用过的内存缓存起来（如从磁盘中读取的数据），这样再次需要这些数据时可以直接从内存中取，而不需要有一个漫长的磁盘操作，这样可以提高系统的整体性能。因为free状态的内存中的内容是不可用的，与其闲置还不如发挥它们的作用。而在系统需要时，又可以快速的从这些可释放的内存中分配，我想这种机制是非常好的，老兄您认为呢？一个普通人都知道...呵呵

下面我们来看看free命令的结果：

         total    used    free    shared buffers    cached
Mem:          500       355       145       0       67       249
-/+ buffers/cache:       38       462
Swap:       996       1       994

在这个结果中的第一行是从内核角度来看系统内存使用状态的，可以看到free的内存只有145M；
第二行是从应用层的角度来看系统内存的使用状况，可以看到free的内存有462M；

你有没有看到462这个值的妙处呢？？？？462 ：＝ 145 ＋ 249 ＋ 67