免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3140 | 回复: 6
打印 上一主题 下一主题

Solaris 内存Troubleshooting和调优 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2007-07-04 17:51 |只看该作者 |倒序浏览
Solaris Memory Troubleshooting and Tuning
Author: John Richardson
Source: http://www.sunhelpdesk.com


Before We Begin

The first thing to be aware of is that Solaris utilizes almost all of physical memory on a system.  The output of vmstat shows very little memory free no mater what is happening on the server.  This is because it will cache portions of the file system in memory for quicker access.  This makes sense because you would rather be using your memory than letting it collect dust if your applications/processes are not using it.  For instance, if you continually open a file from an application, chances are it will be in memory the next time you access it.  This greatly improves I/O performance because RAM access is many times faster than accessing the data from disk.

As with is the case most often, you need the right tools to effectively solve problems.  I prefer to use “memtool” as my primary tool to solve memory problems.  Memtool functions are being integrated into Solaris 8 and 9.  However, currently I suggest utilizing memtool as it provides a wide range of useful tools compatible with most Solaris versions.  You can download the latest version from:

•        http://www.solarisinternals.com/si/downloads/_memtool/


Identifying a Memory Problem

The following methods can be used to identify a memory problem:

Check scan rate(sr) using vmstat:

•        Look at the “sr” column of vmstat over a 30 second period to get an average scan rate.  If values are consistently non-zero, then there is a possibility of a memory shortage problem.  

Important:  A common misconception is that an average non-zero scan rate (especially above 200) indicates a memory shortage.   There is the same probability that a non-zero scan rate indicates file system paging instead of application paging.  Unfortunately, the Solaris memory system (vmstat) counts paging activity generated by both file system paging and application paging.  So, if your system is paging in and out file I/O, then the (pi) and (po) columns of vmstat will show this activity along with an increasing scan rate (if number of free pages falls below lotsfree).  An important fact to point out here is that the system will not use swap space to handle unmodified file pages as it does with application paging.  File pages are not required to be in memory at all times as is the case with application memory.  Therefore, if you run out of application memory, the system is forced to page in and out to the swap device causing horrible performance.  For instance, consider a system with plenty of physical memory with a large portion being utilized for file system buffer cache (Think of the buffer cache as a portion of memory being utilized to store data reads from disk so that it can be processed).  The data is left in RAM in hopes that the process will need this data again in the future and not have to access the slow disk again.  If there is a lot of random I/O (not sequential because of the “free behind and read ahead” memory policy), the scanner daemon may be invoked to find available memory pages in the file system buffer cache, thus increasing the scan rate (I have witnessed scan rates over 10000 on Solaris 7 due to high random file i/o).   Note that in this case, application memory is not low, rather the file buffer cache has used all of the free pages. Therefore, the scan rates may be high only because it is looking for free pages for file system buffer cache, and not due to swapping pages in and out of virtual memory.  You can verify this by looking at the results of “prtmem” (will discuss prtmem later) to see if file system buffer cache is still high (above 10% of total memory), as well as checking there is no disk i/o activity on the swap device.  

Therefore, you cannot use vmstat in isolation to determine a memory shortage.  Although vmstat will indeed show that the system is not low on memory (no paging or scanning), it does not definitively indicate a memory shortage.  You should combine vmstat with the use one of the following two methods.

Monitor the disk activity of the swap partition (device):

•         Take a look at the output of “iostat -xPnc 5” to determine if there is any significant I/O on the swap device.  This is a true indication of swapping (memory shortage).  The following example output of the swap partition indicates there is no memory problem (notice all zeros):

    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0s2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0s2
    …

A quick command check if you know the swap device (usually slice 2 of the boot disk) is “iostat -xPnc 5 | grep c0t0d0s2”.  Watch the output to see if there is any activity.

Use memtool (this is my favorite option as it provides a lot of detail):

•        Memtool is a great resource to show memory distribution in a system.  In particular, if the file system buffer cache is less than 10% of total memory, then there is a memory shortage.


Using Memtool to Investigate Memory

Memtool is excellent resource in determining memory distribution of a system.  Here is an example of the prtmem command provided with memtool:

# prtmem

Total memory:            5875 Megabytes
Kernel Memory:            485 Megabytes
Application:             2007 Megabytes
Executable & libs:        154 Megabytes
File Cache:              3043 Megabytes
Free, file cache:         172 Megabytes
Free, free:                12 Megabytes

Pay close attention to File Cache, Free file cache, and Free output lines.  This indicates how much memory is being used by file system cache and how much memory is not being used by the system at all.  The total of these three lines should be more than 10% of the total memory of a system.  If it drops below 10% of total memory, the scanner daemon will likely be invoked to find free pages in memory for applications.  This will cause the system to swap and cause poor performance.   This will be indicated by high (pi), (po), and (sr) columns in vmstat output as well as queued disk activity on the swap device.  Note that prtmem includes shared memory in the Application output line above.  As you can see, a quick look at the memory distribution of a system using prtmem can quickly determine if there is a memory problem on a system.

If application memory is causing a memory shortage, you may be interested in finding the application that is using the most memory.  The memps command shows how much memory processes (as well as file system with -m argument) are using.  

# memps

    PID     Size Resident   Shared  Anon     Process              
    873   14144k    7224k    1728k    5496k  /opt/VRTSvmsa/jre/bin/../bin/sparc/native_threads/jre
    933   16160k    4872k    2096k    2776k  /opt/perf/bin/scopeux
    865   10152k    3480k    1552k    1928k  /opt/VRTSvmsa/jre/bin/../bin/sparc/native_threads/jre
   1605   13344k    5816k    4808k    1008k  opcmsga
   1608   11512k    5600k    4720k     880k  opcmona
   1607   10832k    5528k    4656k     872k  opcle
  26545    3952k    2232k    1456k     776k  /usr/sbin/nscd
Notice that processes have anonymous and shared memory.  The sum of this memory equals the resident size (total memory inclusive of shared memory and private memory) in memory.  The size column displays the total virtual (physical and swap) memory assigned to this process.  To estimate the total memory used by a group of the same processes, some calculation is required.  Simply sum the anonymous memory of each process and add the largest shared memory size of the individual processes.  We are only adding the shared memory size once because each process is using the same shared memory, so we don’t want to include this multiple times.  If you want to know how much memory different processes are using, you will need to sum the shared memory of each process because the shared memory for each process is a different portion of physical ram and needs to be included.

Here is an example of calculating the memory utilized by the same ksh processes:

# memps | grep ksh
  14466    1824k    1552k    1376k     176k  -ksh
    784    1816k    1504k    1408k      96k  /bin/ksh /opt/VRTSvmsa/bin/vmsa_server
  14320    1824k    1512k    1416k      96k  -ksh
  14102    1824k    1512k    1416k      96k  -ksh
  27477    1832k    1512k    1416k      96k  -ksh
  11150    1840k    1512k    1416k      96k  -ksh
   6969    1832k    1512k    1416k      96k  -ksh
   6220    1824k    1512k    1416k      96k  -ksh
--------------------------------------------------------------
Total Anonymous = 848k
Max Shared Memory = 1416k
Memory Used Estimate = 2264k

If you would like to dig deeper into an individual process, Memtool provides pmap which is equivalent to Solaris  command “pmap”.  Pmem shows the memory address space of an individual process.  It basically shows a breakdown of the mapped files in memory for the process.


Finding Memory Leaks

An application memory leak occurs when process memory continues to grow without bounds.  This can be seen by looking at the private portion of resident memory with the pmem command.  If it continually grows, there is likely a memory leak.


Memory Performance Tuning

The default configuration of many systems are usually not optimal settings given the range of memory requirements from diverse applications.  There are system parameters that can be tweaked to provide better system performance in regards to memory utilization, especially for larger and faster systems available today.  The following are general system parameters I tweak on new server systems to optimize memory performance in /etc/system:

set priority_paging=1                ç Only if using Solaris 7 or 2.6
set fastscan=131072
set handspreadpages=131072
set maxpgio=65536

Enable priority paging if using Solaris 7 or Solaris 2.6 with patch 105181 applied.  Priority paging ensures that file pages will be paged out before paging out application pages.  Solaris 8 and above has a new third generation memory algorithm which handles this feature.

Fastscan is maximum number of pages per second that the system looks at when memory pressure is highest.  That is, it is the number of pages the scanner daemon tries to scan when invoked by free memory falling below lotsfree.  The default is 8192.   Handspreadpages should be set to the same size of fastscan.  Keep in mind that fastscan and handspreadpages should be less than or equal to total physical memory / 4.

Maxpgio is the maximum number of page I/O requests that can be queued by the paging system. This number is divided by 4 to get the actual maximum used by the paging system. It is used to throttle the number of requests as well as to control process swapping.  There is really no harm in setting this value too large.

论坛徽章:
0
2 [报告]
发表于 2007-07-04 20:16 |只看该作者
最进 风 MS k了不少英文资料

论坛徽章:
0
3 [报告]
发表于 2007-07-05 08:44 |只看该作者
斑竹应该把SUN的优调资料都发出来一下

   斑竹有SA400的资料吗?

论坛徽章:
0
4 [报告]
发表于 2007-07-05 12:52 |只看该作者
可惜memtool不支持solaris10

论坛徽章:
2
IT运维版块每日发帖之星
日期:2016-03-19 06:20:00数据库技术版块每日发帖之星
日期:2016-07-05 06:20:00
5 [报告]
发表于 2007-07-05 14:33 |只看该作者
prstat这个命令也不错的,另外在S10中不是有dtrace吗.

论坛徽章:
0
6 [报告]
发表于 2007-07-05 15:48 |只看该作者
sun公司真应该多考虑考虑一下中国啊
多出的中文的资料

论坛徽章:
0
7 [报告]
发表于 2007-07-05 18:22 |只看该作者
原帖由 bencyber 于 2007-7-5 14:33 发表
prstat这个命令也不错的,另外在S10中不是有dtrace吗.


dtrace不会用啊。。。。赫赫。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP