免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
楼主: 兰花仙子

[仙子创造]用Perl写一套服务监控系统 [复制链接]

论坛徽章:
0
发表于 2007-01-14 09:01 |显示全部楼层
好东西呀 好好学习 天天向上

论坛徽章:
0
发表于 2007-01-14 09:34 |显示全部楼层
支持一下。

论坛徽章:
0
发表于 2007-01-14 12:13 |显示全部楼层
[quote]原帖由 | 于 2007-1-14 03:57 发表

  1. 看了一下还不错
  2. 两点建议
  3. 命令你可以这么写
  4. [/quote]

  5. [code]@warn=@{[$cmd{lc $key}()]} and push @warnings,"error @{[@warn]}";
复制代码


这里偶还未看太明白,$cmd{lc $key}()本来就返回一个列表吧?然后用[..]使它成为标量,再用@{[..]}又使它恢复成列表,what意思哪?

论坛徽章:
0
发表于 2007-01-14 12:27 |显示全部楼层
原帖由 langue 于 2007-1-14 09:34 发表
支持一下。

那都有您的身影啊,随便支持一下楼主

论坛徽章:
0
发表于 2007-01-14 14:41 |显示全部楼层
学习~~~

论坛徽章:
0
发表于 2007-01-14 18:23 |显示全部楼层
一般般,也就唬弄新手的水平
我给大家推荐一款监控的软件,mon
不但能发现错误,还能自己写一些修复功能模块
新浪就是用的这个mon监控。

论坛徽章:
0
发表于 2007-01-14 18:32 |显示全部楼层
斑竹的雕虫小伎,也敢拿出来哗众取宠,可笑国内perl果真后继无人!

论坛徽章:
0
发表于 2007-01-14 18:46 |显示全部楼层
何况uptime用来做负载评判,就显示出斑竹的水平很粗劣了。(至少做事不够严谨)
给你看一些材料,也让你好好羞愧一下,(注:无知不是你的错,错了还要拿出来炫耀那就是人品问题了)
Linux and Unixes have excellent metric of system load called “loadavg”. In fact load average is is 3 numbers which correspond to “load average” calculated for one five and 15 minutes. It is computed as exponential moving average so most recent load have more weight in the value than old one.

What does Load Average corresponds to ? At least on Linux it is number of processes which are in “running” state or in “uninterruptable sleep” state which typically corresponds to disk IO. You can also map LoadAvg to VMSTAT output - it is something like moving average of sum of “r” and “b” columns from VMSTAT.

Obviously minimum value for LoadAvg is zero which corresponds to completely idle system, and there is no maximum  

First thing to understand about LoadAvg it does not really tell you if it is CPU bound load or IO bound load. For example if you have LoadAvg of 10 it may mean there are 10 processes/threads actively consuming CPU or it could be same 10 processes waiting on disk IO and you can see CPU utilization being close to zero.

Second thing is to understand LoadAvg values are relative to your system size. If you have single CPU and 1 disk loadavg of 2 can be considered significant, while if you have 16 CPUs and 2 disks Load of 4 can be light if it is CPU bound - because the system can execute much more CPU bound tasks in parallel or High if it is Disk Bound LoadAvg.

Low Load Average does not mean there are no performance problems, for example if you run single batch job on the server with MySQL, Load Average is likely to be close to 1 even if there are a lot of CPUs and Disks - system may be quite idle and performance still poor because application is not parallel enough. Similar situations can happen if there is a lot of network IO involved or if there are a lot of locks (table/row level locks) or other limiting factors such as innodb_thread_concurrency.

The most interesting question I think is how LoadAvg represent box load in terms of how much load it can handle before it becomes to slow down or being completely unable to handle the load, and it is tricky question. Both for CPUs and for Disk there are two stages request can be. It can be ether currently executing or queued for further execution. The time which is needed to complete request is sum of time it was really executed and the time it was spent in queue. As the system is loaded response time starts to increase mainly because of time requests spend waiting in various queues and waiting on locks, the time of true execution may well remain constant. This is a bit of simplifications as there are number of other effects coming in play but good enough for sake of explanation.

What does it mean from LoadAvg standpoint ? You need to understand where parallel execution continues and where waiting in the queue starts. If you have fully CPU bound workload which is rather parallel (ie many queries will run at once) and you have 4CPUs until your LoadAvg is below 4 you have low time spend waiting for CPUs to be free to do the work. There is some wait but not much. So if you have LoadAvg of 1 and your workload scales linearly with number of connections and CPUs (ie there are no row waits involved) you can assume box can handle up to 3-4 times more load before response time starts to suffer.

If however the LoadAvg is 4 already it may take rather insignificant increase to take it up to 8 and you will see some delays due to queuing. If there are 4CPUs (Cores) and loadavg is 16 for CPU bound workload it often means requests should take 4 times more to complete than they would on idle box due to waiting in the queue.

Same true for pure Disk IO bound workload with small difference of disk not being replaceable (if you’re waiting on one drive you can’t use another drive instead), and the fact disks can optimize multiple outstanding requests a bit better compared to requests coming one after another.

For mixed workload, which is what we usually see in practice you have to do some assumptions guesses or further analyzes if you want good estimates. Ie you may want to check mpstat, vmstat and iostat to see where load comes from. But the general rule remains the same - until you’re able to explore parallel abilities of the box it will perform well as soon as you need to do a lot of queuing performance starts to suffer.

Let us clarify last point - how much more load the box can handle before it overloads, loadavg skyrockets and it becomes as good as down. First for many applications request inflow is not constant - ie web site gets poor response time and users do not spend so much time on it any more so load drops. This is however temporary relive only as there are stubborn users which would not go away even with slow responding site until their browsers timeout, which is as good as site is down. There are too many variables to come with exact numbers but generally as soon as you have long queuing started it may take just 10-20% extra load to overload system, so it is better to keep loadavg low - below number of CPUs and/or disks you have.

I must note - LoadAvg is not perfect tool for the task. It is just almost always available unlike other metrics. It is best to have profiling information so you can see as response time for your requests starts to grow. As soon as it becomes to grow with no good reason I would start to worry whatever LoadAvg shows.

论坛徽章:
0
发表于 2007-01-14 20:45 |显示全部楼层
原帖由 helbreathszw 于 2007-1-14 18:32 发表
斑竹的雕虫小伎,也敢拿出来哗众取宠,可笑国内perl果真后继无人!


Hmmm,偶发这篇帖子的背景,是为了回复这篇帖子:
http://bbs.chinaunix.net/viewthr ... &extra=page%3D1

你才在哗众取宠吧?
BTW:偶就在GZ SINA,别以为自己什么都了解,哼!

论坛徽章:
0
发表于 2007-01-14 20:58 |显示全部楼层
晕,居然都是一家人
你应该是广州的网讯的吧!新浪的无线技术平台。

[ 本帖最后由 helbreathszw 于 2007-1-14 21:02 编辑 ]
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

DTCC2020中国数据库技术大会 限时8.5折

【架构革新 高效可控】2020年8月17日~19日第十一届中国数据库技术大会将在北京隆重召开。

大会设置2大主会场,20+技术专场,将邀请超百位行业专家,重点围绕数据架构、AI与大数据、传统企业数据库实践和国产开源数据库等内容展开分享和探讨,为广大数据领域从业人士提供一场年度盛会和交流平台。

http://dtcc.it168.com


大会官网>>
  

北京盛拓优讯信息技术有限公司. 版权所有 16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122
中国互联网协会会员  联系我们:huangweiwei@it168.com
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP