服务器自动化运维管理方案
本帖最后由 king_819 于 2012-05-10 17:34 编辑一个完善的自动化运维体系至少要包括系统预备、配置管理以及监控报警3个功能模块:
1. 系统预备
i. 自动化安装操作系统及常用软件包
2. 配置管理
i. 自动化部署业务系统软件包并完成配置
ii. 远程管理服务器(开关服务等)
iii. 变更回滚
3. 监控报警
i. 服务器可用性、性能、流量、安全监控等
ii. 向管理员发送报警信息
好东西啊,都是经验的沉淀 多谢king兄分享 顶下先!~~~~ xuexi l 系学习了 感谢分享! 公司名称:思科(Cisco)上海研发中心
公司行业:电信/互联网系统及设备
公司性质:外商独资.外企办事处
公司规模:1000人以上
公司简介:
思科中国研发中心于2005年10月12日在上海市漕河泾经济技术开发区正式成立。自成立之后,思科中国研发中心快速成长。已成为思科海外第二大战略研发中心,总体投资达到1亿美金。期间,思科在全球并购了数十家在独自的市场和技术领域内遥遥领先的成长型公司,包括科学亚特兰大(Scientific Atlanta),网讯(WebEX)等,并首次在中国进行了并购(DVN 机顶盒业务)。2011年11月,思科中国研发中心在杭州、苏州、合肥成立分公司,标志着作为公司重点战略投资之一的网讯公司整合的顺利完成,至此思科中国研发中心拓展到了6个城市(上海、杭州、苏州、合肥、北京和深圳),思科中国研发中心聚集了来自中国及全球著名大学和优秀企业。
请有意向的朋友准备好中英文简历, 投递至 crdc-mars-hiring@external.cisco.com
工作地点: 上海徐汇漕河泾开发区
邮件标题:姓名+学校+工作年限+职位标题
-------------------------------------------------------------------------------------------------------------------
Job Title: Senior Cloud Service Reliability Engineer
Job Description:
Cisco is Seeking a capable Service Reliability Engineer with experience deploying and managing web services and applications in cloud based hosting environments to join our DevOps/Service Excellence team.
Join the Enterprise Networking Group's OnPlus Services team,A dynamic, growth organization with responsibility for building, deploying and running cloud-based network management for a growing $20B USD networking business. OnPlus is positioned to gain preference and loyalty among Cisco's enterprise, mid-market and small business partners and customers worldwide.
Responsibilities:
- Work closely with marketing, development and test organizations to gain an understanding of our market, customer and product requirements and use cases and determine critical network, system and application thresholds
- Design, implement and maintain web service deployment and managementinfrastructure for a geographically disperse, global Cloud Computing Environment
- Develop and deploy tools for monitoring service health status and analysis of performance metrics
- Work closely with management to develop, implement and document operational procedures for OnPlus Cloud Services
- Develop guidance that ensures scalability and high availability while minimizing performance issues
- Work closely with development engineering to evaluate and determine production readiness of software releases using Agile methods
- Analyze complex system behavior, performance and application issues
- Provide technical mentoring to Cloud Service Operations team members
- Become active member of a level 3 Subject Matter Expert team and part of a 24x7 on call rotation
- Educate, lead and assist Cisco Small Business Support Center level 1 and level 2 engineers in handling customer cases
Education and Experience Requirements:
- BSEE/CS plus 6 years related experience, or MSEE/CS plus 4 years related experience
- 4 years Unix/Linux experience, including kernel configuration, high availability and scalability design, performance analysis and tuning and automation scripting.
Preferred Qualifications:
- Flexible and adaptable with a strong work ethic and the ability to work independently as well as within a team environment
- Ability to take ownership and to lead, influence and mentor other DevOps engineers
- Strong problem solving and debugging skills in complex situations
- Proven ability to manage high priority tasks with competing priorities and with meticulous attention to detail
- Work effectively in a fast-paced and constantly changing growth environment
- Excellent skills in written and verbal communication
- Ability to collaborate with cross functional teams globally
- Proven ability to work effectively in a UNIX environment.
- Demonstrable programming skills with shell scripting, Perl, PHP, and Java
- Experience withserver config/deployment automation tools (Puppet, Chef etc) a plus
- Experience with Business Continuity and Disaster Recovery systems. Symantec NetBackup experience a plus
- Experience with IP networking, including familiarity with the functionality, operating, and failure modes of networks.
- Experience designing first and mid mile monitoring and proactively isolating worldwide internet issues
- Experience with server management and monitoring tools, Nagios experience a plus
- Experience with OS Virtualization, VMWare and Amazon AWS experience a plus
- ITIL v3 Foundation level certification a plus
回复 6# brianself
这次就不删帖了,下次直接去招聘版块发跟招聘相关的帖子 感谢分享:lol:lol 受教了。 不错。