r1 - 11 Jul 2007 - 19:56:12 - JaymeCox? You are here: TWiki >  Main Web > ProcessManagement
Also See: http://en.wikipedia.org/wiki/ITIL

Change Control Basics (Draft)

The following process documentation describes a proposal for basic change control measures within the Production Environment. This fundamental approach to change management is meant to both minimize issues and to make trouble-shooting issues easier and, therefore, faster and less disruptive. Ultimately, the goal of any Production Hosting Operations team is to create and maintain stability.


This document covers 7 different types of changes:


  1. Nagios/Groundwork Changes
  2. OS Package Adds
  3. Kernel changes and OS Upgrades
  4. Operational Code Rolls
  5. Host Configuration Changes
  6. Automated Updates (e.g., IP Tables)
  7. Device and Appliance Configuration Changes

In ALL cases changes require the following:


  1. A test plan if testing in staging is required prior to roll-out to production.
  2. Exec Approval: A ticket opened and Assigned to the Director of Internet Ops (Closed Ticket=Approval)
  3. Documented Roll-out Plan (in ticket)
  4. Documented Roll-back Plan (in ticket)
  5. Communications Plan (in ticket - must include email notification to appropriate parties 24 hours prior to change)

Any person making changes must be able to answer "yes" to all of the following questions:


  1. Do I fully understand the nature of the changes I am about to make?
  2. Have I opened a ticket and do I have my management's approval to execute these changes?
  3. Is this the proper time to make these changes?
  4. Can I completely back-out my changes, should I need to stop this process?
  5. Do I have the support (engineering, operations, customer support) needed to successfully complete these changes?
  6. Does everyone effected by these changes know when the changes are going to be made?

Maintenance Windows for Changes:


  1. Established Weekly Maintenance Windows, when non-emergency code and system changes may be implemented, is every Tuesday and Thursday from 2pm-6pm PST/PDT.
  2. Engineers may request to roll-out code changes at other times (aka Emergency Changes).
  3. Code roll-outs and systems modifications require the approval of the Director of Internet Ops (via Ticket System).
  4. Emergency Changes require the approval of the Director of Internet Ops (via Ticket System).

Nagios/Groundwork Changes

This includes code changes, configuration changes, hardware and any software or network changes. It is not necessary to test these changes in staging prior to implementation but roll-out of changes requires the approval of the Director of Internet Ops.

OS Package Adds

These include any updates or additions to the existing base OS. It is absolutely necessary to test these changes in staging prior to roll-out to production. No OS changes should be made simply on the grounds that a patch is "reported to fix" a specific issue. All patches must be tested and show verifiable proof to address specific issues and to interact cleanly with existing software and systems.

OS Upgrades

These changes entail a complete replacement of the current version of an operating system precipitated by lack of support for the current OS, security concerns, licensing issues or opportunism with respect to specific feature support. It is absolutely necessary to test these changes in staging prior to roll-out to production. No OS upgrades will be made without complete and thorough system testing and prior consultation with service owners in engineering teams.

OS upgrades require the following items and approvals:

  1. OS Upgrade Plan and Schedule - What is it we're trying to achieve, why and how?
  2. OS Upgrade Budget - what is the total cost of the upgrade?
  3. OS Upgrade Test Plan - How will we assure that we understand the impact of the change to applications and infrastructure
  4. OS Upgrade Test Results and Evaluation - How's it going? What problems have we seen and how are will we overcome thiose issues?

Operational Code Rolls

Operations opens a ticket at least 2 business days prior to the planned code release date. The ticket should include the following:

  1. Code Release Date and time
  2. Names of Hosts to which Changes will be applied
  3. Roll-out and Roll-back Plan/Steps
  4. Potential Service Impacts, Dependencies
  5. Associated Monitoring requirements, including services, alerts, individuals to be alerted

Host Configuration Changes

Host configuration changes include any changes short of full OS upgrades. This includes changes to any OS components, the Linux kernal, file system, mounts, environmental variables, hardware sweaps/upgrades, etc.

Device and Appliance Configuration Changes

These changes encompass edits to switches, routers and hubs, Firewall and VPN devices, load balancers and other "appliance" servers. In many cases, these units fall into the category of production infrastructure.
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions

tip TWiki Tip of the Day
Escaping TWiki rendering
Use the verbatim tag to surround code excerpts and other formatted text with verbatim and /verbatim ... Read on Read more

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback