Database Reference
In-Depth Information
Chapter 12
Incident Management
by Pete Sharman
The goal of incident management is to enable administrators to monitor and resolve service disruptions that may
be occurring in their data centers as quickly and efficiently as possible. Instead of managing the numerous discrete
events that may be raised as the result of any of these service disruptions, we want to manage a smaller number of
more-meaningful incidents, and to manage them based on business priority across the lifecycle of those incidents.
To do this, EM12c provides a centralized incident console called Incident Manager that enables an administrator
to track, diagnose, and resolve these incidents, as well as providing features to help eliminate the root causes of
recurrent incidents. Incident Manager also includes features to tie in to Oracle expertise via relevant My Oracle
Support knowledge base articles and documentation to enable administrators to accelerate the process of diagnosing
and resolving incidents and problems. Finally, Incident Manager also offers the ability to do lifecycle operations for
incidents, so you can assign ownership of an incident to a specific user, acknowledge an incident, set a priority for an
incident, track an incident's status, escalate an incident, or suppress it so you can defer it to a later time. You can also
raise notifications on an incident or open a help-desk ticket via the help-desk connectors.
Because Incident Manager is brand new functionality in EM12c, this chapter provides the following:
An explanation of the new terminology, including events, incidents, and problems
An introduction to the user interface so you will be able to set up Incident Manager
Some suggested guidelines on how to get the most out of this new functionality
Incident Manager Terminology
Before drilling into how you use the Incident Manager functionality, you need to understand some of the new
terminology. Let's start with looking at events and incidents, and how they are differentiated.
Events and Incidents
Enterprise Manager continues to be the primary tool for managing and monitoring the Oracle data center, so it
manages and monitors Oracle applications as well as the application stack, from application servers to databases to
hosts, and the operating system. When Enterprise Manager detects issues in any of this infrastructure, it raises events.
The events might be any of the following:
Metric alerts : These alerts (for example, CPU utilization or tablespace usage alerts) indicate
that a critical threshold you set has been crossed.
Job events : These could be caused by a failure in a job as you are using the job system.
An event is raised to signal the failure of a particular job.
 
Search WWH ::




Custom Search