Taking Care – Error Handling - Applied SOA Patterns on the Oracle Platform

Database Reference

In-Depth Information

6. Our log monitoring tools (Oracle BAM, Nagios, and so on) can control individual

service runtime metrics and task service footprints in general. BAM connectors

are explained in the next chapter.

7. Obviously, we have our original message logged (orange step 1; refer to the fig-

ure from the Maintaining Exception Discoverability section), as well as option-

ally, the message with header and tracing records at the moment our composite

application crashes.

Tip

Why is the crash record with message payload optional? Because we simply can-

not count on it. If we can, then the first and second lines of defense discussed pre-

viously will be more than enough. However, how many times in complex com-

positions do messages just disappear without trace? At best, you can only have a

record indicating that a response was sent from the composition member that nev-

er reached the destination. This is exactly the situation we are discussing here,

and our primary goal is to keep the business running and buy some time for Ops

to find the root cause.

Regarding the last position, proactive monitoring and fault prevention generally means

that you collect and analyze technical and business data for an extended period of time

and analyze them against different thresholds, which are specific to individual processes

at the time of execution. You already have a comprehensive list of WLS/SOA MBean at-

tributes to monitor. After completing your homework (sorry, we do not have enough space

here to explain the meaning of every attribute), you will learn that, for instance, Hogging

Thread Count indicates we have some threads that take too much time; we can assume

that they will never be returned (send an alert when you have more than 10 of these).

What should you do? Increase the thread pool, maybe? Yes, it might help, but only a little

and for a short period of time. If you start getting this error after deploying a new compos-

ite, most probably we will have an indication stating that it is poorly designed and not

misconfigured.

Therefore, technical monitoring must be combined with functional monitoring to select a

proper action; bare minimum policies should be as follows:

• Every abstract process must be monitored according to the SLA of the total exe-

cution time. That's it! From start to stop (from the composition's initiation until its

delivery to the ultimate receiver), we must have two minutes (set your number

here), not more. Simple isn't it? Yes, if you assure that start/stop indications are

Search WWH ::

Custom Search

Home