Risk Analysis Methodologies
BACK to TOP
BACK to HOME
Any comments, please e-mail me at thk@pacific.net.sg
1.
Qualitative risk analysis methodologies
In the this section, we will deal with the qualitative
methods used in risk analysis namely preliminary risk analysis(PHA), hazard
and operability study(HAZOP), and failure mode and effects analysis (FMEA/FMECA).
1.1
Preliminary Risk Analysis
Preliminary Risk Analysis Preliminary risk analysis
or hazard analysis[1, 2, 3, 4, 5] is a qualitative technique which involves
a disciplined analysis of the event sequences which could transform a potential
hazard into an accident[1]. In this technique, the possible undesirable
events are identified first and then analysed separately. For each undesirable
events or hazards, possible improvements, or preventive measures are then
formulated.
The result from this methodology provides a basis
for determining which categories of hazard should be looked into more closely
and which analysis methods are most suitable. Such an analysis also proved
valuable in the working environment to which activities lacking safety
measures can be readily identified. With the aid of a frequency/ consequence
diagram, the identified hazards can then be ranked according to risk, allowing
measures to be prioritized to prevent accidents.
BACK to TOP
1.2
Hazard and Operability studies(HAZOP)
The HAZOP technique was developed in the early
1970s[7] by Imperial Chemical Industries Ltd[1]. HAZOP[1, 2, 7, 8, 17]
can be defined as the application of a formal systematic critical examination
of the process and engineering intentions of new or existing facilities
to assess the hazard potential that arise from deviation in design specifications
and the consequential effects on the facilities as a whole.
This technique is usually performed using a set
of guidewords : NO/NOT, MORE OR/LESS OF, AS WELL AS, PART OF REVERSE, AND
OTHER THAN. From these guidewords, scenarios that may result in a hazard
or an operational problem is identified. Consider the possible flow problems
in a process line, the guide word MORE OF will correspond to high flow
rate, while that for LESS THAN, low flow rate. The consequences of the
hazard and measures to reduce the frequency with which the hazard will
occur are then discussed. This technique had gained wide acceptance in
the process industries as an effective tool for plant safety and operability
improvements. Detailed procedures on how to perform the technique are available
in literature [7, 17].
BACK to TOP
1.3
Failure Mode and Effects Analysis(FMEA/FMECA)
This method was developed in the 1950s by reliability
engineers to determine problems that could arise from malfunctions of military
system. Failure mode and effects analysis[1, 2, 3, 5, 6, 7, 8, 9, 17, 25,
26, 27, 28, 29, 30, 31, 58] is a procedure by which each potential failure
mode in a system is analysed to determine its effect on the system and
to classify it according to its severity[1].
When the FMEA is extended by a criticality analysis, the technique
is then called failure mode and effects criticality analysis(FMECA). Failure
mode and effects analysis has gained wide acceptance by the aerospace and
the military industries[7]. In fact, the technique has adapted itself in
other form such as misuse mode and effects analysis[15].
Detail procedures on how to carry out an FMEA
and its various applications in the different industries have been documented
in [16]. On the other hand, the way to evaluate the criticality index is
available in [1] and [3]. The use of knowledge base system for the automation
of FMEA process have been discussed in [25, 26, 27], whereas the use of
causal reasoning model for FMEA is documented in [28]. An improved FMEA
methodology which uses a single matrix to model the entire system and a
set of indices derived from probabilistic combination to reflect the importance
of an event relating to the indenture under consideration and to the entire
system is presented in [29, 30]. A similar approach was made in literature
[31] to model the entire system using fuzzy cognitive map.
BACK to TOP
1.4
Discussion and Conclusion
The three techniques outlined above requires
only the employment of hardware familiar personnel. However, FMEA tends
to be more labour intensive, as failure of each individual components in
the system has to be considered. A point to note is that these qualitative
techniques can be used in the design as well as operational stage of a
system.
All the techniques mentioned above have seen
wide usage in the nuclear power plant and chemical processing plant. In
fact, FMEA, one of the most documented, has been used by Intel[52] and
National Semiconductor[53] to improve the reliability of their product.
For the case of preliminary risk analysis, it has seen application in safety
analysis[2] as well as offshore platform[1]. HAZOP, on the other hand,
has been widely used in the chemical industries[3] for detailed failure
and effect study on the piping and instrumentation layout.
BACK
to TOP BACK
to HOME
2
Tree based techniques
In this section, fault-tree analysis(FTA), event-tree
analysis(ETA), cause- consequence analysis(CCA), management oversight risk
tree(MORT) and safety management organisation review technique (SMORT)
will be discussed.
2.1
Fault tree analysis
The concept of fault tree analysis (FTA)[1, 2,
3, 4, 5, 6, 7, 8, 10, 11, 17, 23] was originated by Bell Telephone Laboratories
in 1962 as a technique with which to perform a safety evaluation of the
Minutemen Intercontinental Ballistic Missile Launch Control System[23].
A fault tree is a logical diagram which shows the relation between system
failure, ie. a specific undesirable event in the system, and failures of
the components of the system[2]. It is a technique based on deductive logic.
An undesirable event is first defined and causal relationships of the failures
leading to that event are then identified.
Figure 1 : A fault tree depicting
the event "Fire breaks out".
Fault tree can be used in qualitative or quantitative
risk analysis. The difference in them is that the qualitative fault tree
is looser in structure and does not require use of the same rigorous logic
as the formal fault tree[7]. Figure 1 shows a fault tree with top event
"Fire breaks out". This method is used in a wide range of industries and
there is extensive support in the form of published literature and software
packages, such as CARA[2]. An application of fault tree analysis on causal
relations for large vehicle accidents is documented in [11]. On the other
hand, detailed descriptions on how to carry out fault tree analysis are
given in literature [1, 3, 7].
BACK to TOP
2.2
Event tree analysis
Event tree analysis[3, 5, 6, 7, 8, 10, 17] is
a method for illustrating the sequence of outcomes which may arise after
the occurrence of a selected initial event. This technique, unlike fault
tree uses inductive logic. It is mainly used in consequence analysis for
pre-incident and post-incident application. The left side connects with
the initiator, the right side with plant damage state; the top defines
the systems; nodes (dots) call for branching probabilities obtained from
the system analysis. If the path goes up at the node, the system succeeded,
if down, it failed.
ETA has seen application in the nuclear industries
for operability analysis of nuclear power plant as well as accident sequence
in the Three Mile Island-2 reactor’s accident[6].
BACK to TOP
2.3
Cause-Consequence Analysis
Cause-consequence analysis(CCA)[2, 3, 5, 8, 17]
is a blend of fault tree and event tree analysis[17]. This technique combines
cause analysis (described by fault trees) and consequence analysis (described
by event trees), and hence deductive and inductive analysis is used. The
purpose of CCA is to identify chains of events that can result in undesirable
consequences. With the probabilities of the various events in the CCA diagram,
the probabilities of the various consequences can be calculated, thus establishing
the risk level of the system. Figure 2 below shows a typical CCA.
Figure 2 : A typical Cause-Consequence Analysis
This technique was invented by RISO Laboratories
in Denmark to be used in risk analysis of nuclear power stations[2]. However,
it can also be adapted by the other industries in the estimation of the
safety of a protective or other systems[2]. Details on how to carry out
cause consequence analysis as well as the benefits and restrictions of
it are documented in literature [2, 17].
BACK to TOP
2.4
Management Oversight Risk Tree
Management oversight risk tree(MORT) was developed
in the early 1970s[2] for the U.S. Energy Research and Development Administration
as safety analysis method that would be compatible with complex, goal-oriented
management systems[17]. MORT[2, 8, 12, 17, 21, 22, 23] is a diagram which
arranges safety program elements in an orderly and logical manner. Its
analysis is carried out by means of fault tree, where the top event is
“Damage, destruction, other costs, lost production or reduced credibility
of the enterprise in the eyes of society” [2]. The tree gives an overview
of the causes of the top event from management oversights and omissions
or from assumed risks or both.
The MORT tree has more than 1500 possible basic
events inputed to 100 generic events which have being increasing identified
in the fields of accident prevention, administration and management. A
generic MORT diagram is included at the end of this report. MORT is used
in the analysis or investigation of accidents and events, and evaluation
of safety programs. Its usefulness was revealed in literature [17], “normal
investigations revealed an average of 18 problems (and recommendations).
Complementary investigations with MORT analysis revealed additional 20
contributions per case”.
BACK to TOP
2.5
Safety Management Organization Review Technique
Safety management organization review technique(SMORT)[2,
17] is a simplified modification of MORT developed in Scandinavia[17].
This technique is structured by means of analysis levels with associated
checklists, while MORT is based on a comprehensive tree structure. Owing
to its structured analytical process, SMORT is classified as one of the
tree based methodologies.
The SMORT analysis includes data collection based
on the checklists and their associated questions, in addition to evaluation
of results. The information can be collected from interviews, studies of
documents and investigations. This technique can be used to perform detailed
investigation of accidents and near misses. It also served well as a method
for safety audits and planning of safety measures[2].
2.6
Discussion and Conclusion
The tree-based methods are mainly used to find
cut-sets leading to the undesired events. In fact, event tree and fault
tree have been widely used to quantify the probabilities of occurrence
of accidents and other undesired events leading to the loss of life or
economic losses in probabilistic risk assessment. However, the usage of
fault tree and event tree are confined to static, logic modeling of accident
scenarios[13]. In giving the same treatment to hardware failures and human
errors in fault tree and event tree analysis, the conditions affecting
human behaviour can not be modeled explicitly. This affects the assessed
level of dependency between events. No doubt, there exists techniques such
as human cognitive reliability[5, 12] to reconcile such deficiencies in
the fault tree analysis, new methodologies that model such responses have
emerged.
BACK
to TOP BACK
to HOME
3
Methodologies for analysis of dynamic system
In this section, GO method, digraph/fault graph,
event sequence diagrams, Markov modeling, dynamic event logic analytical
methodology and dynamic event tree analysis method will be discussed.
3.1
GO method
The GO method[5, 13] is a success-oriented system
analysis that uses seventeen operators to aid in model construction[5].
It was developed by Kaman Sciences Corporation during the 1960s for reliability
analysis of electronics for the Department of Defense in U.S
The GO model can be constructed from engineering
drawings by replacing system elements with one or more GO operators. Such
operators are of three basic types : (1) independent, (2) dependent, and
(3) logic. Independent operators are used to model components requiring
no input and the independent operators, require at least one input in order
to have an output. Logic operators, on the other hand, combine the operators
into the success logic of the system being modeled. With the probability
data for each independent and dependent operator, the probability of successful
operation can then be calculated.
The GO method is used in practical application
where the boundary conditions for the system to be modeled are well defined
by a system schematic or other design documents. However, the failure modes
are implicitly modeled, making it unsuitable for detailed analysis of failure
modes beyond the level of component events shown in the system drawing.
Furthermore, it does not treat common cause failures nor provide structural
information(i.e the minimum cut sets) regarding the system. A brief description
of GO flow, which is based on GO method is documented in literature [13].
BACK to TOP
3.2
Digraph/Fault Graph
The fault graph method/digraph matrix analysis[5,
13] uses the mathematics and language of graph theory such as “path set”
(a set of models traveled on a path) and “reachability” (the complete set
of all possible paths between any two nodes)[5].
This method is similar to a GO chart but uses
AND and OR gates instead. The connectivity matrix, derived from adjacency
matrix for the system, shows whether a fault node will lead to the top
event. These matrices are then computer analysed to give singletons (single
components that can cause system failure) or doubletons (pairs of components
that can cause system failure). Digraph method allows cycles and
feed back loops which make it attractive for dynamic system. Figure
3 shows a success oriented system digraph of simplified emergency core
cooling system.
Figure 3 :
Success oriented system digraph of simplified emergency core cooling system
in a nuclear power plant (Ralph R. Fullwood & Robert E. Hall)
BACK to TOP
3.3
Markov Modeling
Markov modeling[1, 3, 5, 7, 8, 14, 18] is a classical
modeling technique used for assessing the time-dependent behaviour of many
dynamic systems[13]. In a ‘Markov chain’ processes, transitions between
states are assumed to occur only at discrete points in time. On the other
hand, in a ‘discrete Markov process’, transitions between states are allowed
to occur at any point in time. For process system, the discrete system
states can be defined in terms of ranges of process variables as well as
component status.
This methodology also incorporates time explicitly,
and can be extended to cover situations where problem parameters are time
independent. The state probabilities of the system P(t) in a continuous
Markov system analysis are obtained by the solution of a coupled set of
first order, constant coefficient differential equations :
dP/dt = M.P(t),
where M is the matrix of coefficients whose off-diagonal
elements are the transition rate and whose diagonal elements are such that
the matrix columns sum to zero. An application of Markov modeling to a
hold-up tank problem is discussed in literature [13], while Pate-Cornell(1993)
used the technique to study the fire propagation for a subsystem on board
a off-shore platform in [14].
BACK to TOP
3.4
Dynamic Event Logic Analytical Methodology
The dynamic event logic analytical methodology(DYLAM)[13,
18, 19] provides an integrated framework to explicitly treat time, process
variables and system behaviour[13]. A DYLAM will usually comprised of the
following procedures : (a) component modeling, (b) system equation resolution
algorithms, (c) setting of TOP conditions and (d) event sequence generation
and analysis.
DYLAM is useful for the description of dynamic
incident scenarios and for reliability assessment of systems whose mission
is defined in terms of values of process variables to be kept within certain
limits in time[19]. This technique can also be used for identification
of system behaviour and thus, as a design tool for implementing protections
and operator procedures[19].
It is important to note that system specific
DYLAM simulator must be created to analyse each particular problem. Furthermore,
input data such as probabilities of a component being in certain state
at transient initiation, independency of such probabilities, transition
rates between different states, and conditional probability matrices for
dependencies among states and process variables need to be provided to
run the DYLAM package. An application of DYLAM on a reservoir problem is
given in literature [18].
BACK to TOP
3.5
Dynamic Event Tree Analysis Method
Dynamic event tree analysis method(DETAM)[13,
20] is an approach that treats time-dependent evolution of plant hardware
states, process variable values, and operator states over the course of
a scenario[20]. In general, a dynamic event tree is an event tree in which
branchings are allowed at different points in time. This approach is defined
by five characteristics set : (a) branching set, (b) set of variables defining
the system state, (c) branching rules, (d) sequence expansion rule and
(e) quantification tools. The branching set refer to the set of variables
that determine the space of possible branches at any node in the tree.
Branching rules, on the other hand, refer to rules used to determine when
a branching should take place (a constant time step). The sequence expansion
rules are used to limit the number of sequences.
This approach can be used to represent a wide
variety of operator behaviours, model the consequences of operator actions
and also served as a framework for the analyst to employ a causal model
for errors of commission. Thus it allows the testing of emergency procedures
and identify where and how changes can be made to improve their effectiveness.
An analysis of the accident sequence for a steam generator tube rupture
is presented in literature [20].
BACK to TOP
3.6
Discussion and Conclusion
The techniques discussed above address the deficiencies
found in fault/event tree methodologies when analysing dynamic scenarios.
However, there are also limitation to their usage. The digraph and GO techniques
model the system behaviour and deal, in limited extend, with changes in
model structure over time. On the other hand, Markov modeling requires
the explicit identification of possible system states and the transitions
between these states. This is a problem as it is difficult to envision
the entire set of possible states prior to scenario development. DYLAM
and DETAM can solve the problem through the use of implicit state-transition
definition. The drawbacks to these implicit techniques are implementation-
oriented[13]. With the large tree-structure generated through the DYLAM
and DETAM approaches, large computer resources are required. The second
problem is that the implicit methodologies may require a considerable amount
of analyst effort in data gathering and model construction.
BACK to TOP
BACK to HOME
Conclusions
A total of 13 risk analysis techniques were reviewed
in the discussion above. Qualitative methodologies though lacking the ability
to account the dependencies between events, are effective in identifying
potential hazards and failures within the system. The tree-based techniques
addressed this deficiency by taking into consideration the dependencies
between each events. The probabilities of occurrence of the undesired event
can also be quantified with the availability of operational data. However,
no one has yet attempted to quantified the undesired top event in a MORT
tree[12].
Currently, research has been made on DYLAM[13,
18, 19] and DETAM[13, 20] to study accident scenarios by treating time,
process variables, system behaviour and operators action through an integrated
framework. These techniques address the problem of having less than adequate
modeling of conditions affecting control system actions and operator behaviour
when using the fault/event tree(e.g. behaviour of plant process variables,
previous decisions by the operating crew)[13]. However, the drawbacks for
these techniques are the requirement for large computer resources and extensive
data collection. With the development of more efficient algorithm and powerful
computer, such methodologies would be widely applied.
BACK to TOP
BACK to HOME
Any comments, please e-mail me, Tan Hiap Keong at thk@pacific.net.sg