How to Evaluate Inspection Findings Before Making Repair Decisions

An inspection finding is a description of what was observed. A repair decision is a commitment about what to do about it. The gap between these two things is wider than it might appear, and how that gap is navigated — with rigour or with guesswork — has a direct effect on outage cost, machine reliability and the confidence with which you reassemble the machine. This article describes a structured approach to that evaluation process.

Why Evaluation Is a Distinct Step

In practice, repair decisions during outages are often made implicitly rather than explicitly. A bearing looks bad, so it gets replaced. A seal shows some wear, so it gets replaced. Over time, this pattern drives outage costs up without a clear understanding of whether all those replacements were necessary — and whether other things that were not replaced should have been.

The alternative is to treat the evaluation as a distinct step, separate from both the inspection and the repair work, with its own inputs and outputs. The inputs are the inspection findings plus the relevant operating history. The output is a prioritised list of recommended actions, with clear reasoning for each. This makes the decision process visible and defensible — both at the time and when looking back.

What Determines Severity

Severity is not the same as the size or visibility of a finding. A small crack in a high-stress location can be more severe than extensive surface corrosion on a non-structural surface. Severity is determined by the answer to one question: what is the risk to the machine if this finding is left unaddressed until the next scheduled maintenance event?

Several factors affect severity:

Location: is the finding in a stress-critical location? On a surface that carries load? Near a seal or clearance that affects performance?
Failure mode: if this finding progresses, how does it fail? Does it give warning (rising vibration, rising temperature) or does it fail suddenly? Failure modes that give warning are more manageable than those that don't.
Rate of progression: is there evidence that the finding is developing quickly or slowly? A finding that has been present but stable since the last outage is different from a finding that was not present last outage and has now developed.
Consequence of failure: what damage occurs downstream if this component fails in service? A bearing failure that trips the machine is serious. A bearing failure that also damages the journal and requires extended repair is substantially more serious.

The Repair / Monitor / Accept Decision

Most findings fall into one of three categories:

Repair in this outage

The finding poses a risk that is not acceptable for the next operating period. The repair should be scoped and completed before reassembly. This category requires the clearest justification, because adding repair scope during an outage has cost and schedule implications.

Monitor through next operating period

The finding does not require immediate repair but should be tracked through the next operating period with specific monitoring parameters. This is a valid outcome — not every finding requires repair — but it requires defining what monitoring will occur, what thresholds would trigger re-evaluation, and who is responsible. "Monitor" without these details is not a useful decision.

Accept without monitoring

The finding is within normal variation for this machine and type of service, represents no meaningful risk, and does not require any specific follow-up. This should be documented explicitly, not assumed by silence.

Documentation principle

Every significant finding should have an explicit decision recorded against it — repair, monitor or accept — with the reasoning. A finding that is simply not mentioned in the outage conclusion report is indistinguishable from a finding that was overlooked. The absence of a documented decision is itself a risk: if something goes wrong in the next operating period, nobody will know whether the relevant finding was assessed and accepted, or never considered.

Using Operating History in the Evaluation

Inspection findings cannot be evaluated in isolation from the machine's operating history. Two questions from operating history are particularly important:

Was there a trigger event? Findings that developed after a specific operational event (an emergency shutdown, an overspeed, an abnormal load transient) have a different significance than findings that developed gradually. A trigger event finding is more likely to require investigation of cause before deciding on repair scope.
What did the previous inspection show? Comparing current findings to previous outage findings reveals whether a condition is stable, progressing or has resolved. A finding documented as "minor, accept" in the previous outage that has now worsened clearly requires a different decision than it did before.

Common Evaluation Errors

Replacing without assessing cause

Replacing a worn bearing without determining why it wore prematurely addresses the symptom but not the cause. If the cause is a contaminated oil supply, insufficient oil film pressure, or electrical current damage to the bearing surface, the replacement bearing will follow the same degradation path. Finding the cause is part of the evaluation, not an optional extra.

Accepting progressive findings as normal

A finding that has been present for several outage cycles and is "always there" can gradually progress to a point where it requires more significant intervention. Familiarity with a finding does not mean it is benign. Each outage should re-evaluate whether the current state of a known finding still meets the criteria that led to its last "accept" or "monitor" decision.

Scope creep without evaluation

During an outage, it is tempting to add repair scope for things that "look like they might become a problem." Sometimes this judgement is correct. But scope added without a clear severity assessment creates work that may not have been necessary and makes it harder to understand the true outage cost-to-reliability relationship over time.

Pressure to close out

Outages operate under time pressure. As the outage window closes, there is strong pressure to make quick decisions and get the machine back online. This is precisely when the evaluation step is most likely to be skipped or compressed. Building time for the evaluation explicitly into the outage schedule — rather than treating it as something that happens informally — is the most practical way to protect it.

Decision Table: Common Finding Types

Finding type	Key evaluation questions	Typical outcome
Bearing surface damage (scoring, spalling)	Extent of damage; oil analysis for particles; journal surface condition	Replace + investigate cause
Blade tip erosion	Depth of erosion; effect on clearance; rate vs. previous outage	Repair if clearance-critical; otherwise monitor rate
Seal segment wear	Current clearance vs. design; expected clearance growth next period	Replace if clearance exceeds limit or trend indicates failure before next outage
Coupling bolt fretting	Extent; whether alignment was within spec; bolt torque records	Replace bolts; check alignment; investigate if severe
Surface corrosion (external, non-structural)	Is it progressing? Is it near a seal or clearance?	Clean and treat; accept or monitor based on location
Casing joint face leakage evidence	Source of leakage; joint surface flatness; bolt load records	Re-surface if warranted; re-seal; check bolt torque sequence
Deposit accumulation on blades	Type of deposit; effect on blade profile; mass unbalance risk	Clean; investigate steam/water quality if recurring

The Output: What a Useful Evaluation Looks Like

A useful evaluation produces a document that can be handed to the person authorising repair scope with the following information for each significant finding:

Description of the finding (location, nature, extent)
Assessment of severity and the reasoning behind it
Recommended action: repair / monitor / accept
If repair: what repair, to what specification, and what additional investigation is required before or during repair
If monitor: what parameters, what thresholds, and review timeline
If accept: why the finding is within acceptable limits for the next operating period

This document also becomes part of the outage record and the basis for evaluating the same machine at the next outage. The value of structured evaluation compounds over outage cycles: after three or four outages of structured evaluation, you have a clear picture of how this machine deteriorates, which components are most at risk, and where maintenance effort is best invested.