An inspection finding is a description of what was observed. A repair decision is a commitment about what to do about it. The gap between these two things is wider than it might appear, and how that gap is navigated — with rigour or with guesswork — has a direct effect on outage cost, machine reliability and the confidence with which you reassemble the machine. This article describes a structured approach to that evaluation process.
Why Evaluation Is a Distinct Step
In practice, repair decisions during outages are often made implicitly rather than explicitly. A bearing looks bad, so it gets replaced. A seal shows some wear, so it gets replaced. Over time, this pattern drives outage costs up without a clear understanding of whether all those replacements were necessary — and whether other things that were not replaced should have been.
The alternative is to treat the evaluation as a distinct step, separate from both the inspection and the repair work, with its own inputs and outputs. The inputs are the inspection findings plus the relevant operating history. The output is a prioritised list of recommended actions, with clear reasoning for each. This makes the decision process visible and defensible — both at the time and when looking back.
What Determines Severity
Severity is not the same as the size or visibility of a finding. A small crack in a high-stress location can be more severe than extensive surface corrosion on a non-structural surface. Severity is determined by the answer to one question: what is the risk to the machine if this finding is left unaddressed until the next scheduled maintenance event?
Several factors affect severity:
- Location: is the finding in a stress-critical location? On a surface that carries load? Near a seal or clearance that affects performance?
- Failure mode: if this finding progresses, how does it fail? Does it give warning (rising vibration, rising temperature) or does it fail suddenly? Failure modes that give warning are more manageable than those that don't.
- Rate of progression: is there evidence that the finding is developing quickly or slowly? A finding that has been present but stable since the last outage is different from a finding that was not present last outage and has now developed.
- Consequence of failure: what damage occurs downstream if this component fails in service? A bearing failure that trips the machine is serious. A bearing failure that also damages the journal and requires extended repair is substantially more serious.
The Repair / Monitor / Accept Decision
Most findings fall into one of three categories:
Repair in this outage
The finding poses a risk that is not acceptable for the next operating period. The repair should be scoped and completed before reassembly. This category requires the clearest justification, because adding repair scope during an outage has cost and schedule implications.
Monitor through next operating period
The finding does not require immediate repair but should be tracked through the next operating period with specific monitoring parameters. This is a valid outcome — not every finding requires repair — but it requires defining what monitoring will occur, what thresholds would trigger re-evaluation, and who is responsible. "Monitor" without these details is not a useful decision.
Accept without monitoring
The finding is within normal variation for this machine and type of service, represents no meaningful risk, and does not require any specific follow-up. This should be documented explicitly, not assumed by silence.
Every significant finding should have an explicit decision recorded against it — repair, monitor or accept — with the reasoning. A finding that is simply not mentioned in the outage conclusion report is indistinguishable from a finding that was overlooked. The absence of a documented decision is itself a risk: if something goes wrong in the next operating period, nobody will know whether the relevant finding was assessed and accepted, or never considered.
Using Operating History in the Evaluation
Inspection findings cannot be evaluated in isolation from the machine's operating history. Two questions from operating history are particularly important:
- Was there a trigger event? Findings that developed after a specific operational event (an emergency shutdown, an overspeed, an abnormal load transient) have a different significance than findings that developed gradually. A trigger event finding is more likely to require investigation of cause before deciding on repair scope.
- What did the previous inspection show? Comparing current findings to previous outage findings reveals whether a condition is stable, progressing or has resolved. A finding documented as "minor, accept" in the previous outage that has now worsened clearly requires a different decision than it did before.
Common Evaluation Errors
Replacing without assessing cause
Replacing a worn bearing without determining why it wore prematurely addresses the symptom but not the cause. If the cause is a contaminated oil supply, insufficient oil film pressure, or electrical current damage to the bearing surface, the replacement bearing will follow the same degradation path. Finding the cause is part of the evaluation, not an optional extra.
Accepting progressive findings as normal
A finding that has been present for several outage cycles and is "always there" can gradually progress to a point where it requires more significant intervention. Familiarity with a finding does not mean it is benign. Each outage should re-evaluate whether the current state of a known finding still meets the criteria that led to its last "accept" or "monitor" decision.
Scope creep without evaluation
During an outage, it is tempting to add repair scope for things that "look like they might become a problem." Sometimes this judgement is correct. But scope added without a clear severity assessment creates work that may not have been necessary and makes it harder to understand the true outage cost-to-reliability relationship over time.
Outages operate under time pressure. As the outage window closes, there is strong pressure to make quick decisions and get the machine back online. This is precisely when the evaluation step is most likely to be skipped or compressed. Building time for the evaluation explicitly into the outage schedule — rather than treating it as something that happens informally — is the most practical way to protect it.
Decision Table: Common Finding Types
| Finding type | Key evaluation questions | Typical outcome |
|---|---|---|
| Bearing surface damage (scoring, spalling) | Extent of damage; oil analysis for particles; journal surface condition | Replace + investigate cause |
| Blade tip erosion | Depth of erosion; effect on clearance; rate vs. previous outage | Repair if clearance-critical; otherwise monitor rate |
| Seal segment wear | Current clearance vs. design; expected clearance growth next period | Replace if clearance exceeds limit or trend indicates failure before next outage |
| Coupling bolt fretting | Extent; whether alignment was within spec; bolt torque records | Replace bolts; check alignment; investigate if severe |
| Surface corrosion (external, non-structural) | Is it progressing? Is it near a seal or clearance? | Clean and treat; accept or monitor based on location |
| Casing joint face leakage evidence | Source of leakage; joint surface flatness; bolt load records | Re-surface if warranted; re-seal; check bolt torque sequence |
| Deposit accumulation on blades | Type of deposit; effect on blade profile; mass unbalance risk | Clean; investigate steam/water quality if recurring |
The Output: What a Useful Evaluation Looks Like
A useful evaluation produces a document that can be handed to the person authorising repair scope with the following information for each significant finding:
- Description of the finding (location, nature, extent)
- Assessment of severity and the reasoning behind it
- Recommended action: repair / monitor / accept
- If repair: what repair, to what specification, and what additional investigation is required before or during repair
- If monitor: what parameters, what thresholds, and review timeline
- If accept: why the finding is within acceptable limits for the next operating period
This document also becomes part of the outage record and the basis for evaluating the same machine at the next outage. The value of structured evaluation compounds over outage cycles: after three or four outages of structured evaluation, you have a clear picture of how this machine deteriorates, which components are most at risk, and where maintenance effort is best invested.