KLARF Format Ingestion in Production Defect Pipelines: Parsing Quirks, Version Differences, and Edge Cases
KLARF — KLA Results File — is the de facto standard for inspection tool output from KLA Instruments equipment. If you're ingesting defect data from a Surfscan SP7, a 2930-series patterned wafer inspector, or an eDR-7000 e-beam review tool, you're reading KLARF files. The format has been around since the early 1990s, and it shows: the specification is inconsistently implemented across tool generations, version differences are subtle enough to cause silent parse failures, and edge cases in multi-pass inspection flows are documented poorly or not at all.
In building our ingestion pipeline, we've cataloged every parse failure mode we've encountered in production. This article documents the most important ones.
Version 1.x vs. 2.0: What Actually Changed and What Breaks
KLARF 1.x (most commonly version 1.8) and KLARF 2.0 share the same basic structure — a text file with keyword-delimited records — but they differ in ways that break simple parsers.
The most significant structural difference is in the DEFECT_LIST record. In version 1.8, the defect list is a flat sequence of whitespace-delimited numeric fields in a fixed order. The field order is defined in a preceding ATTR_ORDER record that lists column names. In version 2.0, the defect list uses a more structured column definition block with explicit type annotations, and the ATTR_ORDER record is replaced by a DEFECT_ATTRIBUTE_LIST section with named entries.
A parser written for version 1.8 that tries to read a version 2.0 file by looking for ATTR_ORDER will find nothing and either fail silently or produce an empty defect list. This is the most common KLARF parse failure we've debugged — the file appears to ingest (no exception thrown), but the defect count comes out zero.
Always check the FileVersion field in the header before choosing your parse path:
FileVersion 1 8;
versus:
FileVersion 2 0;
If your pipeline doesn't branch on this field, it will fail on one version or the other in production.
Header Structure: Required Fields and Common Omissions
The KLARF header contains file-level metadata: lot ID, wafer ID, slot number, inspection date/time, inspection tool ID, setup name, and coordinate reference point. In a spec-compliant file, all of these are present and parseable. In production files from real tools, several patterns of non-compliance appear:
- Missing InspectionDate / InspectionTime: Some KLA tool configurations write these fields as empty strings or omit them entirely. If your pipeline uses the inspection timestamp for real-time alerting, you need a fallback — typically the file system modification time on the KLARF file itself, which is at most a few seconds off from the actual inspection completion time.
- Non-standard LotID formatting: The spec allows alphanumeric lot IDs. In practice, fabs use MES-generated lot ID formats that include hyphens, underscores, and in some cases Unicode characters in lot labels. If your parser expects only alphanumeric characters in the LotID field, it will truncate or misread lot IDs with special characters.
- WaferID vs. SlotNumber ambiguity: Version 1.8 files sometimes encode wafer identity as a slot number (1–25 in a standard cassette) rather than a wafer sequence ID. Version 2.0 files typically use an explicit
WaferIDstring. When joining KLARF records to MES lot lineage, using slot number directly can cause join failures when the cassette load order doesn't match the MES wafer sequence.
Multi-Pass Inspection: RESULT_TYPE and the Ambiguous Defect Record
Multi-pass inspection is standard practice on advanced node patterned wafer inspectors. A single wafer goes through two or more sequential inspection algorithms — typically a high-sensitivity pass optimized for particles and a second pass optimized for pattern defects — and each pass produces its own defect list. These are written into the same KLARF file as separate SummarySpec and DEFECT_LIST blocks, distinguished by a RESULT_TYPE field.
The RESULT_TYPE field values are not standardized across tool generations. Common values include 0, 1, NUISANCE, REAL, and tool-specific strings defined in the setup file. A parser that doesn't preserve and route by RESULT_TYPE will either combine defect records from all passes (inflating the defect count and mixing nuisance with real defects) or silently take only the first block (discarding the second-pass results entirely).
For production defect classification pipelines, the correct behavior is to parse all RESULT_TYPE blocks, label each defect record with its source pass, and route the combined set to classification. The classifier then operates on the full defect population with pass-of-origin as a feature — because a particle defect detected only in the nuisance pass has a different kill probability than one flagged in both the high-sensitivity and real-defect passes.
Sub-Die Coordinate System Conventions
KLARF defect coordinates are expressed in microns relative to the wafer center. The coordinate system is defined by the OrientationMark field, which specifies the orientation of the wafer flat or notch. Getting this wrong doesn't cause a parse failure — it causes a silent correctness failure where all defect positions are mirrored or rotated relative to the actual die layout.
The KLA convention for wafer orientation in 1.8 files places the origin at wafer center, with X increasing to the right and Y increasing downward — the same as screen coordinates. This is opposite to the standard cartesian convention used in most fab CAD tools, where Y increases upward. Converting KLARF coordinates to die-level positions for defect-to-layout overlay requires an explicit Y-axis flip. Missing this causes defect annotations to appear in the wrong die quadrant.
Version 2.0 files include an explicit CoordinateOrientation field that specifies the axis convention. Version 1.8 files typically don't, and you need to know the tool configuration to apply the correct transform.
Edge Cases That Cause Silent Pipeline Failures
Beyond version differences and coordinate conventions, several edge cases cause silent failures in production ingestion pipelines:
- File truncation during write: KLARF files are written incrementally by the inspection tool as each wafer is processed. If the tool writes a file before fully completing the defect list (network interruption, tool pre-emption), the file may be syntactically valid up to a certain record and then stop. A parser that reads until EOF without checking for the required closing record terminator will treat a truncated file as a complete one. Always verify that the expected closing record is present before committing the parse result.
- Zero-defect wafers: A wafer with no defects produces a KLARF file with an empty DEFECT_LIST block. Some parsers interpret an empty defect list as a parse failure rather than a valid zero-count result, discarding the record. Zero-defect wafers are important data points for spatial pattern baselines and should be preserved.
- Repeated wafer records in a single file: Some tool configurations write multiple wafer records into a single KLARF file — one per wafer in a batch inspection run. Parsers that expect one RESULT_TYPE block per file will process only the first wafer's data and discard the rest. If your lot covers 25 wafers and your pipeline is only recording data for wafer 1, this is the likely cause.
The pattern we see most consistently in pipeline audits is that KLARF parsers are tested against a small set of well-formed sample files and then deployed to production, where the actual file population includes every edge case listed above. Investing in an exhaustive test corpus that includes truncated files, zero-defect wafers, multi-wafer batches, and version 1.8/2.0 examples before deployment saves significant debugging time.
The KLARF format isn't going away — it's too embedded in the KLA tool ecosystem and too widely referenced in MES integrations to be replaced in the near term. Building a reliable parser means explicitly handling each of these cases rather than assuming the file population will be spec-compliant. In our experience, at most 70% of production KLARF files from older tool configurations are fully spec-compliant. The rest require one or more of the fallback behaviors described here.