Building a Noise Characterization Pipeline for QEC

How QECSync approaches per-device noise characterization: from randomized benchmarking to produce per-qubit gate error rates, to how that data flows into the MWPM edge weight matrix and improves logical error suppression.

Noise characterization pipeline workflow from calibration data to decoder weights

The central claim in QECSync's design is that decoder performance improves when the decoder knows your hardware's specific noise structure — not a generic depolarizing noise model, but the actual per-qubit relaxation times, per-gate fidelities, and readout errors measured on your device before your experiment runs.

This article describes how we build that per-device noise characterization in practice: what measurements you need to perform, what parameters matter most for decoder performance, and how QECSync converts those parameters into an edge weight matrix for the MWPM syndrome graph.

What the decoder actually needs

The MWPM decoder works on a syndrome graph where each edge has a weight proportional to the log-likelihood of the corresponding error event. The decoder finds the minimum-weight perfect matching — the set of edges with minimum total weight that pairs all violated stabilizers. If edge weights accurately reflect the actual error probabilities on your device, the decoder finds the maximum-likelihood correction.

There are three classes of error that contribute to surface-code syndrome events:

  1. Gate errors. Two-qubit CNOT errors during stabilizer extraction create data errors on the qubits participating in the gate. Single-qubit gate errors contribute at a lower rate.
  2. Idle errors. Qubits that sit idle during syndrome cycles accumulate decoherence proportional to idle time / T2. In a staggered syndrome schedule, some qubits idle for 1–3 gate steps between their own CNOT operations.
  3. Measurement errors. Ancilla qubit readout has an assignment error rate — the probability that a |0⟩ ancilla is read as 1, or vice versa. Measurement errors produce transient syndrome anomalies that must be handled via temporal redundancy.

Each of these contributes a different type of edge in the syndrome graph. Gate errors produce spatial edges (connecting adjacent syndrome positions). Measurement errors produce temporal edges (connecting the same syndrome position across time steps). Idle errors produce edges that mix spatial and temporal structure depending on when in the cycle the idle time occurs.

Step 1: Randomized benchmarking for gate error rates

Randomized benchmarking (RB) is the standard method for characterizing gate error rates in a way that is insensitive to state preparation and measurement (SPAM) errors. The procedure: apply increasingly long random sequences of Clifford gates, append the recovery gate, measure the return probability. The exponential decay rate gives the average gate fidelity.

For single-qubit gates, interleaved RB measures the error rate of a specific gate (H, X/2, Z/2) by interleaving it between random Clifford sequences and comparing to the reference decay rate. Single-qubit gate errors on superconducting devices typically fall in the 0.01–0.1% range.

For two-qubit gates, simultaneous RB on all gate pairs in the syndrome circuit measures the gate error rates under realistic crosstalk conditions. This is important: measuring two-qubit gate fidelity in isolation often gives optimistic numbers compared to the error rate when multiple gates run in parallel, as they do during a syndrome cycle.

QECSync's calibration pipeline accepts RB output in standard JSON format: a per-gate error probability table keyed by qubit pair index. This directly populates the gate-error terms in the edge weight matrix.

Step 2: T1 and T2 for idle error characterization

T1 (energy relaxation time) and T2 (dephasing time, measured via Ramsey or Hahn echo sequences) characterize how quickly qubits decohere while idle. For the purpose of surface-code decoder weight computation, T2* (free-induction decay) is often more relevant than T2 (echo-corrected), because the syndrome circuit does not include refocusing pulses during qubit idle periods.

The idle error probability for a qubit idling for time τ is approximately 1 − exp(−τ/T2*). At T2* = 100 µs and a syndrome step time of 200 ns, the idle error probability per step is approximately 2×10⁻³ — comparable to a typical two-qubit gate error rate. At T2* = 50 µs, it doubles. At T2* = 300 µs, it drops to 6.7×10⁻⁴.

The syndrome circuit schedule determines how long each qubit idles and when. QECSync's calibration module takes the per-qubit T1/T2 values and the syndrome circuit timing to compute per-idle-step error probabilities for each qubit in the code.

Step 3: Readout assignment error matrix

Ancilla qubit readout is the single largest non-gate error source in most current superconducting devices. Assignment errors — reading |0⟩ as 1, or |1⟩ as 0 — are not symmetric; typically, P(1|0) ≠ P(0|1), and the rates vary significantly across ancilla qubits.

The readout assignment error matrix is measured by preparing each ancilla qubit in |0⟩ and |1⟩ and measuring 1000+ times to get the error probabilities empirically. This 2×2 matrix per ancilla qubit feeds into the temporal edge weights of the syndrome graph — a high-readout-error ancilla generates syndrome events that are more likely to be measurement errors than data errors, and the decoder should weight its temporal edges accordingly.

Step 4: Crosstalk characterization (where available)

ZZ-coupling (always-on qubit-qubit interaction) is a form of crosstalk specific to superconducting transmon devices. When qubit A and qubit B are coupled, the presence of qubit A in |1⟩ shifts the transition frequency of qubit B, leading to correlated phase errors during idle periods. This produces spatially correlated error events — adjacent qubits are more likely to both have errors than independent error probabilities would predict.

ZZ coupling maps — measured by preparing one qubit in |1⟩ and performing Ramsey spectroscopy on its neighbor — give per-pair coupling strengths in kHz. QECSync converts these to correlated edge weights in the syndrome graph. Edges between syndrome positions near high-ZZ pairs get modified weights that account for the correlation structure.

Crosstalk characterization is not available on all platforms and is not required for basic QECSync operation. It is an optional input that improves decoder performance on devices where ZZ coupling is significant.

From calibration data to edge weight matrix

QECSync's NoiseModel object aggregates all the above inputs and exposes a single method: compute_edge_weights(syndrome_graph). This returns the weight matrix used by the MWPM decoder.

The edge weight for a spatial edge connecting two syndrome positions is computed as the negative log-probability of the most likely error chain between those positions, given the per-gate and per-idle error rates of the qubits in that chain. The weight for a temporal edge at an ancilla position is the negative log-probability of a measurement error at that ancilla, given its measured assignment error rate.

This is the same probabilistic framework described in Fowler et al. (2012), instantiated with device-specific rather than uniform error parameters. The result is a decoder that, in expectation, makes better matching decisions than a decoder using heuristic or uniform weights — specifically by preferring to route corrections through low-error qubits and treating high-readout-error ancilla syndrome events with appropriate skepticism.

Calibration drift and re-calibration frequency

Gate fidelities and T2 times on superconducting devices drift on timescales of hours to days, driven by two-level system (TLS) fluctuators, flux noise, and thermal instability. A calibration performed 24 hours before an experiment may not accurately reflect the device state during the experiment.

For best results, calibration should be performed within the same session as the QEC experiment — ideally the same cooldown cycle or within 2–4 hours of the QEC data collection. QECSync's calibration pipeline is designed to run in under 30 minutes for a standard d=5 configuration, making same-session calibration practical.

For the detailed calibration data format and the JSON schema QECSync accepts, see the hardware integration guide. The API reference describes the NoiseModel class and its loading methods.