8 Correlation Trigger
nyanpasu64 edytuje tę stronę 2018-12-05 03:53:01 -08:00

Setup

self._data_taper (const) = self._calc_data_taper()

  • Multiplied by each frame of input audio.
  • Zeroes out all data older than 1 frame old. Leaves all data at t=present and future untouched.
    • Take left half of a Hann cosine taper
    • Right-pad=1 taper to 1 frame long (t-1f to t)
    • Left-pad=0 taper to halfN t-halfN to t, and right-pad=1 taper to N t-halfN to t-halfN+N

self._buffer (mutable) = [0...]

  • Correlated with data (for triggering).
  • Updated with tightly windowed old data at various pitches

self._windowed_step (const) = self._calc_step()

  • Added to self._buffer, nonzero if edge triggering is nonzero. Not a data window.
    • Left half is -0.? to -edge_strength, right half is +edge_strength to +0.?.
    • ASCII art: --._|‾'--

Per Frame

  • Get data.
  • Calculate, save, and subtract mean.
  • get_period() Estimate and save fundamental period ( performs autocorrelation).
  • Compute falloff_window (width proportional to period).
    • This ensures we don't travel more than +-2 wavelengths or so, searching for a match.
    • Computationally expensive. I cache results, the details are unimportant (<=1 semitone of change)
  • Multiply data by min(falloff_window, self._data_taper
    • This ensures we don't travel more than +-2 wavelengths or -1 frame back in time (whichever is narrower)
  • Cross-correlate with self._buffer + self._windowed_step.
  • Find peak (slightly simplified).

Per Frame (after trigger)

  • Get new data around peak. Subtract mean (saved above).
  • Multiply data by Gaussian window (width proportional to period) (saved above)
  • self._buffer = lerp(self._buffer, data)

Some normalization details missing, see code.

Design

Contributions to history buffer are tightly windowed (to remain nearly in-phase despite pitch changes). Consequently, when performing correlation, new data is gently windowed so it can slide past the history buffer.

Each contribution to buffer is windowed by pitch and normalized by amplitude, not accumulated power. So long-period bass contributes disproportionately to cross-correlation. (Though AFAIK this does not cause issues in practice.)