Log in
All articles
Privacy·2026-04-02·7 min read·Punkto Team

Zero audio retention — what it actually means, and how we built it

Most "private" meeting tools still store your call audio. Punkto does not — by architecture, not policy. Here is the engineering case for zero audio retention, and exactly how the pipeline works.

Almost every “private” meeting tool stores your call audio. It might be encrypted, region-locked, retention-limited — but it sits on a disk somewhere, and a court order, a misconfiguration, or a compromised employee could surface it.

Punkto does not store your audio. Not because we promise we won't — because the system architecturally cannot. Here is what that means, why it matters, and how we built it.

What “zero retention” means in software terms

The word “retention” in privacy contexts usually means “how long we keep it.” Zero retention means: the time interval between “data exists” and “data is destroyed” is the duration of a single request handler — typically 5 to 60 seconds. After that interval, the data has no representation, anywhere in our system.

Concretely, for an audio file uploaded to our transcription API:

  1. The audio arrives as a multipart upload, held in a memory buffer.
  2. The buffer is streamed to our speech-to-text provider over TLS.
  3. Text comes back. The audio buffer is dereferenced.
  4. The Node.js garbage collector reclaims the memory.
  5. The transcript and AI summary are written to the database. The audio is not.

At no point does the audio touch a disk, a backup, a cache, or a log. There is nofs.writeFile. There is no s3.putObject. There is, by code, no audio storage path.

Why this matters

Voice is biometric data

Under GDPR Recital 51 and Article 9, voice qualifies as biometric data when processed for identification. That triggers the “special category” regime: explicit consent requirements, stricter lawful basis tests, automatic DPIA in many cases. Storing voice raises the compliance bar significantly. Not storing it side-steps the regime entirely.

The least valuable thing to keep

Once you have a clean transcript, the audio adds almost nothing. You cannot search it. You cannot skim it. You cannot share it efficiently. The 90% of users who would want to revisit a meeting revisit the transcript or summary, not the audio.

Meanwhile, the audio is the most sensitive artifact of the entire call. It is biometric. It captures tone. It captures the accidental side-comments. It is what an adversary would most want.

Storing the lowest-value highest-risk artifact is a bad trade. We just do not make it.

Subpoena resistance

A court order can compel us to produce data we have. It cannot compel us to produce data we do not have. By not storing audio, we close an entire surface of legal exposure for our customers — and for ourselves.

Breach impact

Imagine a worst-case breach where every database in our system is dumped. What gets out? Email addresses, hashed passwords, transcripts, summaries, action items. Painful. What does not get out: any voice recording, ever, of any meeting we have ever processed.

How we built it

The architecture is two pages of code and two structural decisions.

Decision 1: no audio_path column

Our board_recordings table has columns for transcript, summary, duration, created_at. It does not have a column for the audio file path. There is, in the database schema itself, no place to put audio.

(We did keep the column nullable for backward compatibility with an opt-in retention mode we may offer to Enterprise customers. The default value is, and stays, NULL.)

Decision 2: the API handler does not write to storage

Our POST /api/transcript endpoint:

  • Reads the multipart audio into a Node.js Buffer.
  • Sends the buffer to the speech-to-text API.
  • Receives the transcript text.
  • Sends the transcript to the LLM API for summary generation.
  • Inserts a row in board_recordings with the transcript and summary — no audio_path.
  • Returns the response. The audio buffer goes out of scope. Garbage collection reclaims it.

The handler is intentionally short and intentionally simple. There is no “save audio for later” branch. There is no fallback that uploads on transcription failure. The audio is processed, or it is not — but it is never persisted.

Decision 3: pre-existing audio purged

Earlier versions of Punkto (before 2026-04) did briefly store audio. When we shipped zero retention, we did not just stop writing — we purged every existing audio file from storage. The bucket is empty. We have an internal audit log confirming the purge.

What we cannot do because of this

Honesty section. Zero retention has trade-offs:

  • You cannot replay a meeting. If you wanted to listen back to confirm something, you cannot. Read the transcript instead.
  • We cannot regenerate the transcript. If our transcription was buggy, we cannot re-run it on better software. The transcript you got at the time is the transcript you have.
  • You cannot “export the audio”. There is nothing to export. Some competitors offer .mp3 export of past meetings. We do not, by design.
  • Forensics and dispute resolution are harder. If a participant disputes what they said, we cannot prove it with audio. We can only show the transcript and timestamps.

For most meetings, these are acceptable losses. For some — financial advice recording, regulated compliance audits, legally-binding negotiations — they are deal-breakers. Those use cases need a different tool, or they need our planned opt-in audio retention mode (Enterprise, customer-held keys, explicit retention window).

How to verify our claims

“Trust us” is not a privacy policy. Things you can do to verify:

  1. Read the transcript route handler. The code is in our repository. The audio path for that handler does not include any persistence call.
  2. Read our DPA. The data-flow diagram explicitly excludes audio from the subprocessor data list.
  3. Submit a data subject access request. Ask for everything we have on you. The response will not include audio, because there is none.
  4. Check the storage bucket policy.Our recordings bucket has only one allowed operation: “list and delete on session end.” No insert. No upload. The bucket exists only for legacy compatibility and is empty.

The principle behind it

Data minimization (GDPR Article 5(1)(c)) says you should collect and keep only what you need. Storage is not a feature — it is a debt. Every byte you store is a byte you have to defend, encrypt, back up, comply with, and eventually delete.

For audio, the calculation is simple: the value of having it is small (the transcript covers 90% of use cases), and the risk of having it is large (biometric data, breach exposure, subpoena surface). The right retention period is zero.

That is the principle. The pipeline above is the implementation. Both are public.


Try it. Punkto runs zero-retention transcription on every recorded session. EU-hosted, EU jurisdiction, free for 3 transcripts per month — no credit card.

Frequently asked questions

What does "zero audio retention" mean exactly?

It means the call audio is never written to persistent storage. Audio is held in memory while it is being transcribed, then the buffer is destroyed at the end of the request. After 30 seconds, no copy of your call audio exists anywhere.

How is this different from "encrypted at rest"?

Encryption at rest means the audio is stored, encrypted with a vendor-held key. The vendor can decrypt it when subpoenaed or hacked. Zero retention means there is nothing to decrypt — the audio simply does not exist after transcription.

What about backups? Are they really cleared?

Audio buffers exist only in process memory of the transcription handler. They are never written to a database, file system, S3 bucket, or backup volume. There is no scheduled backup of audio because there is no audio storage.

Can you replay a meeting if I ask you to?

No. Once the transcription request completes, the audio is gone. We can show you the transcript and AI summary that was generated. We cannot regenerate them or replay the call.

Why does GDPR specifically care about voice?

Voice can be used to identify a person — it is biometric data when processed for identification (GDPR Recital 51, Article 9). Storing voice triggers the higher-protection regime for special categories of data. Not storing it removes the regime entirely.

What if I want to keep the audio for compliance reasons?

Some regulated sectors (finance, healthcare audits) require audio retention. We are designing an opt-in audio retention mode for Enterprise customers, with explicit retention windows, encryption with customer-held keys, and audit logs. Default is, and will remain, zero retention.

Try Punkto

Structured meetings, live captions, AI summaries — EU-hosted, GDPR-native. Free for 3 sessions/month, no credit card.