iOS Voice Memos to Obsidian with Claude Code

I record a lot of voice memos. Walking the dog, driving, sitting in the backyard — basically anytime I have a thought worth capturing but can't type it out.

The problem? Those memos just... sit there. Buried in the Voice Memos app, never to be seen again.

I've been using Obsidian as my PKM (Personal Knowledge Management) system, and I wanted my voice memos to automatically show up there — transcribed, summarized, and linked to my daily notes.

Here's how I built it.

The Hidden Recordings Folder

This took me way too long to figure out. iOS Voice Memos sync to your Mac via iCloud, but they don't land in an obvious place like ~/Documents or your iCloud Drive folder.

They're hidden in a Group Container:

~/Library/Group Containers/
  group.com.apple.VoiceMemos.shared/
  Recordings/

Files show up here as .m4a files with timestamps like 20251226 102752-B6C42C301.m4a, so you'll want a renaming or parsing step.

Automator Folder Actions

macOS has this underrated feature called Folder Actions. You can set up an Automator workflow that triggers whenever a new file lands in a folder. Perfect for our use case.

Here's how to set it up:

Open Automator and create a new Folder Action
Set it to watch the Voice Memos folder (path above)
Add a Run Shell Script action
Pass input as arguments and point it to your processing script

The tricky part is that macOS sandbox restrictions can make this folder action finicky. You might need to grant Automator full disk access in System Preferences → Privacy & Security → Full Disk Access.

The Processing Pipeline

The original version of this script used Whisper.cpp for local transcription. It worked great, but I wanted better transcription quality without managing model files.

Enter Gemini Flash 3. Google's API handles audio transcription really well, and the latency is reasonable for async processing.

The flow looks like this:

New file lands → Automator triggers the script
Convert to MP3 → FFmpeg handles the m4a → mp3 conversion
Upload to Gemini → Resumable upload API for the audio file
Transcribe → Gemini returns the full transcript
Analyze → Second Gemini call extracts title, summary, and tags
Create Obsidian note → Markdown file with frontmatter
Update daily note → Append a link to today's daily note
Git commit → Push changes to the vault repo

Here's the key bit — uploading audio to Gemini's API:

# Upload file to Gemini
gemini_upload_file() {
  local audio_path="$1"
  local mime_type num_bytes upload_url file_uri

  mime_type=$(file -b --mime-type "$audio_path")
  num_bytes=$(wc -c < "$audio_path" | tr -d ' ')

  # Start resumable upload
  curl -s "https://generativelanguage.googleapis.com/upload/v1beta/files" \
    -H "x-goog-api-key: $GEMINI_API_KEY" \
    -D "$header_file" \
    -H "X-Goog-Upload-Protocol: resumable" \
    -H "X-Goog-Upload-Command: start" \
    -H "X-Goog-Upload-Header-Content-Length: ${num_bytes}" \
    -H "X-Goog-Upload-Header-Content-Type: ${mime_type}" \
    -H "Content-Type: application/json" \
    -d "{\"file\": {\"display_name\": \"voice-memo\"}}" >/dev/null

  upload_url=$(grep -i "x-goog-upload-url: " "$header_file" | cut -d" " -f2)

  # Upload the bytes
  curl -s "$upload_url" \
    -H "X-Goog-Upload-Offset: 0" \
    -H "X-Goog-Upload-Command: upload, finalize" \
    --data-binary "@${audio_path}" > "$file_info"

  jq -r '.file.uri' "$file_info"
}

Auto-Tagging with Context

The analysis prompt extracts tags that match my Obsidian tag system:

Context tags: work, personal, health, learning, family
Priority tags: priority/high, priority/medium, priority/low
Topic tags: Specific to the content (e.g., home-renovation, book-ideas)

Here's what a generated note looks like:

---
title: "Kitchen Renovation Ideas and Timeline"
date: 2025-12-26T14:33:00-0600
tags: [voice-memo, personal, family, priority/medium, home-renovation]
audio: "kitchen-renovation-ideas.mp3"
---

# Kitchen Renovation Ideas and Timeline

**Recorded:** 2025-12-26 at 14:33
**Audio:** `kitchen-renovation-ideas.mp3`

## Summary

Planning to start the kitchen renovation in March after getting
quotes from three contractors. Need to finalize cabinet style
and decide between quartz and butcher block countertops.

## Transcript

[Full transcription here...]

The summary is written in first person since these are my notes. I added this to the prompt:

"You are summarizing personal voice memos for Drew. The primary speaker is Drew himself — these are his notes and thoughts. Write summaries in first person or as personal notes."

Daily Note Integration

Every voice memo gets appended to that day's daily note:

---

### Voice Memo: 14:33
[[Voice Memos/kitchen-renovation-ideas|Kitchen Renovation Ideas and Timeline]]
> Planning to start the kitchen renovation in March after getting quotes...

If the daily note doesn't exist yet, the script creates it via claude /daily — a slash command in my Obsidian PKM setup that generates a daily note from a template.

Building It with Claude Code

I didn't write any of this code. I spent a decent amount of time getting my initial prompt right, copying in relevant code snippets from docs — but that was it.

A few voice memo recordings, pasting the failed logs into Claude, and we were off to the races.

What's Next

A few things I want to add:

Action item extraction — Detect when I mention something I need to do and create tasks
Semantic search — Use embeddings to find related notes across my vault

Try It Yourself

The Obsidian PKM setup I'm using is based on ballred/obsidian-claude-pkm. It's a great starting point if you want Claude Code integrated into your note-taking workflow.

For the voice memo pipeline specifically, you'll need:

macOS with Automator
FFmpeg (brew install ffmpeg)
jq (brew install jq)
A Google AI Studio API key for Gemini
An Obsidian vault (or any markdown-based notes system)

p.s. — I wrote the outline for this post as a voice memo. Meta.