iOS Voice Memos to Obsidian with Claude Code
·4 min read
I record a lot of voice memos. Walking the dog, driving, sitting in the backyard — basically anytime I have a thought worth capturing but can't type it out.
The problem? Those memos just... sit there. Buried in the Voice Memos app, never to be seen again.
I've been using Obsidian as my PKM (Personal Knowledge Management) system, and I wanted my voice memos to automatically show up there — transcribed, summarized, and linked to my daily notes.
Here's how I built it.
The Hidden Recordings Folder
This took me way too long to figure out. iOS Voice Memos sync to your Mac via iCloud, but they don't land in an obvious place like ~/Documents or your iCloud Drive folder.
They're hidden in a Group Container:
~/Library/Group Containers/
group.com.apple.VoiceMemos.shared/
Recordings/
Files show up here as .m4a files with timestamps like 20251226 102752-B6C42C301.m4a. Not exactly human-readable, but we can work with that.
Automator Folder Actions
macOS has this underrated feature called Folder Actions. You can set up an Automator workflow that triggers whenever a new file lands in a folder. Perfect for our use case.
Here's how to set it up:
- Open Automator and create a new Folder Action
- Set it to watch the Voice Memos folder (path above)
- Add a Run Shell Script action
- Pass input as arguments and point it to your processing script
The tricky part is that macOS sandbox restrictions can make this folder action finicky. You might need to grant Automator full disk access in System Preferences → Privacy & Security → Full Disk Access.
The Processing Pipeline
The original version of this script used Whisper.cpp for local transcription. It worked great, but I wanted better transcription quality without managing model files.
Enter Gemini Flash 3. Google's API handles audio transcription really well, and the latency is reasonable for async processing.
The flow looks like this:
- New file lands → Automator triggers the script
- Convert to MP3 → FFmpeg handles the m4a → mp3 conversion
- Upload to Gemini → Resumable upload API for the audio file
- Transcribe → Gemini returns the full transcript
- Analyze → Second Gemini call extracts title, summary, and tags
- Create Obsidian note → Markdown file with frontmatter
- Update daily note → Append a link to today's daily note
- Git commit → Push changes to the vault repo
Here's the key bit — uploading audio to Gemini's API:
# Upload file to Gemini
gemini_upload_file() {
local audio_path="$1"
local mime_type num_bytes upload_url file_uri
mime_type=$(file -b --mime-type "$audio_path")
num_bytes=$(wc -c < "$audio_path" | tr -d ' ')
# Start resumable upload
curl -s "https://generativelanguage.googleapis.com/upload/v1beta/files" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-D "$header_file" \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: ${num_bytes}" \
-H "X-Goog-Upload-Header-Content-Type: ${mime_type}" \
-H "Content-Type: application/json" \
-d "{\"file\": {\"display_name\": \"voice-memo\"}}" >/dev/null
upload_url=$(grep -i "x-goog-upload-url: " "$header_file" | cut -d" " -f2)
# Upload the bytes
curl -s "$upload_url" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
--data-binary "@${audio_path}" > "$file_info"
jq -r '.file.uri' "$file_info"
}
Auto-Tagging with Context
The analysis prompt extracts tags that match my Obsidian tag system:
- Context tags:
work,personal,health,learning,family - Priority tags:
priority/high,priority/medium,priority/low - Topic tags: Specific to the content (e.g.,
home-renovation,book-ideas)
Here's what a generated note looks like:
---
title: "Kitchen Renovation Ideas and Timeline"
date: 2025-12-26T14:33:00-0600
tags: [voice-memo, personal, family, priority/medium, home-renovation]
audio: "kitchen-renovation-ideas.mp3"
---
# Kitchen Renovation Ideas and Timeline
**Recorded:** 2025-12-26 at 14:33
**Audio:** `kitchen-renovation-ideas.mp3`
## Summary
Planning to start the kitchen renovation in March after getting
quotes from three contractors. Need to finalize cabinet style
and decide between quartz and butcher block countertops.
## Transcript
[Full transcription here...]
The summary is written in first person since these are my notes. I added this to the prompt:
"You are summarizing personal voice memos for Drew. The primary speaker is Drew himself — these are his notes and thoughts. Write summaries in first person or as personal notes."
Daily Note Integration
Every voice memo gets appended to that day's daily note:
---
### Voice Memo: 14:33
[[Voice Memos/kitchen-renovation-ideas|Kitchen Renovation Ideas and Timeline]]
> Planning to start the kitchen renovation in March after getting quotes...
If the daily note doesn't exist yet, the script creates it via claude /daily — a slash command in my Obsidian PKM setup that generates a daily note from a template.
Building It with Claude Code
I didn't write any of this code. I spent a decent amount of time getting my initial prompt right, copying in relevant code snippets from docs — but that was it.
A few voice memo recordings, pasting the failed logs into Claude, and we were off to the races.
What's Next
A few things I want to add:
- Action item extraction — Detect when I mention something I need to do and create tasks
- Semantic search — Use embeddings to find related notes across my vault
Try It Yourself
The Obsidian PKM setup I'm using is based on ballred/obsidian-claude-pkm. It's a great starting point if you want Claude Code integrated into your note-taking workflow.
For the voice memo pipeline specifically, you'll need:
- macOS with Automator
- FFmpeg (
brew install ffmpeg) - jq (
brew install jq) - A Google AI Studio API key for Gemini
- An Obsidian vault (or any markdown-based notes system)
p.s. — I wrote the outline for this post as a voice memo. Meta.
Stay up to date
Get essays and field notes delivered as soon as they publish.