How to build a Gmail AI Agent

I’ve been trying to pin down what “counts” as an agent. If it runs in a loop, makes choices without me, takes real action, and gives me back minutes every day, I’ll call it an agent.

Gmail Agent diagram showing the loop of sense, decide, act, and record.

Here’s a simple example I built: a Gmail triage bot. It reads new mail, asks gpt-5-mini for a structured output, applies labels, archives or trashes, drafts replies (never auto-sends), and logs a short note to Reflect. It’s quiet, boring, and useful.

My definition of “agent”

For me, an agent is just:

Sense → pull state from a real system.
Decide → use AI to map inputs to a typed plan.
Act → carry out those actions through APIs.
Record → log what happened so I can trust it.

The solution

AI SDK + AI Gateway
Next.js route handlers
Upstash Redis
Vercel cron

Pretty simple.

Setting up Gmail access tokens

Create or reuse a Google Cloud project
- Visit console.cloud.google.com (https://console.cloud.google.com/).
- Enable the Gmail API for your project (APIs & Services → Library → Gmail API → Enable).
- Configure the OAuth consent screen (choose External, add yourself and any work accounts to the “Test users” section, and save).
Create an OAuth client
- APIs & Services → Credentials → Create credentials → OAuth client ID.
- Choose Desktop App (makes refresh-token minting easiest).
- Note the Client ID and Client Secret; we’ll need them for both inboxes.
Mint a refresh token for each inbox
Wire tokens into the agent

If Google ever revokes a refresh token (password reset, admin action, or you mint another one), just repeat Step 3 and update the refresh token for each account.

The decision contract (ai.ts)

The loop expects the model to return this shape:

// lib/ai.ts
export type EmailProcessingDecision = {
  category: "personal" | "work" | "newsletter" | "finance" | "travel" | "alerts" | "promotion" | "other";
  importance: "low" | "medium" | "high";
  summary: string;
  actions: { markRead: boolean; archive: boolean; delete: boolean };
  reply:   { shouldDraft: boolean; subject?: string; body?: string };
  persistToNotes: { shouldPersist: boolean; notes?: string };
  rationale: string;
  unsubscribeUrl?: string;
};

export async function classifyEmail(_: {
  accountLabel: string; subject: string; from: string; body: string; snippet?: string;
}): Promise<EmailProcessingDecision> {
  // LLM → structured object (omitted)
  return await llmStructuredDecision(_);
}

The loop (email-processor.ts)

This is the bulk of the agent's code. It fetches, decides, acts, and records. Most of this just looks like integration with 3rd party APIs.

// lib/email-processor.ts
type Dec = Awaited<ReturnType<typeof classifyEmail>>;

const START = "2025/09/01";
const DEFAULT_Q = `in:inbox is:unread after:${START}`;

export async function processInboxes({ query, maxPerAccount = 25 } = {}) {
  const redis = Redis.fromEnv();
  const today = getReflectTodayString();

  for (const acct of gmailAccounts) {
    if (missingCreds(acct)) continue;

    const gmail = getGmailClient(acct);
    const processedSet = processedKeyFor(acct.id);
    const refs = await fetchUnreadMessages(gmail, acct, {
      query: ensureCutoff(query),
      maxResults: maxPerAccount,
    });

    await parallel(refs, 3, async (ref) => {
      if (!ref?.id) return;
      const id = ref.id;
      if (await redis.sismember(processedSet, id)) return; // skip if seen

      const msg = await loadMessage(gmail, acct, id);
      if (!msg || isBeforeStart(msg)) return markProcessed(redis, processedSet, id);

      // DECIDE
      const d = await classifyEmail({
        accountLabel: acct.label,
        subject: msg.subject,
        from: msg.from,
        body: msg.plainTextBody || msg.htmlBody || msg.snippet || "",
        snippet: msg.snippet,
      });

      // ACT
      await ensureAndApplyLabels(gmail, acct, id, d);
      if (d.actions.delete) await moveMessageToTrash(gmail, acct, id);
      if (d.actions.markRead) await markMessageRead(gmail, acct, id);
      if (d.actions.archive)  await archiveMessage(gmail, acct, id);

      if (d.reply.shouldDraft && d.reply.body) {
        await createDraft(gmail, acct, id, {
          subject: d.reply.subject ?? replySubject(msg.subject),
          body: d.reply.body,
        });
      }

      // RECORD
      if (shouldReflect(d)) {
        await appendReflectUpdate({
          date: today,
          text: reflectLine(acct, msg, d, normalizeUnsub(d.unsubscribeUrl)),
        });
      }

      await Promise.all([
        markProcessed(redis, processedSet, id),
        redis.set(cacheKey(acct.id, id), JSON.stringify({
          decision: d,
          subject: msg.subject,
          from: msg.from,
          receivedAt: msg.receivedAt?.toISOString(),
        }), { ex: 60 * 60 * 24 * 30 }),
      ]);
    });
  }
}

Why I call this an agent

Yes, this could all be better. Use Gmail webhooks instead of busy polling, a more powerful model with multiple turns, etc. But this still counts as an agent to me.

It senses (fetches Gmail state), decides (structured LLM output), acts (labels, archive, draft), and records (Redis + Reflect). It runs every few minutes via a Vercel cron job.

And it quietly takes work off my plate every day. That’s enough of an "agent" for me.

Closing thought

If building systems like this but with much larger scope and impact sounds interesting, I'm hiring a senior engineer on the GTM Engineering team at Vercel and would love to connect.