top of page

How We Used Google Apps Script and Gemini AI to Parse PDFs and Automate Invoice Processing

  • Writer: Michael Lam
    Michael Lam
  • Aug 31
  • 3 min read

Key Takeaways



  • We built an AI-powered automation tool using Google Apps Script and Gemini 2.5 Flash-Lite that parses PDFs, validates data, and submits invoices automatically.

  • The solution saved our client ~1 FTE per week and eliminated manual data entry errors.

  • Choosing the right LLM matters:


    • U.S. models like GPT-4o and Gemini deliver excellent accuracy but come at a higher cost.

    • Chinese LLMs like DeepSeek, Qwen, and Yi-Lightning provide similar structured parsing at 10×–50× lower cost.


  • Leveraging cost-efficient AI automation can give companies a significant competitive edge.


Introduction


Invoice processing used to be one of the biggest hidden productivity drains. Finance teams spent hours every week opening PDFs, copying invoice data into spreadsheets, validating line items, and manually submitting records into their ERP systems.


We recently helped a client replace this tedious process with an AI-powered PDF automation tool built using Google Apps Script and Gemini AI.


Results:


  • 1 full-time equivalent (FTE) saved every week

  • Zero manual entry errors

  • PDF-to-submission in seconds



The Problem


Our client received hundreds of PDF invoices weekly. The finance team spent hours:


  • Opening each PDF manually

  • Copying vendor names, invoice totals, and line items into spreadsheets

  • Validating against POs and vendor IDs

  • Submitting invoices into their ERP system


This workflow was repetitive, error-prone, and expensive.


The Solution: AI-Powered Automation


We built a serverless workflow using Google Apps Script and Gemini 2.5 Flash-Lite that handles everything from PDF parsing to ERP submission.


1) Smart PDF Ingestion

Invoices are automatically collected from:


  • Email inboxes

  • Google Drive watch folders

  • Cloud storage integrations


2) AI-Powered PDF Parsing

Using Gemini’s multimodal capability, the tool reads PDFs directly and returns structured JSON with:


  • Invoice numbers and dates

  • Vendor details

  • Line items, tax, and totals

  • Payment terms and PO numbers


3) Real-Time Validation

Before submission, the tool applies business rules to ensure data integrity:


  • Vendor IDs are cross-checked

  • PO totals match invoice totals

  • Duplicates are detected automatically


4) Automatic ERP Submission

Once validated, invoices are pushed into the ERP system via API, reducing human touchpoints to exception handling only.



Choosing the Right AI Model: Cost vs Accuracy


When benchmarking LLMs for PDF parsing, we discovered significant cost differences:


Model

Country

Approx. Monthly Cost*

PDF Parsing

Notes

Gemini 2.5 Flash-Lite

USA

~$0.24

✅ Yes

Best balance of accuracy + cost

GPT-4o

USA

~$120.00

✅ Yes

Most accurate but expensive

GPT-4o-mini

USA

~$7.20

✅ Yes

Budget GPT option

Claude 3.5 Sonnet

USA

~$27.00

✅ Yes

Strong structured parsing

DeepSeek R1/V3

China

~$0.02

❌ Needs OCR

~50× cheaper

Qwen-VL-Max

China

~$0.62

✅ Yes

Strong multimodal alternative

Yi-Lightning

China

~$0.21

❌ Needs OCR

Lightweight and low cost


*Assumes 1,500 PDFs/month (~1,000 tokens each).


Insight: Chinese LLMs like DeepSeek and Qwen deliver comparable structured parsing for a fraction of the cost.


Sample Google Apps Script with Gemini AI


Here’s an example script showing how to parse PDFs and extract structured invoice data using Gemini 2.5 Flash-Lite:

/**
 * Parse invoices from PDF using Gemini 2.5 Flash-Lite.
 * Setup: Add a Script property GEMINI_API_KEY with your API key.
 */
function parseInvoicePdf() {
  const FILE_ID = 'YOUR_PDF_FILE_ID'; // Replace with your Drive file ID
  const API_KEY = PropertiesService.getScriptProperties().getProperty('GEMINI_API_KEY');
  if (!API_KEY) throw new Error('Set GEMINI_API_KEY in Script Properties.');

  // Read PDF and encode as Base64
  const blob = DriveApp.getFileById(FILE_ID).getBlob();
  const base64 = Utilities.base64Encode(blob.getBytes());

  // Ask Gemini to extract structured invoice data
  const prompt = `
    Extract invoice data from this PDF in JSON:
    {
      "invoice_number": string,
      "vendor": string,
      "date": string,
      "due_date": string|null,
      "po_number": string|null,
      "total": number,
      "currency": string|null,
      "line_items": [
        { "description": string, "qty": number|null, "unit_price": number|null, "amount": number|null }
      ],
      "notes": string|null
    }
    Return JSON only.
  `;

  const payload = {
    contents: [{
      role: 'user',
      parts: [
        { text: prompt },
        { inline_data: { mime_type: "application/pdf", data: base64 } }
      ]
    }]
  };

  const url = `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-lite:generateContent?key=${API_KEY}`;
  const res = UrlFetchApp.fetch(url, {
    method: 'post',
    contentType: 'application/json',
    payload: JSON.stringify(payload),
    muteHttpExceptions: true
  });

  if (res.getResponseCode() >= 300) {
    throw new Error(`Gemini API error ${res.getResponseCode()}: ${res.getContentText()}`);
  }

  const body = JSON.parse(res.getContentText());
  const text = body.candidates?.[0]?.content?.parts?.[0]?.text || '{}';

  // Parse JSON output
  let parsed = {};
  try {
    parsed = JSON.parse(text);
  } catch (e) {
    Logger.log('Invalid JSON returned. Raw output:\n' + text);
    throw e;
  }

  Logger.log(JSON.stringify(parsed, null, 2));
}


Final Thoughts


Using Google Apps Script with Gemini AI allowed us to automate a traditionally manual process from start to finish — parsing PDFs, validating invoice data, and submitting to ERP — saving a full week of work per month.


And with the rise of cost-efficient Chinese LLMs, it’s clear the AI landscape is shifting fast. Companies that embrace automation early and choose models strategically will gain a lasting competitive advantage.



Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

+1 708 773 8338

Join our Newsletter

  • Facebook
  • Instagram
  • X
  • TikTok

© 2025 AutomateSOS.com | All Rights Reserved.

bottom of page