For most accounting professionals, data entry is the part of the job that nobody talks about — but everyone feels. Entering invoice data into QuickBooks, reconciling receipts, keying in AP transactions — it's repetitive, error-prone, and can consume an enormous chunk of billable time.

This guide explains how AI-powered document extraction works for bookkeeping, where it fits into a real accounting workflow, and how to set it up so that PDF invoices flow into your accounting system automatically.

The Cost of Manual Invoice Data Entry

Before getting into the how, it's worth understanding the scale of the problem.

A typical small business might receive 50–200 invoices per month. A bookkeeper handling 10 clients could be processing 500–2,000 invoices monthly. At even 3 minutes per invoice, that's 25–100 hours of data entry per month — work that generates zero analytical value and is highly susceptible to human error (transposed numbers, missed line items, wrong account codes).

Automating this doesn't just save time. It also reduces errors, creates consistent data structures across clients, and gives you an audit trail for every document.

What AI Invoice Extraction Does

Traditional template-based invoice processing breaks as soon as a vendor changes their invoice format. AI extraction is different — it understands the meaning of data on an invoice, not just its position.

For a standard invoice, the AI identifies and extracts:

  • Vendor name, address, and contact information
  • Invoice number and reference codes
  • Invoice date and payment due date
  • Payment terms (Net 30, etc.)
  • Line item descriptions, quantities, unit prices, and line totals
  • Subtotal, tax rate, tax amount, and invoice total
  • Purchase order number (if present)
  • Bank details or payment instructions

The output is structured data in a consistent format — regardless of whether the source invoice is from a large vendor with a polished PDF or a small supplier with a handwritten scan.

Setting Up an Automated Bookkeeping Workflow

Here's a practical workflow used by bookkeeping practices to process client invoices automatically.

Step 1

Collect invoices in one place

Set up a dedicated email address or cloud folder (Google Drive, Dropbox) where clients send or drop their invoices. This is your ingestion point.

Step 2

Extract structured data using the API

Use the Ordalis API to process invoices as they arrive. You can trigger processing via a webhook, a scheduled script, or a no-code automation tool like Zapier or Make.

Step 3

Map fields to your accounting system

The JSON output from the API maps to standard accounting fields. Write a simple mapping script or use your automation tool to route data to the right QuickBooks or Xero fields.

Step 4

Review exceptions, not routine data

Set up a review queue for invoices where the AI confidence is lower (unusual formats, damaged scans, handwritten documents). Everything else flows automatically. You go from reviewing every invoice to only reviewing edge cases.

Practical Code: Process Invoices from a Gmail Inbox

This Python script checks a Gmail label for new invoice attachments and processes them automatically:

import requests
import base64
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build

ORDALIS_API_KEY = "your_api_key"
GMAIL_LABEL = "client-invoices"

def get_invoice_attachments(service):
    """Fetch PDF attachments from labeled emails."""
    results = service.users().messages().list(
        userId='me',
        labelIds=[GMAIL_LABEL],
        q='has:attachment newer_than:1d'
    ).execute()

    attachments = []
    for msg in results.get('messages', []):
        message = service.users().messages().get(
            userId='me', id=msg['id']
        ).execute()

        for part in message['payload'].get('parts', []):
            if part['filename'].endswith('.pdf'):
                att_id = part['body']['attachmentId']
                att = service.users().messages().attachments().get(
                    userId='me', messageId=msg['id'], id=att_id
                ).execute()
                attachments.append({
                    'filename': part['filename'],
                    'data': base64.urlsafe_b64decode(att['data'])
                })

    return attachments

def extract_invoice_data(pdf_bytes, filename):
    """Send PDF to Ordalis API and return structured data."""
    response = requests.post(
        'https://ordalis-api.tyler-gee13.workers.dev/api/v1/convert',
        headers={'X-API-Key': ORDALIS_API_KEY},
        files={'file': (filename, pdf_bytes, 'application/pdf')},
        data={'output_format': 'json'}
    )
    return response.json()

# Main processing loop
creds = Credentials.from_authorized_user_file('token.json')
service = build('gmail', 'v1', credentials=creds)

attachments = get_invoice_attachments(service)
print(f"Found {len(attachments)} new invoices to process")

for att in attachments:
    data = extract_invoice_data(att['data'], att['filename'])
    print(f"✓ {att['filename']}: {data.get('vendor_name')} — ${data.get('total_amount')}")
    # Here: push data to QuickBooks/Xero API or write to database

Integrating with QuickBooks

QuickBooks Online has an API that lets you create bills programmatically. Once you have structured invoice data from Ordalis, you can create a bill in QuickBooks like this:

import requests

QB_ACCESS_TOKEN = "your_quickbooks_token"
QB_COMPANY_ID = "your_company_id"
QB_BASE = f"https://quickbooks.api.intuit.com/v3/company/{QB_COMPANY_ID}"

def create_quickbooks_bill(invoice_data):
    """Create a bill in QuickBooks from extracted invoice data."""

    bill_payload = {
        "VendorRef": {
            "name": invoice_data["vendor_name"]
        },
        "TxnDate": invoice_data["invoice_date"],
        "DueDate": invoice_data["due_date"],
        "DocNumber": invoice_data["invoice_number"],
        "Line": [
            {
                "DetailType": "AccountBasedExpenseLineDetail",
                "Amount": item["line_total"],
                "Description": item["description"],
                "AccountBasedExpenseLineDetail": {
                    "AccountRef": {"name": "Accounts Payable"}
                }
            }
            for item in invoice_data.get("line_items", [])
        ],
        "TotalAmt": invoice_data["total_amount"]
    }

    response = requests.post(
        f"{QB_BASE}/bill",
        json=bill_payload,
        headers={
            "Authorization": f"Bearer {QB_ACCESS_TOKEN}",
            "Content-Type": "application/json"
        }
    )
    return response.json()

# Example usage after extracting invoice data:
# bill = create_quickbooks_bill(invoice_data)
# print(f"Created QB Bill ID: {bill['Bill']['Id']}")

Note: QuickBooks vendor names must match exactly what's in your QB vendor list. For new vendors, you'll need to create them first via the QB API or handle the mismatch in your script.

Using No-Code Tools Instead

If you're not writing code, Zapier and Make (formerly Integromat) can wire together the same workflow:

  1. Trigger: New email attachment in Gmail (or new file in Dropbox/Google Drive)
  2. Action: POST the file to the Ordalis API using a webhook step
  3. Action: Parse the JSON response
  4. Action: Create a bill or transaction in QuickBooks/Xero using their native Zapier integrations

This approach requires no code and can be set up in under an hour. It's a good starting point before investing in a custom integration.

Common Questions from Bookkeepers

What about invoices that arrive as images inside emails?

The API supports image formats (PNG, JPG, TIFF) in addition to PDFs. If clients send photo invoices or screenshots, you can process those the same way.

Can I handle multiple clients from one account?

Yes. The API is stateless — you can pass a schema parameter to specify the document type, and organize your own data by client on your end. Business plans support 100,000 conversions/month with higher concurrency for multi-client workflows.

What happens with invoices that have errors or are unreadable?

The API returns a structured response with a confidence indicator. Low-confidence extractions are flagged so you can route them to a manual review queue rather than letting errors flow into your accounting system.

Automate Your Invoice Processing

Start with 50 free conversions per month. No credit card, no setup fee. See how AI extraction fits into your bookkeeping workflow.

Start Free Trial