Receipt OCR

Extract structured data from receipt images using Tesseract OCR. Returns merchant info, line items, totals, and payment method.

POST /api/receipt-ocr
Media
~1.5s avg latency
API Key auth
10MB max image

What It Does

The Receipt OCR API uses Tesseract OCR to extract text from receipt and invoice images, then applies intelligent parsing to identify and structure key financial information. The API processes base64-encoded images and returns a comprehensive JSON object containing all extracted data.

Key Features

  • Multi-format Date Parsing: Automatically detects and converts 6 common date formats to ISO 8601
  • Line Item Extraction: Parses individual items with descriptions, quantities, and prices
  • Financial Summary: Extracts subtotal, tax, and total amounts
  • Payment Detection: Identifies payment methods (cash, credit, debit, etc.)
  • Currency Recognition: Detects currency symbols and codes
  • Confidence Scoring: Returns confidence levels for extracted data

Code Examples

curl -X POST https://api.atomicapis.dev/api/receipt-ocr \
  -H "X-RapidAPI-Proxy-Secret: YOUR_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "iVBORw0KGgoAAAANSUhEUgAA...",
    "language": "eng",
    "includeRawText": false
  }'

Request Parameters

Name Type Required Description
image string Required Base64-encoded image data. Supports JPEG, PNG, WebP, TIFF, and BMP formats. Maximum size: 10MB.
language string Optional OCR language code (e.g., "eng", "deu", "fra"). Default: "eng".
includeRawText boolean Optional When set to true, includes raw OCR text output in the rawText response field. Omit or set to false to exclude.

Response Format

200 OK - Success Response
{
  "receipt": {
    "merchantName": "Starbucks Coffee",
    "merchantAddress": "123 Main St, Seattle, WA 98101",
    "date": "2024-01-15",
    "time": "14:32",
    "items": [
      {
        "description": "Grande Caffe Latte",
        "quantity": 1,
        "unitPrice": 4.95,
        "totalPrice": 4.95
      },
      {
        "description": "Blueberry Muffin",
        "quantity": 2,
        "unitPrice": 2.75,
        "totalPrice": 5.50
      }
    ],
    "subtotal": 10.45,
    "tax": 0.87,
    "total": 11.32,
    "paymentMethod": "VISA",
    "cardLastFour": "4242",
    "currency": "USD"
  },
  "confidence": 87.45,
  "rawText": null,
  "processingTimeMs": 1240.35
}

Response Fields

Field Type Description
receipt.merchantName string | null Name of the merchant or store. Null if not detected.
receipt.merchantAddress string | null Address of the merchant. Null if not detected.
receipt.date string | null Transaction date in ISO 8601 format (YYYY-MM-DD). Null if not detected.
receipt.time string | null Transaction time (e.g., "14:32" or "2:32 PM"). Null if not detected.
receipt.items array Array of purchased items, each with description (string), quantity (integer), unitPrice (number | null), and totalPrice (number).
receipt.subtotal number | null Subtotal before tax. Null if not detected.
receipt.tax number | null Tax amount. Null if not detected.
receipt.total number | null Final total amount. Null if not detected.
receipt.paymentMethod string | null Payment method detected (VISA, MASTERCARD, AMEX, DISCOVER, DINERS, JCB, DEBIT, CREDIT, or CASH). Null if not detected.
receipt.cardLastFour string | null Last four digits of the payment card. Null if not detected.
receipt.currency string | null Detected currency code (USD, EUR, GBP, or JPY). Null if not detected.
confidence number Average OCR confidence score (0-100) from Tesseract, rounded to 2 decimal places.
rawText string | null Raw OCR text output (only if includeRawText is true).
processingTimeMs number Processing time in milliseconds, rounded to 2 decimal places.

Use Cases

Expense Tracking

Build expense tracking apps that automatically extract data from receipt photos. Users can simply snap a picture and have all details captured instantly.

Accounting Integration

Integrate with accounting software like QuickBooks or Xero. Automatically populate expense entries from receipt images without manual data entry.

Receipt Management

Create digital receipt archives with searchable structured data. Enable users to find receipts by merchant, date, amount, or item description.

Build Constraints

Tesseract CLI Shell-out

The API uses Tesseract OCR via command-line interface. Images are processed through temporary files that are securely managed and cleaned up after extraction.

Supported Formats & Size

Accepts base64-encoded images in JPEG, PNG, WebP, TIFF, and BMP formats. Maximum file size is 10MB. Larger images should be optimized before encoding.

Temp File Lifecycle

Temporary files are created during processing and automatically deleted immediately after OCR extraction completes. No images are persisted on the server.

Stateless Operation

The API is completely stateless. Images are discarded immediately after processing. No data is stored or retained between requests.

Error Codes

Error HTTP Status Message
Missing image 400 The 'image' field is required and cannot be empty.
Invalid base64 400 Invalid base64-encoded image data.
Image too large 400 Image exceeds maximum allowed size of 10MB.
OCR error 400 An error occurred during receipt OCR processing.
Unauthorized 401 Invalid or missing X-RapidAPI-Proxy-Secret header.

Ready to extract receipt data?

Start extracting structured data from receipt images today.