Back to all apps
AI Document Analyzer illustration
AIAzure OpenAIDocument ProcessingPython

AI Document Analyzer

Intelligent document processing tool leveraging Azure OpenAI to extract, classify and summarize information from PDF and Word documents.

Author: Jean-Baptiste MartinPublished: 15 September 2024

Demo / Testimonial

About this app

Overview

The AI Document Analyzer is an internal tool that leverages Azure OpenAI GPT-4 to automatically process and extract structured information from various document formats including PDF, DOCX, and XLSX files.

Key Features

  • Automated extraction: Identifies key entities, dates, amounts, and clauses
  • Multi-format support: Handles PDF, Word, and Excel files
  • Structured output: Exports results as JSON or CSV
  • Batch processing: Handles up to 100 documents per session
  • Azure integration: Fully integrated with Azure Blob Storage and Azure Cognitive Services

Technical Stack

  • Backend: Python (FastAPI), Azure OpenAI (GPT-4o)
  • Frontend: React + TypeScript
  • Infrastructure: Azure Container Apps, Azure Blob Storage
  • CI/CD: GitHub Actions → Azure Container Registry

Getting Started

Deploy the application using the provided ARM templates in the /infra folder. Ensure you have the necessary Azure OpenAI quota before deploying.

Limitations

Currently limited to documents under 50 MB. Scanned PDFs require OCR pre-processing via Azure Document Intelligence.