Skip to main content

Azure AI Document Intelligence

Azure AI Document Intelligence (formerly known as Azure Form Recognizer) is machine-learning based service that extracts text (including handwriting), tables or key-value-pairs from scanned documents or images.

This current implementation of a loader using Document Intelligence is able to incorporate content page-wise and turn it into LangChain documents.

Document Intelligence supports PDF, JPEG, PNG, BMP, or TIFF.

Further documentation is available at https://aka.ms/doc-intelligence.

%pip install langchain langchain-community azure-ai-documentintelligence -q

Example 1

The first example uses a local file which will be sent to Azure AI Document Intelligence.

With the initialized document analysis client, we can proceed to create an instance of the DocumentIntelligenceLoader:

from langchain_community.document_loaders import AzureAIDocumentIntelligenceLoader

file_path = "<filepath>"
endpoint = "<endpoint>"
key = "<key>"
loader = AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint, api_key=key, file_path=file_path, api_model="prebuilt-layout"
)

documents = loader.load()

The default output contains one LangChain document with markdown format content:

documents

Example 2

The input file can also be URL path.

url_path = "<url>"
loader = AzureAIDocumentIntelligenceLoader(
api_endpoint=endpoint, api_key=key, url_path=url_path, api_model="prebuilt-layout"
)

documents = loader.load()
documents