Azure OpenAI Service

Azure OpenAI Service provides OpenAI models (GPT-5.4, GPT-5.4 mini, GPT-5.4 nano, embedding models) hosted within Microsoft Azure — now accessed through Microsoft AI Foundry. This is the recommended path for enterprises that require Azure compliance, data residency guarantees, private networking, or Microsoft Entra ID integration.

GPT-5.4 on Azure (March 2026)

GPT-5.4 mini and GPT-5.4 nano are now available on Azure via Microsoft Foundry in Standard Global deployment, Data Zone US, and rolling out to Data Zone EU. See the official announcement.

How CL Connects to Azure OpenAI

Contract Lucidity treats Azure OpenAI as a variant of the OpenAI provider. Internally, both use the openai Python SDK -- Azure OpenAI simply requires an Azure-specific endpoint URL and API key instead of the standard OpenAI endpoint. The provider key is azure_openai.

When to Use Azure OpenAI

Requirement	Standard OpenAI	Azure OpenAI
Data stays in your Azure tenant	No	Yes
Private endpoint (VNet integration)	No	Yes
Azure AD / Entra ID authentication	No	Yes
SOC 2 Type II via your Azure subscription	Shared	Dedicated
HIPAA BAA through Azure	No	Yes
Content filtering customisation	Limited	Full control
Model availability	All models, immediately	Per-region, may lag
Setup complexity	Low	Medium-High

Prerequisites

Before configuring CL, you need:

An Azure subscription with billing enabled
Access approval for Azure OpenAI Service (may require an application form at aka.ms/oai/access)
Permission to create resources in your Azure subscription (Contributor role or higher)

Setting Up Azure OpenAI

Step 1: Create an Azure OpenAI Resource

Sign in to the Azure Portal
Click Create a resource
Search for "Azure OpenAI" and select it
Click Create and fill in:
- Subscription: Select your Azure subscription
- Resource group: Create new or select existing
- Region: Choose a region that supports your desired models (see model availability by region)
- Name: A unique name (e.g., cl-openai-prod)
- Pricing tier: Standard S0
Configure network access:
- All networks (simplest for initial setup)
- Selected networks (recommended for production -- restrict to your CL server's IP)
- Private endpoint (most secure -- requires VNet)
Click Review + create, then Create

Region Selection Matters

Not all models are available in all Azure regions. GPT-5.4 mini and nano are available in Standard Global and Data Zone US as of March 2026. Check the Azure OpenAI model availability page for current regional support.

Step 2: Deploy Models

After the resource is created:

Open your Azure OpenAI resource
Click Model deployments > Manage Deployments (opens Azure AI Studio)
Click + Deploy model > Deploy base model
Select the model you want to deploy:

CL Capability	Recommended Model	Deployment Name Convention	Pricing (per 1M tokens)
Extraction & Classification	GPT-5.4 nano	`cl-nano-extract`	$0.20 input / $1.25 output
Document Understanding	GPT-5.4 mini	`cl-mini-understand`	$0.75 input / $4.50 output
Reasoning	GPT-5.4	`cl-54-reason`	$2.50 input / $15.00 output
Generation	GPT-5.4 mini	`cl-mini-generate`	$0.75 input / $4.50 output
Embeddings	text-embedding-3-small	`cl-embed-small`	$0.02 input

Set the tokens per minute (TPM) quota for each deployment
Click Deploy

Deployment Name vs Model Name

Azure OpenAI requires you to use the deployment name (not the model name) in API calls. When configuring CL, enter the deployment name as the model name in the AI Capabilities settings.

For example, if you deploy GPT-5.4 nano with the deployment name cl-nano-extract, enter cl-nano-extract as the model in CL's capability mapping.

Step 3: Get Endpoint and API Key

In the Azure Portal, navigate to your Azure OpenAI resource
Click Keys and Endpoint in the left sidebar
Copy:
- KEY 1 (or KEY 2) -- this is your API key
- Endpoint -- the URL (e.g., https://cl-openai-prod.openai.azure.com/)

Step 4: Configure in Contract Lucidity

Navigate to Settings > AI Providers
Click Add Provider
Select Azure OpenAI as the provider type
Enter:
- API Key: The key from Step 3
- Endpoint URL: The endpoint from Step 3
Click Save & Verify

Then map capabilities as described in Step 2 of the Overview, using your deployment names as the model names.

Architecture

Quota and Scaling

Azure OpenAI uses a Tokens Per Minute (TPM) quota system per deployment. Default quotas vary by model and region.

Recommended Quotas for CL

Deployment	Min TPM	Recommended TPM	Max Available
GPT-5.4 nano (extraction)	30K	120K	600K+ (varies by region)
GPT-5.4 mini (understanding/generation)	30K	120K	600K+
GPT-5.4 (reasoning)	30K	80K	300K+
Embedding model	120K	350K	2M+

Increasing Quotas

In Azure AI Studio, click Quotas in the left sidebar
Select your deployment
Click Request quota increase
Specify the desired TPM

tip

If you are processing many documents simultaneously, prioritise increasing the embedding model's TPM quota. Embedding calls are made in batches of up to 100 texts and can consume quota quickly.

Cost Considerations

Azure OpenAI pricing is generally equivalent to standard OpenAI pricing for the same models, with minor regional variations. Key differences:

Factor	Standard OpenAI	Azure OpenAI
Per-token pricing	Same	Same (or very close)
Provisioned throughput	Not available	Available (reserved capacity at discount)
Billing	Direct to OpenAI	Through your Azure subscription
Committed use discounts	No	Yes (Azure reservations)
Cost visibility	OpenAI dashboard	Azure Cost Management + tags

Provisioned Throughput Units (PTU)

For high-volume, predictable workloads, Azure offers Provisioned Throughput -- reserved capacity billed hourly rather than per-token. This can reduce costs by 30-50% for Am Law 100 deployments processing thousands of documents monthly.

Security Best Practices

Network Isolation

For production deployments, restrict network access to your Azure OpenAI resource:

Azure OpenAI Resource > Networking > Firewalls and virtual networks

Options:

Allow specific IP addresses -- add your CL server's public IP
Private endpoint -- create a private endpoint in the same VNet as your CL deployment
Service endpoint -- if CL runs in an Azure VM or AKS within the same VNet

Key Rotation

Azure provides two API keys (KEY 1 and KEY 2) to enable zero-downtime key rotation:

Update CL to use KEY 2
Regenerate KEY 1
(Next rotation) Update CL to use KEY 1
Regenerate KEY 2

Content Filtering

Azure OpenAI includes built-in content filtering that can be customised per deployment. For legal contract analysis, the default filter settings are generally appropriate. If you encounter false-positive content filtering on legitimate contract language (e.g., indemnification clauses discussing liability for bodily injury), you can adjust filters in Azure AI Studio.

Troubleshooting

Symptom	Cause	Solution
`404 Resource Not Found`	Wrong endpoint URL or deployment name	Verify endpoint URL includes trailing `/`, verify deployment name exactly matches
`401 Access Denied`	Invalid API key	Regenerate key in Azure Portal > Keys and Endpoint
`429 Rate Limit Exceeded`	TPM quota exceeded	Increase deployment quota or reduce concurrency
`DeploymentNotFound`	Deployment name typo in CL config	Use deployment name (not model name) in CL capability mapping
Model not available in region	Region limitation	GPT-5.4 mini/nano: Standard Global, Data Zone US. Check model availability
Content filter triggered	Default filter blocking legal content	Customise content filter in Azure AI Studio
Slow responses	Low TPM allocation	Increase TPM quota; consider Provisioned Throughput for consistent latency

When to Use Azure OpenAI​

Prerequisites​

Setting Up Azure OpenAI​

Step 1: Create an Azure OpenAI Resource​

Step 2: Deploy Models​

Step 3: Get Endpoint and API Key​

Step 4: Configure in Contract Lucidity​

Architecture​

Quota and Scaling​

Recommended Quotas for CL​

Increasing Quotas​

Cost Considerations​

Provisioned Throughput Units (PTU)​

Security Best Practices​

Network Isolation​

Key Rotation​

Content Filtering​

Troubleshooting​