Skip to main content

Azure OpenAI Service

Azure OpenAI Service provides OpenAI models (GPT-5.4, GPT-5.4 mini, GPT-5.4 nano, embedding models) hosted within Microsoft Azure — now accessed through Microsoft AI Foundry. This is the recommended path for enterprises that require Azure compliance, data residency guarantees, private networking, or Microsoft Entra ID integration.

GPT-5.4 on Azure (March 2026)

GPT-5.4 mini and GPT-5.4 nano are now available on Azure via Microsoft Foundry in Standard Global deployment, Data Zone US, and rolling out to Data Zone EU. See the official announcement.

How CL Connects to Azure OpenAI

Contract Lucidity treats Azure OpenAI as a variant of the OpenAI provider. Internally, both use the openai Python SDK -- Azure OpenAI simply requires an Azure-specific endpoint URL and API key instead of the standard OpenAI endpoint. The provider key is azure_openai.

When to Use Azure OpenAI

RequirementStandard OpenAIAzure OpenAI
Data stays in your Azure tenantNoYes
Private endpoint (VNet integration)NoYes
Azure AD / Entra ID authenticationNoYes
SOC 2 Type II via your Azure subscriptionSharedDedicated
HIPAA BAA through AzureNoYes
Content filtering customisationLimitedFull control
Model availabilityAll models, immediatelyPer-region, may lag
Setup complexityLowMedium-High

Prerequisites

Before configuring CL, you need:

  1. An Azure subscription with billing enabled
  2. Access approval for Azure OpenAI Service (may require an application form at aka.ms/oai/access)
  3. Permission to create resources in your Azure subscription (Contributor role or higher)

Setting Up Azure OpenAI

Step 1: Create an Azure OpenAI Resource

  1. Sign in to the Azure Portal
  2. Click Create a resource
  3. Search for "Azure OpenAI" and select it
  4. Click Create and fill in:
    • Subscription: Select your Azure subscription
    • Resource group: Create new or select existing
    • Region: Choose a region that supports your desired models (see model availability by region)
    • Name: A unique name (e.g., cl-openai-prod)
    • Pricing tier: Standard S0
  5. Configure network access:
    • All networks (simplest for initial setup)
    • Selected networks (recommended for production -- restrict to your CL server's IP)
    • Private endpoint (most secure -- requires VNet)
  6. Click Review + create, then Create
Region Selection Matters

Not all models are available in all Azure regions. GPT-5.4 mini and nano are available in Standard Global and Data Zone US as of March 2026. Check the Azure OpenAI model availability page for current regional support.

Step 2: Deploy Models

After the resource is created:

  1. Open your Azure OpenAI resource
  2. Click Model deployments > Manage Deployments (opens Azure AI Studio)
  3. Click + Deploy model > Deploy base model
  4. Select the model you want to deploy:
CL CapabilityRecommended ModelDeployment Name ConventionPricing (per 1M tokens)
Extraction & ClassificationGPT-5.4 nanocl-nano-extract$0.20 input / $1.25 output
Document UnderstandingGPT-5.4 minicl-mini-understand$0.75 input / $4.50 output
ReasoningGPT-5.4cl-54-reason$2.50 input / $15.00 output
GenerationGPT-5.4 minicl-mini-generate$0.75 input / $4.50 output
Embeddingstext-embedding-3-smallcl-embed-small$0.02 input
  1. Set the tokens per minute (TPM) quota for each deployment
  2. Click Deploy
Deployment Name vs Model Name

Azure OpenAI requires you to use the deployment name (not the model name) in API calls. When configuring CL, enter the deployment name as the model name in the AI Capabilities settings.

For example, if you deploy GPT-5.4 nano with the deployment name cl-nano-extract, enter cl-nano-extract as the model in CL's capability mapping.

Step 3: Get Endpoint and API Key

  1. In the Azure Portal, navigate to your Azure OpenAI resource
  2. Click Keys and Endpoint in the left sidebar
  3. Copy:
    • KEY 1 (or KEY 2) -- this is your API key
    • Endpoint -- the URL (e.g., https://cl-openai-prod.openai.azure.com/)

Step 4: Configure in Contract Lucidity

  1. Navigate to Settings > AI Providers
  2. Click Add Provider
  3. Select Azure OpenAI as the provider type
  4. Enter:
    • API Key: The key from Step 3
    • Endpoint URL: The endpoint from Step 3
  5. Click Save & Verify

Then map capabilities as described in Step 2 of the Overview, using your deployment names as the model names.

Architecture

Quota and Scaling

Azure OpenAI uses a Tokens Per Minute (TPM) quota system per deployment. Default quotas vary by model and region.

DeploymentMin TPMRecommended TPMMax Available
GPT-5.4 nano (extraction)30K120K600K+ (varies by region)
GPT-5.4 mini (understanding/generation)30K120K600K+
GPT-5.4 (reasoning)30K80K300K+
Embedding model120K350K2M+

Increasing Quotas

  1. In Azure AI Studio, click Quotas in the left sidebar
  2. Select your deployment
  3. Click Request quota increase
  4. Specify the desired TPM
tip

If you are processing many documents simultaneously, prioritise increasing the embedding model's TPM quota. Embedding calls are made in batches of up to 100 texts and can consume quota quickly.

Cost Considerations

Azure OpenAI pricing is generally equivalent to standard OpenAI pricing for the same models, with minor regional variations. Key differences:

FactorStandard OpenAIAzure OpenAI
Per-token pricingSameSame (or very close)
Provisioned throughputNot availableAvailable (reserved capacity at discount)
BillingDirect to OpenAIThrough your Azure subscription
Committed use discountsNoYes (Azure reservations)
Cost visibilityOpenAI dashboardAzure Cost Management + tags

Provisioned Throughput Units (PTU)

For high-volume, predictable workloads, Azure offers Provisioned Throughput -- reserved capacity billed hourly rather than per-token. This can reduce costs by 30-50% for Am Law 100 deployments processing thousands of documents monthly.

Security Best Practices

Network Isolation

For production deployments, restrict network access to your Azure OpenAI resource:

Azure OpenAI Resource > Networking > Firewalls and virtual networks

Options:

  1. Allow specific IP addresses -- add your CL server's public IP
  2. Private endpoint -- create a private endpoint in the same VNet as your CL deployment
  3. Service endpoint -- if CL runs in an Azure VM or AKS within the same VNet

Key Rotation

Azure provides two API keys (KEY 1 and KEY 2) to enable zero-downtime key rotation:

  1. Update CL to use KEY 2
  2. Regenerate KEY 1
  3. (Next rotation) Update CL to use KEY 1
  4. Regenerate KEY 2

Content Filtering

Azure OpenAI includes built-in content filtering that can be customised per deployment. For legal contract analysis, the default filter settings are generally appropriate. If you encounter false-positive content filtering on legitimate contract language (e.g., indemnification clauses discussing liability for bodily injury), you can adjust filters in Azure AI Studio.

Troubleshooting

SymptomCauseSolution
404 Resource Not FoundWrong endpoint URL or deployment nameVerify endpoint URL includes trailing /, verify deployment name exactly matches
401 Access DeniedInvalid API keyRegenerate key in Azure Portal > Keys and Endpoint
429 Rate Limit ExceededTPM quota exceededIncrease deployment quota or reduce concurrency
DeploymentNotFoundDeployment name typo in CL configUse deployment name (not model name) in CL capability mapping
Model not available in regionRegion limitationGPT-5.4 mini/nano: Standard Global, Data Zone US. Check model availability
Content filter triggeredDefault filter blocking legal contentCustomise content filter in Azure AI Studio
Slow responsesLow TPM allocationIncrease TPM quota; consider Provisioned Throughput for consistent latency