
Learn how to use Azure OpenAI in C# with practical code examples. Integrate GPT-4o into your .NET app with streaming, function calling, and best practices.
Azure OpenAI C# Tutorial — Build AI-Powered .NET Applications with GPT-4o
If you want to add AI capabilities to your .NET application, Azure OpenAI with C# is the most production-ready path available today. Unlike calling the OpenAI API directly, Azure OpenAI gives you enterprise-grade security, regional data residency, private networking, and the same powerful models — including GPT-4o — running inside your own Azure subscription.
In this tutorial, you will learn how to integrate GPT-4o into a C# application from scratch. We will cover setup, basic chat completions, streaming responses, function calling, and the best practices that separate hobby projects from production systems. Every code example is runnable so you can follow along in Visual Studio or VS Code.
Prerequisites — What You Need Before Writing Code
Before you start the Azure OpenAI C# integration, make sure you have the following ready:
- .NET 8 or later installed on your machine
- An Azure subscription (a free trial works)
- An Azure OpenAI resource created in the Azure Portal
- A GPT-4o model deployment within that resource
- Your endpoint URL and API key from the Azure Portal (found under Keys and Endpoint)
To create the Azure OpenAI resource, go to the Azure Portal, search for "Azure OpenAI," and click Create. Choose your subscription, resource group, region, and a name. Once provisioned, navigate to Model Deployments and deploy the gpt-4o model. Note the deployment name — you will need it in your code.
Step 1 — Install the Azure OpenAI SDK for .NET
The official Azure OpenAI SDK for .NET is distributed as a NuGet package. Open your terminal in your project directory and run:
dotnet add package Azure.AI.OpenAI
dotnet add package OpenAI
The Azure.AI.OpenAI package provides the AzureOpenAIClient class, while the OpenAI package provides shared types like ChatClient, ChatMessage, and ChatCompletion. Both packages work together — Azure's SDK extends the base OpenAI SDK with Azure-specific authentication and configuration.
Step 2 — Your First Azure OpenAI Chat Completion in C#
Let's start with the simplest possible example — sending a single prompt and getting a response. This is the foundation of every Azure OpenAI API C# example you will build on.
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
// Replace with your actual values
string endpoint = "https://your-resource-name.openai.azure.com/";
string apiKey = "your-api-key-here";
string deploymentName = "gpt-4o"; // Your model deployment name
// Create the Azure OpenAI client
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
// Get a ChatClient for your deployment
ChatClient chatClient = azureClient.GetChatClient(deploymentName);
// Send a chat completion request
ChatCompletion completion = await chatClient.CompleteChatAsync(
[
new SystemChatMessage("You are a helpful assistant specializing in C# and .NET development."),
new UserChatMessage("Explain the difference between async and parallel programming in C#.")
]);
Console.WriteLine(completion.Content[0].Text);
This code creates an AzureOpenAIClient, gets a ChatClient for your GPT-4o deployment, and sends a conversation with a system message and a user message. The response comes back as a ChatCompletion object, and the generated text lives in completion.Content[0].Text.
Using Microsoft Entra ID Instead of API Keys
For production applications, API keys are not recommended. Use Microsoft Entra ID (formerly Azure Active Directory) with the Azure.Identity package for token-based authentication:
using Azure.Identity;
using Azure.AI.OpenAI;
using OpenAI.Chat;
string endpoint = "https://your-resource-name.openai.azure.com/";
string deploymentName = "gpt-4o";
// Uses DefaultAzureCredential — works with Managed Identity,
// Azure CLI, Visual Studio, and more
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new DefaultAzureCredential());
ChatClient chatClient = azureClient.GetChatClient(deploymentName);
ChatCompletion completion = await chatClient.CompleteChatAsync(
[
new SystemChatMessage("You are a helpful coding assistant."),
new UserChatMessage("Write a LINQ query to find duplicate emails in a list.")
]);
Console.WriteLine(completion.Content[0].Text);
Install the identity package with dotnet add package Azure.Identity. The DefaultAzureCredential automatically picks the right credential source based on your environment — Managed Identity in Azure, your Visual Studio or Azure CLI credentials locally.
Step 3 — Stream Responses for Better User Experience
Waiting for a complete response before showing anything to the user creates a poor experience, especially for longer answers. Streaming chat completions in C# lets you display tokens as they arrive, just like ChatGPT does:
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
var messages = new List<ChatMessage>
{
new SystemChatMessage("You are a senior .NET architect. Give detailed, practical answers."),
new UserChatMessage("What are the best practices for structuring a Clean Architecture solution in .NET 8?")
};
// Stream the response token by token
await foreach (StreamingChatCompletionUpdate update
in chatClient.CompleteChatStreamingAsync(messages))
{
foreach (ChatMessageContentPart part in update.ContentUpdate)
{
Console.Write(part.Text);
}
}
Console.WriteLine();
The CompleteChatStreamingAsync method returns an IAsyncEnumerable of StreamingChatCompletionUpdate objects. Each update may contain one or more content parts with partial text. This is essential for web APIs where you want to use Server-Sent Events (SSE) to push tokens to a frontend in real time.
Step 4 — Maintain Conversation History
A single request-response is useful, but real applications need multi-turn conversations. The Azure OpenAI API is stateless — you must send the full conversation history with each request:
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// Maintain conversation history
var conversationHistory = new List<ChatMessage>
{
new SystemChatMessage("You are an expert C# tutor. Keep answers concise and use code examples.")
};
// Simulate a multi-turn conversation
string[] userQuestions =
[
"What is dependency injection?",
"Show me how to register services in .NET 8.",
"How do I use keyed services?"
];
foreach (string question in userQuestions)
{
conversationHistory.Add(new UserChatMessage(question));
ChatCompletion completion = await chatClient.CompleteChatAsync(conversationHistory);
string response = completion.Content[0].Text;
// Add assistant response to history so the model remembers it
conversationHistory.Add(new AssistantChatMessage(response));
Console.WriteLine($"User: {question}");
Console.WriteLine($"Assistant: {response}\n");
Console.WriteLine(new string('-', 60));
}
Each time you call CompleteChatAsync, you send the entire list of messages. The model uses the full history to generate contextually relevant responses. Keep in mind that longer histories consume more tokens, so in production you should implement a sliding window or summarization strategy to manage token usage.
Step 5 — Function Calling (Tool Use) with GPT-4o
One of the most powerful features of GPT-4o is function calling — the model can decide when to call functions you define and provide structured arguments. This lets you build AI agents that interact with databases, APIs, and external systems.
using System.Text.Json;
using Azure;
using Azure.AI.OpenAI;
using OpenAI.Chat;
AzureOpenAIClient azureClient = new(
new Uri(endpoint),
new AzureKeyCredential(apiKey));
ChatClient chatClient = azureClient.GetChatClient("gpt-4o");
// Define a tool the model can call
ChatTool getWeatherTool = ChatTool.CreateFunctionTool(
functionName: "get_current_weather",
functionDescription: "Get the current weather for a given city",
functionParameters: BinaryData.FromString("""
{
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name, e.g. London, New York"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit"
}
},
"required": ["city"]
}
""")
);
var messages = new List<ChatMessage>
{
new SystemChatMessage("You are a helpful weather assistant."),
new UserChatMessage("What's the weather like in London and New York right now?")
};
var options = new ChatCompletionOptions();
options.Tools.Add(getWeatherTool);
// First call — the model will request function calls
ChatCompletion completion = await chatClient.CompleteChatAsync(messages, options);
if (completion.FinishReason == ChatFinishReason.ToolCalls)
{
// Add the assistant's tool call message to history
messages.Add(new AssistantChatMessage(completion));
// Process each tool call
foreach (ChatToolCall toolCall in completion.ToolCalls)
{
Console.WriteLine($"Model wants to call: {toolCall.FunctionName}");
Console.WriteLine($"Arguments: {toolCall.FunctionArguments}");
// Simulate calling a real weather API
string weatherResult = toolCall.FunctionName switch
{
"get_current_weather" => GetWeather(toolCall.FunctionArguments),
_ => "Unknown function"
};
// Return the result to the model
messages.Add(new ToolChatMessage(toolCall.Id, weatherResult));
}
// Second call — the model generates a natural language response
ChatCompletion finalResponse = await chatClient.CompleteChatAsync(messages, options);
Console.WriteLine(finalResponse.Content[0].Text);
}
// Simulated weather function
static string GetWeather(BinaryData arguments)
{
using JsonDocument doc = JsonDocument.Parse(arguments);
string city = doc.RootElement.GetProperty("city").GetString()!;
// In a real app, call a weather API here
return JsonSerializer.Serialize(new
{
city,
temperature = city == "London" ? "15°C" : "28°C",
condition = city == "London" ? "Cloudy" : "Sunny"
});
}
Function calling follows a two-step pattern: the model returns ToolCalls instead of text content, you execute the functions and return the results as ToolChatMessage, then you call the model again to generate a human-readable answer from the tool results. This pattern is the foundation for building sophisticated AI agents in C#.
Step 6 — Configure Chat Completion Options
The ChatCompletionOptions class lets you fine-tune the model's behavior. Here are the most commonly used settings:
var options = new ChatCompletionOptions
{
Temperature = 0.7f, // 0 = deterministic, 2 = very creative
MaxOutputTokenCount = 1000, // Limit response length
TopP = 0.9f, // Nucleus sampling threshold
FrequencyPenalty = 0.0f, // Reduce repetition
PresencePenalty = 0.0f, // Encourage topic diversity
};
ChatCompletion completion = await chatClient.CompleteChatAsync(messages, options);
For code generation, use a Temperature of 0 to 0.3 for deterministic, correct output. For creative tasks like writing marketing copy, increase it to 0.7–1.0. The MaxOutputTokenCount parameter prevents the model from generating excessively long responses, which helps control costs.
Best Practices for Production Azure OpenAI Applications
Moving from a working prototype to a production system requires attention to reliability, cost, and security. Here are the practices that matter most when building Azure OpenAI applications in .NET:
1. Never Hardcode Credentials
Store your endpoint and keys in environment variables, Azure Key Vault, or .NET's Secret Manager. Better yet, use Managed Identity with DefaultAzureCredential so there are no secrets to manage at all.
// Read from environment variables or configuration
string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")
?? throw new InvalidOperationException("AZURE_OPENAI_ENDPOINT not set");
// Or use IConfiguration in ASP.NET Core
builder.Services.AddSingleton(sp =>
{
var config = sp.GetRequiredService<IConfiguration>();
return new AzureOpenAIClient(
new Uri(config["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential());
});
2. Handle Rate Limits and Errors Gracefully
Azure OpenAI enforces rate limits measured in tokens per minute (TPM) and requests per minute (RPM). The SDK throws RequestFailedException with a 429 status code when you hit the limit. Implement retry logic with exponential backoff:
using Azure;
using Polly;
using Polly.Retry;
// Using the Polly resilience library
var retryPipeline = new ResiliencePipelineBuilder()
.AddRetry(new RetryStrategyOptions
{
MaxRetryAttempts = 3,
DelayGenerator = args =>
{
// Check for Retry-After header from Azure
if (args.Outcome.Exception is RequestFailedException rfe
&& rfe.Status == 429)
{
return ValueTask.FromResult<TimeSpan?>(
TimeSpan.FromSeconds(Math.Pow(2, args.AttemptNumber)));
}
return ValueTask.FromResult<TimeSpan?>(
TimeSpan.FromSeconds(1));
},
ShouldHandle = new PredicateBuilder()
.Handle<RequestFailedException>(ex => ex.Status is 429 or 500 or 503)
})
.Build();
ChatCompletion completion = await retryPipeline.ExecuteAsync(
async ct => await chatClient.CompleteChatAsync(messages),
CancellationToken.None);
3. Monitor Token Usage and Costs
Every ChatCompletion object includes token usage information. Track it to monitor your spending:
ChatCompletion completion = await chatClient.CompleteChatAsync(messages);
Console.WriteLine($"Input tokens: {completion.Usage.InputTokenCount}");
Console.WriteLine($"Output tokens: {completion.Usage.OutputTokenCount}");
Console.WriteLine($"Total tokens: {completion.Usage.TotalTokenCount}");
4. Register ChatClient in Dependency Injection
In ASP.NET Core applications, register the client as a singleton to reuse the underlying HTTP connection:
// In Program.cs
builder.Services.AddSingleton(sp =>
{
var client = new AzureOpenAIClient(
new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential());
return client.GetChatClient("gpt-4o");
});
// In your controller or service
public class ChatService(ChatClient chatClient)
{
public async Task<string> GetResponseAsync(string userMessage)
{
var completion = await chatClient.CompleteChatAsync(
[new UserChatMessage(userMessage)]);
return completion.Content[0].Text;
}
}
Common Pitfalls to Avoid
After building multiple production Azure OpenAI applications, these are the mistakes I see developers make most often:
- Ignoring token limits: GPT-4o has a context window. If your conversation history plus the response exceeds it, the API returns an error. Track token counts and truncate history proactively.
- Not setting MaxOutputTokenCount: Without a limit, the model may generate thousands of tokens for a simple question, inflating your costs unnecessarily.
- Creating a new client per request:
AzureOpenAIClientmanages an HTTP connection pool. Creating one per request wastes resources and can cause socket exhaustion under load. - Skipping the system message: A well-crafted system prompt dramatically improves response quality. Always include one that defines the assistant's role, constraints, and output format.
- Using API keys in production: Managed Identity is more secure, rotates automatically, and eliminates the risk of key leakage in logs or source control.
- Not implementing content filtering: Azure OpenAI includes built-in content filters, but you should still validate inputs and outputs for your specific use case.
Azure OpenAI vs. OpenAI API — Which Should You Use?
If you are already in the Azure ecosystem, Azure OpenAI is almost always the right choice for C# developers. Here is a quick comparison:
- Azure OpenAI: Enterprise SLAs, private endpoints (VNet integration), data stays in your Azure region, Managed Identity auth, Azure Monitor integration, content filtering built-in.
- OpenAI API directly: Faster access to new models, simpler setup, no Azure subscription needed, pay-as-you-go only.
For startups and personal projects, the OpenAI API is faster to get started with. For enterprise applications where compliance, data residency, and network security matter, Azure OpenAI is the clear winner — and the C# SDK makes switching between them almost trivial since they share the same base types.
Conclusion — Start Building with Azure OpenAI and C# Today
Integrating Azure OpenAI with C# is straightforward once you understand the SDK patterns. You learned how to set up authentication, send chat completions, stream responses, maintain conversation history, and implement function calling — all the building blocks you need for production AI features.
Here are the key takeaways:
- Use the
Azure.AI.OpenAIandOpenAINuGet packages together for the best developer experience. - Prefer
DefaultAzureCredentialover API keys for production deployments. - Stream responses with
CompleteChatStreamingAsyncfor a better user experience. - Implement retry logic with exponential backoff to handle rate limits gracefully.
- Track token usage to keep costs predictable.
- Register
ChatClientas a singleton in your DI container.
Start with the basic chat completion example, get it running, and then layer on streaming and function calling as your application grows. The Azure OpenAI SDK for .NET is well-designed and follows familiar patterns — if you know C#, you already know most of what you need.
Your go-to resource for C#, .NET, and modern software development. Follow along for daily tutorials, tips, and real-world examples.
Comments
Post a Comment