ML.NET Tutorial: Build Your First ML Model in C#

Learn ML.NET with this step-by-step C# tutorial. Build, train, and deploy your first machine learning model in .NET with practical code examples.

If you've ever wanted to add machine learning to a .NET application without leaving C#, this ML.NET tutorial is where you start. ML.NET is Microsoft's open-source, cross-platform framework that lets C# and F# developers build, train, and deploy custom machine learning models â€” no Python required, no context switching, just the language and ecosystem you already know.

In this hands-on guide, you'll build a complete machine learning model in C# from scratch. We'll walk through real, runnable code that loads data, trains a binary classification model, evaluates its accuracy, and makes predictions â€” all using ML.NET in a standard .NET console application.

By the end, you'll understand the ML.NET pipeline architecture, know how to pick the right algorithm for your problem, and have a working model you can integrate into any .NET application.

What Is ML.NET and Why Should C# Developers Care?

ML.NET is a machine learning framework built specifically for .NET developers. Released by Microsoft, it provides a first-class way to integrate ML into your existing C# applications without relying on external services or learning an entirely new language.

Here's why ML.NET stands out for .NET teams:

No Python dependency â€” Train and consume models entirely in C#. Your ML code lives alongside your business logic, shares the same types, and deploys the same way.
Production-ready performance â€” ML.NET models run natively in .NET. No inter-process calls, no REST overhead, no serialization bottlenecks. Inference is fast and memory-efficient.
Broad algorithm support â€” Classification, regression, clustering, anomaly detection, recommendation, ranking, time series forecasting, and image classification are all built in.
AutoML included â€” Not sure which algorithm to pick? ML.NET's AutoML automatically searches across algorithms and hyperparameters to find the best model for your data.
ONNX interoperability â€” Import models trained in TensorFlow, PyTorch, or scikit-learn via ONNX format, then serve them through ML.NET's prediction engine.

If your application already runs on .NET, ML.NET eliminates the operational complexity of maintaining a separate Python microservice just for ML predictions.

ML.NET Tutorial: Setting Up Your Project

Let's build a sentiment analysis model â€” a binary classifier that predicts whether a product review is positive or negative. This is one of the most practical ML.NET examples because it demonstrates the full pipeline with a simple, understandable dataset.

Prerequisites

.NET 8 SDK or later (works with .NET 9 as well)
Any code editor (Visual Studio, VS Code, or Rider)
Basic C# knowledge

Create the Project and Install ML.NET

Open your terminal and create a new console application:

// Run these commands in your terminal:
// dotnet new console -n SentimentAnalysis
// cd SentimentAnalysis
// dotnet add package Microsoft.ML

That single NuGet package gives you the entire ML.NET framework â€” data loading, transformations, trainers, and the prediction engine.

Step 1: Define Your Data Models

ML.NET uses strongly-typed C# classes to represent your data. This is one of its biggest advantages over dynamically-typed ML frameworks â€” your IDE gives you autocomplete, compile-time checking, and refactoring support on your ML data structures.

using Microsoft.ML.Data;

public class ReviewData
{
    [LoadColumn(0)]
    public string? ReviewText { get; set; }

    [LoadColumn(1), ColumnName("Label")]
    public bool Sentiment { get; set; }
}

public class SentimentPrediction
{
    [ColumnName("PredictedLabel")]
    public bool Prediction { get; set; }

    public float Probability { get; set; }

    public float Score { get; set; }
}

ReviewData maps to your training CSV. The [LoadColumn] attributes tell ML.NET which CSV column maps to which property. The [ColumnName("Label")] attribute marks Sentiment as the value we want the model to predict.

SentimentPrediction is the output shape. ML.NET populates Prediction (true/false), Probability (0.0 to 1.0 confidence), and Score (the raw model output before sigmoid) automatically after inference.

Step 2: Prepare Your Training Data

Create a file called reviews.csv in your project directory. In a real project, you'd use thousands of labeled examples. For this C# machine learning tutorial, we'll use a small dataset to demonstrate the pipeline:

// reviews.csv content:
// ReviewText,Sentiment
// "This product is amazing and works perfectly",true
// "Terrible quality, broke after one day",false
// "Best purchase I've made this year",true
// "Complete waste of money, do not buy",false
// "Excellent build quality and fast shipping",true
// "Stopped working within a week, very disappointed",false
// "Love it! Exactly what I needed",true
// "Poor design, cheaply made",false
// "Great value for the price, highly recommend",true
// "Returned it immediately, awful product",false

Set the file to copy to the output directory by adding this to your .csproj:

// Add inside your .csproj file:
// <ItemGroup>
//   <Content Include="reviews.csv">
//     <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
//   </Content>
// </ItemGroup>

Step 3: Build the ML.NET Pipeline and Train the Model

This is where ML.NET's architecture shines. You define a pipeline of data transformations and a training algorithm, then execute it against your data. Everything is composable and strongly typed.

using Microsoft.ML;

var mlContext = new MLContext(seed: 42);

// Load data
IDataView dataView = mlContext.Data.LoadFromTextFile<ReviewData>(
    path: "reviews.csv",
    hasHeader: true,
    separatorChar: ','
);

// Split into training and test sets (80/20)
var splitData = mlContext.Data.TrainTestSplit(dataView, testFraction: 0.2);

// Build the transformation and training pipeline
var pipeline = mlContext.Transforms.Text
    .FeaturizeText(
        outputColumnName: "Features",
        inputColumnName: nameof(ReviewData.ReviewText))
    .Append(mlContext.BinaryClassification.Trainers
        .SdcaLogisticRegression(
            labelColumnName: "Label",
            featureColumnName: "Features"));

// Train the model
Console.WriteLine("Training the model...");
ITransformer model = pipeline.Fit(splitData.TrainSet);
Console.WriteLine("Training complete.");

Let's break down what each piece does:

MLContext â€” The entry point for all ML.NET operations. Setting a seed ensures reproducible results across runs.
LoadFromTextFile â€” Reads your CSV and maps it to ReviewData objects using the [LoadColumn] attributes.
TrainTestSplit â€” Randomly splits data so you can train on 80% and evaluate on 20% the model hasn't seen. This prevents overfitting.
FeaturizeText â€” Converts raw text into a numerical feature vector. Internally, it tokenizes, removes stop words, and applies n-gram and TF-IDF weighting. This single method call replaces dozens of lines of manual text preprocessing.
SdcaLogisticRegression â€” A fast, scalable binary classification algorithm. SDCA (Stochastic Dual Coordinate Ascent) handles large datasets efficiently and works well as a starting point.

Step 4: Evaluate Model Accuracy

Never deploy a model without measuring its performance. ML.NET provides built-in evaluation metrics for every task type:

// Evaluate on the test set
var predictions = model.Transform(splitData.TestSet);
var metrics = mlContext.BinaryClassification.Evaluate(predictions, "Label");

Console.WriteLine($"Accuracy:    {metrics.Accuracy:P2}");
Console.WriteLine($"AUC:         {metrics.AreaUnderRocCurve:P2}");
Console.WriteLine($"F1 Score:    {metrics.F1Score:P2}");
Console.WriteLine($"Precision:   {metrics.PositivePrecision:P2}");
Console.WriteLine($"Recall:      {metrics.PositiveRecall:P2}");

Understanding these metrics matters:

Accuracy â€” Percentage of correct predictions overall. Misleading when classes are imbalanced (e.g., 95% positive reviews).
AUC (Area Under ROC Curve) â€” Measures how well the model separates classes regardless of threshold. Closer to 1.0 is better. This is often more reliable than accuracy.
F1 Score â€” The harmonic mean of precision and recall. Use this when you care about both false positives and false negatives equally.
Precision â€” Of all predictions labeled positive, how many were actually positive? High precision means few false positives.
Recall â€” Of all actually positive samples, how many did the model find? High recall means few false negatives.

For a production sentiment classifier, you'd typically want AUC above 0.85 and F1 above 0.80. With our tiny dataset, the numbers will be lower â€” the point here is understanding the pipeline.

Step 5: Make Predictions with Your Trained Model

Now let's use the model to classify new reviews it has never seen:

// Create a prediction engine for single predictions
var predictionEngine = mlContext.Model
    .CreatePredictionEngine<ReviewData, SentimentPrediction>(model);

// Predict on new data
var sampleReviews = new[]
{
    new ReviewData { ReviewText = "Absolutely love this product, works great!" },
    new ReviewData { ReviewText = "Broke on the first use, total junk" },
    new ReviewData { ReviewText = "Decent product for the price" }
};

foreach (var review in sampleReviews)
{
    var prediction = predictionEngine.Predict(review);
    var sentiment = prediction.Prediction ? "Positive" : "Negative";
    Console.WriteLine($"Review: {review.ReviewText}");
    Console.WriteLine($"  Sentiment: {sentiment} ({prediction.Probability:P1} confidence)");
    Console.WriteLine();
}

The PredictionEngine is optimized for single predictions â€” perfect for real-time scenarios like API endpoints or user input validation. For batch predictions on large datasets, use model.Transform(dataView) instead, which is more efficient for bulk operations.

Step 6: Save and Load the Model

A trained model is useless if you have to retrain it every time your application starts. ML.NET lets you serialize models to disk and load them in any .NET application:

// Save the trained model
string modelPath = "SentimentModel.zip";
mlContext.Model.Save(model, dataView.Schema, modelPath);
Console.WriteLine($"Model saved to {modelPath}");

// Load the model in another application or service
MLContext loadedContext = new MLContext();
ITransformer loadedModel = loadedContext.Model.Load(modelPath, out var schema);

// Create a new prediction engine from the loaded model
var loadedEngine = loadedContext.Model
    .CreatePredictionEngine<ReviewData, SentimentPrediction>(loadedModel);

var result = loadedEngine.Predict(
    new ReviewData { ReviewText = "This is fantastic!" });
Console.WriteLine($"Loaded model prediction: {result.Prediction} ({result.Probability:P1})");

The saved .zip file contains the entire pipeline â€” transformations and trained model weights. You can deploy this file alongside your application, load it at startup, and run predictions without any training infrastructure.

ML.NET Best Practices for Production

Getting a model working is one thing. Getting it working reliably in production is another. Here are the practices that matter:

Use PredictionEnginePool for Web Applications

PredictionEngine is not thread-safe. In ASP.NET Core applications, use PredictionEnginePool from the Microsoft.Extensions.ML package. It manages a pool of engines and handles concurrent requests safely:

// In Program.cs or Startup.cs
builder.Services.AddPredictionEnginePool<ReviewData, SentimentPrediction>()
    .FromFile(modelName: "SentimentModel", filePath: "SentimentModel.zip");

// In your controller or endpoint
app.MapPost("/predict", (
    PredictionEnginePool<ReviewData, SentimentPrediction> pool,
    ReviewData input) =>
{
    var prediction = pool.Predict(modelName: "SentimentModel", input);
    return Results.Ok(new { prediction.Prediction, prediction.Probability });
});

Let AutoML Choose the Best Algorithm

If you're unsure whether SdcaLogisticRegression is the best trainer for your data, use AutoML to search automatically:

// Install: dotnet add package Microsoft.ML.AutoML

var experiment = mlContext.Auto()
    .CreateBinaryClassificationExperiment(maxExperimentTimeInSeconds: 60)
    .Execute(splitData.TrainSet, labelColumnName: "Label");

Console.WriteLine($"Best trainer: {experiment.BestRun.TrainerName}");
Console.WriteLine($"Best accuracy: {experiment.BestRun.ValidationMetrics.Accuracy:P2}");

// Use the best model directly
ITransformer bestModel = experiment.BestRun.Model;

Common Pitfalls to Avoid

Training on too little data â€” Our example uses 10 rows for demonstration. Real models need hundreds to thousands of labeled examples minimum. The model quality scales directly with data quality and quantity.
Not shuffling data â€” If all positive examples come first and negative examples come second, the train/test split won't be representative. TrainTestSplit shuffles by default, but verify your data isn't sorted by label.
Ignoring class imbalance â€” If 90% of your data is positive, the model learns to always predict positive and still gets 90% accuracy. Use F1 and AUC metrics instead, and consider techniques like oversampling the minority class.
Evaluating on training data â€” Always evaluate on held-out test data. A model that memorizes training data will look perfect but fail on new inputs.
Using PredictionEngine in multi-threaded code â€” It's not thread-safe. Use PredictionEnginePool in web applications, or create one engine per thread.

Beyond Binary Classification: What Else Can ML.NET Do?

Sentiment analysis is just the starting point. ML.NET supports a wide range of machine learning tasks you can build into your .NET applications:

Regression â€” Predict continuous values like price, temperature, or delivery time.
Multi-class classification â€” Categorize items into three or more groups (e.g., support ticket routing).
Recommendation â€” Build "users who bought X also bought Y" engines using matrix factorization.
Anomaly detection â€” Flag unusual transactions, server metrics, or sensor readings in real time.
Image classification â€” Classify images using transfer learning with pre-trained deep learning models.
Object detection â€” Locate and identify objects within images.
Time series forecasting â€” Predict future values based on historical patterns (sales, traffic, inventory).

Each task follows the same pipeline pattern: load data, transform features, train, evaluate, predict. Once you understand the pattern from this tutorial, adapting it to other problem types is straightforward.

Conclusion: Getting Started with ML.NET

This ML.NET tutorial walked you through the complete lifecycle of building a machine learning model in C# â€” from project setup through data loading, pipeline construction, training, evaluation, and deployment. The key takeaways:

ML.NET lets you build and deploy ML models entirely in C# with no Python dependency.
The pipeline architecture (load â†’ transform â†’ train â†’ evaluate â†’ predict) is consistent across all ML task types.
Always evaluate on held-out test data and use metrics appropriate for your problem (AUC and F1 over raw accuracy).
Use PredictionEnginePool for thread-safe predictions in web applications.
Start with a simple model, measure its performance, then iterate â€” AutoML can help you find better algorithms automatically.

The complete source code from this tutorial runs as-is in any .NET 8+ console application. Install the Microsoft.ML NuGet package, paste the code, add your training data, and you have a working ML model running natively in .NET â€” no external services, no API costs, no language switching.

Tags: #ML.NET tutorial #machine learning C# #ML.NET example #C# machine learning tutorial #build ML model C# #ML.NET getting started

About csharp-coder.com
Your go-to resource for C#, .NET, and modern software development. Follow along for daily tutorials, tips, and real-world examples.

Angular 14 : 404 error during refresh page after deployment

In this article, We will learn how to solve 404 file or directory not found angular error in production. Refresh browser angular 404 file or directory not found error You have built an Angular app and created a production build with ng build --prod You deploy it to a production server. Everything works fine until you refresh the page. The app throws The requested URL was not found on this server message (Status code 404 not found). It appears that angular routing not working on the production server when you refresh the page. The error appears on the following scenarios When you type the URL directly in the address bar. When you refresh the page The error appears on all the pages except the root page. Reason for the requested URL was not found on this server error In a Multi-page web application, every time the application needs to display a page it has to send a request to the web server. You can do that by either typing the URL in the address bar, clicking on the Me...

CSharp-Coder

Search This Blog