Build a RAG Application with DeepSeek R1, OLLAMA, and Semantic Kernel

Are you curious about how to combine your own data with AI so it can answer questions instantly? In this guide, I’ll show you how to build a simple RAG (Retrieval-Augmented Generation) application using DeepSeek R1, OLLAMA, and Microsoft Semantic Kernel.

We’ll create a basic Expense Manager scenario where you can ask questions like:

“How much did I spend on coffee?”
“What is my total expense amount?”
“Which day did I spend the most?”

With RAG, your AI model will not just generate answers — it will pull information from your documents and give accurate, context-aware responses. Let’s dive in 🚀.

Step 1: Prerequisites

Before we start, make sure you have:

Ollama Running on your machine.
DeepSeek R1 model installed (e.g., 1.5B parameters).
Semantic Kernel packages installed in your .NET project.

Verify your model with:

1

ollama list

Confirm that DeepSeek R1 is available and running on port 11434.

Step 2: Create a Console Project

We’ll use a .NET Console Application for this demo.

Install the required NuGet packages:

1
2
3


dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.KernelMemory.Core
dotnet add package Microsoft.KernelMemory

These packages let you embed documents, manage memory, and query with Semantic Kernel.

Step 3: Configure Models

In your program, set up the configuration for text generation and embeddings:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


var config = new Config
{
    Endpoint = "http://localhost:11434",
    Model = "deepseek-r1:1.5b",
    MaxTokens = 131072
};

var embeddingConfig = new Config
{
    Endpoint = "http://localhost:11434",
    Model = "deepseek-r1:1.5b",
    MaxTokens = 2048
};

This ensures both text generation and embeddings are handled by DeepSeek R1.

Step 4: Create Memory and Add Documents

We’ll embed a document containing our expense data into Semantic Kernel memory.

Example snippet:

1
2
3
4
5
6
7
8


var memory = new MemoryBuilder()
    .WithLlamaTextGeneration(config)
    .WithLlamaTextEmbeddingGeneration(embeddingConfig)
    .WithMemoryServerless()
    .Build();

await memory.AddDocumentAsync("expenses", "budget.txt");
Console.WriteLine("Document loaded. Model is ready to take questions!");

Your budget.txt file might look like this:

1
2
3
4
5


Date       Category     Amount
2025-02-10 Coffee       3.75
2025-02-11 Coffee       4.50
2025-02-14 Rent         800
2025-02-16 Groceries    40

Step 5: Ask Questions

Now we can query our memory interactively:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


while (true)
{
    Console.Write("Ask a question: ");
    var question = Console.ReadLine();

    var answer = await memory.QueryAsync("expenses", question);

    Console.WriteLine("Answer: " + answer.Result);

    foreach (var source in answer.RelevantSources)
    {
        Console.WriteLine($"Source: {source.SourceName}");
    }
}

Example queries:

Q: How much did I spend on coffee?
A: You spent a total of 8.25 on coffee.
Q: Which day did I spend the most?
A: February 14, 2025 — Rent $800.
Q: Which day did I spend the least?
A: February 16, 2025 — Groceries $40.

Step 6: Add More Documents (PDFs, etc.)

You can also embed PDFs or multiple files:

1

await memory.AddDocumentAsync("docs", "micropython-doc.pdf");

Now your model can answer from both your expense file and MicroPython docs, and even show the exact source file in its response.

Conclusion

By combining DeepSeek R1, AMA, and Semantic Kernel, we built a working RAG application that can:

Ingest personal or project documents.
Answer natural-language questions with context.
Cite sources for reliable results.

This workflow is powerful because you’re not just chatting with an LLM — you’re giving it your own knowledge base.

💡 Want the complete project code? I’ve uploaded it (along with the text and PDF files) on my Buy Me a Coffee .

Semantic Kernel Prompts: Complete Guide to All Types with Examples for .NET AI Apps