# Privacy-Preserving Fine-tuning

Imagine you’re working on a highly confidential project, and you need to collaborate with others without revealing your sensitive information. That’s essentially what privacy-preserving fine-tuning does for AI models. It’s like having a locked box where only authorized parties can contribute to the project without ever seeing the contents inside.

At the heart of this process is a **blind training protocol**, which ensures that even the entities providing computational resources (like GPU providers) can’t access or infer the training data. This is achieved through **secure enclaves**, which are isolated execution environments built using proven technologies like AMD’s Secure Encrypted Virtualization [(SEV)](https://www.amd.com/en/developer/sev.html) and Intel’s Software Guard Extensions [(SGX)](https://www.intel.com/content/www/us/en/download/683952/intel-software-guard-extensions-intel-sgx-driver-and-data-center-attestation-primitives-intel-sgx-dcap.html). These enclaves act as a fortress, encrypting both the model architecture and the training data while still allowing the GPUs to perform their computations.

<figure><img src="/files/WS6blBE4k22c3NShTZOs" alt=""><figcaption><p>GPU Provisioning for anonymous SLM training job</p></figcaption></figure>

The protocol takes this a step further by integrating cutting-edge privacy-preserving techniques. For instance, it uses **thresholded Paillier homomorphic encryption** to securely aggregate model parameters and **Intel’s Trust Domain Extensions** [**(TDX)**](https://www.intel.com/content/www/us/en/developer/tools/trust-domain-extensions/overview.html) for cross-platform compatibility. This ensures that the fine-tuning process remains confidential and tamper-proof, even across different hardware environments.

To verify the integrity and quality of the training process, the network employs **zero-knowledge proof systems**, particularly inspired by the Mina Protocol’s recursive zk-SNARKs. These proofs allow the network to confirm that the training was done correctly without revealing any details about the data or the model. It’s like getting a verification stamp on your confidential project without anyone seeing what’s inside.

This approach not only safeguards sensitive information but also ensures that the transition from centralized large language models (LLMs) to personalized small language models (SLMs) is seamless, efficient, and secure. It also creates extra guardrailing for agents, so that an agent trained for twitter won’t go awol on your wallet and vice versa. It’s a game-changer for maintaining privacy while still achieving high-performance AI capabilities.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://bitgpt-2.gitbook.io/docs/infrastructure/privacy-preserving-fine-tuning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
