Project Shards

Project Shards is a vision for building personal and community AI from the bottom up rather than relying entirely on large centralised artificial intelligence systems owned and controlled by corporations. The core idea is simple but powerful. Instead of sending private data, thoughts, documents and experiences to distant data centres controlled by unknown actors, individuals and communities build small local language models that run on their own machines. These models become trusted intermediaries between people and the wider AI ecosystem.

In this model a person has their own personal shard, a small AI trained partly on general world knowledge and partly on their own chosen corpus of information. Families or small communities may also build a community shard, a shared intelligence trained on their collective knowledge, values and documents. These shards can communicate with each other and also interact with larger frontier AI systems when needed. The important difference is that the local shard becomes a protective and interpretive layer. It decides what information leaves the local environment, how requests are phrased, and how answers are interpreted before they return to the user.

This creates a decentralised AI ecosystem where intelligence grows from individuals and communities outward rather than flowing down from large corporate systems. In practical terms it means a person can run an AI that understands their own thinking style, preferred references, ethical framework and working habits while still being able to consult powerful external models when needed. The result is not isolation from the wider AI world but a new structure of trust where personal and community autonomy remains intact.

Understanding large language models

To understand Project Shards it helps to begin with the structure of modern large language models, commonly called LLMs.

An LLM is a neural network trained to predict the next word or token in a sequence of text. This seemingly simple prediction task allows the model to learn patterns of language, reasoning and knowledge from extremely large datasets. The training data used to build the model is known as its corpus.

A corpus is simply the collection of documents used during training. For example, a foundation model might be trained on a mixture of sources such as Wikipedia, books, academic papers, code repositories and publicly available internet text. The model learns statistical patterns across this enormous body of text and encodes them into numerical parameters.

LLMs typically pass through several stages of development. A base foundation model is the raw pretrained model. It has learned language patterns and general world knowledge but is not yet optimised for conversation. An assistant or chat model is a base model that has been further trained so that it behaves helpfully in dialogue with users. A reasoning model is a model that has undergone additional training to produce stronger logical and analytical responses, often through step by step reasoning processes.

These training stages allow the model to become progressively more capable and aligned with human tasks.

Parameters, capacity and representation

The size of a language model is usually described by the number of parameters it contains. Parameters are the adjustable numerical weights inside the neural network. They determine how the model processes information and makes predictions.

Parameters can be thought of as the capacity of the model’s memory space. They are not knowledge themselves but rather the structure that holds patterns learned from data.

For example a model with seven billion parameters trained on Wikipedia and books may know roughly the same facts as a seventy billion parameter model trained on the same corpus. The difference is that the larger model can represent those facts with greater precision and separate concepts more clearly.

This affects the model’s ability to distinguish meanings. For instance the word Jaguar might refer either to a large cat or to a car brand. Larger models often keep these meanings more clearly separated because they have more representational capacity.

Another architectural concept relevant to Project Shards is the Mixture of Experts (MoE) approach. In an MoE model the network is divided into specialised components called experts. When processing a token only a subset of these experts is activated. This means a model might have hundreds of billions of total parameters but only use a small portion at any moment. The result is improved efficiency without sacrificing capability.

These ideas matter because they show that smaller models can still be powerful when used intelligently.

Markov chains and how language models work

At a fundamental level, many language models rely on probabilistic processes similar to Markov chains. A Markov chain is a mathematical system where the next state depends only on the current state rather than the entire history.

In the context of language modelling, the system predicts the next token based on the tokens that appear immediately before it. Over time, the neural network learns patterns in sequences of words and sentences. When prompted with text it continues the sequence in a statistically plausible way.

Although the mathematics behind neural networks is much more complex than simple Markov chains, the conceptual intuition remains helpful. The model predicts the next step in a sequence based on learned probabilities. This predictive ability is what allows language models to generate essays, answer questions, summarise documents and write computer code.

The DeepSeek breakthrough

Traditional language model development requires enormous computational resources. Training a frontier model often involves trillions of tokens and thousands of specialised processors running for months.

DeepSeek introduced an important shift in how reasoning models can be built. Instead of training every model from scratch on massive datasets, they demonstrated a pipeline that reuses the intelligence of a large teacher model.

The process works roughly as follows. First, a large base model is trained on a massive corpus. This step still requires significant resources and usually occurs within large research labs. Second, reinforcement learning techniques are used to improve the reasoning ability of this base model. The resulting system becomes a powerful teacher model capable of generating step by step reasoning explanations. Third, the teacher generates thousands or hundreds of thousands of reasoning examples. These examples contain questions, answers and reasoning traces showing how the problem is solved. Fourth, smaller student models are trained on this dataset using supervised learning.

This process is called knowledge distillation.

The remarkable result is that relatively small models can inherit reasoning behaviour from much larger systems. In some cases the smaller student models perform surprisingly well on reasoning tasks even though they contain far fewer parameters. This approach dramatically lowers the cost of building specialised models.

Knowledge distillation and the teacher–student model

Knowledge distillation is central to the Project Shards concept. In this approach a large model acts as a teacher and produces high quality training examples. A smaller model acts as the student and learns from these examples.

The teacher might generate data such as:

questions about a topic
detailed answers
step by step reasoning traces
analogies and explanations
counterexamples and edge cases

The student model is then trained on this synthetic dataset. The student does not merely memorise information. Instead it learns the patterns of reasoning demonstrated by the teacher.

This technique is extremely powerful because it allows complex reasoning behaviours to be transferred into smaller, more efficient models. For example a large model with thirty billion parameters might teach a student model with three billion parameters how to reason about mathematical problems or ethical dilemmas. In certain specialised tasks the student may even outperform the teacher because it has been optimised specifically for that domain.

Personal corpus and the “you-shaped” model

A central idea in Project Shards is the use of a personal corpus. A personal corpus might include:

personal notes
research documents
favourite books
journals or essays
ethical frameworks
community guidelines

When a model is fine tuned on this corpus it becomes “you shaped”. It learns your vocabulary, preferred references and intellectual interests.

For example if your corpus includes books on mythology, essays on ethics and personal reflections on community building, the model will gradually learn to interpret questions through those lenses. This allows the model to become a deeply personalised assistant rather than a generic chatbot.

The MyShard model workflow

The Project Shards workflow typically involves three main models.

1. Base student model. A small open model such as a 1B–7B parameter model is chosen as the base. This model already understands language and basic knowledge.

2. Teacher model. A larger model such as Qwen 32B or DeepSeek reasoning models acts as the teacher. It generates reasoning examples about the user’s corpus.

3. Final personal model — MyShard. The final model is trained on both the personal corpus and the teacher generated reasoning dataset. This model becomes the user’s personal AI assistant.

The process looks roughly like this. First the base model is fine tuned on the personal corpus. Then the teacher model generates questions and reasoning examples about that corpus. Next the student model is trained again on this synthetic dataset. Finally the model is quantised so that it runs efficiently on local hardware.

The resulting system becomes MyShard, a personal AI assistant that understands both general knowledge and the user’s own intellectual environment.

Hardware requirements

One of the most exciting aspects of Project Shards is that it does not require datacentre infrastructure. A reasonably powerful workstation is sufficient.

A typical training configuration might include:

a high end GPU such as an RTX 4090
64–128 GB of system memory
1–2 TB of NVMe storage

With efficient techniques such as LoRA or QLoRA, training can occur on a single machine. Once trained, the model can be quantised to reduce memory usage. Quantised models often require only a few gigabytes of storage and can run on laptops or compact servers. This makes personal AI deployment practical for individuals and small groups.

Community shards and networked intelligence

Project Shards is not limited to individuals. Communities can create shared shards trained on their collective knowledge. Examples might include:

guardianship and family support groups
transgender community resources
ethical food sourcing networks
creative collectives
local governance organisations

Each community shard contains specialised knowledge relevant to that group. These shards can communicate with each other while still maintaining local autonomy. In this way a network of small AI systems could emerge that reflect the diversity of human communities rather than a single central authority.

Business model and monetisation

The commercial side of Project Shards emphasises transparency. The idea is not to avoid monetisation but to make it honest and understandable. Possible services include:

personal shard training packages
community shard hosting and support
secure AI routing services
specialised domain models
consulting and training for organisations

Basic personal usage might remain inexpensive or free, while advanced services provide revenue to sustain development. The core promise is that users always know how their data is used and where it is stored. Trust becomes the product.

Ethical vision

Beyond technology and business, Project Shards represents a broader ethical perspective. It imagines a world where artificial intelligence grows from the bottom up rather than being imposed from above. Instead of a single global intelligence controlled by a handful of corporations, many smaller intelligences exist that reflect the needs and values of individuals and communities.

These systems can still collaborate with large frontier models, but they do so through a layer of local trust. In this vision AI becomes a partner in human flourishing rather than a distant authority.

References

No.	Reference	Description
1	arxiv.org/abs/2501.12948	DeepSeek-R1 research paper describing the reasoning-focused model and its reinforcement learning and distillation pipeline.
2	arxiv.org/abs/2412.15115	Technical report for the Qwen model family, explaining architecture, training data and reasoning capabilities.
3	arxiv.org/abs/2106.09685	The LoRA paper introducing low-rank adaptation for efficient fine tuning of large language models.
4	arxiv.org/abs/2305.14314	QLoRA research describing how quantised models can be fine tuned efficiently on consumer hardware.
5	research.google — Mixture of Experts	Google research article explaining Mixture of Experts architectures and conditional computation.
6	github.com/e-p-armstrong/augmentoolkit	Open source project that generates synthetic training datasets from documents for building domain-specific language models.
7	github.com/e-p-armstrong/verustoolkit	Toolkit demonstrating how specialised domain language models can outperform generic models when trained on coherent corpora.
8	github.com/fiatrete/OpenDAN-Personal-AI-OS	Personal AI operating system designed to run local models, tools and data flows on user hardware.
9	nvidia.com — Blackwell architecture	Overview of NVIDIA Blackwell architecture used for large scale AI training and inference.
10	cloud.google.com/tpu	Google Cloud TPU platform documentation describing specialised hardware used for large scale machine learning training.