Project · 2025
Manzoni AI
A LoRA fine-tuning of Gemma 3 4B IT that translates modern Italian into the literary style of Alessandro Manzoni, built on a custom parallel dataset distilled from I Promessi Sposi.
Motivation
The project started from a personal memory: reading I Promessi Sposi at fifteen with a teacher who didn’t just assign the novel, but guided a word-by-word analysis of Manzoni’s syntax, irony, and rhetoric. Years later, the same process of stylistic absorption was replicated — this time in a language model with four billion parameters.
The goal was not to build a general-purpose chatbot, but a style-transfer model capable of rewriting contemporary Italian into prose inspired by the nineteenth-century language of I Promessi Sposi.
The core challenge
No public parallel dataset existed for this task. Training required aligned sentence pairs: the same content expressed once in modern Italian (B1 level) and once in Manzoni’s original register.
Dataset construction
The dataset was built through a reverse distillation pipeline starting from the full text of I Promessi Sposi (38 chapters, introduction excluded to avoid the archaic seventeenth-century register of the original manuscript).
Each chapter was processed with an external LLM (deephermes-3-mistral-24b, accessed via API) instructed to rewrite Manzonian sentences into simple contemporary Italian at B1 level, preserving every sentence and its structure. The prompt was engineered to suppress commentary, avoid summaries, and produce clean output only.
The resulting corpus — modern Italian → Manzonian Italian aligned pairs — was then manually reviewed to remove hallucinations, spurious language switches (including fragments in oriental scripts), and residual prompt artifacts like <to_call> markers. The final dataset was stored in CSV format with one column per register.
Representative excerpt:
| Modern (B1) | Manzonian original |
|---|---|
| Il ramo del lago di Como che si volge a sud… | Quel ramo del lago di Como, che volge a mezzogiorno… |
| Addio, montagne che sorgono dall’acqua… | Addio, monti sorgenti dall’acque, ed elevati al cielo… |
Fine-tuning
The base model is Gemma 3 4B IT (full precision, non-quantized), chosen for its strong Italian support and compatibility with the Google Colab free tier hardware (NVIDIA Tesla T4, ~14.7 GB VRAM). Full fine-tuning was infeasible at this scale, so LoRA (Low-Rank Adaptation) was used. Training was implemented via the Unsloth library, optimized for LoRA on consumer-grade GPUs.
Key hyperparameters:
| Parameter | Value |
|---|---|
| Base model | Gemma 3 4B IT (full precision) |
| Optimizer | AdamW 8-bit |
| Learning rate | 2e-4 |
| Effective batch size | 8 (2 per device × 4 gradient accumulation steps) |
| Training steps | 250 |
| Warmup steps | 5 |
| LR scheduler | Linear decay |
| Weight decay | 0.01 |
| Random seed | 3507 |
The training loss showed a consistent downward trend over 250 steps with expected small-batch fluctuations, stabilizing near the end of training — a good balance between adaptation and overfitting under hardware constraints.
Results
The fine-tuned model reliably rewrites contemporary Italian into a consistently Manzonian register, reproducing complex syntax, archaic lexical choices, and the rhetorical rhythm of the original novel.
Standard configuration (temperature 0.85, top-p 0.82):
Input: Anche se tutti speravano che le cose cambiassero, ogni giorno sembrava uguale al precedente. La gente camminava per le strade con il volto stanco, parlava poco e guardava lontano come se aspettasse qualcosa.
Output: Ognuno, per dir la verità, sognava che le cose potessero cambiare; ma ogni giorno era, pareva a tutti, uguale al giorno che gli precedeva. Passeggiavano per le strade, con il volto stanco, a poca parola, e con lo sguardo lontano, come se aspettassero qualcosa.
Beyond standard literary Italian, the model was evaluated on:
- Short colloquial sentences — stylistic transfer preserved even for proverbs and informal phrases
- Off-topic pragmatic prompts — the model responds partially, but remains anchored to the Manzonian register
- Cross-linguistic inputs (English, French) — archaic lexical shifts appear even outside Italian, though less precisely
- Famous Italian literary texts (Dante, Verga, Leopardi) — successfully recast into a Manzonian register while preserving semantic content
- Temperature / top-p sweeps — lower temperatures produce more conservative outputs; higher temperatures with moderate top-p yield greater stylistic inventiveness
The full methodology, dataset construction process, training pipeline, and evaluation results are documented in the linked academic paper.