Learning a Language Using Only Words You Already Know

TL;DR: LangSeed is a proof-of-concept language learning app that defines new words using only the vocabulary you already know, with emojis bridging semantic gaps. The code is available on GitHub, and you can try it here LangSeed.

I recently bought a simplified version of the Journey to the West in Mandarin. Turns out I had overestimated my reading skills, and I barely recognised 20% of the words on the first page.

While I could have googled each word (or better yet, signed up for Chinese lessons) and gotten the English definition, doing so felt like it broke the immersion. Instead, I wanted to look up the words I didn't know and see their definitions in my target language. The problem is that there is no dictionary that is 100% tailored to my current level.¹

This is where LLMs come into the picture.

Generative Dictionary

The idea is simple: you have a set of words you have already mastered, M, and when you encounter a new word, you ask a model to define it using only words from M. There are two ways to do this: guided decoding and post-generation validation.

In guided decoding, you block the model from generating tokens that don't match words in M. This is slightly complicated, since tokens sometimes correspond to a full word, multiple words or a single character. This is easier to set up for local models, and while frontier models and their APIs offer some support for it, they are often centred around CFGs and JSON schemas.

Post-generation is much simpler. After the model has produced an answer, you segment it into words (using Jieba), find all words not in M, and call the LLM again with a list of words it used that weren't allowed. You repeat this in a loop until the model yields or you give up. It's impossible to explain "melancholy" if you only know the words for "I", "like", "food". In practice, it averages 1.5 loops for me, with three being the maximum before I give up. ²

It's worth noting that I feed the list of words I already know into the context, together with a sentence containing the word, since many words have multiple meanings depending on context.
You can "train" yourself to get stronger, or you can ride a "train".

One thing I noticed was that if the seed vocabulary was small enough, it was essentially impossible to break out and explain more complicated concepts. At first, I considered generating images to explain the concepts, since, if you squint hard enough, it's similar to how we learn words along with our visual senses. But then it struck me that I could use emojis, the universal language we all "speak", the Rosetta Stone of our time. Now the model could substitute words and concepts using flags, animals and sleeping emoji.

Hover or tap to see the translations

I also discovered that a single definition was often not enough to grasp the idea, so I started having the model generate three definitions when possible, which drastically increased my chances of understanding the word.

I also had the model output words it wished I already knew in order to create better definitions. For example, 学习 (study) recommends that I learn 学校 (school) and 知识 (knowledge), and initially relied on the emoji sequence 📚✏️🧠💡 to convey some of that meaning (see image below). The model also self-rates its definitions from 0 to 5.

But it's still far from perfect. My Chinese friends pointed out multiple grammatical issues, as well as the use of somewhat unusual words in the example sentences. There are some words I still don't get, so I've paused them until I have a larger vocabulary, at which point I'll come back and try again (for example 关于).

Drills

The next step was to create a basic training process. I’ve enjoyed spaced repetition in the past (for example, Anki), but for this proof of concept I wanted something simpler. In the end, I landed on two types of questions: a sentence with a gap where I need to pick from four options, and a sentence with a yes-or-no answer. All of these use only words you already know (or emojis), relying on the same post-processing verification as before.

I also made a version that could handle Swedish (my native language) and English, which made it easier to understand how well the LLMs were doing, how sensible the questions were, and how many grammatical problems they had. One big advantage of Chinese for this task is that words don’t conjugate; there’s no tense, no -ing forms, and so on. In Swedish and English, by contrast, words often have multiple stems or inflected forms.

This helped me debug conceptual errors and avoid trusting the models blindly.

Implementation

For this project, I decided to use Phoenix LiveView (Elixir), and it was a real joy to work with. I also tried the new req_llm library for unifying provider requests. I used Oban for generating questions in the background, ensuring that each word always had at least four pending questions, since they take some time to generate.

I deployed it on Fly.io, but their hosted Postgres was a bit too expensive for a one-off project, so I gave Neon a try for the first time.

I tried a few models but ended up using Gemini 2.5 Pro as my default model. Gemini Flash wasn't "creative" enough to bridge the gap using emojis. GPT-5 (and 5.1) also did a fairly good job.

Conclusion

I have been using this app for a week now, mostly on my phone while commuting to the office, and I can now read the first page! Next up: figuring out pronunciation; maybe a Christmas break project.

Seeing the definition of a new word feels a bit like solving a puzzle. You have a bunch of clues, you start figuring out what it means, and as you learn more, you can come back and build a clearer and clearer picture.

Sometimes it will mislead you, and reading a clear definition would definitely speed things up, but I wonder whether the process of figuring out a word doesn’t reinforce it much more.

There are graded readers for this specific purpose, but I find much more motivation reading something I actually want to read, rather than something that is unintersting but on my level. ↩
I noticed that M needed to be around 60 word of very strategically chosen words before the model could actually start teaching me net new things. ↩

Generative Dictionary

Drills

Implementation

Conclusion

Like what you see?