Published on November 7th, 2016 | by Luke Turpeinen1
Building Your Own World – Language & Names (part 3)
Building Your Own Language-
Constructing a Word List
You are finally doing it- you’re building your own world. Whether it’s for an RPG campaign or a board game design, you want to build a setting that is coherent and caters to your specific aesthetic needs. Language, specifically the way you name people, places and things, is going to be a reader’s first introduction to your world and its denizens. Having an internally consistent naming language defines your canon in a simple and direct way.
If you have not read them before, I highly suggest looking at two previous articles also on the subject of language building (this introduction to IPA alphabets, and a step-by-step walkthrough of the first parts of constructing a language). These articles will get you ready to make words for your language by teaching you about sounds in language and how to write them at a fundamental level.
You need to know: 1) which sounds your language can make, 2) how those sounds are written out, 3) how those sounds can be arranged into syllables. I suggest at this point having at least one way to write your language that uses only one character to represent each sound, as it will make the next steps easier. After you have your phonology and orthography all figured out, you’re going to be ready to start playing with word generation.
These charts represent the consonant and vowel choices I made for this constructed language during the last article. For our purposes right now we’ll be focusing on the “technical”/”in-world” set of characters to represent the phones we’ve chosen, for use with the word generator program.
I like to think of this next stage as “word mining”, the phase where we get the raw materials we’ll need in order to shape actual words for use later on. We’re not worried about perfect results right now, just about getting the right “feel” in the word construction. We’ll finalize the concepts we shape now in the “word smithing” in a future article.
First off, go ahead and open up The Zompist Word Generator. While you’re there, check out his guides and tutorials (buy a book, I don’t know the author at all, but it’s useful stuff and I support that). He’s an actual linguist, not an amateur, and if you have more difficult questions he’ll be a much more knowledgeable resource than I. That said, you might need some help figuring out this program.
When you first pull it up, you should see something like this:
The first two boxes I want to call out for you to notice are the “Categories” box and the “Syllable Types” box. These two are very closely related and we’ll be altering them the most at first. Categories lets you define variables, while Syllable Types allows you to arrange those variables into patterns that your language will use.
Last week during the example we came up with (C(w))V(n, ng, m, s) as our first pass at a syllable structure. To get this idea ready for the Zompist, we need to further define these variables. First off, we need four of them. C for “primary consonant” seems fine, as does V for “root vowel”, we can use W for the optional “w” and N for the “nasal endings”.
Now we need to define exactly which phenomes the consonants represent, and plug that into the form. You can list letters under more than one variable, if you want to be able to make syllables like “nan”. I put all of my consonants into C, and all of my nasals (plus “s”) into N, but then remembered I wanted the “ng” sound only at the end of consonants and the “ny” sound only at the beginning, so I adjusted the list.
Note! It is important to understand that Zompist lets you control how common your phenomes are. The more to the left the character appears in the Categories box, the more frequently it’ll appear in your word lists. You can control the disparity between the left-most and right-most letters by adjusting the “Dropoff” rate- which also allows you to make them equally likely to occur, essentially shutting off the feature.
Now I need to generate a list of possible variable combinations for the program to use. In the Syllable Types box, I write out all possible syllable formations I want the program to output. I can’t just make a formula like above, but it should be fairly simple to generate these yourself, as you’ve already defined the variables you need in the Categories box.
Order your syllable types from most frequent to least frequent, in case you want to use some drop off on them as well. To do so, click the “slow syllable dropoff” button above the “Generate” button. This is what I have so far:
First Run Thru
Make sure you delete everything out of the rewrite rules, we’ll deal with those later. Choose either “Text Output” for something that looks like text in a paragraph form, a word list (big or small), or every single syllable your conlang can produce based on these rules (though it doesn’t have to, and probably shouldn’t, use them all).
My first pass looked like this:
Generating this list taught me a couple of things. One is that I dislike having that extra “w” show up, even though it doesn’t have to be as frequent as the generator makes it. I’m going to just toss the “w” into the C variable and not worry about it. I also reordered the letters a bit and changed the drop off to “slow”.
After tinkering with the output and trying out both word lists and paragraph style text, I decided that having a solo vowel syllable was messing with my “no dipthongs” rule, so I took that option out. My text started to look like this:
What I’m looking for is “proper” vowel/consonant groupings, according to the style in my head. I also try to read the text, to make sure that there aren’t syllable pairings that are absurdly difficult. If there are, I would need to find out why that is, and how it should sound instead.
For example, take the word “mimñuṅnyn” which was generated in the above text. This is an example of a word I’d never want to make because it has too many similar letter forms. These letters correspond to similar sounds, which makes it even harder to say- like a tongue twister. Looking at my options, I note that I can take “m” and “n” out of the C variable to largely fix this issue. Another option instead of (or in addition to) that would be to use a rewrite rule.
The “Rewrite Rules” field is a simple set of commands that change your text after it has been generated normally. Put the text you want to change before what you want to change it to, with a vertical line ( | ) between them (you can even leave the second part blank to delete certain combos).
So I could take the word “mimñuṅnyn” and decide that anytime a non-labial dental (n, ṅ) is followed by “y” that it turns into “ñ” ([nṅ]y|ñy), then we can say that whenever “ñ” is followed by other nasals, the second one is dropped (ñ[nñṅ]|ñ). You could do that again for multiple letters and combos (ṅ[ñṅn]|ṅ), to tighten up the way your words look right out of the generator.
Just note that the rules are executed in order and that may change how your output looks. For example, the order you put the above rules could change “mimñuṅnyn” to “mimñuṅ” or “mimñuñyn”. Depending on how complex your syllable construction and rewrite rules are, this may matter a lot, or not at all.
You can also keep a set of rewrite rules handy to switch your finalized text between your technical orthography (which I’ve been using in examples) and a transliteration that is more appealing to native speakers of your own language. This just involves running letter substitutions from my “in-world” orthography to my “English” orthographies, above.
Now you’re ready to generate words to your heart’s content, and they’ll all have the same look and feel.
This in-depth article and our recent three article per week publishing schedule are thanks to the generous donations from our Patreon backers. If you wish to vote on the kind of content we produce, become a Patron today!