The One Grammar to Rule Them All (OGTRTA)
Current version: 0.3 (beta)
A specification for a class of languages.
Gloss abbreviations are drawn from Wikipedia's list of glossing abbreviations.
Introduction
OGTRTA is a template for making constructed languages. The idea is that with a short wordlist and a few decisions about grammar, you can create a fully-functioning, self-consistent, complete, and unique conlang, suitable for further elaboration.
I created OGTRTA for my own use, because I suck at finishing conlang grammars. I wanted a way to create a conlang that was "complete" (if short on vocabulary) in a few minutes, so that most of the remaining work would be crafting the lexicon and tweaking the morphology. Essentially, I wanted the terms/walking-skeleton pattern, but for conlangs instead of computer programs.
However, in order to arrive at a system that would work, the way I had to develop OGTRTA was to create several actual (half-finished) conlangs, peruse grammars of natural languages, and then figure out what framework could produce them all. OGTRTA is not some pie-in-the-sky idea I dreamed up. It's derived from observations of actual languages both natural and artificial.
In creating OGTRTA, I had to strike a balance between a maximally-comprehensible system and a maximally-flexible one. As a result, OGTRTA is not right for every language. For example, if you want a language with very free word order, tons of noun classes, and head-marking, OGTRTA is probably not a good choice — but then again, you don't need OGTRTA for a language like that, because the main problem OGTRTA solves is syntax. On the other hand, if you want a language with fairly fixed word order, which is either consistently head-initial or consistently head-final, OGTRTA might work well for you.
OGTRTA comprises two parts: a syntax, and a set of glosses for about 50 morphemes. This document describes both parts, beginning with the syntax.
Syntax
Parts of speech
OGTRTA recognizes five parts of speech:
- nouns
- verbs
- determiners
- pronouns
- conjunctions
Nouns are straightforward: a noun refers to a person, place, thing, or idea. However, verbs in OGTRTA are a little bit different from English verbs. OGTRTA has no adjectives or prepositions, so verbs fill the role of both.
OGTRTA's determiners include words like articles (e.g. "the") and some interrogatives ("which", "whose").
Pronouns function syntactically much like nouns, but have some restrictions. For example, they cannot take determiners. Also, their inflectional morphology may differ from that of ordinary nouns (exactly how it differs is left up to the individual language).
Conjunctions connect syntax nodes (phrases and sentences) as peers — i.e. without subordinating one node to the other. Conjunctions include such useful words as "and", "or", "while", "because", "so", and so on.
Of these parts of speech, nouns and verbs are "open classes," meaning that speakers of a language innovate and borrow new ones all the time. Determiners, pronouns, and conjunctions are "closed classes." Innovation in these parts of speech should happen very rarely.
OGTRTA aims to provide languages with a complete set of determiners, pronouns, and conjunctions at the outset, so the only part of the lexicon you will need to design yourself is the nouns and verbs. OGTRTA also provides a framework for designing preposition-like verbs.
Word order
The basic word order in declarative sentences in OGTRTA is VOS or verb-object-subject. Wait, come back! The subject is usually fronted via a transformation, so SVO word order is more common in actual sentences. Additionally, OGTRTA is a reversible syntax: the right-hand side of all the syntax production rules can be reversed, producing an SOV language. Backing the subject of an SOV sentence produces OVS.
The other possible word orders, VSO and OSV, are not directly supported. If you really want to make a VSO language with OGTRTA, you can probably figure something out, but this guide will not describe how to do it.
To keep things straightforward, this guide assumes a verb-object word order, and all the examples will use that syntax.
Production rules, level 1
The complete set of production rules for OGTRTA is complex, so rather than dump them on you all at once, they will be introduced in stages. This section describes "level 1" syntax—the most minimal, stripped-down version of the grammar.
A sentence is either declarative (making a statement about what is the case), imperative (issuing a command or request) or interrogative (asking a question). OGTRTA views interrogative sentences as a special case of imperatives, since they're making a request for information. So there are only two types of sentences in the formal grammar: declarative sentences, abbreviated DS
and interrogative/imperative sentences, abbreviated IS
.
Here is how we represent the idea that a sentence S
is either a DS
or IS
:
S -> DS
S -> IS
A declarative sentence DS
consists of a verb phrase VP
followed by a subject noun phrase NP
.
DS -> VP NP
The order of nodes in a sentence can also be reversed:
DS -> NP VP
A sentence can be composed of multiple sentences, using a conjunction:
DS -> S CONJ S
An interrogative or imperative sentence IS
is a single noun phrase (see the sections on imperatives and interrogatives for how this works).
IS -> NP
A verb phrase consists of a verb V
with valence n
(represented V/n
), followed by n complement noun phrases (represented NP{n}
).
VP -> V/n NP{n}
What's valence, you ask? The valence of a word is the number of noun phrases you need to put after the word to make it a complete "thought." For example, "Rachel skydives" is a complete thought, because "skydives" has a valence of 0, but "Rachel pokes" is not complete — "pokes" has valence 1 and requires an object. The noun phrases required by the valence of a word are called the word's complements.
All words in OGTRTA have a valence. Most nouns have a valence of 0, but nouns derived from verbs (infinitives) often have a valence greater than 0. Individual languages might also choose to give other nouns a nonzero valence, e.g. kinship terms. A word like "daughter" might require a complement specifying who the referent is the daughter of.
Note that valence is a grammatical concept, not a logical one. In English, a verb like "eat" can have either valence 1 or valence 0, and that's perfectly fine. You can say "I eat", or "I eat beans" and both are grammatical. In OGTRTA, every word must have a well-defined valence when it is actually used: there is no such thing as an optional complement. However, OGTRTA languages often have morphological affixes to modify the valence of a word, so your language can have both eat/1
and eat/0
. The valence-changing affixes can also be realized as null morphemes, so eat/1
and eat/0
need not have distinct forms.
In any case, the important thing to understand is that valence is always about what words are required to be there, not about whether the action represented by a verb conceptually has a direct object or not.
A noun phrase consists of a noun N
with valence n
, followed by its n
complements. The noun is optionally preceded by a determiner.
NP -> DET? N/n NP{n}
A noun phrase can also be composed of other noun phrases, using a conjunction:
NP -> NP CONJ NP
Or it can just be a pronoun:
NP -> PRN
The noun phrase at the root of an interrogative or imperative sentence often takes the form of an interrogative phrase followed by a declarative sentence:
NP -> IP DS
An interrogative phrase is either an interrogative pronoun IPRN
, or a noun phrase with an interrogative determiner IDET
.
IP -> IPRN
IP -> IDET N/n NP{n}
With just these syntax rules, we can already construct sentences of arbitrary length and complexity. Something's missing, though. Where are the adjectives?
Production rules, level 2
Recall that OGTRTA does not have adjectives as a separate lexical class. However, its syntax does have a concept of modifiers. A verb phrase can modify a noun by following it (either before or after any complements):
NP -> DET? N/n VP NP{n} VP
In fact, a noun can be modified by any number of VPs (zero or more), represented by VP*
:
NP -> DET? N/n VP* NP{n} VP*
Words that we think of as adjectives in English are zero-valence verbs in OGTRTA.
A noun phrase that includes a determiner can have zero-valence modifiers between the determiner and the noun:
NP -> DET V/0* N/n VP* NP{n} VP*
This rule is optional, though, and individual languages can safely leave it out.
Verbs can also have modifiers, equivalent to adverbs in English.
VP -> V/n VP* NP{n} VP*
Production rules, level 3
Now we come to the interface between syntax and morphology. The nodes in our syntax tree can have "tags," which can be "inherited" by descendent nodes. Depending on the specific language, tags on a node can influence what morphological affixes are allowed or required. In the syntax metalanguage, I'll represent these tags as a letter after a dot, like .x
.
So, starting back at the beginning: A sentence has a subject noun phrase (NP.s
) and a finite verb phrase (VP.f
).
S -> VP.f NP.s
S -> NP.s VP.f
Any tags on a verb phrase are inherited by its head verb. Here I use .x
to mean "any tags that are present." The modifiers of a phrase get a "modifier tag" .m
, and the complements of a verb may get a "case tag" .c
, the specific nature of which is determined by the lexical verb. The modifiers of a verb phrase also get an "adverbial tag" .a
.
VP.x -> V_n.x VP.m.a* NP.c{n} VP.m.a*
That rule is pretty complicated, but I promise it's the worst one. A conjoined verb phrase passes its tags to both conjuncts:
VP.x -> VP.x CONJ VP.x
Same with a conjoined noun phrase:
NP.x -> NP.x CONJ NP.x
The head noun of an NP inherits the tags:
NP.x -> PRN.x
NP.x -> DET? N.x VP.m*
NP.x -> DET V_0.m* N.x VP.m*
And that's it! That's the entire syntax of OGTRTA.
All together now:
S -> DS
S -> NP
DS -> VP.f NP.s
DS -> NP.s VP.f
DS -> S CONJ S
VP.x -> V_n.x VP.m.a* NP.c{n} VP.m.a*
VP.x -> VP.x CONJ VP.x
NP.x -> PRN.x
NP.x -> DET? N.x VP.m*
NP.x -> DET V_0.m* N.x VP.m*
NP.x -> NP.x CONJ NP.x
NP.x -> IP DS
Get that printed on a t-shirt. I'm sure it'll be a hit at parties.
Production rules appendix: agreement tags
As described above, the syntax of OGTRTA can do many things, but it has one major shortcoming: it cannot represent any constraints on agreement between words. That is, the syntax cannot require a verb to agree in plurality with its subject (as in English "he swims" vs. "they swim"), or a modifier to agree in gender with the noun it's modifying (as in Spanish "el poema bonito" vs. "la casa bonita").
All is not lost, however: we can use tags to represent agreement constraints between nodes. It's not possible to fully specify a formal system for doing this, because it's very dependent on the individual language, but I'll give some examples.
First example: perhaps you want your language to have masculine and feminine articles, like Spanish does, and you want to require that nouns must agree in gender with their article. Here's how you might do that with .m
and .f
tags:
NP.m -> DET.m N.m VP*
NP.f -> DET.f N.f VP*
These rules say, basically, that a noun phrase can be either masculine or feminine. Masculine NPs have to have a masculine determiner and a masculine noun, while feminine NPs have to have a feminine determiner and a feminine noun.
Maybe you also want to head-mark the gender of VP constituents on the verb, as in Swahili. Here's how you'd do that, with tags ms
and fs
for the gender of the subject and m1
and f1
for the gender of the first complement.
S.ms.m1 -> VP.ms.m1 NP.ms
S.ms.f1 -> VP.ms.f1 NP.ms
S.fs.m1 -> VP.fs.m1 NP.fs
S.fs.f1 -> VP.fs.f1 NP.fs
VP.ms.m1 -> V_1.ms.m1 VP* NP.m1 VP*
VP.ms.f1 -> V_1.ms.f1 VP* NP.f1 VP*
VP.fs.m1 -> V_1.fs.m1 VP* NP.m1 VP*
VP.fs.f1 -> V_1.fs.f1 VP* NP.f1 VP*
As you can see, this gets more complicated the more genders and complements you have to deal with. But in theory, this system is capable of expressing all the possible combinations.
This is a sketch, not a formal description. It's hard to describe agreement tags precisely in any system less powerful than an actual programming language. But hopefully you get the idea. If not, don't worry about it. You don't need to know anything about agreement tags unless you, like me, need the security blanket of formality to keep those evil, messy WORDS safely away from you.
Lexicon
Now that we have covered syntax, let's move on to the other half of OGTRTA: the lexicon.
For this section, we will have to introduce some glossing conventions:
- A dash separates bound morphemes, e.g.
cabbage-PL
"cabbages" - A dot separates the constituent meanings of a fusional morpheme, e.g.
3.AN.PL
for the third person animate plural pronoun. - Underscores separate multiple words used to gloss a single morpheme, e.g.
what_kind_of
- The valence of a verb is indicated by a slash followed by a number, as in Erlang, e.g.
swim/0
,eat/1
,give/2
. - Abbreviations are in uppercase, e.g.
DEF
for the definite article,PL
for plural, etc.
Null morphemes
Any OGTRTA language may opt out of any morpheme or inflection, by realizing it as a null morpheme. For example, let's say you wanted your language to have a Celtic-style "genitive of juxtaposition," where to refer to a noun and its possessor you just put the two nouns next to each other. At first, it might seem like OGTRTA can't do this, because nouns cannot directly modify other nouns; only verbs can.
But there is a workaround: use a genitive verb GEN/1
which is realized as null. So:
trymped y brenin
trymped ∅ y brenin
trumpet GEN/1 DEF king
the king's trumpet
Another trick lets you use genitive-case nouns as modifiers, as in German:
die Räder des Busses
die räder ∅ des busses
DEF.NEU.PL wheel-PL GEN/1 DEF.MASC.GEN bus-GEN
the wheels of the bus
Here we suppose there is a verb GEN/1
which governs the genitive case (a constraint that can be expressed via tags) and is realized as null.
When you think OGTRTA can't do something, often it can — you just have to add some null morphemes until the syntax tree is happy. But beware: the more null morphemes you add, the more ambiguous the language (generally) gets.
Determiners
- Do articles agree in number, case, or gender with their noun?
- Does the language have definite and indefinite articles?
- Are demonstratives (this, that) determiners, or verbs?
- Are possessives (my, your) determiners, or verbs?
- Are the quantifiers any, all, no determiners, or are they expressed periphrastically as nouns, e.g. all of the stars, none of the stars?
- Do the interrogative determiners which, whose exist, or are they expressed periphrastically?
Articles
DEF
singular definite articleDEFPL
plural definite article
For indefinite nouns, number is expressed periphrastically, with modifiers.
This difference in number marking strategies stems from an observation about how number is used in practice. In many indefinite noun phrases, the noun is effectively numberless:
- "Are there sheep on the hill?"
- "Everyone should send their children to school."
These examples are morphologically plural (in English) but the actual referents of the highlighted nouns might be singular or plural depending on the circumstances of the utterance.
However, a definite noun cannot be numberless, because a definite noun always refers to a specific set of things.
Quantifiers
NEGDET
negative quantifier ("there are no sheep on the hill")every
universal quantifier ("every sheep is on the hill")
Interrogatives
which
what_kind_of
how_much
(also used for "how many")
Words that aren't determiners
Some types of words are determiners in English but not in OGTRTA. These include:
- possessive pronouns ("my", "your")
- demonstratives ("this (one)", "that (one)")
These concepts are all expressed by verbs in OGTRTA.
Semantics
OGTRTA employs determiners only in cases where the referent of a noun cannot be pinned down by determining modifiers. A phrase like "the bear" cannot be paraphrased by describing a kind of bear — the same goes for "no bear" and "every bear".
Possessives and demonstratives, on the other hand, can be expressed as modifiers. Precedents exist in many natural languages: English ("for the sake of them"), Italian ("la mia famiglia"), and Welsh ("y bore 'ma"), for instance. Note that in each of these cases, the modifier can coexist with the definite article.
"every" / "all" / "none" as modifiers
Since determiners cannot be used with pronouns, a periphrastic construction is needed to express e.g. "all of them".
1SG DIST see1 every being of1#M 3ANPL
"I saw all of them" (lit: I saw every one of them)
NEGDET
uses a different paraphrase:
1SG DIST see1 NEGDET being of1#M 3ANPL
"I saw none of them"
Prepositions
Derived from Wikipedia's List of Grammatical Cases.
A language should have some way of expressing all of these ideas. Each preposition could probably have its own chapter in a grammar textbook.
at1
orLOC1
"at" (the basic locative preposition) - "on/upon" can be merged with this.of1
orGEN1
"of" (genitive)to1
orALL1
"to" (the basic allative preposition)from1
orABL1
"from" (the basic ablative preposition)in1
orINESS1
"in" (the basic inessive preposition)COM1
"with" (comitative)INS1
"with" (instrumental)AGT1
"by" (agentive) - can be merged with "with"LIM1
"by" (limitative of time, e.g. "by 5:00")DAT1
"for" (dative) - can be merged with "to"BEN1
"for" (benefactive)during1
"during"DISTR1
"per" / "for each"CMPR1
"like" / "as"PRIV1
"without"
Intransitive Prepositions
up0
←above1#MID
down0
←below1#MID
away0
,off0
←from1#MID
Less Common Prepositions
between1
"between"touch1
"touching"- "about" / "concerning" - can be merged with "touching"
beside1
"next to" / "beside"before1
"in front of" / "before"behind1
"after" / "behind"under1
"under"over1
"over"near1
"near"ELA1
"out of" (elative)PERL1
"through" / "via"- "according to"
- "because of"
- "for want of"
- "instead of"
BEN1
"for the benefit of"MAL1
"against" / "for the detriment of"
Notes
Prepositions "to" and "from", and variants like "onto" and "from out of" can be replaced by inchoative and cessastive inflections of other prepositions. E.g. "to" can be at1#INCH
. "for" (dative) can be of1#INCH
.
Personal Pronouns
1SG
1PL
2SG
2PL
3ANSG
3ANPL
- animate 3rd person3INANSG
3INANPL
- inanimate 3rd person
Demonstratives
here0
there0
Atomic Verbs
say0
go0
know1
see1
like1
like (be fond of; the comparative preposition isCMPR1
)look1
think1
can1
make1
get1
be1
take
find1
ask2
- 1: askee, 2: thing requestedperceive2
- 1: thing perceived, 2: quality- feel
- turn
- want
- stand
- die
- hear
- try
- open
- begin
- let
allow1
Derived Verbs
have1
←of1#PASS
- tell ←
give2 <someone> know1#INF
become1
←be1#INCH
- give ←
of1#CAUS
orhave1#CAUS
- last
endure0
←endure1#MID
- seem, appear ←
perceive2#PASS
Negation
#NEG
Morphological Tenses
PROX
proximal tense, used for events in the present.DIST
distal tense, used for the past and for hypotheticals.AOR
aorist tense, used for general statements, for timeless truths, and when the speaker does not care to mark tense.
Periphrastic Tenses
PRF
perfect (←ELA1
+ infinitive)FUT
future (←ALL1
+ infinitive)
Aspects
PROG
progressive (←LOC1
+ infinitive)
Valence-Changing
#MID
removes all complement slots.#PASS
swaps the subject and first complement slots.#MIDPASS
removes the subject and promotes the first complement to subject.#CAUS
demotes the subject to first complement, increasing valence by 1, and adds a new subject.
Valence Restoration
#MID
removes all complement slots, but sometimes you don't want that; you want to keep one of the complements of a valence-2 verb but remove the other. The way to accomplish that is via valence restoration.
Each verb has a lexically-determined mapping from slot indices to verbs that restore those slots when used as modifiers. E.g. here ABL1
is used to restore slot 1 of ask2
.
1SG PROX FUT ask2#MID ABL1#M 3ANSG
"I'll ask them."
To restore slot 2, you'd have to use ALL1
:
1SG PROX FUT ask2#MID ALL1#M 3INANSG
"I'll ask for it."
Valence restoration is also useful when you want to swap the complements of a valence-2 verb:
1SG PROX FUT ask2#MID ALL1#M 3INANSG ABL1 3ANSG
"I'll ask for it from them."
Modifiers used for valence restoration never take the adverbial particle.
Part-of-Speech-Changing
Nominalization
#INF
- nominalizes a verb, keeping complement slots#GER
- nominalizes a verb, dropping all complement slotsNZ
- nominalizes a sentence
GEN1
/of1
can be used to attach a subject to a nominalized verb:
1SG PROX want1 GEN1#M 2SG see1#INF 3INANSG
"I want you to see it"
When used in this way, GEN1
never takes the adverbial particle.
Questions
Interrogative sentences are NPs.
Interrogative pronouns:
what
who
Interrogative determiners:
which
whose
Interrogative particles:
whether
how_condition
("How are you?", "How is the course going?")how_degree
(how_degree AOR tall0 ^Barack ^Obama
"How tall is Barack Obama?" = "To what degree is Barack Obama tall?")- also used for
how_much
/how_many
- also used for
why
An interrogative NP is either an interrogative pronoun, or a noun with an interrogative determiner.
TODO: replace interrogative determiners with predicates. Possessives and demonstratives aren't determiners, so why should which
and whose
be?
OGTRTA languages may front interrogative NPs, though some languages leave them in place. When an interrogative NP is fronted, a resumptive pronoun RES
is left in its place. Languages may realize the RES
morpheme as null.
whose vassal 3ANSG FIN PERF kill1#INF RES
"Whose vassal did he kill?"
Agents in non-finite clauses
The semantic agents of a non-finite clause can be added as the complement of a modifier on the main predicate. This modifier is glossed NFSBJ
(non-finite subject) but is typically realized as some other morpheme: either of1
(in languages that mark first modifiers) or ADV
(in languages with an adverbial predicate).
3ANSG PROX bring1 3INANSG
"He brings it." →1SG PROX want1 NFSBJ 3ANSG bring1#INF 3INANSG
"I want him to bring it."
Modifiers on predicates
Modifier phrases can attach to predicates as well as nouns. In complex sentences, this can create undesirable ambiguity. OGTRTA languages use a couple different strategies to resolve the ambiguity.
Nearest modifier marking
In languages that use the "nearest modifier" strategy, a modifier that immediately follows the head of its parent phrase is marked with an inflection #M
. Often, this marking is an initial consonant mutation, but it does not have to be.
1SG DIST eat0 quick0#M
"I ate quickly"bear fierce0#M very0#M
"a very fierce bear" — contrast withbear fierce0#M very0
, which might mean "an actual, fierce bear."
In SOV/OVS languages, a modifier that immediately precedes the head of its parent phrase gets the #M
inflection.
1SG quick0#M eat0 DIST
"I ate quickly"
Adverbial particle
In languages that use the "adverbial particle" strategy, a verb that modifies another optionally has the particle ADV
placed before it.
1SG DIST eat0 ADV quick0
"I ate quickly"bear fierce0 ADV very0
"a very fierce bear"
ADV
might also precede a noun in some cases:
1SG DIST see1 3SG ADV thursday
"I saw it on Thursday"