I a never-so-mentioned matter, that being how we are able to make sense of grammar in a non-statistical method. Many AI fashions, equivalent to DeepMind’s GEM, and Google’s PARSEVAL don’t rely purely on statistical studying to make sense of grammar. As a substitute, these hybrid fashions reintroduce formal grammars, equivalent to Combinatory Categorial Grammar (CCG), into their structure. This permits these fashions to utilize a long time of Linguistic evaluation in only a few strains of code. Theoretically, permitting them to achieve the identical degree of competence in much less time and at much less value. However how can we flip grammar into one thing a pc can work with?
To know this, we’ll discuss how phrases flip into features, the Algebra behind their mixture, and the way a program returning a TypeError is, in some ways, equal to a sentence with unhealthy grammar.
Pay attention, undefined would possibly deliver again some unhealthy reminiscences — phrases crossed out in pink, ruler to the wrist, or a clean stare within the face of “preposition”.
To a grammarian, that is aided by means of a set of prescriptive guidelines. Instructions like:
- Thou shalt not use “whom” as a topic.
- Thou shalt have a topic and an object in a sentence.
- Thou shalt not finish sentences with prepositions (at, by, into,…).
As a author, I’ve all the time discovered the commandments a bit restrictive—half ache, half medication. And whereas I can admit this grammar can make clear your writing, it doesn’t assist machines perceive sentence construction. To do that, we might want to focus on Combinatory Categorial Grammar (CCG), in the event you’re acquainted. Nonetheless, we can not abandon prescriptive grammar. And on this article we’ll use the 2nd Commandment: Each sentence should comprise a transparent topic and predicate.
From NLP to Proof Nets
Within the early 2000s, statistical CCG parsers had been leaders in offering vast protection and high-accuracy syntactic parsing by capturing long-distance dependencies and sophisticated coordination. Whereas not the present scorching matter in LLMs, it has helped form question-answering, logical inference, and machine translation programs the place structural transparency is desired.
Whereas grammar can now be inferred from sheer knowledge alone, no want for hand-coded guidelines, nonetheless, many state-of-the-art fashions re-inject syntactic indicators as a result of:
- Implicit studying alone can miss corner-case phenomena. A parser might be made to deal with triple-negatives in legalese or enjambment in poetry, however provided that you explicitly encode these patterns.
- Present sooner studying with much less knowledge. Studying grammar from knowledge alone requires billions of information and is computationally pricey.
- Interpretability and management. When analyzing syntactic errors, it’s simpler to have a look at parse-based options than opaque consideration weights.
- Consistency in technology. Purely emergent fashions can drift, flipping verb tenses mid-sentence, mismatching pronouns and antecedents. A syntax-aware parser or grammar module can implement this explicitly.
- Low-resource language constraints. Swahili or Welsh could have much less out there knowledge for standard large-scale coaching. Hand-coded grammar guidelines make up for that.
Proof Nets
One more reason CCG continues to matter is its deep connection to proof nets (Girard 1987). Proof nets are a graph-based method of representing proofs in linear logic that strip away bureaucratic particulars to disclose the core logical construction. Morrill (1994) and Moot & Retoré (2012) proved that each CCG parse might be translated into one in every of these canonical proof-net graphs, giving a direct, formal bridge between CCG’s syntactic derivations and linear-logic proofs. Lengthy-distance dependencies emerge as express paths, derivational redundancies are eradicated, and semantic composition follows graph contractions. After we say linear logic, each formulation should be used precisely as soon as in a derivation.
Consider it this fashion: as you construct a CCG parse (or its proof web), every syntactic mixture (e.g. a verb phrase combining with its topic) tells you precisely which semantic operation to carry out (operate‐software, operate‐composition, and so on.). That sequence of syntax‐guided steps then composes the meanings of particular person phrases into the that means of the entire sentence in a formally exact method.
The C&C parser and EasyCCG (Lewis & Steedman, 2014) are distinguished instruments within the area of CCG parsing. Whereas each are broadly used, EasyCCG is mostly acknowledged for its velocity, usually reaching sooner parsing occasions, whereas the C&C parser is ceaselessly famous for its accuracy, significantly on advanced sentences.
The place are we precisely?
Formally Sort-1 on Chomsky Hierarchy, proper beneath Turing Machines, proper above Pushdown Automata. Sort-1 is Context-Delicate. The deeper the language is within the Chomsky Hierarchy, the upper the generative energy, structural complexity, and computational assets required to parse it.
- Parse: to find out if a string might be constructed given the grammar guidelines.
- Language ( 𝑳) is a finite set of phrases composed by taking components from an Alphabet (𝚺) set, which incorporates all of the symbols for that language.
With the broader definition, “phrases” don’t should be phrases. For instance, our “phrases” could possibly be electronic mail addresses, and our alphabet might be numbers, letters, and symbols.
In English, If we need to discuss complete sentences, we are able to let our alphabet Σ be the set of all phrases (our vocabulary). Then a sentence is any finite string in Σ*, and a language L⊆Σ* is simply the set of “well-formed” sentences we care about.

Given this summary definition for language, we are able to discuss esoteric constructions equivalent to languages the place all phrases are issues like (ab, aab, aaabbbb, abb, and so on.) Formally described as follows:

Right here exponentiation appears extra like appending issues to the top of a string so 3² = 33 ≠ 9
This language is Sort-3 on the hierarchy, a Common Expression . Whereas it may be arduous to search out sensible makes use of for the above language, essentially the most ubiquitous examples of Common Expressions coming to make use of in the actual world is email-address validation on internet types: behind the scenes, the shape makes use of a regex like…
^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}$
this@[email protected]@😡grammar.in_the_email_language.ca
Right here the only real goal of our grammar is to be sure you enter a legitimate electronic mail handle. In CCG our grammar has a extra liguistic goal, it checks if phrases mix gramatically.
Chomsky’s Hierarchy in NLP
As you progress from Sort-3 to Sort-0, you lower the constraints on what you may produce. This will increase the expressive energy at the price of extra computation.
Sort-0 (Recursively enumerable grammars): Full semantic parsing or technology in a Turing-complete formalism (e.g. Prolog DCGs with arbitrary additional arguments, or neural seq2seq fashions that in precept can simulate any Turing machine).
Sort-1 (Context-sensitive grammars): Swiss-German makes use of cross-serial dependency, the place you want extra details about the encompassing phrases earlier than rewriting. This might require extra computational steps to parse.

After we cowl the Algrbra of CCG later come again and see how utilizing just one ahead and backward software would possibly encounter a problem with Swiss-German (Trace: it’s important to mix adjoining classes)
Sort-2 (Context-free grammars): A CCG turns into a pure Sort-II (context-free) grammar precisely once you solely permit the 2 software guidelines and no higher-order combinators (kind‐elevating or composition).
Sort-3 (Common grammars): Tokenization, easy sample–based mostly tagging (e.g. recognizing dates, electronic mail addresses, or part-of-speech tags utilizing common expressions or finite‐state transducers)
The Algebra of CCG
Let’s say we now have classes A and B, then ahead software and backward software work as follows:
- A/B says that if we now have a B to the proper of A/B, then the ensuing product is A
- AB says that if we now have a B to the left of AB, then the ensuing product is A.

in apply A and B grow to be components of sheach
The algebra of CCG appears quite a bit just like the multiplication of fractions; discover how “numerators” cancel with “denominators”. Nonetheless, not like multiplication, the order issues. This algebra is not commutative. Don’t keep in mind this as a rule, however a direct consequence of phrase order. This lack of commutativity is critical to distinguish between “We go there.” As a sentence, and “Go we there.” as nonsense.
Combining two atomic classes utilizing or /, (e.g: NP/N) creates a advanced class that classifies a phrase and describes how it may be mixed.
Within the following illustration, “The” combines with “Canine”(Noun (N)) to make the Noun Phrase (NP) “The canine”. Equally, the Noun Phrase “The canine” can mix with “ran”(verb(SNP)) to make the sentence (S) “The canine ran”.


Developing Full Sentences
Take one thing just like the phrase “a”—clearly not a Noun, Noun Phrase, or Sentence, however we might describe it in these phrases by saying “a” is a phrase that expects a noun on the appropriate to grow to be a noun phrase:
“a” = NP/N
That is how “a ball” (NP/N N → NP) turns into a noun phrase.
Do you see how we are able to cleverly describe articles (a, an, the) when it comes to NP and N to create a class that describes how they operate, how they work together with the phrases round them? Why name “the” an article after we can name it a operate that expects a noun to grow to be a noun phrase?
We will do the identical factor with verbs. To type a sentence S, we’d like a topic and a predicate.
- The Topic (RED) does the motion.
- The motion is the verb.
- The Predicate (BLUE) receives the motion.

By splitting the sentence this fashion, we are able to see that the verb acts as a fulcrum between two mandatory components of sentence development, so that you shouldn’t be stunned to see that the verb and adverb take a really particular function in CCG, as they’re classes that comprise the atomic class S.
We will describe a Verb as one thing that takes a Noun Phrase to the left, and a Noun Phrase to the appropriate to grow to be a sentence. (SNP)/NP. No extra atomic classes wanted.
“After debugging for hours” is a subordinate (dependent) adverbial clause. Parsable by C&C or EasyCCG
How This Pertains to Programming
The factor I discover most elegant about CCG is the way it turns the verb “write” right into a operate (SNP)/NP that takes a Noun Phrase to the left and proper as enter to output a sentence. By treating phrases as features, the CCG parser type-checks a sentence the identical method a compiler type-checks a program.
A dreaded TypeError will ensue in the event you attempt to make a sentence like “run write stroll.” This is not going to compile the identical method sum(“phrase”) wouldn’t compile. Within the first case, you enter a verb the place a Noun Phrase was anticipated, and within the second, you enter a string the place a quantity was anticipated. TypeError
In Lambda calculus, we might write:
λo. λs. write s o -- look ahead to an object o, then a topic s
In CCG, each lexical merchandise carries not solely a syntactic class but additionally a small lambda-term encoding its that means — e.g. write may be assigned (S(SNP)/NP with semantics λo. λs.write(s, o) to point it first takes an object (o), then a topic (s). As you apply CCG’s combinatory guidelines (like operate software), you concurrently apply these lambda-terms, composing the meanings of phrases step-by-step into a whole logical type for the entire sentence.
Lambda calculus is a very small formal language that does one factor: It describes how you can construct features and how you can run them. Every thing else — numbers, Booleans, knowledge constructions, even complete applications — might be encoded when it comes to these features. In consequence, the lambda calculus serves as a exact mathematical mannequin of computation itself.
Conclusion
The facility of CCG lies in its capacity to rework language into an algebraic system, offering a transparent set of compositional directions. That is extremely helpful for revealing the connections between human language and formal computation. Admittedly, the CCG defined right here isn’t complete sufficient to parse sentences like:
Parsing these sentences requires way more. Once you attempt to construct a complete CCG system to deal with real-world English at scale, you want over 1,200 totally different grammatical classes, revealing how a lot hidden complexity exists in what looks as if “abnormal” language use.
Even the next development is a simplified mannequin:
S
├── S
│ ├── NP CCG
│ └── SNP
│ ├── (SNP)/NP is not
│ └── NP
│ ├── NP/NP simply
│ └── NP
│ ├── NP/N a
│ └── N
│ ├── N/N highly effective
│ └── N
│ ├── N method
│ └── NN
│ ├── (NN)/NP for
│ └── NP
│ ├── NP computer systems
│ └── NPNP
│ ├── (NPNP)/(SNP) to
│ └── SNP
│ ├── (SNP)/NP perceive
│ └── NP
│ ├── N/N sentence
│ └── N construction
├── (SS)/S ; (punctuation)
└── S
├── NP it
└── SNP
├── (SNP)(SNP) additionally
└── SNP
├── (SNP)/(S[TO]NP) seems
└── S[TO]NP
├── (S[TO]NP)/(SNP) to
└── SNP
├── (SNP)/NP mirror
└── NP
├── NP/(SNP) how
└── N
├── N/N our
└── N
├── N brains
└── NN
├── (NN)/NP course of
└── NP language
At its core, CCG supplies a methodical and rigorous strategy to separating sentences, reassembling them, and making certain grammatical consistency. All of the whereas avoiding incomplete sentences like:

References
Wang, S., … et al. (2024). Computational fashions to review language processing within the human mind: A survey. arXiv preprint arXiv:2403.13368. https://arxiv.org/abs/2403.13368
Lewis, M., & Steedman, M. (2014). A* CCG parsing with a supertagger and sensible dynamic programming. In Proceedings of the 2014 Convention on Empirical Strategies in Natural Language Processing (EMNLP) (pp. 1787–1798).
Girard, J.-Y. (1987). Linear logic. Theoretical Pc Science, 50(1), 1–102.
Morrill, G. (1994). Categorial deduction. Journal of Logic, Language and Info, 3(3), 287–321.
Moot, R., & Retoré, C. (2012). The logic of categorial grammars: A deductive account of pure language syntax and semantics. Springer.
Jurafsky, D., & Martin, J. H. (2023). Speech and language processing (third ed.) [Appendix E: Combinatory Categorial Grammar]. Retrieved Could 29, 2025, from https://web.stanford.edu/~jurafsky/slp3/E.pdf