Introduction to Theory of Computing
Contents
ToggleTheory of Computation is a foundational branch of Computer Science and mathematics that defines the capabilities and limitations of computational system using mathematical models. It is a domain that studies how computers work, what are the problems computers can solve, capabilities and limitations. It focuses on what can be computed (computability), how efficiently it can be solved (complexity).
It uses abstract models like Turing Machines to explore automata theory, computability and complexity. It provides the foundation for algorithms, compilers and for understanding undecidable problems.
The Three main pillars of Theory of Computing
- Language – A medium for communicating
- Automata – A mathematical model or machine
- Grammar – Rules and regulations for generating a Language.
Language is a medium for Communicating and grammar is a set of rules and regulations for constructing a language. Just like we need an understanding of grammar for creating a sentence properly in English . But before we knew grammar we had to learn alphabets like A,B,C…..X,Y,Z.
Before we move on to what a Language , Grammar or Automata are we need to understand the basics of TOC.
1. Symbol
A symbol is the smallest unit or building block of a language. It is also called a character. It can be anything like { a,b,c,0,1,2,3,4……x,y,z}.
2. Alphabet
A finite set of symbols is called an alphabet. Finite means which can be counted. Denoted by Σ. For example, if x and y are two symbols then the alphabet will be denoted by Σ={x,y} or Σ(x,y). Used to construct strings and languages.
3. String
A finite sequence of alphabets is known as a string. Denoted by w and the length of the string is denoted by |w|. For example, Σ (a,b), then strings possible will be a and b. But if the given length of the strings possible is 2, then the strings will be {aa, ab , ba ,bb}. If the given length is 3 then strings possible will be {aaa,aab,aba,abb,baa,bab,bba,bbb}.
So if the length is n , then the strings will be 2n .
Power of Σ
Lets say given an alphabet Σ(a,b)
- Σ0 = Set of all strings with length 0 (Null String). Denoted by λ or ε. These are known as Identity element.
- Σ1 = Set of all strings with length 1 which is a,b.
- Σ2 = Set of all strings with length 2 which is Σ.Σ = {a,b}.{a,b} = {aa,ab,ba,bb}
- Σ3 = Set of all strings with length 3 which is Σ.Σ.Σ = {a,b}.{a,b}.{a,b} = {aaa,aab,aba,abb,bbb,bba,bab,baa}
.
.
.
- Σ* (Kleene Closure) – Set of all strings of all length possible for certain alphabets from zero to infinite number of times, in which ε (epsilon) is also included. It is also known as Kleene Star.
- Σ+ (Positive Closure) – Set of all strings of all length possible except ε.
Language
Collection of all the strings is called a language. A language is a set of strings formed by concatenating symbols from a specific, alphabet (Σ). It is formally defined as a subset of all possible strings (Σ*) that can be generated, representing valid inputs or computational problems.
A language can be of two types : Finite and Infinite
Finite Language means that language which contains fixed number of strings. Suppose, Σ(a,b) and |w|=2, then language will be {aa,ab,ba,bb}. Here we can count the number of strings (fixed no. of strings). So it is a finite language.
Infinite Language means that language which has infinite number of strings and can not be counted. If Σ(a,b) and length is at least 1 a, then the language will be { a,aa,aaa,aabaaa,ababa,baba …….. }. In this case the strings possible can be infinite and hence it is called a infinite language.
Languages can also be classified into 4 categories based on their complexity which fall under Chomsky Hierarchy.
- Regular Languages
- Context-Free Languages
- Context-Sensitive Languages
- Recursively Enumerable Languages
We will be knowing these four categories of Languages in later part of this TOC Tutorial.
Languages define the syntax of programming languages and the inputs for automata to solve problems. Essentially, a language in TOC is a formal way to describe a set of strings that follow specific structural rules, which can be checked or generated by a machine.