Kitchen Style Tutorials: Set Theory #7

Functions, embeddings, restrictions/extensions, composites, canonical mappings, inversions, in/surjections, kernels, isomorphisms…

“There can be only one”

Functions are special kinds of Relations, those Relations that would not “split”. In our Tropical Island analogy that would mean that we are looking for only one best look-alike coconut on the second island for any coconut from the first island. The same one could be the best copy of a number of coconuts, but we must decisively pick one out of potentially many candidates on the second island.

We still can use the Relations notation x R y, or, if you will, x F y, understanding that F is a set of ordered pairs f of elements x from set X and elements y of set Y for which condition F holds true, but such ones that have the above restriction, which could be formulated as: if (x, y) ∈ F, and (x, z) ∈ F, then y = z. Such pairs ƒ = (x, y) ∈ F usually are denoted as ƒ(x) = y. If we want to talk about function (or, interchangeably, mapping, or correspondence, or transformation) from the WHOLE set X (dom ƒ = X) to some subset of the set Y, we may use notation ƒ: X → Y. This notation is a bit counterintuitive and limited in its asymmetry, but convenient for the usual view on a Function as a thing that has an unbounded domain X, but potentially bounded range in the set Y. If we happen to have to work with piecewise functions, we can define a subset A ⊂ X, and use the earlier notation ƒ(A), which is also not a perfect, but potentially ambiguous notation in case A is also a member of X (not a usual, but not an impossible case). A more generic, but usually used for Space transformations, notation F(X, Y) does not actually imply that every element x of the set X is in the set x F y in the usual sense. The closest approximation of the idea that transformation may not map every element of X into Y there is implemented via “zero” or identity element, mapping to which effectively means an absence of the mapping to the “real” element of Y.

If we want to map elements of set X into itself (i.e. look-alike coconuts on the same island), then we may call such a transformation an operator, and depict it as L: X → X, or L(X). If X is a subset of Y (for example the island-wide look-alikes of the coconuts from the particular patch), then ƒ: X → Y is called inclusion or imbedding/embedding, or injection (operator). If y = ƒ(x) = x : ∀ (for all) x ∈ X, such an operator is called identity.

If we have mapping ƒ: X → Y defined, and A is subset of X, we may naturally define a “smaller” function  g: A → Y, by requiring g(x) = ƒ(x) : ∀ x ∈ A. Such a function is called restriction (of ƒ), and it’s usually depicted as g = ƒ | A, so definition above could be written as (ƒ | A)(x) = ƒ(x) : ∀ x ∈ A. Function ƒ is, respectively, called extension (of g).

Because (∵) we required in the function definition ƒ: X → Y that any arbitrary x ∈ X should have some mapping into Y, and if we have a second function g: Y → Z, we can naturally define a third, composite function h: X → Z such that h(x) = g(ƒ(x)), which is, actually a restriction of g(y) where y ∈ Y. I.e. ƒ(x) will exist for any x ∈ X, and ƒ(x) ∈ Y, therefore (∴) g(ƒ(x)) will exist as well. Other notations could be either just gƒ(x), (gƒ)(x), or g∘ƒ(x). In our Island analogy, we can imagine that, if all coconuts from island A carried away to island B and washed ashore there, then get carried away (by a morning high tide, for example) with all other coconuts there to the beaches of island C, then we can safely ignore their overnight stay on island B, and view the island A to C current transit as a function in its own standing.

Even more interesting are the functions associated with the equivalence relation R we discussed in the previous chapter. Function ƒ: X → X/R that maps elements of a set X, on with the equivalence relation R is defined, onto elements of the quotient (or factor) set X/R, i. e. if a R b then ƒ(a) = ƒ(b) = a/R (or π(a)), is called canonical mapping. In our analogy, if, for example, all roundish coconuts from the original island happen to be carried away by currents to island A, while all prolongated coconuts – to island B, yet all ridged – to island C, such shape division will be equivalence relations, collections of coconuts of particular shape could be seen equivalence classes, islands – quotient set, and currents would represent canonical mapping.

This set-theoretic approach of defining Function as a kind of Relation, which IS a (static) set of ordered pairs x F y that doesn’t DO anything, may be dissatisfying for some mathematicians, and especially for computer users, for its lack of action. Under this approach, we can look at the usual notation y = ƒ(x) as a whole and statically – just a weird way of writing individual relations ƒ between elements x ∈ X and y ∈ Y as pairs ƒ = (x, y). Those who love dynamics, rather prefer to look at Function as some “Magic” Converter or Establisher of Correspondences ƒ that takes elements x ∈ X and DOES something with them, the result of which action ƒ(x) are elements y ∈ Y. Then the result of that action on the whole domain X orderly paired with the domain itself – the above-mentioned set F – is called Graph of the Function, and is depicted as Гƒ.

We can call a range of the x F y function set (y for some x in (x F y)) an image. If we reverse order of the pairs (y, x) and call such a set y F⁻¹ x, we can get a range set we can call inverse image, elements of which we may depict as ran (ƒ⁻¹ = (y, x) ∈ F⁻¹: for some y) or  ƒ⁻¹(y). Or, we can see it as a domain of F for a chosen range, then, when talking about the inverse image of some subset B ⊂ Y to set X, we can depict it as ƒ⁻¹(B) = {x ∈ X: ƒ(x) ∈ B}. Because we didn’t require uniqueness of x for a given y in the F function’s ordered pair elements, an inverse image of a function is not a range of a function itself, but of a mere relation. In our Tropical Islands analogy all coconuts from island A, which look like a particular chosen coconut on island B, would be its inverse image.

We can always create a special type of functions, though, that would require such a uniqueness: if ƒ(a) = y and ƒ(a’) = y, then a = a’. Such functions are called injective (one more terminology ambiguity), or one-to-one. For such functions, an inverse image ƒ⁻¹(y) under ƒ will be an inverse function, i.e. ƒ⁻¹(y) = x iff ƒ(x) = y. Also, we can define another special type of onto, or surjective functions, requiring Y = ƒ(X) (instead of generic ƒ(X) ⊂ Y). Which means that there is no such a coconut on island B which would not be a look-alike to some coconut on island A. If a function is simultaneously injective and surjective it’s called bijective.

If a function is not surjective, ∃ (there exists) B ⊂ Y : ƒ⁻¹(B) = ∅ (i.e. unique coconuts on island B you can’t find like them on island A). What if there exist unique coconuts on island A? Usual set-theoretic notations are a bit tough to express that fact, the best we can come up is: for A ⊂ X : ƒ(A) = ∅ (requiring ∅ be an element of Y), but a more flexible Space transformation notations have a special idea for that. It’s called null space of transformation T (null T(X, Y)), or kernel of T(X, Y): kernel T = {x ∈ X: T(x) = ∅}. kernel T = ∅ (requiring ∅ be an element of X as well) would mean that all other x of set X do have non-empty mappings to set Y.

That is not a usual set-theoretic concept, but may be useful for understanding what we were doing when we were proving iff (if and only if) theorems, demonstrating truth of the statement in the forward and then reverse order. When we work with Spaces, we often use concept of isomorphism, which we are not going to discuss in detail, but the general idea here is that if transformation from one space to another preserves the Structure, i.e. Relations between elements of spaces, we can transparently “re-label” them, and whatever observation and conclusions we make about elements and their behavior in one, more convenient for observation and contemplation space, would be true in another isomorphic space. If we talk about just sets without relations defined over them, we’ll at least need that all elements of one set would be mapped to elements of another set and vice versa.

That is exactly what we were doing when proving iff theorems. We can envision the left part of the theorem statement as a set X specification (remember Axiom of Specification?), and the right – as a set Y specification. Then we make sure that every element of X is T-mapped into Y (i.e. null T(X) = ∅), and conversely, that every element of Y is inversely mapped into X (i.e. T(X) is onto). Therefore, the left and the right sides of the statements are just different labels of the same thing.


Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #6


Alone as Islands in the Ocean…

So far we were talking about Sets as collections of isolated objects – like Islands in the Ocean, maybe aggregating into Archipelagos, but still unconnected and unrelated to each other. That is a quite boring picture and actually unrealistic one. There are birds flying from Island to Island, there are ocean currents bringing coconuts, branches, seeds from trees growing on one Island, to shores of others, not even mentioning ocean animals cruising waters around Archipelagos, or traveling half a Globe from one to another. And deep on the Ocean floor level, we can see these Islands are really is just one thing – Earth Crust having ridges and spouts here and there.

It’s a whole Space of Interconnected and Interrelated entities, whatever way we look at it. We already mentioned that Mathematics is based on the Real World and the ways we, humans, percept it. Relations between objects is what we see everywhere (even if there are “really” no ones) – it is a nature of the human mind. Hence, how is it useful, or even, possible to have such mathematical abstractions as Sets, which have only one type of relation (belonging) that exists only between Somethings (Set elements) and Nothings (Sets themselves, as we defined them in Axiom of Extension)? Let’s not even rush to answer this question yet, but instead, let’s fix this obvious shortcoming of Set Theory, and define what Relations are.

The surprising answer to the question “What Relation is?” would be “Don’t know, don’t care (that don’t know)”. As previously, when we were talking about Sets themselves, we are rather concerned with the ways we can represent those innate concepts as Sets, and now, Relations. If Relation is a “thing” between two objects x and y that could be held either as True or False, then we can represent it as an Ordered Pair:

r = (x, y)
(as we remember, Ordered Pair means (x, y) ≠ (y, x))

Then relations between elements x belonging to set X, and elements y belonging to set Y, could be represented also as a Set of Ordered Pairs R, and particular relation (x, y) belonging to R could be depicted as:

x R y

and we can say that set R is a relation from set X to set Y.

In our Tropical Islands analogy, we may define set X as all coconuts that have grown on one island, set Y – on another, and an innate relation we want to represent, say, is a look-alikeness relation. Everyone feels what does that mean that one object looks like another, (though particular judgments may vary from person to person) so a set of pair-wise observations that a coconut from island X looks like another from island Y would be relation R from X to Y. One coconut from island X may look like many coconuts on island Y, and one coconut on island Y may look like many coconuts on island X, hence Relation is a many-to-many “thing” in general case, and is good to represent such equations as x² + y² = a, which are not functions (ok, we haven’t defined what all these things are, yet, but your innate mathematical understanding is enough to make it clear).

All coconuts from island X that present in the set of relations R could be named as domain, while all coconuts from island Y found in the set R – range:

dom R = {x: for all y in (x R y)}

ran R = {y: for all x in (x R y)}

We usually agree that particular object (or coconut) looks exactly like itself, and that holds true for each and every coconut (at least at a particular moment). Such a relation x R x that holds true for all (depicted as ∀) x ∈ X ∨ Y is called reflexive. Also, if one coconut looks like another, that implies that another looks like the former: x R y ⇒ y R x (∀(x, y) ∈ R). Such relations possessing that quality are called symmetric. If we have three coconuts, that usually means that if coconut a looks like b, and b looks like c, then a also looks like c (∀(a, b) ∈ R ∧ ∀(b, c) ∈ R). Such a relation is called transitive.

From our innate experience, we know that not all relations are such. If set X is coconut palms and set Y is coconuts, and relation between them is the relation of “growth”, then such a relation is not reflexive (coconut or palm does not grow on itself), not symmetric (if a coconut grew on a palm, does not mean that the palm grew on that coconut), and not transitive (even if a second generation coconut grew on a palm that came out of the first generation coconut, it still did not grow on the original first generation palm).

We can envision Relations even in a more tangible way, as coconuts floating from one island X, to another – Y, where places of the palm growth would be elements of these islands-sets. Although some particular coconuts may float off and then back to the same palm, or float from island X to Y and back to the same palm growth place, or float to place c, while others hop to b and then to c, those conditions won’t be satisfied for each and other coconut, so such a relation will be neither reflexive, nor symmetric, nor transitive. If we prefer to look at a more respectable, mathematic-like examples with real numbers, then the circle equation above is symmetric but isn’t reflexive or transitive. “Greater” or “less” relations are transitive but aren’t reflexive or symmetric. And |x – y| < 1 is reflexive and symmetric but isn’t transitive.

Relations “from X to X” (also called “in X”) that are reflexive, symmetric and transitive do have a special place and are called equivalence relations. For all y for which x R y holds true, these relations create so-called equivalence classes in respect to x (depicted as x/R, or π(x)), which are either disjoint or equal *. Collection or set C of these equivalence classes, depicted as C = X/R, could be called a partition, or quotient, or factor set. On the other hand, we can say that partition C induces equivalence relation R in set X, or R = X/C **.

* In a more formal way, we want to prove that if a, b ∈ X and a/R ∩ b/R ≠ ∅, then a/R = b/R. Let’s pick an arbitrary c that belongs to the intersection of two equivalence classes: c ∈ a/R ∩ b/R. Belonging of c to a/R would mean a R c, as well as, belonging of c to b/R means b R c. Let’s pick an arbitrary x that belongs just to a/R, which would mean a R x. From symmetry: a R c ⇒ c R a, and from transitivity: c R a, a R x ⇒ c R x. Also from transitivity of b R c and c R x ⇒ b R x, which means that arbitrary x belonging to a/R always belongs to b/R, or a/R ⊂ b/R. Applying the same logic to arbitrary x belonging to b/R, we’ll get b/R ⊂ a/R, or a/R = b/R. Also, by definition, because every x belongs to some equivalence class, then ∪ x/R = X. I.e., indeed whole set X is partitioned on a collection of disjoint equivalence classes.

** Here we want to show, which is quite obvious, that when we partition set X into set of subsets C, and each of these subsets would have elements x, y, z, and if we can come up with a relation R (which we indeed always can – for example, relation of belonging to the same subset) that would guaranty reflexivity, symmetry and transitivity between them, then such a subset would be an equivalence class, and all of them would be a quotient set.

The latter fact is quite dramatic – it means that we don’t deal with pure sets, we always work with spaces. When we curve out subsets from the Universe, in accordance to the Axiom of Specification, we willingly or unwillingly partition it, and that partitioning is guided by our perception, and implicitly or explicitly that perception imposes or induces relations (at least the equivalence ones) between elements of the newly created sets. There maybe exist whole, clean, unpartitioned (by our perception) sets out there in the Universe, but we don’t have access to them. I guess, we, humans, are not upset about that, because we always look for relations between things 🙂

Natural Language Processing examples

An obvious example of a relation between words in a whole sentence is their order. We can represent the order either in a form of immediate ordered pairs, or longer ordered tuples. It could be argued that majority of syntax (and corresponding semantic) structures could be represented naturally via binary relations, and those structures that do not naturally map to them (like enumerations or compound words) could be still technically represented as recursive pairwise structures without much of the structural information loss. If we don’t agree with that, we still may use three-, four-, etc- wise relations.

Let us start building on the “Dog Bites Man” (usually) example from previous chapters, and extend it to a sentence S1 = “Agitated Brown Dog Bites Tall Man In Large Park”. Such a concept level statement looks a bit awkward (daring hasty generalization or unfounded assumption is weird) but let’s stick to simpler grammatical forms to avoid dealing with stop-words and stemming. Should we break up the sentence into a set of ordered pairs, or treat it as a whole long ordered tuple? Or break it up into smaller, but still tuples longer than pairs? The answer(s) may be not so obvious if looking for it from the inside of an analytic language such as English.

It doesn’t matter what formal instruments a particular language uses, the ultimate goal is to express partitioning of a sentence, for example, into Theme (what we speak of) and Rheme (what we speak about it), or, on a finer level, into Subject, Verbal Group, Direct and Indirect Objects, Adverbial Clauses, etc. Purely Synthetic/Flective languages use Case System to bind nouns, adjectives, and participles into Subject, Object, Adverbial Clauses and delineate them from each other. For example, the Accusative Case would mark Direct Object, Dative – Indirect, and so on. Order of all these words – nouns, adjectives, participles is irrelevant, and they may be intermixed with each other arbitrary, but knowing their cases (via inflection, for example), we can easily find out to which partition of the sentence they belong.

As we mentioned before, any partitioning or subsetting of a set implies the existence of an Equivalence Relation between members of a subset. The Case partitioning, being more fundamental (at least for Indo-European languages), implies that we have to look for a similar equivalence relation in Analytic languages. Strong Greater or Less relations, being applied to the whole sentence set, are obviously not good for that, because they are neither reflexive nor symmetric. What we can do for analytical language is to create an ordered subset of delineating words (in our case S2 = (Bites, In, .), then S3 = S1 – S2), and use the relation R = “in between of the same delineating word ordered pairs (listed in set Q = {(∅, Bites), (Bites, In), (In, .)})”, to partition the sentence: S3/R = {{Agitated, Brown, Dog}, {Tall, Man}, {Large, Park}} (which could be implemented as an intersection of the Less relation relative to delineating words).

Does the order-based relation partitioning have its useful place in NLP? Of course, it does. We mentioned before that not only relations define partitioning, but the opposite is also true – the very partition of a set implies there exist a relation defining that partition. For example, our sentence starts with “Monday Morning Agitated Brown Dog…”. There is no helper word here to delineate Adverbial from Subject, still, we can partition that fragment, for example using colocation dictionaries, and recreate the elusive relation.

Yet another thing we could be interested in is to find relations between members of the quotient set S3/R = {{Agitated, Brown, Dog}, {Tall, Man}, {Large, Park}}. What can help us in that is a mapping to its dual set Q = {(∅, Bites), (Bites, In), (In, .)} we used to partition the original set S3. And about that stuff of functions and mappings, we’ll talk in the next chapter…


Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #5

What exactly box are you thinking outside of?

If you haven’t noticed (likely you have, just have forgiven it) we weren’t rigorous in our definition of the union of sets: ∪ = {x: x ∈ X for some X in C}, or intersection of sets ∩ = {x: x ∈ X for all X in C}, which does not explicitly acknowledge Axiom of Specification in the form ∪ = {x ∈ S: x ∈ X for some X in C} or ∩ = {x ∈ S: x ∈ X for all X in C}. Of course, we can imply that sets X in collection C were defined rigorously according to Axiom of Specification as X = {x ∈ S: some condition}, and get away with that a bit sloppy union or intersection definition.

If we don’t imply these Axiom of Specification things, we again prone to things incomprehensible by the usual human mind. For example, what if collection C is an empty set ∅? What is ∪ = {x: x ∈ X for some X in ∅}, or ∩ = {x: x ∈ X for all X in ∅}? What are these things that belong to other things which belong to Nothing? That is unclear, and we may want to try to answer that question from the other side – which things do not comply with our definition, and the rest will be our answer. So, there should be some x that does not belong to one (intersection) or all (union) X belonging to empty set. But nothing belongs to ∅, therefore there are not such x’s. Therefore any and every x of the Universe, or even more than Universe belong to the union or intersection of the empty set members. Ok, AI or ETI, can you explain this thing to humans? No? Then we are back to our human intelligence safe Axiom of Specification, which says that all members of the sets, we are talking about, are members of some superset S, therefore ∪ ∅ = ∩ ∅ = S. So we are fine and sane again.


So far, in the Unions and Intersections section, we were talking about creating our “chimera” sets by joining (one or another (inner or outer) way) original “genuine” sets we carved out of the Universe. What about other ways of creating “chimera” sets? For example by separating (in some ways) parts of the original “genuine” sets?

Yes, we can define the difference set (or operation over sets, though we haven’t defined yet what it is, and we’ll just rely on the innate understanding of that word so far), known also as relative complement, which we can denote as A – B = {x ∈ A: x ∉ B}. In this notation B is not necessarily a subset of A (B ⊂ A) and could be B ⊄ A. We can mention here some easily provable facts (in a sense of Latin root of the word – to make, compose, construct), and let the reader prove them:

A ⊂ B iff A – B = ∅

A – (A – B) = A ∩ B

A ∩ (B – C) = (A ∩ B) – (A ∩ C)

We can also define the symmetric difference, or Boolean sum: A + B = (A – B) ∪ (B – A), which could be easily seen (and left for the reader to prove), is commutative A + B = B + A, as well as A + ∅ = A, and A + A = ∅. The symmetric difference is also associative A + (B + C) = (A + B) + C, which is a bit less transparent. The left part of the equation boils down to A + (B + C) = (A – ((B – C) ∪ (C – B))) ∪ (((B – C) ∪ (C – B)) – A), where ((B – C) ∪ (C – B)) means that an arbitrary x belongs either to B or C, and not to both, while A – ((B – C) ∪ (C – B)) means that x belongs either to A or to all three A, B, and C; also ((B – C) ∪ (C – B)) – A means x belongs to either to B or C, and the same time not in A. Careful right side unwinding (which is left to the reader) will lead to the same result ■. Actually, often in Set Theory graphic proofs via Venn diagrams (which the reader has already heard of or seen in every place mentioning Set Theory (therefore we don’t mention them here)) could be faster and easier to understand 🙂

We can also assume that set A (and any other set in this paragraph) belong to a superset S: A ⊂ S, and call this complement absolute, and denote it as A’, or C(A), and come up (literally make it up :)) with some easily provable facts:

(A’)’ = A

∅’ = S and S’ = ∅

A ∩ A’ = ∅ and A ∪ A’ = S

A ⊂ B iff B’ ⊂ A’

and most useful facts in everyday life are called DeMorgan laws:

(A ∪ B)’ = A’ ∩ B’ * (or in the collection, of family form: ∪’ {X: X ∈ C} = ∩ {X’: X ∈ C})

(A ∩ B)’ = A’ ∪ B’  (or in the collection, of family form: ∩’ {X: X ∈ C} = ∪ {X’: X ∈ C})

* Let’s prove the first law for two sets. The left side means that arbitrary x belongs to S and does not belong to either A or B, which means x does not belong to A and does not belong to B, which we see on the right side, therefore we can say at least (A ∪ B)’ ⊂ A’ ∩ B’. But what about right side? Does something from there doesn’t belong to the left side? The right side means that x belongs to S and does not belong to A and does not belong to B, or in other words – doesn’t belong to either one, i.e., at least (A ∪ B)’ ⊃ A’ ∩ B’. Does the left side contain anything that is not on the right side? Previously we already have proven it does not, therefore (A ∪ B)’ = A’ ∩ B’ ■. Why are we doing these things – going from left to right, then from right to left? We have already mentioned that Set Theory will actually tell us why, and that is related to isomorphism (which we didn’t define yet), and here, in particular, we want to make sure that Null Space (Null T) is ∅, and T is onto. What all that is will discuss later, so stay tuned 🙂

Examples in Natural Language and Artificial Intelligence Domains

The Union or Intersection over Empty Set idiosyncrasy seems as an unimportant peculiarity, but if we look at Unions as ways of creating new texts, the Empty Set as a starting point means the beginning of a conversation. Since such a Union is everything in the Universe then anything is good enough for starting a dialog. That’s quite an interesting point for the real, agency-full AI development.

Complements so far look like the most important set-theoretic feature for the agency-full AI. It’s really amazing – you want to create a new subset out of the other ones, and “accidentally” you create another, “delta” set you don’t intend, but it is a text with its own meaning. Of course, we can ignore and throw away that unintended text, but also we can learn from it. From word2vec models we’ve learned there is a delta vector between words “King” and “Queen”, and not only between the words, but texts associated with them, and even more, the similar delta vector is seen between texts associated with other gender-related actors: female and male Presidents, PMs, pilots, etc…

When subtracting set “King” from set “Queen” (which of course have the common Intersection Set with such elements as {“power”, “will”, “respect”, …}) we’ll get a Relative Complement {“beauty”, “charm”, “intrigue”, …}, while doing the opposite subtraction we’ll get another part of the Boolean Sum set: {“hunting”, “fight”, “sword”, …}, which would represent just learned concepts of Femininity and Masculinity.

For our “Dog bites man”/”Man bites dog” example, one Relative complement would be: A – B = {{dog}, {bites, dog}}, while B – A = {{man}, {bites, man}}. That means that Relative complements (or Boolean sum) allow us learn about difference in relations in these two sentences.

It is Complements which are really new, i.e. unexpected and unintended sets, which we really want (and fear) from the agency-full AI.



Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #4

Unions and Intersections of Sets

Can we really do anything?

When we introduced Axiom of Specification we allowed ourselves to create new sets from the ones we can already find out there in the Universe. When we introduced Axiom of Pairing, we agreed we can combine those (sub)sets as elements into other sets (of the higher hierarchy). But what about combining elements of (sub)sets into new sets, effectively “melting” their boundaries? Or, by Descartes’ analogy, creating chimeras from the parts of existing animals? That is quite a debatable question whether chimeras exist in the Universe, therefore, we, again, have to make a voluntary decision that they do exist in the Axiom of Unions (or we can call it Axiom of Chimeras) that says that for every collection of sets C there exists a set that contains all the elements which belong at least to one set of the collection.

We depict the resulting, “chimera” set as ∪ = {x: x ∈ X for some X in C}, or we can use shorter notations like:

∪ C, or ∪ {X: X ∈ C}, or ∪X ∈ C X, or ∪ {X: X ∈ {A, B, … D}} = A ∪ B ∪ … D

From the definition above, there follow immediately such equations as:

∪ ∅ = ∅ and ∪ {A} = A

Intuitively, easily understandable properties of the set unions, which may still require simple proofs, are:

A ∪ ∅ = A *

A ∪ B = B ∪ A (commutativity)

A ∪ (B ∪ C) = (A ∪ B) ∪ C (associability)

A ∪ A = A

A ⊂ B iff A ∪ B = B (or A ⊂ B ⇔ A ∪ B = B ) **

Those simple proofs, following from the definition of the ∪ set: A ∪ B = {x: either x ∈ A or x ∈ B}, would be like those two for the first and last property (the rest we left to the reader):

* We have to prove that the left and right sides of the equation A ∪ ∅ = A are indeed equal. Let’s take an arbitrary x belonging to the set ∪ on the left side. That “belonging” means that x either belongs to A or doesn’t exist. An arbitrary x belonging to the set on the right side belongs to the set A. Therefore (or ∴) if any arbitrary x on the left side is present also on the right side, all x that belong to the left side belongs to the right side. QED (quod erat demonstrandum – which was demonstrated) or notation ■.

** By definition of A ⊂ B, if x ∈ A, also x ∈ B. Then, an arbitrary x belonging to the left side of A ∪ B, belongs either to A (and also to B due to the early A ⊂ B proposal) or to B. An arbitrary x belonging to the right side, belongs to B. i.e. both sides equal, therefore A ⊂ B ⇒ A ∪ B = B. Conversely, if both sides of expression A ∪ B = B are equal, that means that every x belonging to A also belongs to B, which means that A ⊂ B, i.e. A ⊂ B ⇐ A ∪ B = B. Now, we proved A ⊂ B ⇔ A ∪ B = B in both directions ■. Why we do that in both directions? The very Set Theory will tell us later, and for now, we can just throw a word isomorphism, and left the reader guessing what that is 🙂

If we substitute word “some” by “all” in the union definition, we’ll get another type of the “chimera” set – intersection ∩ = {x: x ∈ X for all X in C}, which could be denoted as: ∩ C, or ∩ {X: X ∈ C}, or ∩X ∈ C X, or ∩ {X: X ∈ {A, B, … D}} = A ∩ B ∩ … D, with similar (or one can say some of them are “anti-similar”) qualities:

A ∩ ∅ = ∅

A ∩ B = B ∩ A (commutativity)

A ∩ (B ∩ C) = (A ∩ B) ∩ C (associability)

A ∩ A = A

A ⊂ B iff A ∪ B = A (or A ⊂ B ⇔ A ∩ B = A )

If A ∩ B = ∅, such sets are called disjoint. There are also couple useful qualities involving both, union and intersection sets, which are called distributive laws:

A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) ***

A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)

*** The first one we can prove similarly to the above proofs: if an arbitrary x belongs to the left side, that means it belongs both to A and B ∪ C, which means it belongs either to B or to C. Then, if x belongs to A and B it belongs to the right side, or, in the other only case, if it belongs to A and C, it still belongs to the right side. Conversely, if x on the right side belongs to A ∩ B or A ∩ C, it always belongs to A and either to B or C, which is exactly the left side ■.

The other, not present proofs are, again, left for the reader to practice by the proof templates above. Do we have any quirkiness with the definitions we came up, so usual for the deep diving into Set Theory, which only AI or ETI can comprehend? Let’s see in the other chapter…

Examples in Natural Language and Artificial Intelligence Domains

Fascinating, we can just grub arbitrary fragments from any texts, paste them together, and get nothing less than a text. You, and me, and everybody sometimes (or maybe more often than sometimes, but we won’t tell it anyone, he-he) as students used to cut and pasted pretty much random fragments from pompous books of wise authors to quickly bake an essay for a hated mandatory class, and usually were getting away with that, even occasionally getting high grades for such a Kunststück – a peculiar kraft.

But just pause, and think about it… You take a representation of Thought – something unique, complex, carefully crafted, carelessly butcher it, mix it up with other remains of which used to be Thoughts, and voilà – here it is – a brand new shining Thought. Isn’t that remarkable, surprising, …and scary?

In the utilitarian sense, for ChatBots, and generally for the Natural Language Generation (and here we step into AI domain in its true meaning – the meaning of agency) applications, that means we do not need to make up Texts, – it would be enough to combine them from fragments of the existing corpora. Yes, it’s all that it is – the innate concept of Creativity we have such an admiration to, is no more than Unions of Subsets in the set-theoretic terms applied to the agency-full AI.

As for our previous example: “Dog bites man” and “Man bites dog” sentences, the relation-full set representation would be (we even don’t do lemmatization here), for the first sentense: A = {{dog}, {man}, {bites}, {dog}, {bites, dog), {bites, dog, man}}, for the second one: B = {{dog}, {man}, {bites}, {man}, {bites, man), {bites, dog, man}}, and for the Union: A ∪ B = {{dog}, {dog}, {man}, {man}, {bites}, {bites, dog}, {bites, man}, {bites, dog, man}}. The Union doesn’t have dog-man order – counts for them are the same – 3. Or, if we wite the same set in a more structured way:

A ∪ B = {{dog}, {bites}, {man},    
         {dog,   bites},
                {bites,   man},
         {dog,   bites,   man}}

we will see that order (and relations it represents) is bidirectional. I.e. our Union, which represents text Generation and Generalization of the original texts, tries to say that either active-passive relation is possible, while what is more importants is the biting action. Which observation is quite philosophical, and impressive for the simple set-theoretic representation and operations.

Yet another bizarre, as well as, again, scary thing about the intersection of texts is that they do exist. Again, stop and think – how in the world, all these unique an personal things as Thoughts (and their duals – Texts), can have common parts, even if they are empty sets? Nevertheless, they do, and we can exploit that fact for practical purposes – finding commonalities between texts (otherwise that would be hopeless). And, again, that is all we need for another human innate opaque concept of high reverence – Search – in application to AI.

Again, for our “Dog bites man”/”Man bites dog” example, the Intersection would be: A ∩ B = {{dog}, {man}, {bites}, {bites, dog, man}}. The Intersection, as well as the Union, don’t have an order – all word counts are the same – 2, but now the relation it represents is not directional at all:

A ∩ B = {{dog}, {bites}, {man},  
         {dog,   bites,   man}}

Which means the text we just generated, is indeed a generic Search request text – we are interested in any texts mentioning dogs, men, and their biting relation regardless of its direction.

PS: There is actually some sloppiness, related to “uniqlessness”, lurking in the examples, which we’ll talk later, and so far left for readers to figure out…



Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #3

Sets of Sets

You know you are Center of the Universe, don’t you?

When we introduced Axiom of Specification, consequences from that were quite drastic. We said that we may only create new sets from sets already existing out there, in the Universe, and we explicitly prohibited the existence of the Set of All Sets. But what about other Sets of Sets Which Are Not All Sets? Losing such a potentially convenient concept of a set being a member of another set, which we promised initially, would significantly impoverish Set Theory. We can not produce sets of sets from the Whole Universe due to the Axiom of Specification. We do can argue that there exist “things” in the Universe we may create sets from, but what about sets of sets? Do they really exist in the Universe (except itself we had prohibited to view as a set)? The Smaller Than Universe Sets of Sets is an abstract debatable concept hardly perceptible directly, which acceptance rely on the will of a particular person, therefore, to avoid potential arguments, we have to come up with another Axiom that would build up sets of sets from the opposite, bottom-to-top side.

It’s called Axiom of Pairing, and it declares that for any two sets there is a set those sets belong to. Or, unofficially we can call it Axiom of the “Universe Thickness” – basically what it says is that, though we can’t have the Whole Universe Set, the Universe is so Vast and so Dense that for any Subsets, we are able to come up with, it will contain a Set those Subsets belong to. Is that easily comprehensible by a human (at least the regular one)? Hardly. So, again, let’s wait for AI or ETI comprehend that and explain it to us.

Meanwhile, while we wait on AI and ETI, we may discuss few things understandable by humans. Because of the Axiom, if a and b are sets, we do can write {x ∈ A: T(x)} (before we weren’t aware of the existence of set A), where generating function T(x) holds true for x=a or x=b, or maybe some other x. As a particular case we can build subset P = {x ∈ A: x=a or x=b} (or {x ∈ A: x=a ∨ x=b}) and call set P={a,b}={b,a} a pair (the unordered one, which concept we, again, do not define yet). Or, we can build subset S = {x ∈ A: (x=a or x=b) and a=b} as a special case of pair P, and call S={a,a}={a} a singleton. Those definitions explain such Set properties as “orderlessness” and “uniquelessness” we already briefly mentioned. Also, we can see a relation between the very relations of “belonging” ∈ and “inclusion” ⊂: if we define B = {x ∈ A: x=a} that means B ⊂ A, or B = {a} ⊂ A when a ∈ A.

Ordering Sets

However, what if we want to order unordered Sets? Should we define a new concept of Order? Not necessarily. For example, we have set A = {a, b, c, d}, which, by the Axiom of Extension, is exactly the same as {d, c, b, a}, or other permutations. What if we want to deal with the only one ordered permutation, which we denote by the rounded brackets, for example, A’ = (c, a, d, b)? Now, when we have defined the Axiom of Pairing and allowed Sets of Sets to exist, we can associate with our set A a set C of subsets of A, still unordered, but composed by going from one side of the ordered collection of elements of set A, and including, step by step, elements occurring before the current position in the ordered collection such as: C = { {d, c, a}, {c}, {c, a, b, d}, {a, c} }. The number of occurrences of the members of set A would define their position in the ordered list A’.

The remarkable thing about this representation of Order is that all sets in it remain unordered. Not that we may want to replace our innate human understanding of what Order is by this formalization of combining Set Theory concepts, but demonstration of the possibility of mapping basis ontological concepts easily and implicitly understandably by humans into basis concepts of the other space we as humans obviously having problem to grasp (look at those Paradoxes and Axims of the Set Theory) is astonishing. Apparently, computationally and algorithmically (AI, are you there?) it would be easier to work with sets rather than with the innate, but, because of that, difficult to formalize concepts. That makes us guess that maybe we won’t be hugely successful in formalizing human ontologies (because of the blind spot of their innate nature), but instead, we may be quite successful in formalizing non-human perception ontologies (and vice versa), for the mutual human, AI, and ETI benefits.

The alien nature of some Axiomatic Set Theory concepts may be seen in some mathematicians calling them pathological and hoping to get rid of them, while some, who embraced those concepts, still trying to use them where they are needed and quickly forget afterward. Axiomatic Set Theory, using the above mentioned Order representation, does equate set A’ with collection C: A’ = (c, a, d, b) = { {d, c, a}, {c}, {c, a, b, d}, {a, c} }, and mainly uses it for the concept of ordered pairs: OP = (a, b) = { {a}, {a, b} }, in order to show that, unlike unordered pairs in Axiom of Extension, the ordered pairs (a, b) and (x, y) are equal only iff a = x and b = y. After achieving the goal, this Order representation gets abandoned and replaced by the ≤ relation. The reason for that that consequences of that Order interpretation lead to the uncomfortable “pathological” consequence: {a, b} ∈ (a, b).

Examples in Natural Language Domain

Still, that approach may be not so unwanted, but rather, in opposite, be welcomed, for example in Natural Language Processing, and not only for the Order relation, which is, actually, local feature of the Analytical languages that flatten a more general Tree/Hierarchical Structure of the Flective/Synthetical languages, which Hierarchical Structure could be also represented by the similar approach. And, not necessarily in the way Set Theory does, but maybe in the Topological approach style, where we add a list of sets as a form of the relation representation over the original untouched Set.

If we were to express the Order relation in a pure Set Theory way, for example, for sentence “Dog bites man”, the implicitly ordered set could be: {{dog}, {bites, dog}, {bites, dog, man}}, while sentence “Man bites dog” would look like: {{man}, {bites, man}, {bites, dog, man}}. Isn’t it remarcable?

The Axiom of Pairing is even more astonishing in its application to NLP. We may take two arbitrary linguistic entities: words, clauses, sentences, texts, – and find or define a higher hierarchical entity they would belong to. It may not be a part of today’s Normative Grammar, but it could have been, or will be, or we may make it be so :). One more thing we see here is that we may have not only Flat syntaxes but, as we mentioned before, Hierarchical ones, too, which allows existence not only Analytical languages. Also, we may say that we may convert any type of syntax tree into a binary one.


Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #2


Does Universe exist?

In the Vector Space definition – “Collection of objects with a mathematical structure defined over them” – “collection” could be thought of as a thing that manifests itself only through a concept of “belonging”. Such thing, which interacts with other “objects” through the relation of “belonging”, has mathematical name “Set”, and the very relation of “belonging” is depicted by the stylized Greek letter epsilon: ∈. “Objects”, which may “belong” to Sets, are virtually anything we can reasonably imagine from our experience: numbers, words, sentences, sounds, colors, pictures, physical objects, ideas, planets, galaxies, sets themselves, other mathematical concepts, or whatever you find in your physical or percepted Universe. We write x ∈ A, meaning that object x belongs to set A (or x is a member or element of A). We can describe a set either by simple enumeration of its members: A = {1, 2, 3, …}, or use a generating function of some sort: A = {x: T(x)} (i.e. set A consists of such objects x that satisfy conditions of a function T (what is a function we don’t discuss yet, leaving it to the reader’s intuition at this point)).

The weird and cautious language about reason, experience, and perception in the definition above is there not for a pure stylistic. Attempts to define Sets, their “objects”, and meaning of “belonging” as vague “everythings”, and break mathematics free of limitations of the human experience and comprehension, eventually failed. Quite expectedly, considering Immanuel Kant’s simile, in his Critique of Pure Reason, of a Dowe complaining about air being an obstacle that slows him down, who, being placed into a vacuum of the Outer Space and having lost air’s support, was not able to fly at all.

Initial Axiom of Comprehension (which, really, could have been named as Axiom of Fantasies) stated that for any concept of “belonging” (doesn’t matter could we imagine it or not (probably AI or ETI could)) there exist a Set (actually, the generating function definition of a Set above is the formal expression of the Axiom). The Axiom has been falsified by Russell’s Paradox of a Set of all Sets that do not Belong to Themselves (so, it says T(x) = x ∉ x). If such a set doesn’t belong to itself which means that it does belong to itself, and we stuck in this contradiction or infinite loop.

To save the whole concept of Set Ernst Zermelo came up with the Axiom of Specification (or Axiom of Lack of Fantasies): A = {x ∈ S: T(x)}, which basically formalized Descartes’ proposition, mentioned in the previous chapter, that humans cannot create new things, and only can break up and rearrange already existing and known things. Informally, the Axiom means that we can intelligently create and do things only with “Subsets” of the already existing out there Sets. The Axiom resolves Russell’s paradox just saying that there exist no such things as Sets of All Sets (including Themselves), therefore such generating function as Not Belonging to Themselves does not produce paradoxes. Or other similar things are prohibited if their existence leads to paradoxes, because, first, we have produced a proof that such things exist, and only then divide and combine them.

At least in the reality of humans; if AI or ETI can experience such things, it’s up to them to resolve the accompanying problems. Though, the way the Axiom of Specification resolves Russell’s paradox is a bit unsettling – the notion that there is no Set of All Sets means that there is no Universe (all things can’t belong to One Thing). Maybe Universe is not the thing easily comprehended by humans, and we just have to wait for AI or ETI to explain it to us. There were other attempts to save the Set Theory, but Zermelo’s Axiomatic Set Theory is a more popular solution, at least uptodate, than others.

The new term “Subset” implies yet another relation that involves Sets. If “belonging” was a more “vertical” relations between things, “inclusiveness” is a more “horizontal” relation in a sense how Bernard Russell tried to resolve his Paradox, introducing different (hierarchical) types of sets. We say that set B includes set A (or set A is a subset of B) if all elements of set A belong also to set B, and we depict that relation as A ⊂ B. The inclusion relation definition implies that a set is always its own subset A ⊂ A (i.e. inclusion is reflexive, though we don’t define what is that yet). Also, the inclusion definition implies that Set doesn’t have the order and uniqueness constraints.

If for two sets A ⊂ B and B ⊂ A relations are true, then we may define yet another relation of “equality”, depicted as =, and Axiom (of Extension) that two sets are equal iff (if and only if that is depicted as ⇔) they have the same elements. The two-way “iff” means that if sets have the same elements then they are equal and that if they are equal then they have the same elements. The Axiom may sound too trivial, but its informal formulation gives more “food for thought” – the Axiom basically says that a Set is nothing more than its members and their “belonging” relations, or that “pure” Set in itself is Nothing.

If we take a “pure” Set in its Axiom of Specification with any false generating function, for example: A = {x ∈ S: x ≠ x}, we get a “thing” that has no members, that is called Empty Set and is depicted as ∅. The Empty Set is yet another set that is included into every set (in addition to the set itself) ∅ ⊂ A. We could have tried to falsify that by showing that ∅ has a member that does not belong to A, but because ∅ has no members, we can’t do that. One more thing to mention here is that ∅ ≠ {∅}. The latter set does have one member, which is ∅, which does not have members. As an approximate analogy we can see Set as an imaginary box, therefore just Nothing and Nothing in Imaginary Box are different “Nothingnesses”.

But what about reflexive relation A ∈ A, is it true or necessary? We established that A ⊂ A, and A = A. What does definition A = {x ∈ S: x ∉ x} do with Axiom of Extension? Actually nothing (bad), the Axiom doesn’t falsify our definition – non-reflexive members of A are enough to build a set. We say here that every member of set A is exactly a member of set S that is not a member of itself. Therefore if A belongs to itself A ∈ A on the left, then it doesn’t belong to itself on the right A ∉ A. Therefore, because of the contradiction, unlike the “inclusive” relation ⊂, the “belonging” relation ∈ is not needed to be reflexive. I.e. a Set does not need to belong to itself. It may be, in particular cases like A = {x ∈ S: x ∈ x or x ∉ x} – but it cannot add anything extra to its members which entirely define it according to Axiom of Extension.

So far we were quite vague on examples of members of Sets after introducing Axioms solving Russell’s Paradox – the only concrete positive example was the Empty Set ∅. Can we say anything more concrete on the other possible Set members? Let’s try to do that in the next chapters…

Examples in Natural Language Domain

When started writing about Relations in Set Theory, and after mentioning that just-sets part of Set Theory is boring, and all fun and interest comes from the Relations part, I realized that all these tutorial chapters on just-sets-by-themselves with their dry mathematical and logical language and not many examples, are, indeed, boring as Hell (BTW Pope Francis recently announced that there is no Hell, therefore it’s literally an Empty Set, hence is ultimately boring), and I decided to add some examples. Most of us speak, listen, read, and write to some extent, therefore Natural Language Understanding Domain, I guess, would be interesting and easily understandable for everybody.

The Axiom of Comprehension, applied to NLU, would mean that we could magically create any texts, any languages, any meta-languages (languages describing the lower hierarchy languages), only by our wish. That reasoning leads us to the same Russell’s paradox. We could have wished to create a Super Metalanguage (which is not reducible to itself) that can describe all Metalanguages (which could describe lower languages but not themselves, too). Isn’t it an ultimate goal of Linguistics? But that means the Super Metalanguage can describe itself, therefore we ram into contradiction. So, sorry for breaking your dream – there is no such thing as an Ultimate Metalanguage that would describe all the languages in the Universe 🙂

Therefore, if we want to get rid of this contradiction in the way Set Theory does, and adopt Axiom of Specification, we would say we can come up with only subsets of what already exists out there in the Universe. So far for linguists, it’s only human languages of the Earth (ok, ok, maybe some animal, too), and all texts, corpora, syntaxes, and grammars that are and ever have been, and nothing more. Brutal? Yes. But if don’t like it, just throw out Set Theory from your bag of tools 🙂

The Axiom of Extension basically says that in the language constructs there is nothing more than their components, quite contrary to the view that System is more than, and not reducible to its building blocks. If you like the latter more than the former, then, again, throw away Set Theory from your baggage 🙂 However, surprising (or “surprising”) efficiency and productivity of word2vec models is an argument for the former statement. You may say that collection of words in a sentence (or paragraphs in a larger text) is not enough, and their arrangement (order and other types of expressing relations) imposes additional (and maybe even prime) meaning, hence sec2sec models are even better. However, there are ways we can express relations in Sets by means of Sets (which we are going to talk about later), and we don’t actually need to look for anything additional inside the Set itself that is not reduced to its members.

Yet another thing we were talking about is an Empty Set, in terms of looking at Nothingness or Emptiness as a Thing we could deal with. It doesn’t need advanced linguistics to come up with the idea of a Null Subject or other Null Parts of Sentence, however, it’s, as well, not a trivial thing for laypersons, especially talking an Analytic Language like English, to realize by themselves. Even for the Flective Languages (like Russian) speakers, routinely using such constructs as Дождит (literally “Raining”) in their speech, it’s not an easy task unless they are familiar with Set Theory 🙂 Sorry, Agglutinating languages speakers, I’m not exposed to them, so I can’t say how things are done there 🙂

Sentence “Raining” appears is of no correct syntax in English. It lacks mandatory Subject “it”, and even helper verb “is”, however, they could be viewed as empty sets: “∅ ∅ raining”. So, is it fine now? In a way, yes – in the right context (“He looked out of the window – raining”), sentences created with use of the Grammars of Errors (yup, errors are not random, they follow their own grammar(s)), could be understood even in context of the Normative Grammar. Isn’t it what we need, especially if we work with Non-Normative verbal or online corpora? As Hegel said: “All that exists is rational…”. We can say that syntax is not “correct” how many times we want, but as long as it exists out there we have to deal with it…


Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment

Kitchen Style Tutorials: Set Theory #1

Mathematical Space

What we see, is it real? If it’s real, can we see it?

Mathematical space is a collection of objects with a structure defined over them. This definition as a whole, as well as the words used in it and their relations, may seem too vague, broad, and ambiguous. What is “object”? What is “collection” and “structure”? Why is the structure “defined”, and not intrinsically belongs to these objects or their collection? And, if it’s defined, who and why defines it?

Instead of intuitively guessing answers to these questions, we might have wanted to build our understanding of these vague and complex concepts from a limited basis of the simple, narrow, well defined, clear and distinct concepts, as Rene Descartes, one of the framers of the modern philosophy and algebra, wanted. However, the same time he noted that human mind is a very poor inventor. It can’t create anything genuinely new. It’s very good, though, in mixing and combining blocks already known to it. If we go to the path of excessive reduction of the number, richness, and generality of these blocks, we may end up unable to span complex (if at all) concepts from the simple, distinct, and clear blocks.

Therefore, we may be better off defining simple, well defined, and clear rules of combination of the potentially opaque, complex, and only intuitively comprehensible building blocks. To add insult to the injury, postmodernism pointed out that intuition depends on the personal experience, therefore each person will have different from others innate concepts. The question ‘is your “red” the same as my “red”?’ does not necessarily expect the affirmative answer. Still, we may define labels or tokens (word “red”) to the real-world phenomena, hoping that everyone would sort out relations of their inner mental images with those labels by themselves, likely on sub/unconscious level.

In this context, Chomskian LAD (Language Acquisition Device) could be also decoded as the “Label” Acquisition Device, because those innate concepts of the real-world understanding are imbedded into natural human languages. For example, the very concept of number 1 (and other numbers than 1) presents in all(?) grammars of human languages in some form of single or plural grammatical number, or even does the dynamic of going from 1 to many via 1+1=2 (through dual grammatical number dropped in the explicit way in many modern languages).

Therefore, we may not need to go into a rabbit hole of purely analytical ontologies, risking to end up with nothing: “this is/consists of that and that, which, in their turn, are/consist of even smaller and simpler things”, until they disappear. Instead, on some level we may start thinking in terms of more synthetic ontologies of the type: this could be thought of as that (and that, or that) in their combinations and relations (as well as other equally true combinations of other “that”s).

Based on those deep, even language embedded concepts of natural numbers, formal mathematical axioms, like Peano’s axioms, come quite naturally: for each natural number x there is only one successor y=s(x)=x+1; if successors are equal then do equal their predecessors; there is only one number – 1 – that is not a successor of any number; 1 with its successors spans the whole collection of natural numbers N.

Mathematics has been always an object of amusement, that such a seemingly artificial construct out of the fantasy world could be so useful being applied to problems of the real world. Meanwhile, mathematical concepts are always based, on the one hand – on the real world phenomena, and on the other hand – on humans’ patterns of the world’s perception and comprehension. One can reasonably expect that wherever those concepts lead us, their range will be the same – humans’ understanding of the world, i.e. humans may always find areas of their application.

The above implies that extraterrestrial, or artificial intelligence, with potentially different ways of observing and comprehending the real world, may develop quite different mathematics. Even humans, using different than the “usual” set axioms have developed quite different mathematical theories, but we may expect ET’s and AI’s difference on a much larger scale.

However, let’s come back in the next chapter to the ways of thinking about “collections of objects” and “mathematical structure defined over them” from the original definition…


Halmos, Paul R., 1960: Naive Set Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., repr. 2017, Mineola, New York: Dover Publications, Inc.

Mendelson, Bart, 1975: Introduction to Topology. Boston: Allyn and Bacon, first edition 1962, second edition 1968, repr. 1990, New York: Dover Publications, Inc.

Pinter, Charles C., 1971: A Book of Set Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, repr. 2014, Mineola, New York: Dover Publications, Inc.

Posted in Mathematical Ruminations | Tagged , , , , , | Leave a comment