Thinking in terms of filters

In dealing with topological vector spaces it is more convenient to define a topology by specifying what the neighbourhoods of each point are going to be. It is well-known that the two approaches are equivalent: an open set will be a set, which, whenever it contains a point, contains a neighbourhood of this point; one can also say that an open set is a set which is a neighbourhood of each one of its points; on the other hand, a neighbourhood of a point x of E is simply a set which contains some open set containing x.

In order to define a topology by the system of the neighbourhoods of the points, it is convenient to use the notion of a filter. This is a very primitive notion, and the student should find it easy to become familiar with it, and to learn how to use filters, just as he learned how to use sequences. The notion of filter is perfectly independent of topology. A filter is given on a set which need not be carry any other structure. Let E be the set. A filter \mathcal{F} is a family of subsets of E, submitted to three conditions:

(F_{1}) The empty set \phi should not belong to the family \mathcal{F}.

(F_{2}) The intersection of any two sets, belonging to the family, also belongs to the family \mathcal{F}.

(F_{3}) Any set, which contains a set belonging to \mathcal{F} should also belong to \mathcal{F}.

The simplest example of a filter on a set E is the family of all subsets of E which contain a given subset A, provided the latter is non empty. With every infinite sequence of points of E is associated a filter. Let x_{1}, x_{2}, \ldots be the sequence under consideration. The associated filter is the family of all subsets of E which have the following property:

(AF) The subset of E contains all elements x_{1}, x_{2, \ldots} except possibly a finite number of them.

A family \mathcal{B} of subsets of E is a basis of a filter on E if the following two conditions are satisfied:

(BF_{1}) \mathcal{B} \subset \mathcal{F} that is, any subset which belongs to \mathcal{B} must belong to \mathcal{F}.

(BF_{2}) Every subset of E belonging to \mathcal{F} contains some subset of E which belongs to \mathcal{B}.

A familiar example of a basis of filter on the straight line is given by the family of all intervals (-a,a) with a>0: it is a basis of the filter of the neighbourhoods of zero in the usual topology on the real line. Another useful example is the following one: Let \mathcal{F} be the filter associated with a sequence S = \{ x_{1}, x_{2}, \ldots, x_{n}\} For each n=1,2,\ldots let us set S_{n}=\{ x_{n}, x_{n+1}, \ldots\}

and view S_{n} as a subset of E.

Then the sequence of subsets S = S_{1} \supset S_{2} \supset \ldots S_{n} \supset \ldots is a basis of \mathcal{F}.

Let \mathcal{A} be some family of subsets of our set E. We may ask the question: is there a filter \mathcal{F} having \mathcal{A} as a basis (note that a filter can have several different bases) ? In view of the filter axioms, (F_{1}, F_{2}, F_{3}), that filter \mathcal{F}, if it exists, is completely and uniquely determined: it is the family of subsets of E which contains some subset belonging to \mathcal{A}. Observe that the latter property defines perfectly well a certain family, which we have called \mathcal{F} of subsets of E. Then our question can be rephrased as follows: is \mathcal{F} a filter? Obviously, \mathcal{F} satisfies (F_{3}); it also satisfies (F_{1}) if we take care of requiring that no set belonging to \mathcal{A} be the empty set. As for (F_{2}) it is equivalent as we see easily with the following property of \mathcal{A}:

(BF) The intersection of any two sets, belonging to \mathcal{A} contains a set which belongs to \mathcal{A}.

The difference with condition (F_{2}) is that the intersection of two subsets which belong to \mathcal{A} is not requested to belong to \mathcal{A}. but only to the contain some set belonging to \mathcal{A}. Thus, we may state : a basis of fliter on E is a family of nonempty subsets of E satisfying Condition (BF). The filter generated by the basis is uniquely determined by Condition (BF_{2}).

Next step: comparison of fliters. We want to be able to say: this filter is finer than this other filter. Keep in mind that filters are are sets of sets, or rather of subsets. In other words, filters are subsets of the set of subsets of of E, usually denoted by \ss(E). As filters are subsets, (of some sets, in this case \ss{E}, there is a natural order relation among them: the inclusion relation. We can write \mathcal{F} \subset \mathcal{F}^{'} if \mathcal{F} and \mathcal{F}^{'} are two filters on the same set E. It means that every subset of E which belongs to \mathcal{F} also belongs to \mathcal{F}^{'} (the converse being, in general, false). Instead of saying that \mathcal{F} is contained in \mathcal{F}^{'}, one usually says that \mathcal{F}^{'} is finer than \mathcal{F} or that \mathcal{F} is less fine than \mathcal{F}^{'}. Let \mathcal{F} (respectively \mathcal{F}^{'}) be the family of all subsets of E which contains a given subset A (respectively A^{'}) of E; \mathcal{F}^{'} is finer than \mathcal{\mathcal{F}} if and only if A^{'} \subset A.

A topology on the set E is the assignment, to each point x of E, of a filter \mathcal{F}(x) on E, with the additional requirement that the following two conditions be satisfied:

(N_{1}) If a set belongs to \mathcal{F}(x), it contains the point x.

(N_{2}) If a set U belongs to \mathcal{F}(x), there is another set V belonging also to \mathcal{F}(x) such that given any point y of V, U belongs to \mathcal{F}(y).

When these conditions are satisfied, we say that we have a topology on E and we call \mathcal{F}(x) the filter of neighbourhoods of the point x. At frst sight Condition (N_{2}) may seem involved. It expresses, however, a very intuitive fact. Roughly speaking, it says that given any point z near x (that is, z is generic element of U), if a third point y lies sufficiently near to x (the sufficiently near is made precise by the neighbourhood of x, V, of which y is an element), then z lies near to y (that is, z \in U \in \mathcal{F}(y)). In the language of open sets (N_{2}) becomes evident: since U is a neighbourhood of x, U contains an open set containing x; let V be such an open set. Since V is open, and V \subset U ; U is obviously a neighbourhood of each point of V. A basis of the filter \mathcal{F}(x) is called a basis of neighbourhoods of x. This simple notion will play an important role in the forthcoming definitions.

Once we have the notion of filter of neighbourhoods of a point, hence of neighbourhood of a point (any subset of E belonging to the filter of neighbourhoods), we can quickly review the concepts that are used to describe a topology. As we have already said, an open set is a set which is a neighbourhood of each one of its points. A subset of E is closed if its complement is open. The closure of a set A \subset E is the smallest closed set containing A. It will be denoted by \overline{A}. The following is easy to check: a point belongs to \overline{A} if and only if everyone of its neighbourhoods meets A (that is, has a nonempty intersection with A). The interior of a set is the largest open set contained in it; if A is the set, its interior will be denoted by \AA.

A very important notion is the one of a set A dense in another set B; both A and B are subsets of the same topological space E. Then, one says that A is dense in B if the closure \overline{A} of A contains B. In particular, A is said to be dense in E (or everywhere dense) if \overline{A}=E. To say that A is dense in B means that given any neighbourhood of any point x of B, U(x), there is a point y of A which belongs to U(x), that is, A \bigcap U(x) \neq 0. A standard example of a set everywhere dense is the set of rational numbers Q, when regarded as a subset of the real line R (with the usual topology); note that the complement R-Q of Q is also dense in R. Examples of sets which are dense and open are given by the complement of a straight line in the plane or in space, by the complements of a plane in space, etc. Easy to check are the basic intersection and union properties about open or closed sets: that the intersection of a finite number of open sets is open (this follows immediately from the fact, itself obvious in virtue of Axiom F_{2}), that the intersection of a finite number of neighbourhoods of a point is again a neighbourhood of that point); that the union of any number of open sets, be that number finite or infinite, is open (this follows from the fact that the union of a neighbourhood of a point with an arbitrary set is a neighbourhood of the same point: Axiom F_{3}). Byy going to the complements, one concludes that finite unions of closed sets are closed, arbitrary intersections of closed sets are also closed, etc

Observe that a set E may very well carry severall different topologies. When dealing with topological vector spaces, we shall very often encounter this situation of a set, in fact a vector space, carrying several topologies (all compatible with the linear structure, in a sense that is going to be specified soon). For instance, any set may carry the following two topologies (which in practice are almost never used):

(1) the trivial topology: every point of E has only one neighbourhood, the set E itself; (note that in this case every point of E is a limit point and so there are simply too many points !)

(2) the discrete topology given any point x of E, every subset of E is a neighbourhood of x provided that it contains x; in particular, {x} is a neighbourhood of x, and constitutes in fact a basis of the filter of neighbourhoods of x. (note that in this case no point of E is a limit point of any subset of E.)

We may compare topologies, in analogy with the way we have compared filters. Let Let \mathcal{T}, \mathcal{T}^{'} be two topologies on the same set E. We say that \mathcal{T} is finer than \mathcal{T}^{'} if every subset of E which is open for \mathcal{T}^{'} is also open for \mathcal{T}, or equivalent, if every subset of E which is a neighbourhood of a point for \mathcal{T}^{'} is also a neighbourhood of that same point for the topology \mathcal{T}. Let \mathcal{F}(x) (respectively \mathcal{F}^{'}(x)) be the filter of neighbourhoods of an arbitrary point x of E in the topology \mathcal{T} (respectively \mathcal{T}^{'}) : \mathcal{T} is finer than \mathcal{T}^{'}, which we shorten into \mathcal{T} \geq \mathcal{T}^{'}, if, for every x \in E, \mathcal{F}(x) is finer than \mathcal{F}^{'}(x). Given two topologies on the same set, it may very well happen that none is finer than the other. If one is finer than the other, one says sometimes that they are comparable. The discrete topology is finer on a set E than any other topology on E; the trivial topology is less fine than all the others. Topologies on a set form thus a partially ordered set, having a maximal and a minimal element, respectively the discrete and the trivial topology.

The notion of a topology has been introduced in order to provide a solid ground for the notions of convergence and of continuity. Of course, the latter were correctly manipulated (or most of the time, at least) well before anybody thought of topology. We proceed now to give their general definition.


This concerns filters: filters are the “objects” which may (or may not) converge. When do we say that a filter \mathcal{F} on a topological space E converges? We should recall that \mathcal{F} is a family of subsets of E. If \mathcal{F} is to converge to a point x of E, it means that elements of \mathcal{F}, which we repeat again, are subsets of E, get “smaller and smaller” about x, and that points of these subsets get “nearer and nearer” to x. This can be made precise in terms of neighbourhoods of x, which we have at our disposal, since E is a topological space: we must express the fact that, however small a neighbourhood of x is, it should contain some subset of E belonging to the filter \mathcal{F} and consequently, all the elements of \mathcal{F} which are contained in that particular one. But in view of Axiom (F_{3}), this means that the neighbourhood of x under consideration must itself belong to the filter \mathcal{F}, since it must contain some element of \mathcal{F}. The phrase “however small a neighbourhood of x is” has to be made mathematically meaningful: it simply means “whatever is the neighbourhood of x.” In brief, we see that the filter \mathcal{F} converges to the point x if every neighbourhood of x belongs to \mathcal{F}, in other words, if \mathcal{F} if finer than the filter of neighbourhoods of x, \mathcal{F}(x). This is what the convergence to a point of a filter means.

We recall how the convergence of a sequence to a point is defined. Let S = \{ x_{1}, x_{2}, \ldots \} be the sequence. We say that S converges to x if, given an arbitrary neighbourhood U of x, there is an integer n(U) such that n \geq n(U) implies x_{n} \in U. Let S = S_{1} \supset S_{2} \supset S_{3} \supset \ldots S_{n} \ldots be the subsequences introduced earlier: S converges to x if to every U \in \mathcal{F}(x) there is an integer n(U) such that S_{n(U)} \subset U. As the subsets of S_{n} of E form a basis of the filter associated with the sequence S, we see immediately that a sequence S converges to x if and only if the associated filter converges to x.

Note that a filter may converge to several different points. Suppose, for instance, that E carries the trivial topology: then every filter on E converges to every point of E. Note also that a filter may not converge: for instance, if it is the filter associated with some sequence and if this sequence does not converge. Another example is given by a filter on E which is not the filter of all subsets of E which contain a given point x — when E carries the discrete topology: in this topology, the only converging filters are the filters of neighbourhoods of the points. So much for convergence in general topological spaces.


This concerns mappings. In point set topology, a map f: E \rightarrow F, this is to say a map from a topological space E into another topological space F, is said to be continuous if any one of the following two conditions is satisfied:

(a) given any point x of E and any neighbourhood V of the image f(x) \in F of x, the preimage of V, that is to say the set

f^{-1}(V) = \{ x \in E: f(x) \in V\}

is a neighbourhood of x. In short,

\forall {x} \in E, V \in \mathcal{F}(f(x)) implies f^{-1}(V) \in \mathcal{F}(x).

(b) the preimage of any open subset \mathcal{O} of F, f^{-1}(\mathcal{O}) = \{ x \in E: f(x) \in \mathcal{O}\} is an open subset of E.

The student can easily check the equivalence of (a) and (b). As for the intuitive meaning of these conditions, we may say the following. If the mapping f is to be continuous at the point x, it should mean that if x^{'} \in E “converges to x”, then f(x^{'}) should converge to f(x). Note that “f(x^{'}) converges to f(x)” can be made precise in the following way: given an arbitrary neighbourhood of f(x), f(x) should eventually belong to it; and the “eventually” means here: provided that x^{'} is sufficiently near to x. Thus, given an arbitrary neighbourhood V of f(x), if x^{'} belongs to a sufficiently small neighbourhood of x, then f(x^{'}) \in V. The “sufficiently small” can only be determined by the existence of a certain neighbourhood U of x, such that, as soon as x^{'} \in U then f(x^{'}) \in V. This is exactly property (a); to every neighbourhood V of f(x) there is a neighbourhood U of x such that

x^{'} \in U implies f(x^{'}) \in V.

It is immediately seen that if a sequence \{ x_{1}, x_{2}, \ldots\} converges in E to a point x, and if f is a continuous function from E into F, then the sequence \{ f(x_{1}), f(x_{2}), \ldots\} converges to f(x) in F. Convergence of filters is also easily related to continuity of mappings. Let

f: E \rightarrow F

be a mapping from a set E into a set F. Let \mathcal{F} be a filter on E. The image f\mathcal{F} of \mathcal{F} under f is defined as being the filter having the basis

(f\mathcal{F})_{0} = \{ (f(U) \in F); U \in \mathcal{F}\}

Observe that, in general, (f\mathcal{F})_{0} is not itself a filter; it is always the basis of a filter (this is left as an exercise). Now, if the filter \mathcal{F} converges to a point x in E and if f is a continuous function, then f\mathcal{F} converges to f(x) in F. Indeed, the continuity of f implies that f\mathcal{F}(x) is finer than \mathcal{F}((x)); this is simply a restatement of Property (a) above. If then \mathcal{F} is finer than \mathcal{F}(x) (which means that \mathcal{F} converges to x), f\mathcal{F} is finer than f\mathcal{F}(x) and a fortiori finer than \mathcal{F}(f(x)).

We have only considered continuous funtions, which is to say functions defined everywhere and continuous everywhere. Of course, one may prefer to talk about functions continuous at a point. This is defined by the condition (where x is the point under consideration):

for every V \in \mathcal{F}(f(x)) then f^{-1}(V) belongs to \mathcal{F}(x),

or equivalently,

f\mathcal{F}(x) \geq \mathcal{F}(f(x)).

Let us insist on the fact that all the functions or mapping which will be considered in this book/blog (the series of topological vector spaces) are defined everywhere.

As a last remark, let us consider the case where F is identical with E as a set, but carries a different topology from the one given on E, and where f is the identity mapping of E onto F, I. The following two properties are obviously equivalent:

(i) I: E \rightarrow F is continuous.

(ii) the topology of E is finer than the topology of F (these two topologies are defined on the same set).


Nalin Pithwa.

Reference: Topological Vector Spaces, Distributions and Kernels. Francois Treves. Dover Publications.

Compact sets, perfect sets, connected sets

Reference: Principles of Mathematical Analysis by Walter Rudin

2.31 Definition: By an open cover of a set E in a metric space X we mean a collection \{ G_{\alpha} \} of open subsets of X such that E \subset \bigcup_{\alpha}G_{\alpha}.

2.32 Definition A subset K of a metric space X is said to be compact if every open cover of K contains a finite subcover.

More explicitly, the requirement is that if \{ G_{\alpha}\} is an open cover of K, then there are finitely many indices \alpha_{1}, \alpha_{2}, \ldots, \alpha_{n} such that

K \subset G_{\alpha_{1}} \bigcup G_{\alpha_{2}} \ldots \bigcup G_{\alpha_{n}}

The notion of compactness is of great importance in analysis, especially in connection with continuity.

It is clear that every finite set is compact. The existence of a large class of infinite compact sets in R^{k} will follow with Theorem 2.41

We observed earlier (section 2.29) that if E \subset Y \subset X then E may be open relative to Y without being open relative to Y. The property of being open thus depends on the space in which it is embedded. Note that the same is true of the property of being closed.

Compactness, however, behaves better as we shall see now. To formulate the next theorem, let us say, temporarily that K is compact relative to X if the requirements of Definition 2.32 are met.

2.33 Theorem: Suppose K \subset Y \subset X. Then K is compact relative to X if and only if K is compact relative to Y.

By the virtue of the theorem we are able, in many situations, to regard compact sets as metric spaces in their own right, without any paying any attention to any embedding space. In particular, although it makes little sense to talk of open spaces or of closed spaces (every metric space X is an open subset of itself, and is a closed subset of itself), it does make sense to talk of compact metric spaces.


Suppose K is compact relative to X, and let \{  V_{\alpha}\} be a collection of sets, open relative to Y, such that K \subset \bigcup_{\alpha}V_{\alpha}. By theorem 2.30 there are sets G_{\alpha} open relative to X, such that V_{\alpha} = Y \bigcap G_{\alpha} for all x, and since K is compact relative to X, we have

(22)…..K \subset \subset G_{\alpha_{1}} \bigcup \ldots \bigcup G_{\alpha_{n}}

for some choice of finitely many indices \alpha_{1}, \alpha_{2}, \ldots, \alpha_{n}. Since K \subset Y, the above equation 22 implies that

(23) K \subset V_{\alpha_{1}} \bigcup \ldots \bigcup V_{\alpha_{n}}

This proves that K is compact relative to Y.

Conversely, suppose K is compact relative to Y and let \{ G_{\alpha}\} be a collection of open subsets of X which covers K, and put V_{\alpha} = Y \bigcap G_{\alpha}. Then (23) will hold for some choice of \alpha_{1} of \alpha_{1}, \alpha_{2}, \ldots, \alpha_{n}; and since V_{\alpha} \subset G_{\alpha} (23) implies (22).

This completes the proof. QED.

2.34 Theorem Compact subsets of metric spaces are closed.


Let K be a compact subset of a metric space X. We shall prove that the complement of K is an open subset of qX.

Suppose p \in X, and p \notin K. If q \in K, let $V_{q}$ and W_{q} be neighbourhoods of p and q, respectively, of radius less than \frac{1}{2}d(p,q). Since K is compact, there are finitely many points q_{1}, \ldots, q_{n} in K such that

K \subset W_{q_{1}} \bigcup \ldots \bigcup W_{q_{n}}

If V = V_{q_{1}} \bigcap \ldots V_{q_{n}}, then V is a neighbourhood of p which does not intersect W. Hence, V \subset K^{c} so that p is an interior point of K^{c}. Thus, we have proved the theorem. QED.

2.35 Theorem Closed subsets of compact sets are compact.

Proof: Suppose F \subset K \subset X, F is closed (relative to X), and K is compact. Let \{ V_{\alpha}\} be an open cover of F. If F^{c} is adoined to \{ V_{\alpha}\}, we obtain an open cover \Omega of K. Since K is compact, there is a finite subcollection \Phi of \Omega which covers K, and hence, F. If F^{c} is a member of \Phi, we may remove it from \Phi and still retain an open cover of F. We have thus proved that a finite subcollection of \{ V_{\alpha} covers F.

PS: Remark: Note the technique of proof here.

Corollary: If F is closed and K is compact, then F \bigcap K is compact.

Proof: Theorem 2.24b and 2.34 show that F \bigcap K is closed since F \bigcap K \subset K. Theorem 2.35 shows that F \bigcap K is compact. Note: Theorem 2.24b is as follows: For any collection \{ F_{\alpha}\} of closed sets, \bigcap_{\alpha}F_{\alpha} is closed. Theorem 2.34 just proved above says that compact subsets of metric spaces are closed.

2.36 Theorem If \{ K_{\alpha}\} is a collection of compact subsets of a metric space X such that the intersection of every finite subcollection of \{ K_{\alpha}\} is nonempty, then \bigcap K_{\alpha} is nonempty.


PS: This proof of Rudin is not v clear to me. If one of my readers can help, I would be obliged.

2.37 Theorem: If E is an infinite subset of a compact set K, then E has a limit point in K.

2.37 Proof: If no point of K were a limit point of E, (proof by contradiction), then each q \in K would have a neighbourhood which contains at most one point of E (namely, q, if q \in E). (This follows just from the definition of limit point). It is clear that no finite subcollection of V_{q} can cover E and the same is true of K, since E \subset K. This contradicts the compactness of K. (recall definition of a compact set).

2.38 Theorem If \{ I_{n} \} is a sequence of intervals in R^{1}, such that I_{n} \supset I_{n+1} where n \in \mathcal{N}, then \bigcap_{1}^{\infty} I_{n} is not empty.

2.38 Proof: If I_{n} = [a_{n}, b_{n}] let E be the set of all a_{n}. Then E is nonempty and bounded above by b_{1}. Let x be the sup of E. If m and n are positive integers, then

a_{n} \leq a_{m+n} \leq b_{m+n} \leq b_{m}

so that x \leq b_{m} for each m. Since it is obvious that a_{m} \leq x, we see that x \in I_{m} for m=1,2,3….


2.39 Theorem: Let k be a positive integer. If \{ I_{n}\} is a sequence of k-cells such that I_{n} \supset I_{n+1} (n=1,2,3,…) then \bigcap_{1}^{n} I_{n} is not empty.

Proof: Let I_{n} consist of all points x = (x_{1}, x_{2}, \ldots, x_{n}) such that

a_{n,j} \leq x_{j} \leq b_{n,j} (1 \leq j \leq k) (where n = 1,2,3,\ldots)

and put I_{n,j} = [a_{n,j}. b_{n,j}]

For each j, the sequence \{ I_{n,j}\} satisfies the hypotheses of previous theorem 2.38. Hence there are real numbers x_{j}^{*} with 1 \leq j \leq k such that

a_{i,j} \leq x_{j}^{*} \leq b_{n,j} with 1 \leq j \leq k and n=1,2,3,\ldots

Setting x^{*} = (x_{1}^{*}, \ldots, x_{k}^{*}) we see that x^{*} \in I_{n} for n=1,2,3,\ldots. Thus the theorem is proved. QED.

2.40 Theorem Every k-cell is compact.


Let I be a k-cell, consisting of all points x = (x_{1}, \ldots, x_{k}) such that a_{j} \leq x_{j} \leq b_{j} with 1 \leq j \leq k.

Put \delta = (\Sigma_{1}^{k}(b_{j}-a_{j})^{2})^{1/2}

Then |x-y| \leq \delta if x \in I, y \in I.

Suppose, to get a contradiction, that there exists an open cover \{ G_{\alpha} \} of I which contains no finite subcover of I. Put c_{j} = \frac{(a_{j}+b_{j})}{2}. The intervals [a_{j}, c_{j}] and [c_{j}, b_{j}] then determine 2^{k} k-cells Q_{i} whose union is I. At least one of these sets Q_{i}, call it I_{1}, cannot be covered by any finite subcollection of \{ G_{\alpha}\} (otherwise I could be so covered). We next subdivide I_{1} and continue the process. We obtain a sequence I_{n} with the following properties:

(a) I \supset I_{1} \supset I_{2} \supset I_{3} \supset \ldots

(b) I_{n} is not covered by any finite subcollection of \{ G_{\alpha}\}

(c) if x \in I_{n} and y \in I_{n}, then |x-y| \leq 2^{-n} \delta

By (a) and Theorem 2.39 there is a point x^{*} which lies in every I_{n}. For some x, x^{*} \in G_{\alpha}. Since G_{\alpha} is open, there exists r >0 such that |y-x^{*}| <r implies that y \in G_{\alpha}. If n is so large that 2^{-n} \delta <r (there is such an n, for otherwise 2^{n} \leq \frac{\delta}{r} for all positive integers n, which is absurd since R is archimedean), then (c) implies that I_{n} \subset G_{\alpha} which contradicts (b).

This completes the proof.


The equivalence of (a) and (b) in the next theorem is known as the Heine Borel Theorem.

2.41 Theorem: If a set E in R^{k} has one of the following three properties then it has the other two:

(a) E is closed and bounded;

(b) E is compact.

(c) Every infinite subset of E has a limit point in E.


Let us assume (a) is true.

Then E \subset I for some k-cell I, and (b) follows from Theorems 2.40 and 2.35. Theorem 2.37 shows that (b) implies (c). It remains to be shown that (c) implies (a).

If E is not bounded, then E contains points x_{n} with

|x_{n}| >n where n=1,2,3,\ldots

The set S consisting of these points x_{n} is infinite and clearly has no limit point in R^{k} hence has none in E. Thus (c) implies that E is bounded.

If E is not closed, then there is a point x_{0} \in R^{k} which is a limit point of E but not a point of E. For n=1,2,3,\ldots there are points x_{n} \in E of E such that |x_{n}-x_{0}|<\frac{1}{n}. Let S be the set of these points x_{n}. Then S is infinite (otherwise |x_{n}-x_{0}| would have a constant positive value for infinitely many n). S has |x_{0}| as a limit point, and S has no other limit point in R^{k}. For if y \in R^{k}, y \neq 0, then

|x_{n}-y| \geq |x_{n}-y|-|x_{n}-x_{0}| \geq |x_{n}-y| - \frac{1}{n} \geq \frac{1}{2}|x_{n}-y|

for all but finitely many n, this shows that y is not a limit point of S (use the following Theorem 2.20: if p is a limit point of E, then every neighbourhood of p contains infinitely many points of E).

Thus, S has no limit points in E, hence E must be closed if (c) holds.


Remarks: (b) and (c) are equivalent in any metric space (prove this as an exercise ) but that (a) does not, in general imply, (b) and (c) both. Examples are furnished in Exercise 16 of this chapter and by the space L^{2} which is discussed in Chapter 11.

2.42 Theorem (Weierstrass) : Every bounded infinite subset of R^{k} has a limit point in R^{k}.

Proof: Being bounded, the set E in question is a subset of a k-cell I \subset R^{k}. By Theorem 2.40, I is compact, and so E has a limit point in I, by Theorem 2.37 (If E is an infinite subset of a compact set K, then E has a limit point in K).

Perfect Sets:

2.41 Theorem: Let P be a nonempty perfect set in R^{k}. Then P is uncountable.


((Note: Definition of a perfect set: E is perfect if E is closed and if every point of E is a limit point of E. ))

Since P has limit points, P must be infinite. Suppose P is countable and denote the points of P by x_{1}, x_{2}, x_{3}, \ldots We shall construct a sequence \{ V_{n}\} of neighbourhoods, as follows:

Let V_{1} be any neighbourhood of x_{1}. If V_{1} consists of all y \in R^{k} such that |y-x_{1}|<r, the closure \overline{V_{1}} of V_{1} is the set of all y \in R^{k} such that |y-x| \leq r. (note this) (this is true or makes sense : we have used the fact that P is also closed, being perfect so the limit point of P can belong to E itself which is the case when y=x_{1}).

Suppose V_{n} has been constructed so that V_{n} \bigcap P is not empty. Since every point of P is a limit point of P (note that here we have used the other part of the definition a perfect set), there is a neighbourhood V_{n+1} such that (i) \overline{V_{n+1}} \subset V_{n} (ii) x_{n} \notin \overline{V_{n+1}} (iii) V_{n+1} \bigcap P is not empty. By (iii) V_{n+1} satisfies our induction hypothesis, and the construction can proceed.

Put K_{n}=\overline{V_{n}} \bigcap P. Since \overline{V_{n}} is closed and bounded. \overline{V_{n}} is compact. And, by definition of K_{n}, it is an infinite set. Since V_{n} is closed and bounded, \overline{V_{n}} is compact. Since x_{n} \notin K_{n+1} no point of P lies in \bigcap_{1}^{\infty}. But each K_{n} is nonempty, by (iii) and K{n} \supset K_{n+1} by (i), this contradicts the Corollary to theorem 2.36 (If \{ K_{n}\} is a sequence of nonempty compact sets such that K_{n} \supset K_{n+1} (where n \in \mathcal{N}) then \bigcap_{1}^{\infty}K_{n} is not empty. )

Corollary : Every interval [a,b] where a < b is uncountable. In particular, the set of all real numbers is uncountable.

2.44 The Cantor Set: The set which we are now going to construct shows that there exist perfect sets in R^{1} which contain no segment.

Let E_{0} be the interval [0,1].. Remove the segment (\frac{1}{3}, \frac{2}{3}) and and let E_{1} be the union of the intervals

[0,\frac{1}{3}] and [\frac{2}{3}, \frac{1}{1}]

Remove the middle thirds of these intervals, and let E_{2} be the union of the intervals

[0,\frac{1}{9}] and [\frac{2}{9}, \frac{3}{9}] and [\frac{6}{9}, \frac{7}{9}] and [\frac{8}{9}, 1]

Continuing in this way, we obtain a sequence of compact sets E_{n} such that

(a) E_{1} \supset E_{2} \supset E_{3} \ldots

(b) E_{n} is the union of 2^{n} intervals each of length 3^{-n}

The set

P  = \bigcap_{n=1}^{\infty}E_{n}

is called the Cantor set. P is clearly compact, and Theorem 2.36 shows that P is not emtpy. Also, we have shown that P is closed being compact. Now, we have to prove that every point of P is a limit point of P: we can also show that P contains no isolated points: let us therefore look at the construction of P:

No segment of the form

(24) (\frac{3k+1}{3^{n}}, \frac{3l+2}{3^{n}})

where k and m are positive integers, has a point in common with P. Since every segment (\alpha, \beta) contains a segment of the form (24) if 3^{-m} < \frac{\beta - \alpha}{6}, P contains no segment.

To show that P is perfect, it is enough to show that P contains no isolated point. Let x \in P and let S be any segment containing x. Let I_{n} be that interval of E_{n} which contains x. Choose n large enough so that I_{n} \subset S. Let x_{n} be an endpoint of I_{n} such that x_{n} \neq x.

It follows from the construction of P that x_{n} \in P. Hence, x is a limit point of P and P is perfect.


One of the most interesting properties of the Cantor set is that it provides us with an example of an uncountable set of measure zero. (the concept of measure will be discussed in Chapter 11).


2.45 Definition: Two subsets A and B of a metric space X are said to be separated if both A \bigcap \overline{B} and \overline{A} and \bigcap B are both empty; that is, if no point of A lies in the closure of B and no point of B lies in the closure of A.

A set E \subset X is said to be connected if E is not a union of two nonempty separated sets.

2.46 Remark: Separated sets are of course disjoint, but disjoint sets need not be separated. For example, the interval [0,1] and the segment (1,2) are not separated since 1 is a limit point of (1,2). However, the segments (0,1) and (1,2) are separated.

The connected subsets of the line have a particularly simpe structure.

2.47 Theorem: A subset E of the real line R^{1} is connected if and only if it has the following property: if x \in E, y \in E and x<z<y, then z \in E.


If there exist x \in E, y \in E, and soe z \in (x,y) such that z \notin E, then E = A_{1} \bigcup B, where

A_{z} = E \bigcap (-\infty, z) and B_{z} = E \bigcap (z,\infty)

Since A_{z} \subset (-\infty, z) and B_{z} \subset (z, \infty), they are separated. Hence E is not connected.

To prove the converse (note we prove the contrapositive of the converse) suppose E is not connected. So is separated. Then there are nonempty separated sets A and B such that A \bigcup B = E. Choose x \in A and y \in B and assume WLOG that x < y. Define

z = \sup {(A \bigcap [x,y])}

By theorem 2.28 (Let E be a nonempty set of real numbers which is bounded above Let y=\sup {E}. Then y \in E if E is closed), z \in \overline{A} hence z \notin B. In particular x \leq z <y.

If z \notin A, it follows that x < z <y and z \notin E.

If z \notin A, then z \notin \overline{B}, hence there exists z_{1} such that z < z_{1} <y and z_{1} \notin B. Then x <z_{1}<y and z_{1} \notin E.



Nalin Pithwa

Is Math really abstract? I N Herstein answers…

Reference: Chapter 1: Abstract Algebra Third Edition, I. N. Herstein, Prentice Hall International Edition:

For many readers/students of pure mathematics, such a book will be their first contact with abstract mathematics. The subject to be discussed is usually called “abstract algebra,” but the difficulties that the reader may encounter are not so much due to the “algebra” part as they are to the “abstract” part.

On seeing some area of abstract mathematics for the first time,be it in analysis, topology, or what not, there seems to be a common reaction for the novice. This can best be described by a feeling of being adrift, of not having something solid to hang on to. This is not too surprising, for while many of the ideas are fundamentally quite simple, they are subtle and seem to elude one’s grasp the first time around. One way to mitigate this feeling of limbo, or asking oneself “What is the point of all this?” is to take the concept at hand and see what it says in particular concrete cases. In other words, the best road to good understanding of the notions introduced is to look at examples. This is true in all of mathematics.

Can one, with a few strokes, quickly describe the essence, purpose, and background for abstract algebra, for example?

We start with some collection of objects S and endow this collection with an algebraic structure by assuming that we can combine, in one or several ways (usually two) elements of this set S to obtain, once more, elements of this set S. These ways of combining elements of S we call operations on S. Then we try to condition or regulate the nature of S by imposing rules on how these operations behave on S. These rules are usually called axioms defining the particular structure on S. These axioms are for us to define, but the choice made comes, historically in mathematics from noticing that there are many concrete mathematical systems that satisfy these rules or axioms. In algebra, we study algebraic objects or structures called groups, rings, fields.

Of course, one could try many sets of axioms to define new structures. What would we require of such a structure? Certainly we would want that the axioms be consistent, that is, that we should not be led to some nonsensical contradiction computing within the framework of the allowable things the axioms permit us to do. But that is not enough. We can easily set up such algebraic structures by imposing a set of rules on a set S that lead to a pathological or weird system. Furthermore, there may be very few examples of something obeying the rules we have laid down.

Time has shown that certain structures defined by “axioms” play an important role in mathematics (and other areas as well) and that certain others are of no interest. The ones we mentioned earlier, namely, groups, rings, fields, and vector spaces have stood the test of time.

A word about the use of “axioms.” In everyday language, “an axiom means a self-evident truth”. But we are not using every day language; we are dealing with mathematics. An axiom is not a universal truth — but one of several rules spelling out a given mathematical structure. The axiom is true in the system we are studying because we forced it to be true by “force” or “our choice” or “hypothesis”. It is a licence, in that particular structure to do certain things.

We return to something we said earlier about the reaction that many students have on their first encounter with this kind of algebra, namely, a lack of feeling that the material is something they can get their teeth into. Do not be discouraged if the initial exposure leaves you in a bit of a fog.Stick with it, try to understand what a given concept says and most importantly, look at particular, concrete examples of the concept under discussion.

Follow the same approach in linear algebra, analysis and topology.

Cheers, cheers, cheers,

Nalin Pithwa

Exercises: Set theoretic construction of real numbers


  1. If p is a prime number, show that \sqrt{p} is irrational.
  2. Show that \sum_{n=2}^{\infty} \frac{1}{n!}<1
  3. Show that if e_{n} = \pm {1}, then the number \sim_{n=1}^{\infty} \frac{e_{n}}{n!} is an irrational number.
  4. Show that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}.
  5. If A and B are non empty subsets of \Re that is bounded below. Then, s = \inf{A} if and only if (i) s \leq a for all a \in A and (ii) for any \epsilon>0, A \bigcap [s, s +\epsilon) \neq \phi.

Reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.


Nalin Pithwa

Set theoretic construction of real numbers

Continued from previous blog. Same reference: A Second course in analysis by M Ram Murty, Hindustan Book Agency.

Section 1.3

The rational number is insufficient to “measure” all the lengths that arise in the “real world.” This was discovered by the ancient school of Pythagoras which viewed the world through a strange mix of mathematics and mysticism. The aphorism “all is number” seems to have been the underlying mantra of the Pythagoreans. In our modern digital world, this mantra imposes the universality more than in any earlier age.

Pythagoras seems to have been a contemporaryof the Buddha in India, of Confucius and Lao-Tzu in China. He seems to have travelled widely in Egypt, Babylon and India. The Pythagoreans believed in reincarnation and this was wedded to their strict adherence to vegetarianism for by eating meat they might be eating a friend!They are credited to the discovery of both the words philosophy and mathematics. The word “philosophy” literally means the love of wisdom and the word “mathematics” means “that which is learned.” Philosophy then, for the Pythagoreans was seen as a byproduct of mathematics.

The celebrated theorem of Pythagoras dealing with the length of the hypotenuse of a right angled triangle was known to earlier cultures stretching far back in time to Babylon, China and India. It is possible that Pythagoras learned of this theorem during his extensive peregrinations. But the Pythagoreans could see the mysticism inherent in the theorem, for the rule applied to all right angled triangles and there was an element of universality in it and in mathematics in general.

But when the Pythagoreans declared “all is number” they were referring to whole numbers and it is accurate to say that the concept of number had not been clearly articulated at that time. With the underlying idea that whole numbers govern the universe, their theory of ratio and proportion was in for a rude shock. Legend has it that the Pythagorean who discovered that \sqrt{2} is an irrational number was drowned !

The irrationality of \sqrt{2} represents a major episode in the history of mathematics and we owe this to the unfortunate Pythagorean who lost his life by discovering it. The proof is by contradiction and invokes the rudimentary arithmetical idea of a prime number, a concept first enunciated in Euclid’s Elements. A natural number is called a prime number if it has no proper divisors. Thus, the sequence of prime numbers begins as 2,3,5,7,11, \ldots (The number one is not included in the list of prime numberss. It it called a unit, a concept which generalizes to rings.) Two natural numbers a and b are said to be coprime if they have no common prime factor. When we write a fraction a/b in “lowest termss”, we are writing it such that a and b are coprime. We sometimes write (a,b)=1 to indicate that a and b are coprime. This is not to be confused with the concept of an ordered pair introduced earlier.

The irrationality of \sqrt{2} is established by using the method of contradiction. It is instructive to see all the details. If \sqrt{2} is a rational number, then we can write

\sqrt{2}=\frac{a}{b} such that (a,b)=1

Upon squaring both sides, we see


which implies that a is even since the left hand side of the equation is patently an even number. Writing a=2c, we now get


and so b is even. But this contradicts the coprimality of a and b.

This was the theorem that led to the demise of the heroic Pythagorean for it violated the aphorism “all is number” because the ancients viewed numbers as only whole numbers. They also allowed for rational numbers since they can be represented as ratio and proportion of whole quantities. This innocuous theorem opens the door for the discovery of real numbers. For it signals the lack of the “least upper bound property” in the realm of rational numbers.

The student should keep in mind that so far, we have only constructed the universe of rational numbers and thus any further definitions must be given only in terms of rational numbers. A set A of rational numbers is said to be bounded if there is a rational number M such that |x| \leq M for all x \in A. An upper bound for A is any rational number u such that x \leq u for all x \in A. A lower bound for A is a rational number l such that l \leq x for all x \in A. A rational number v is called a least upper bound for A if v \leq u for every upper bound u of A. A rational number m is called a greatest lower bound for A if l \leq m for every lower bound l of A. Thus, a set A is bounded if and only if it has both an upper bound and a lower bound.

Several questions now arise: given a set A of rational numbers, does it always have a least upper bound? Does it have a greatest lower bound? If so, are they unique?

The theorem discovered by the Pythagoreans can be explained as follows. Nascent in this example are many of the fundamental properties we would expect of the real numbers. Essential to the understanding is the familiar fact that the sequence of rational numbers \frac{1}{n} gets smaller and smaller and tends to zero.

If we consider the set

A = \{ x \in \mathcal{Q}: x^{2} \leq 2\}

then A has no least upper bound. Indeed if u is a least upper bound, x \leq u for any x \in A. By definition, u is a rational number and so cannot first equal \sqrt{2}. If u < \sqrt{2}, then we can choose a natural number N such that

\frac{1}{N} < \sqrt{2}-u

so that u + \frac{1}{N}< \sqrt{2}

contradicting that u is an upper bound for the set A. If u>\sqrt{2}, then we can similarly find N such that

u - \frac{1}{N}> \sqrt{2},

which again contradicts that u is the least upper bound for A. Thus, A does not have a least upper bound in the realm of rational numbers.

Digress: We digress a bit here from Dr. M Ram Murty’s above text and reproduce a nice “little proof”‘ from Rudin’s Analysis text illustrating that the rational number line has gaps or holes in it..


As we know there exists no rational solution to p^{2}=2. We now examine this situation a little more closely. Let A be the set of all positive rationals p such that p^{2}<2 and let B consist of all positive rationals p such that p^{2}>2. We shall show that A contains no largest rational number and B contains no smallest rational number.

More explicitly, for every p \in A we can find a rational q in A such that p<q, and for every p in B, we can find a rational q in B such that q < p.

To do this, we associate with each rational number p>0 the number q such that:

q = p - \frac{p^{2}-2}{p+2}=\frac{2p+1}{p+2}…call this I.

Thus, q^{2}-2 = \frac{2(p^{2}-2)}{(p+2)^{2}}….call this II.

If p is in A, then p^{2}-2<0, (1) above shows that q >p, and II shows that q^{2}<2. Thus, q is in A.

If p is in B, then p^{2}-2>0, (I) shows that 0<q<p, and II shows that q^{2}>2. Thus, q is in B.

Conclusion: the purpose of the above discussion is to show that the rational number system has gaps, in spite of the fact that in between any two rational numbers, there is another: if r, s \in \mathcal{Q}, and if r<s, then r < \frac{r+s}{2}<s. The real number system fills these gaps. This is the principle reason for the fundamental role which it plays in analysis.


Two ways of constructing the real numbers were proposed independently by Richard Dedekind (1831-1916) and Augustin Cauchy (1789-1857). Each method has its virtues. The method of Dedekind, using what are called Dedekind cuts is closer to the axiomatic foundations we have been discussing and is suggested by the example discussing \sqrt{2} above. The method of Cauchy using what are now called Cauchy sequences has a wider applicability. In our example above, an essential role is played by the ordering of the rational numbers. The method of Dedekind cuts uses this idea in a fundamental way. It coincides with our intuitive and visual idea of the real numbers as being “points on the number line stretching from minus infinity to plus infinity.”

A Dedekind cut in Q is a partition of Q into two sets A and B with the following properties:

  1. A and B are both non-empty.
  2. A  \bigcup B = \mathcal{Q}
  3. A is “closed downwards”, that is, if r \in A and q<r, then q \in A;
  4. B is “closed upwards”, that is, if q \in B and q<r, then r \in B.
  5. A contains no greatest element, that is, there is no q \in A such that r \leq q for all r \in A.

The set R of real numbers is then defined as the set of all Dedekind cuts.

For example, the irrational number \sqrt{2} is defined by the Dedekind cut:

\mathcal{Q} = A \bigcup B, and A = \{ x \in \mathcal{Q}: x^{2}<2\}, and B = \mathcal{Q}-A.

For example, the number zero is represented by A \bigcup B with A being the set of all non-zero negative rational numbers and B being the set of all non-zero positive rational numbers. As the Dedekind cut A \bigcup B is uniquely defined by A, with B being the complement of A in the set of rationals, we may as well associate A to each real number, or even think of A as the real number. One would then have to define how one would work with this definition from the perspective of addition, multiplication and so on. This is not difficult to do. For instance, we have

A_{1}< A_{2} \Longleftrightarrow A_{1} \subset A_{2}, and A_{1} \leq A_{2} \subseteq A_{2}.

Addition is simple enough:

A_{1}+ A_{2} = \{ a_{1}+a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}\}

Multiplication is defined (rather cumbersomely) by setting A_{1}.A_{2} to be

\{ a_{1}.a_{2}: a_{1} \in A_{1}, a_{2} \in A_{2}, a_{1}, a_{2} \geq 0\} \bigcup \{ q \in Q: q \leq 0\} if A_{1}, A_{2} \geq 0

-(A_{1}.(-A_{2})), if A_{1} \geq 0, and A_{2}<0

-((-A_{1}).A_{2}), if A_{1}<0, A_{2} \geq 0

(-A_{1}).(-A_{2}), if A_{1},A_{2}<0

The absolute value function is then

|A| = A, if A>0; |A|=0, if A=0; |A|=-A, if A <0.

One can prove that all the usual operations with numbers satisfy the expected properties of commutativity, associativity, and distributivity.

By contrast, the method of Cauchy sequences begins with a definition of convergence. In some ways, it is motivated by our discussion of \sqrt{2} earlier. If A is the Dedekind cut representing \sqrt{2} , then the least upper bound of A “is” \sqrt{2}. Thus, the number \sqrt{2} can be thought of as a limit of a sequence of rational numbers. But there can be many such sequences and so one needs a more formal approach. The underlying observation is that a limit of a sequence of rational numbers need not be rational.

This is perhaps best illustrated by Euler’s series for e. Students of calculus are familiar with

e = \sum_{j=0}^{\infty}\frac{1}{j!}

and the right hand side can be thought of as a limit of a sequence of rational numbers


The irrationality of e is easily proved by the method of contradiction. Suppose that e=a/b for two coprime natural numbers a and b. Then,

b!e = \sum_{j=0}^{b}\frac{b!}{j!} + \sum_{j=b+1}^{\infty}\frac{b!}{j!} …call this 1.1

and the second term on the right hand side is strictly less than the geometric series

\sum_{k=1}^{\infty} \frac{1}{(b+1)^{k}}=\frac{1}{b}

which is less than or equal to 1. But the left hand side of 1.1 is a positive integer, the first term on the right hand side of 1.1 is a positive integer and the second term is not an integer, which is a contradiction.

What this example shows is that the “limit” of the sequence of partial sums S_{n} (each of which is a rational number) is the irrational number e.

The reader is familiar with the usual notion of convergence. We say a sequence of real numbers x_{n} converges to L if given \epsilon>0 there exists N such that

|x_{n}-L|< \epsilon for all n \geq N

One may want to view real numbers as “limits of sequences of rational numbers.”But in our formal construction of numbers, several difficulties arise with this definition. The first is that as we have so far only constructed the rational numbers, we need to take the x_{n}‘s to be all rational numbers. Secondly, the limit L need not be a rational number as the example with e shows. This motivates the formal definition of a Cauchy sequence of rational numbers.

A sequence q_{n} in \mathcal{Q} is called a Cauchy sequence if given \epsilon>0, there is an N such that

|q_{n}-q_{m}|< \epsilon for all m,n \geq N

Now it is easy to see that any convergent sequence is a Cauchy sequence. For reference, we record this below:

Theorem 1.2: Any convergent sequence q_{n} is a Cauchy sequence.

Proof of 1.2: Suppose that q_{n} converges to L. Then, choosing N such that

|q_{n}-L|< \frac{\epsilon}{2} for all n \geq N

we have for all m,n\geq N, by the triangle inequality

|q_{n}-q_{m}| \leq |q_{m}-L|+|L-q_{n}|<\epsilon.


Later, we will show that every Cauchy sequence converges. Thus, a sequence is Cauchy if and only if it is convergent and the two notions are equivalent. But the advantage of the notion of a Cauchy sequence is that the limit value L is not mentioned in the definition.

The construction of real numbers is now brought about by defining an equivalence relation on the set of all Cauchy sequences of rational numbers. Since we have constructed rational numbers, we are now allowed to construct the set of all Cauchy sequences of rational numbers. On this set, we say two Cauchy sequences \{ q_{n}\} and \{ r_{n} \} are equivalent if for any \epsilon>0, there exists a natural number N such that

|q_{n}-r_{n}|< \epsilon for all n \geq N.

This is easily seen to define an equivalence relation on the set of all Cauchy sequences of rational numbers. The set of real numbers \Re is then defined as the set of all equivalence classes. We will denote the equivalence class of the sequence \{ q_{n}\} by |\{ q_{n}\}|.

We leave as an exercise to verify that if \{ q_{n}\} and \{ r_{n}\} are two Cauchy sequences, then so are \{ q_{n}+r_{n}\} and \{ q_{n}.r_{n}\}. Thus, the usual operations of addition and multiplication are easily defined as:

[\{ q_{n}\}]+[\{ r_{n}\}] = [\{q_{n}+r_{n} \}]

[\{ q_{n}\}].[\{ r_{n}\}] = [\{ q_{n}.r_{n}\}]

One needs to verify that these are well-defined by checking that the definition does not depend on the choice of representative of the equivalence class.

The order relation of the real numbers can also be defined. Thus, viewing real numbers as equivalence classes or Cauchy sequences of rational numbers, we define [\{ q_{n}\}] < [\{ r_{n}\}] if there is an N such that q_{n}<r_{n} for all n \geq N. We also write [\{ q_{n}\}] \leq [\{ r_{n}\}] if either [\{ q_{n}\}] < [\{ r_{n}\}] or [\{ q_{n}\}] = [\{ r_{n}\}]. Again, these order relations have the expected properties.

Finally, the absolute value function can also be defined. We set

$latex|[\{ q_{n}\}]| = [\{ |q_{n}|\}]$

and the reader can easily verify that indeed the right hand side is a Cauchy sequence if \{ q_{n}\} is Cauchy.

The absolute value has the usual properties, the most notable being the triangle inequality

|[\{ q_{n}\}] + [\{ r_{n}\}]| \leq |[\{ q_{n}\}]|+|[\{ r_{n}\}]|

Having given this formal definition of the real numbers using Cauchy sequences, it is convenient to drop the sequence notation and just represent the real numbers as single letters. Thus, if a, b are real numbers, the triangle inequality is the familiar: |a+b| \leq |a|+|b|

It is now convenient to visualize the set of real numbers in the usual way as points of the line stretching from -\infty to +\infty.

By the very construction, the set of real numbers has the property of completeness in that every Cauchy sequence converges. The reader will also observe that the construction depended only on the property of the absolute value viewed as a metric to measure the distance between two rational numbers. This signals a wider applicability of the method of Cauchy sequences valid more generally in metric spaces which we discuss later.

With the construction of the real numbers, we have restored the least upper bound property and can now define that a sequence of numbers (rational or real) x_{n} converges to a number L \in \Re if given any \epsilon >0, there is an N such that

|x_{n}-L|<\epsilon for all n \geq N.

A non-empty subset A of \Re is said to be bounded below if there is a number s \in \Re such that s \leq a for all a \in A. Any such number s is called a lower bound for A. Analogously, A is said to be bounded above if there is an upper bound for A. The set A is said to be bounded if it is both bounded below and bounded above and unbounded if it is not bounded.

The construction of the real numbers using the Dedekind cuts shows that every non-empty subset of \Re that is bounded above has a least upper bound denoted by \sup {A}. The least upper bound is also called the supremum. Every non-empty subset which is bounded below has a greatest lower bound denoted by \inf {A}. The greatest lower bound is also called the infimum. It is clear that if A and B are non-empty subsets of \Re such that A \subseteq B, then \sup{A} \leq  \sup{B}(see exercises). The following theorem gives a convenient characterization of the supremum.

Theorem 1.3 Let A be a non-empty subset of \Re that is bounded above. Then s=\sup {A} if and only if

(i) a \leq s for all a \in A, and

(ii) for any \epsilon>0, A \bigcap (s-\epsilon, s]\neq \phi.

Proof :

Let s=\sup {A}. Then s is an upper bound for A so that (i) holds. Now if (ii) were false, there would be an \epsilon>0 such that A \bigcap (s-\epsilon, s] = \phi. But then s-\epsilon would also be an upper bound for A, a contradiction. Conversely, suppose thatt (i) and (ii) hold. Then (i) implies s is an upper bound for A. If s were not the least upper bound for A, there would bea t <ssuch that a \leq t for alla \in A. Taking \epsilon = \frac{s-t}{2}>0 in (ii) we get that A \bigcap (s-\epsilon, s] \neq \phi. But s-\epsilon= t+\epsilonso that A \bigcap (t+\epsilon, s] \neq \phi which is a contradiction since a \leq t for all a \in A.


Theorem 1.3 can also be stated as follows. If A is a non-empty subset of \Re that is bounded above ands=\sup{A}, then either s \in A or A \bigcap (s-\epsilon, s] \neq \phi for all \epsilon >0. A similar theorem can be written for the infimum of a set.


Nalin Pithwa

Constructing numbers from sets

Continued from previous blog: A fast review of set theory; same reference: A Second Course in Analysis by M Ram Murty, Hindustan Book Agency.

Mathematicians and philosophers of the nineteenth century pondered deeply into the nature of a number. The question of “what is a number?” is not a simple one. But since mathematicians decided to give foundations of mathematics using the axiomatic method and sets as the basic building blocks, we are led to define numbers using sets. We follow Richard Dedekind (1831-1916) and Giuseppe Peano (1858-1932) in the following construction. It was as late as 1888 and 1889 when this construction was described in two papers written independently by Dedekind and Peano.

We construct a sequence of sets to represent the natural numbers. As noted earlier, zero is represented by the empty set. We have already described the construction of the natural numbers using the empty set. For each natural number n, the successor of n is denoted by n+1 (and sometimes as n^{'}) and defined as

n \bigcup \{ n \}

Thus, each natural number is a set with n elements, namely,

\{ 0,1,2, \ldots, n-1\}

We designate the set of natural numbers by the symbol \mathcal{N}. (It is a matter of personal convenience whether to include zero as a natural number or not. In this discussion, zero is a natural number. In other settings, it may not be. There is no universal convention regarding this and the student is expected to understand depending on the context. Some authors use the term “whole numbers” to indicate that zero is included in the discussion.)

The arithmetic operations on \mathcal{N} are now defined recursively. Addition is defined as a function from \mathcal{N} \times \mathcal{N} to \mathcal{N}:

+ (m,n) = m+n

where m+n is defined recursively by 0+n=n and m'+n = (m+n)^{'}. A similar definition is given for multiplication x by defining 0 \times n = 0 and

m^{'} \times n = (m \times n) + n

We also define m \times n as simply mn which is the familiar symbology.

An equivalence relation on a set S is a subset R of S \times S satisfying:

  1. (reflexive axiom) (a,a) \in R \forall {x} \in S.
  2. (symmetry axiom) (a,b) \in R \Longleftrightarrow (b,a) \in R.
  3. (transitive axiom) (a,b) \in R and (b,c) \in R implies (a,c) \in R.

The notion of an equivalence relation is an abstaction of our concept of equality, or at least what we implicitly expect of the notion of equality. It is more suggestive to write the equivalence relationn, not as a subset of S \times S as indicated above, but rather more symbolically as \sim that our axioms become:

  1. (reflexive) a \sim a, \forall {a} \in .
  2. (symmetry) a \sim b if and only if b \sim a.
  3. (transitive) a \sim b and b \sim c implies a \sim c.

Equivalence relations play a fundamental role in all of mathematics. They allow us to understand aspects of sets by grouping them using certain properties.

To construct negative integers, we define an equivalence relation on \mathcal{N} \times \mathcal{N}. We write

(m,n)\sim (j,k) \Longleftrightarrow m+k=j+n.

Intuitively, we think of (m,n) as m-n so that it becomes evident that our definition is now in terms of concepts that have been defined earlier. This is very similar to how the ancients worked with negative numberss that appeared in an equation. They usually moved them to the other side so that the equation became an equation of non-negative numbers. However, with our set theoretic definition, we have reached a more fundamental and higher level of abstraction. Thus, with our equivalence relation above on the natural numbers, we define the set of integers as the set of equivalence classes of such ordered pairs. It is now easy to see that the following lemma holds:

Lemma 1.1 If (j,k) is an ordered pair of non-negative integers, then exactly one of the following statements holds:

(a) (j,k) is equivalent to (m,0) for a unique non-negative integer m;

(b) (j,k) is equivalent to (0,m) for a unique non-negative integer m;

(c) (j,k) is equivalent to (0,0).

Sometimes, we denote by |(j,k)| the equivalence class of (j,k). With this lemma in place, we now denote by m the set of pairs of non-negative integers equivalent to (m,0) ; by -m the set of pairs equivalent to (0,m) and by 0 the set of pairs equivalent to (0,0). We denote these equivalence classes by \mathcal{Z}.This gives us set theoretic construction of the set of integers.

We can define the operations of addition and multiplication by setting:

|(j_{1},  k_{1})|+|(j_{2}, k_{2})|=|(j_{1}+j_{2}, k_{1}+k_{2})|

|(j_{1}, k_{1})| \times |(j_{2}, k_{2})| = |(j_{1}j_{2}+k_{1}k_{2}, j_{1}k_{2}+j_{2}k_{1})|

This latter definition is best understood if we recall that the symbol (j,k) represents j-k so that the left hand side of the above equation is

(j_{1}-k_{1})(j_{2}-k_{2}) = j_{1}j_{2}+k_{1}k_{2}-(j_{1}k_{2}+j_{2}k_{1})

One needs to check that these definitions are “well-defined” in the sense that they are independent of the representatives chosen for the equivalence class. This can be done as exercises.

In this way, we have now extended the notion of addition and multiplication from the set of natural numbers to the set of integers. Subtraction of integers can be defined by

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})| +(-1)|(j_{2}, k_{2})|

where -1 represents the equivalence class (0,1). All of these definitions correspond to our usual notion of addition, subtraction and multiplication. Their virtue lies in their pure set-theoretic formulation.

We can also order the set of integers in the usual way. Thus,

j_{1}+k_{2} < k_{1}+j_{2} \Longrightarrow |(j_{1}, k_{1})|< |(j_{2}, k_{2})|


j_{1}+k_{2} \leq k_{1}+j_{2} \Longleftrightarrow |(j_{1}, k_{1})| \leq |(j_{2}, k_{2})|.

This corresponds to our usual notion of “less than” and “less than or equal to”.

Finally, we can define the absolute value on the set of integers by setting

|k|=k, if 0<k

|k|=0, if k=0

|k|=-k, if k<0

We can now construct the rational numbers \mathcal{Q} from the set of integers. We do this by defining an equivalence relation on the set \mathcal{Z} \times \mathcal{Z}^{+} by stating that two pairs (j_{1}, k_{1}) and (j_{2}, k_{2}) are equivalent if and only if j_{1}k_{2}=j_{2}k_{1}. Intuitively, we think of (j_{1}, k_{1}) as representing the “fraction” \frac{j_{1}}{k_{1}} and examining what we would mean by \frac{j_{1}}{k_{1}} = \frac{j_{2}}{k_{2}} by reducing it to notions already defined. The set of rational numbers \mathcal{Q} is then defined as the set of such equivalence classes.

The expected operations of addition and multiplication are now evident:

|(j_{1}, k_{1})|+|j_{2}, k_{2}| = |(j_{1}k_{2}+j_{2}k_{1}, k_{1}k_{2})|

|(j_{1}, k_{1})||(j_{2}, k_{2})| = |(j_{1}j_{2}, k_{1}k_{2})|

Again, these definitions are easily verified to be well-defined. Finally, we can now define “division”.If |(j_{1}, k_{1})|, |(j_{2}, k_{2})| \in \mathcal{Q} with j_{2} \neq 0, we define:

\frac{|(j_{1}, k_{1})|}{|(j_{2}, k_{2})|} = |(j_{1}k_{2}, j_{2}k_{1})|

These operations satisfy the familiar laws of associativity, commutativity and distributivity. Subtraction of rational numbers then can be written as :

|(j_{1}, k_{1})|-|(j_{2}, k_{2})| = |(j_{1}, k_{1})|+(-1,1)|(j_{2}, k_{2})|

The ordering of rational numbers can also be written as:

|(j_{1}, k_{1})|< |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2}< j_{2}k_{1}

|(j_{1}, k_{1})|   \leq  |(j_{2}, k_{2})| \Longleftrightarrow j_{1}k_{2} \leq j_{2}k_{1}.

These definitions agree with out usual notions of ordering of the rational numbers.

Finally, the definition of absolute value can be extended as:

|[(j,,k)]|= |(j,k)| if [(0,1)] < [(j,k)]

|[(j,k)]| = |(0,1)| if |(j,k)|=|(0,1)|

|[(j,k)]| = -|(j,k)|<  |(0,1)|.

Again, our familiar properties of the absolute value of rational numbers hold. With this foundational construction in place, we can conveniently represent the equivalence class of (j,k) as simply the fraction j/k and continue to work with these numbers as we were hopefully taught from childhood.

In the next sections/blogs we construct the real numbers from this axiomatic framework.


Hint (generic): keep the meaning of the symbols in mind and meaning of equivalence relations and equivalence classes. Also note that our basic object is a class and a set is a member of a class.

  1. let |(j_{1}, k_{1})|, |(j_{2}, k_{2})| be two elements of \mathcal{Z}. Show that the addition:

|(j_{1}, k_{1})|+|(j_{2}, k_{2})| = |(j_{1}+j_{2}, k_{1}+k_{2})| is well-defined. That is, prove that for any (j_{1}^{'}, k_{1}^{'}) \in |(j_{1}, k_{1})| and (j_{2}^{'},k_{2}^{'}) \in |(j_{2}, k_{2})|, we have that (j_{1}^{'}+j_{2}^{'}, k_{1}^{'}+k_{2}^{'}) is equivalent to (j_{1}+j_{2}, k_{1}+k_{2}).

2. For j_{1}, j_{2}, k \in \mathcal{Z}, prove the distributive law: (j_{1}+j_{2}).k = j_{1}k+j_{2}k.

3. Show that the relations < and \leq on \mathcal{Z} have the following properties:

(a) |(0,j)|< |(0,0)| for all j \in \mathcal{Z}^{+}

(b) |(0,j)|< |(k,0)| for all j, k \in \mathcal{Z}^{+}

(c) |(0,j)|< |(0,k)|, j ,k \in \mathcal{Z}^{+} if and only if k<j

(d) |(0,0)| < |(j,0)| for all j \in \mathcal{Z}^{+}

(e) |(j,0)|<|(k,0)|, j, k \in \mathcal{Z}_{\geq 0}if and only if j<k.

(f) |(0,j)| \leq |(0,0)| for all j \in \mathcal{Z}_{\geq 0}.

(g) |(0,j)| \leq |(k,0)| for all j,k \in \mathcal{Z}_{\geq 0}

(h) |(0,j)| \leq |(0,k)| for j,k \in \mathcal{Z}_{\geq 0} if and only if k \leq j

(i) |(0,0)| \leq |(j,0)| for all j \in \mathcal{Z}_{\geq 0}

(j) |(j,0)| \leq |(k,0)| where j, k \in \mathcal{Z}_{\geq 0} if and only if j \leq k.


Nalin Pithwa.

An example: continuous map which is not homeomorphic

Reference: Topology by Hocking and Young. Dover Publications.

Let us present an example of a continuous mapping (one-to-one) which is not a homeomorphism:

Let S be the set of all non-negative real numbers with their usual metric topology, and let T be the unit circle in its metric topology. Consider $f: S \rightarrow T$. For each x in S, let $f(x) = (1, \frac{2\pi x^{2}}{(1+x^{2})})$, a point in polar coordinates on T.

It is easily shown that f is continuous and one-to-one.

But the set of all x in S such that $x<1$ is open in S while its image is not open in T. Hence, f is not interior. (meaning that: A transformation $f: S \rightarrow T$ of the space S into the space T is said to be interior if f is continuous and and if the image of every open subset of S is open in T) , and is not a homeomorphism (because of the following theorem: A necessary and sufficient condition that the one-to-one continuous map $f: S \rightarrow T$ of the space S onto the space S be a homeomorphism is that f be interior).


Nalin Pithwa.

Some basic facts about connectedness and compactness

Reference: Hocking and Young’s Topology, Dover Publishers. Chapter 1: Topological Spaces and Functions.

Definition : Separated Space: A topological space is separated if it is the union of two disjoint, non empty, open sets.

Definition: Connected Space: A topological space is connected if it is not separated.

PS: Both separatedness and connectedness are invariant under homeomorphisms.

Lemma 1: A space is separated if and only if it is the union of two disjoint, non empty closed sets.

Lemma 2: A space S is connected if and only if the only sets in S which are both open and closed are S and the empty set.

Theorem 1: The real line E^{1} is connected.

Theorem 2: A subset X of a space S is connected if and only if there do not exist two non empty subsets A and B of X such that X = A \bigcup B, and such that (\overline{A} \bigcap B) \bigcup (A \bigcap \overline{B}) is empty.

Note the above is Prof. Rudin’s definition of connectedness.

Theorem 3: Suppose that C is a connected subset of a space S and that \{ C_{\alpha}\} os a collection of connected subsets of S, each of which intersects C. Then, S = C \bigcup (\bigcup_{\alpha}C_{\alpha}) is connected.

Corollary of above: For each n, E^{n} is connected.

Theorem 4:

Every continuous image of a connected space is connected.

Lemma 3: For n>1, the complement of the origin in E^{n} is connected.

Theorem 5: For each n>0, S^{n} is connected.

Theorem 6: If both f: S \rightarrow T and g: T \rightarrow X are continuous, then the composition gf is also continuous.

Lemma 4: A subset X of a space S is compact if and only if every covering of X by open sets in S contains a finite covering of X.

Theorem 7: A closed interval [a,b] in E^{1} is compact.

Theorem 8: Compactness is equivalent to the finite intersection property.

Theorem 9: A compact space is countably compact.

Theorem 10: Compactness and countable compactness are both invariant under continuous transformations.

Theorem 11: A closed subset of a compact space is compact.


Nalin Pithwa.