cs0601080/papi.tex
1: 
2: 
3: %----------------------------------------------------------------
4: %%%%%%%%%%%%%%%%%%%%5Check-
5: 
6: % check whether to use pseudo-additivity or nonextensive additivity
7: 
8: 
9: 
10: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
11: %    INSTITUTE OF PHYSICS PUBLISHING                                   %
12: %                                                                      %
13: %   `Preparing an article for publication in an Institute of Physics   %
14: %    Publishing journal using LaTeX'                                   %
15: %                                                                      %
16: %    LaTeX source code `ioplau2e.tex' used to generate `author         %
17: %    guidelines', the documentation explaining and demonstrating use   %
18: %    of the Institute of Physics Publishing LaTeX preprint files       %
19: %    `iopart.cls, iopart12.clo and iopart10.clo'.                      %
20: %                                                                      %
21: %    `ioplau2e.tex' itself uses LaTeX with `iopart.cls'                %
22: %                                                                      %
23: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
24: %
25: %
26: % First we have a character check
27: %
28: % ! exclamation mark    " double quote  
29: % # hash                ` opening quote (grave)
30: % & ampersand           ' closing quote (acute)
31: % $ dollar              % percent       
32: % ( open parenthesis    ) close paren.  
33: % - hyphen              = equals sign
34: % | vertical bar        ~ tilde         
35: % @ at sign             _ underscore
36: % { open curly brace    } close curly   
37: % [ open square         ] close square bracket
38: % + plus sign           ; semi-colon    
39: % * asterisk            : colon
40: % < open angle bracket  > close angle   
41: % , comma               . full stop
42: % ? question mark       / forward slash 
43: % \ backslash           ^ circumflex
44: %
45: % ABCDEFGHIJKLMNOPQRSTUVWXYZ 
46: % abcdefghijklmnopqrstuvwxyz 
47: % 1234567890
48: %
49: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
50: %
51: \documentclass[12pt]{iopart}
52: \newcommand{\gguide}{{\it Preparing graphics for IOP journals}}
53: 
54: %==============================
55: %Mine
56: 
57: \usepackage{amssymb}
58: \usepackage{amsthm}
59: 
60: %------------------theorm env---------------
61:  \newtheorem{theorem}{Theorem}[section]
62:         \newtheorem{lemma}[theorem]{Lemma}
63:         \newtheorem{proposition}[theorem]{Proposition}
64:         \newtheorem{corollary}[theorem]{Corollary}
65:      \newtheorem{definition}[theorem]{Definition}
66:         \newtheorem{remark}[theorem]{Remark}
67: % \def\QED{\mbox{\rule[0pt]{1.5ex}{1.5ex}}}
68: % \def\proof{\noindent\hspace{2em}{\it Proof: }}
69: % \def\endproof{\hspace*{\fill}~\QED\par\endtrivlist\unskip}
70: %--------------------------------------------
71: \newcommand{\ud}{\mathrm{d}}
72: %=====================================
73: 
74: %Uncomment next line if AMS fonts required
75: %\usepackage{iopams}  
76: \begin{document}
77: 
78: \title[]{On Measure Theoretic definitions of Generalized Information
79:   Measures and Maximum Entropy Prescriptions}
80: 
81: \author{Ambedkar Dukkipati, M Narasimha Murty\footnote{Corresponding author} and
82: Shalabh Bhatnagar}
83: 
84: \address{Department of Computer Science and Automation,
85: Indian Institute of Science, Bangalore-560012, India.}
86: \ead{\mailto{ambedkar@csa.iisc.ernet.in},
87: \mailto{mnm@csa.iisc.ernet.in}, \mailto{shalabh@csa.iisc.ernet.in}}
88: 
89: 
90: %----------------------------------------
91: \begin{abstract}
92:  	Though Shannon entropy of a probability measure $P$, defined
93:         as $- \int_{X} \frac{\ud P}{\ud \mu} \ln \frac{\ud P}{\ud
94:         \mu} \, \ud \mu$ on a measure space $(X, \mathfrak{M},\mu)$, does not
95:         qualify itself as an information measure (it is not a natural
96:         extension of the discrete case), maximum entropy (ME)
97:         prescriptions in the measure-theoretic case are consistent with that of
98:         discrete case.  
99:         In this paper, we study the
100:         measure-theoretic definitions of generalized information
101:         measures and discuss the ME prescriptions. We present two
102:         results in this regard: (i) we prove that, as in
103:         the case of classical relative-entropy, the measure-theoretic
104:         definitions of generalized relative-entropies, R\'{e}nyi and
105:         Tsallis, are natural extensions of their respective discrete
106:         cases, (ii) we show that, ME prescriptions of
107:         measure-theoretic Tsallis entropy are consistent with the
108:         discrete case.
109: \end{abstract}
110: 
111: %Uncomment for PACS numbers title message
112: \pacs{}
113: % Keywords required only for MST, PB, PMB, PM, JOA, JOB? 
114: %\vspace{2pc}
115: %\noindent{\it Keywords}: Article preparation, IOP journals
116: % Uncomment for Submitted to journal title message
117: %\submitto{\JPA}
118: % Comment out if separate title page not required
119: \maketitle
120: 
121: %=========================Introduction===========================
122: \section{Introduction}
123: \label{Section:Introduction}
124:         Shannon measure of information was developed
125:         essentially for the case when the random variable takes a
126:         finite number of values. However, in the literature, one often
127:         encounters an extension of Shannon entropy in the discrete 
128:         case to the case
129:         of a one-dimensional random variable with density function $p$ 
130:         in the form~(e.g \cite{ShannonWeawer:1949:TheMathematicalTheoryOfCommunication,Ash:1965:InformationTheory})  
131:         \begin{displaymath}
132:           S(p) = - \int_{- \infty}^{+ \infty} p(x) \ln p(x)\, \ud x \enspace.
133:         \end{displaymath}
134:         This entropy in the continuous case 
135:         as a pure-mathematical formula (assuming convergence of
136:         the integral and absolute continuity of the density $p$ with
137:         respect to Lebesgue measure) resembles Shannon entropy in the
138:         discrete case, but can not be used as a measure of
139:         information. First, it is not a natural extension of Shannon
140:         entropy in the discrete case, since it is not the limit of the sequence
141:         finite discrete entropies corresponding to pmf which
142:         approximate the pdf $p$. Second, it is not strictly positive.
143: 
144:         Inspite of these short comings, one can still use the
145:         continuous entropy functional in conjunction with the principle of maximum
146:         entropy where one wants to find a probability density function
147:         that has greater uncertainty than any other distribution
148:         satisfying a set of given constraints. Thus, in this use of
149:         continuous measure one is interested in it as a measure of
150:         relative uncertainty, and not of absolute uncertainty. This
151:         is where one can relate maximization of Shannon entropy to the
152:         minimization of Kullback-Leibler relative-entropy
153:         (see~\cite[pp. 55]{KapurKesavan:1997:EntropyOptimizationPrinciples}).
154: %        It
155: %        is well known that the continuous version of
156: %        KL-entropy defined for two probability density functions $p$
157: %        and $r$ as,
158: %        \begin{displaymath}
159: %         I(p\|r) = \int_{- \infty}^{+ \infty} p(x) \ln
160: %        \frac{p(x)}{r(x)} \, \ud x \enspace,
161: %        \end{displaymath}   
162: %        is indeed a natural generalization of same in the discrete
163: %        case.
164: 
165:         Indeed, during the early stages of development of
166:         information theory, the important paper 
167:         by Gelfand, Kolmogorov and Yaglom~\cite{GelfandKolmogorovYaglom:1956:OnTheGeneralDefinitionOfTheAmountOfInformation} 
168:         called attention to the case of defining entropy functional on
169:         an arbitrary measure space $(X, \mathfrak{M},\mu)$. 
170: 	In this respect, Shannon entropy of a probability density function $p:X 
171:         \rightarrow {\mathbb{R}}^{+}$ can be written as,
172:         \begin{displaymath}
173:           S(p) = - \int_{X} p(x) \ln p(x) \, \ud \mu \enspace.
174:         \end{displaymath}  
175:         One can see from the above definition that the concept of
176:         ``the entropy of a pdf'' is a misnomer: there
177:         is always another measure $\mu$ in  the background. In the
178:         discrete case considered by Shannon, $\mu$ is the cardinality
179:         measure\footnote{Counting or cardinality measure $\mu$ on a
180:           measurable space $(X,\mathfrak{M})$, when is $X$ is a
181:           finite set and $\mathfrak{M} = 2^{X}$, is defined as $\mu(E)
182:           = \# E$, $\forall E \in \mathfrak{M}$.}~\cite[pp. 19]{ShannonWeawer:1949:TheMathematicalTheoryOfCommunication};
183:         in the continuous case considered by both Shannon and Wiener,
184:         $\mu$ is the Lebesgue
185:         measure cf.~\cite[pp. 54]{ShannonWeawer:1949:TheMathematicalTheoryOfCommunication}
186:         and 
187:         \cite[pp. 61, 62]{Wiener:1948:Cybernetics}.
188:          All entropies are
189:         defined with respect to some measure
190:         $\mu$,
191:         as Shannon and Wiener both emphasized in~\cite[pp.57,
192:         58]{ShannonWeawer:1949:TheMathematicalTheoryOfCommunication} 
193:         and~\cite[pp.61, 62]{Wiener:1948:Cybernetics} respectively.
194: 
195:         This case was studied independently
196:         by Kallianpur~\cite{Kallianpur:1960:OnTheAmountOfInformationContainedInASingmaField} 
197:         and Pinsker~\cite{Pinsker:1960:InformationAndInformationStability},
198:         and perhaps others were guided by the earlier work
199:         of Kullback~\cite{KullbackLeibler:1951:OnInformationAndSufficiency},
200:         where one would define entropy in terms of Kullback-Leibler
201:         relative entropy. Unlike Shannon entropy, measure-theoretic
202:         definition of KL-entropy is a natural extension of definition
203:         in the discrete case. 
204: 
205: 	%In this respect,
206:         %the Gelfand-Yaglom-Perez  theorem
207:         %(GYP-theorem)~\cite{GelfandYaglom:1959:CalculationOfTheAmountOfInformation_Etc,Perez:1959:InformationTheoryWithAbstractAlphabets,Dobrushin:1959:GeneralFormulationsOfShannonsbasicTheorems}
208:         %plays an important role, which equips measure-theoretic
209:         %KL-entropy with a fundamental definition. The main
210:         %contribution of this chapter is to prove GYP-theorem for 
211: 	%R\'{e}nyi relative-entropy of order $\alpha >1$, which can be
212:         %extended to Tsallis relative-entropy.
213: 
214: 	%Before proving GYP-theorem for R\'{e}nyi relative-entropy,
215: 	In this paper we present the measure-theoretic definitions of
216: 	generalized information measures and show that as in
217:         the case of KL-entropy, the measure-theoretic
218:         definitions of generalized relative-entropies, R\'{e}nyi and
219:         Tsallis, are natural extensions of their respective discrete
220:         cases. We discuss the ME prescriptions for generalized
221: 	entropies and show that ME prescriptions of
222:         measure-theoretic Tsallis entropy are consistent with the
223:         discrete case, which is true for measure-theoretic
224: 	Shannon-entropy. 
225: 
226: 	Rigorous studies of the Shannon and KL entropy functionals in
227: 	measure spaces can be found in the papers by
228:         Ochs~\cite{Ochs:1976:BasicPropertiesOfTheGeneralizedBoltzmann-Gibbs-ShannonEntropy}
229:         and by
230:         Masani~\cite{Masani:1992:TheMeasureTheoreticAspectsOfEntropy_Part_1,Masani:1992:TheMeasureTheoreticAspectsOfEntropy_Part_2}.
231:         Basic measure-theoretic aspects of classical information measures can be
232:         found
233:         in~\cite{Pinsker:1960:InformationAndInformationStability,Guiasu:1977:InformationTheoryWithApplications,Gray:1990:EntropyAndInformationTheory}.
234: %        in~\cite[Chapter~2]{Guiasu:1977:InformationTheoryWithApplications}
235: %        and~\cite[Chapter~5]{Gray:1990:EntropyAndInformationTheory}.
236: 
237:         We review the measure-theoretic formalisms for classical
238:         information measures in 
239:         \S~\ref{Section:ME:MeasureTheoreticDefinitionsOfInformationMeasures}
240:         and extend these definitions to generalized
241:         information measures in
242:         \S~\ref{Section:ME:MeasureTheoreticDefinitionsOfGeneralizedInformationMeasures}. In
243:         \S~\ref{Section:ME:MaximumEntropyAndCanonicalDistributions} we
244:         present the ME prescription for Shannon entropy followed by
245:         prescriptions for 
246:         Tsallis entropy in
247:         \S~\ref{Section:ME:ME-prescriptionForTsallisEntropy}. We
248:         revisit measure-theoretic definitions of generalized entropic
249:         functionals in
250:         \S~\ref{Section:ME:MeasureTheoreticDefinitions_Revisited} and
251:         present some results.
252: 
253: %================================Section:==========================================
254: \section{Measure-Theoretic definitions of Classical Information Measures}
255: \label{Section:ME:MeasureTheoreticDefinitionsOfInformationMeasures}
256: %	Information measures like entropy, mutual information,
257: %	conditional entropy, and conditional mutual information
258: %	etc., can be expressed in terms of KL-entropy and hence
259: %	the measure-theoretic analogs of these measures will follow
260: %	from the measure-theoretic definition of KL-entropy.
261: %	In this section, we study the measure-theoretic
262: %	definitions of KL-entropy and its relation to entropy in this
263: %	case.  
264:   %-----------------------------SubSection-------------------------------------
265:   \subsection{Discrete to Continuous}
266:    \label{SubSection:ME:DiscreteToContinuous}
267:         \noindent
268:         Let $p:[a,b] \rightarrow {\mathbb{R}}^{+}$ be a probability
269:         density function,  where $[a,b] \subset \mathbb{R}$. That is,
270:         $p$ satisfies
271:         \begin{displaymath}
272:         p(x) \geq 0, \:\:\: \forall x \in [a,b] \:\:\: \mathrm{and}\:\:\:
273:         \int_{a}^{b} p(x) \, \ud x =1 \enspace.
274:         \end{displaymath}
275:         In trying to define entropy in the continuous case, the
276:         expression of Shannon entropy was automatically extended by
277:         replacing the sum in the 
278: 	Shannon entropy discrete case by the
279:         corresponding integral. We obtain, in this way, Boltzmann's
280:         H-function (also known
281:   	as differential entropy in information theory),
282: %	~\cite{Grad:1965:OnBoltzmannsH-Theorem}: reference for
283: %        Boltzmann-H function
284:         \begin{equation}
285:         \label{Equation:ME:ContinuousEntropy}
286:         S(p) = - \int_{a}^{b} p(x) \ln p(x) \, \ud x \enspace.
287:         \end{equation}
288:         But the ``continuous entropy'' given
289:         by~(\ref{Equation:ME:ContinuousEntropy}) is not a natural
290:         extension of definition in discrete case in the sense that, it
291:         is not the limit of
292:         the finite discrete entropies corresponding to a sequence of
293:         finer partitions of the interval $[a,b]$ whose norms tend to
294:         zero. We can show this by a counter example. 
295:         Consider a uniform probability distribution 
296:         on the interval $[a,b]$, having the probability density
297:         function
298:         \begin{displaymath}
299:         p(x) = \frac{1}{b-a}\enspace, \:\:\:\:\: x \in [a,b] \enspace.
300:         \end{displaymath}
301:         The continuous
302:         entropy~(\ref{Equation:ME:ContinuousEntropy}), in this case will be
303:         \begin{displaymath}
304:         S(p) = \ln (b - a) \enspace. 
305:         \end{displaymath}
306:         On the other hand, let us consider a finite partition of the the interval
307:         $[a,b]$ which is composed of $n$ equal subintervals, and let
308:         us attach to this partition the finite discrete uniform
309:         probability distribution whose corresponding entropy will be,
310:         of course,
311:         \begin{displaymath}
312:         S_{n}(p) = \ln n \enspace.
313:         \end{displaymath}
314:         Obviously, if $n$ tends to infinity, the discrete entropy
315:         $S_{n}(p)$ will tend to infinity too, and not to $\ln (b-a)$;
316:         therefore $S(p)$ is not the limit of $S_{n}(p)$, when $n$ tends
317:         to infinity. Further, one can observe that $\ln (b-a)$ is negative  
318:         when~$b-a <1$.
319: 
320: 	Thus, strictly speaking
321:         continuous entropy~(\ref{Equation:ME:ContinuousEntropy}) cannot 
322:         represent a measure of uncertainty since uncertainty should
323:         in general be positive.
324: 	We are able to prove the ``nice'' properties only for the
325:         discrete entropy, therefore, it
326:         qualifies as a ``good'' measure of information (or
327:         uncertainty) supplied by an random experiment. The ``continuous
328:         entropy'' not being the limit of the discrete
329:         entropies, we cannot extend the so called nice properties to
330:         it.
331: 
332:         Also, in physical applications, the coordinate $x$ in
333:         (\ref{Equation:ME:ContinuousEntropy}) represents an abscissa,
334:         a distance from a fixed reference point. This distance $x$ has
335:         the dimensions of length. Now, with the density function
336:         $p(x)$, one can specify the probabilities of an event $[c,d)
337:         \subset [a,b]$ as $\int_{c}^{d} p(x) \, \ud x$, one has to
338:         assign the dimensions ${(\mbox{length})}^{-1}$, since
339:         probabilities are dimensionless. Now for $0 \leq z < 1$, one
340:         has the series expansion
341:         \begin{equation}
342:           - \ln (1-z) = z + \frac{1}{2}z^{2} + \frac{1}{3}z^{3}+
343:           \ldots \enspace,
344:         \end{equation}
345:         it is necessary that the argument of the logarithm function
346:         in~(\ref{Equation:ME:ContinuousEntropy}) be
347:         dimensionless.
348: 	Hence the formula (\ref{Equation:ME:ContinuousEntropy}) is
349:         then seen to be dimensionally incorrect, since the argument of
350:         the logarithm on its right hand side has the dimensions of a
351:         probability
352:         density~\cite{Smith:2001:SomeObservationsOnTheConceptsOfInformationTheoreticEntropy}.
353:         Although
354:         Shannon~\cite{Shannon:1948:MathematicalTheoryOfCommunication_BellLabs} 
355:         used the formula (\ref{Equation:ME:ContinuousEntropy}), he
356:         does note its lack of invariance with respect to changes in
357:         the coordinate system.
358: 
359:         In the context of maximum entropy principle
360:         Jaynes~\cite{Jaynes:1968:PriorProbabilities} 
361:         addressed this problem and suggested the formula,
362:         \begin{equation}
363:         \label{Equation:ME:JaynesSuggestion}  
364:           S'(p) = - \int_{a}^{b} p(x) \ln \frac{p(x)}{m(x)}\, \ud x \enspace,
365:         \end{equation}
366:         in the place of (\ref{Equation:ME:ContinuousEntropy}),  
367:         where $m(x)$ is a prior function. Note that when $m(x)$ is probability density
368:         function, (\ref{Equation:ME:JaynesSuggestion}) is nothing but
369:         the relative-entropy. However, if we choose $m(x) = c$, a constant
370:         (e.g \cite{ZellnerHighfield:1988:CalculationOfMaximumEntropyDistributions}),
371:         we get 
372:         \begin{displaymath}
373:           S'(p) = S(p) - \ln c \enspace,
374:         \end{displaymath}
375:         where $S(p)$ refers to the continuous
376:         entropy (\ref{Equation:ME:ContinuousEntropy}).
377:         Thus, maximization of $S'(p)$ is equivalent to maximization of
378:         $S(p)$.
379: 	Further discussion on estimation of probability
380:         density functions by ME-principle in the continuous case can be found in  
381:         \cite{LazoRathie:1978:OnTheEntropyOfContinuousProbabilityDistributions,ZellnerHighfield:1988:CalculationOfMaximumEntropyDistributions,Ryu:1993:MaximumEntropyEstimationOfDensityAndRegressionFunction}.
382: 
383:         Prior to that, Kullback~\cite{KullbackLeibler:1951:OnInformationAndSufficiency} too
384:         suggested that in the measure-theoretic definition of entropy,
385:         instead of examining the entropy 
386:         corresponding to only on given measure, we have to compare the
387:         entropy inside a whole class of measures.
388: 
389:   %-----------------------SubSection------------------------------------
390:   \subsection{Classical information measures}
391:   \label{SubSection:ME:ClassicalInformationMeasures}
392: 
393:         \noindent
394:         Let $(X,\mathfrak{M},\mu)$ be a measure space. $\mu$
395:         need not be a probability measure unless otherwise specified.
396:         Symbols $P$, $R$ will denote probability measures on
397:         measurable space $(X,\mathfrak{M})$ and $p$, $r$  
398:         denote $\mathfrak{M}$-measurable functions on $X$.
399:         An $\mathfrak{M}$-measurable function $p:X \rightarrow
400:         {\mathbb{R}}^{+}$ is said to be a probability 
401:         density function (pdf) if $\int_{X} p \, \ud \mu = 1$.
402: 
403:         In this general setting, Shannon entropy $S(p)$ of pdf $p$ is
404:         defined as follows~\cite{Athreya:1994:EntropyMaximization}. 
405:         %DEFINITION: Shannon entropy for pdf
406:         \begin{definition}
407:         \label{Definition:ME:ShannonEntropy_Measuretheroetic_pdf}
408:         Let $(X,\mathfrak{M},\mu)$ be a measure space and 
409:         $\mathfrak{M}$-measurable function $p:X \rightarrow  
410:         {\mathbb{R}}^{+}$ be pdf. Shannon entropy of $p$
411:         is defined as
412:         \begin{equation}
413:          \label{Equation:ME:ShannonEntropyOf-pdf} 
414:         S(p) = - \int_{X} p \ln p \, \ud \mu \enspace,
415:         \end{equation}
416:         provided the integral on right exists.
417:         \end{definition}%EndDefinition
418:         Entropy functional $S(p)$ defined in (\ref{Equation:ME:ShannonEntropyOf-pdf}) can be
419:         referred to as entropy of the probability measure 
420:         $P$, in the sense that the measure $P$ is induced by $p$,
421:         i.e.,
422:         \begin{equation}
423:         \label{Equation:ME:ProbabilityMeasureInducedByaPdf}  
424:           P(E) = \int_{E} p(x) \, \ud \mu(x) \enspace, \:\:\:\:\:
425:           \forall E \in \mathfrak{M} \enspace.
426:         \end{equation}
427: 	This reference is consistent\footnote{Say
428:         $p$ and 
429:         $r$ are two pdfs and $P$ and $R$ are corresponding
430:         induced measures on measurable space $(X,\mathfrak{M})$ such
431:         that $P$ and $R$ are identical, i.e., $\int_{E} p \,
432:         \ud \mu = \int_{E} r \, \ud \mu$, $\forall E \in \mathfrak{M}$. Then
433:         we have $p \stackrel{\mathrm{a.e}}{=} r$ and hence
434:         $ -\int_{X} p \ln p \, \ud \mu = -\int_{X} r \ln r \, \ud
435:         \mu$.} because the probability measure
436:         $P$ can be identified {\it a.e} by the pdf $p$.
437: 
438:         Further, the definition of the probability measure $P$
439:         in (\ref{Equation:ME:ProbabilityMeasureInducedByaPdf}), allows us
440:         to write entropy functional
441:         (\ref{Equation:ME:ShannonEntropyOf-pdf}) 
442:         as, 
443:         \begin{equation}
444:         \label{Equation:ME:ShannonEntropyOf-PM-inducedBy-pdf}
445:         S(p) = - \int_{X} \frac{\ud P}{\ud \mu} \ln \frac{\ud P}{\ud
446:         \mu} \, \ud \mu \enspace,
447:         \end{equation}
448:         since (\ref{Equation:ME:ProbabilityMeasureInducedByaPdf})
449:         implies\footnote{If a 
450:         nonnegative measurable function $f$ induces a measure $\nu$ on
451:         measurable space $(X,\mathfrak{M})$ with respect to a measure
452:         $\mu$, defined as $\nu(E) = \int_{E} f \, \ud \mu, \:\:\: \forall E \in
453:         \mathfrak{M}$ then $\nu \ll \mu$. Converse is given by
454:         Radon-Nikodym theorem~\cite[pp.36, Theorem
455:           1.40(b)]{Kantorovitz:2003:IntroductionToModernAnalysis}.} $P
456:         \ll \mu$, and pdf $p$ is the
457:         Radon-Nikodym derivative of $P$ w.r.t $\mu$. 
458: 
459:         Now we proceed to the definition of Kullback-Leibler
460:         relative-entropy or KL-entropy for probability measures.
461:         %Definition:Kullback-Leibler Relative-Entropy1
462:         \begin{definition}
463:         \label{Definition:ME:RelativeEntropy_1}
464:         Let $(X,\mathfrak{M})$ be a measurable space. Let $P$ and $R$
465:         be two probability measures on $(X,\mathfrak{M})$. Kullback-Leibler
466:         relative-entropy  KL-entropy of $P$ relative to $R$ is
467:         defined as
468:         \begin{equation}
469:         \label{Equation:ME:RelativeEntropyOfProbabilityMeasures}
470:         I(P\|R) = \left\{ \begin{array}{ll}
471:         \displaystyle{\int_{X} \ln \frac{\ud P}{\ud R} \, \ud P }     &
472:         \:\:\:\:\:\textrm{if}\:\:\:\:\:  P \ll R \enspace, \\ \\
473:           +\infty   & \:\:\:\:\:\textrm{otherwise.}
474:            \end{array} \right.
475:         \end{equation}
476:         \end{definition}%EndDefinition:Kullback-Leiber Relative-Entropy1
477: 	The divergence inequality
478:         $I(P\|R) \geq 0$ and $I(P\|R) =0$ if and only if $P=R$ can be
479:         shown in this case too.
480:         KL-entropy~(\ref{Equation:ME:RelativeEntropyOfProbabilityMeasures})
481:         also can be written as 
482:         \begin{equation}
483:         \label{Equation:ME:AnotherFormForRelativeEntropyOfProbabilityMeasures}  
484:         I(P\|R) = \int_{X} \frac{\ud P}{\ud R} \ln \frac{\ud P}{\ud R}
485:         \, \ud R \enspace.
486:         \end{equation}
487:         
488:         Let the $\sigma$-finite measure $\mu$ on $(X,\mathfrak{M})$
489:         such that $P \ll R \ll \mu$. Since $\mu$ is $\sigma$-finite, from
490:         Radon-Nikodym theorem, there exists a non-negative 
491:         $\mathfrak{M}$-measurable functions $p: X \rightarrow
492:         \mathbb{R}^{+}$ and $r: X \rightarrow \mathbb{R}^{+}$ unique
493:         $\mu$-{\em a.e}, such that
494:         \begin{equation}
495: 	\label{Equation:ME:DefinitionOfPdf_p}
496:         P(E) = \int_{E} p \, \ud \mu \enspace, \:\:\: \forall E \in \mathfrak{M} \enspace,
497:         \end{equation}
498:         and
499:         \begin{equation}
500: 	\label{Equation:ME:DefinitionOfPdf_r}
501:         R(E) = \int_{E} r \, \ud \mu \enspace, \:\:\: \forall E \in
502:         \mathfrak{M} \enspace.
503:         \end{equation}
504:         The pdfs $p$ and $r$ in (\ref{Equation:ME:DefinitionOfPdf_p})
505:         and (\ref{Equation:ME:DefinitionOfPdf_r}) (they are indeed 
506:         pdfs) are Radon-Nikodym 
507:         derivatives of probability measures $P$ and $R$ with respect
508:         to $\mu$, respectively, i.e., $p =\frac{\ud P}{\ud \mu}$ and 
509:         $r=\frac{\ud R}{\ud \mu}$.
510:         Now one can define relative-entropy of pdf $p$ w.r.t $r$ as
511:         follows\footnote{This follows from the chain rule for
512:         Radon-Nikodym derivative:
513:          \begin{displaymath}
514:            \frac{\ud P}{\ud R} \stackrel{\mathrm{a.e}}{=} \frac{\ud
515:              P}{\ud \mu} {\left( \frac{\ud R}{\ud \mu} \right)}^{-1}\enspace.
516:          \end{displaymath}  
517:         }.
518:         
519:        %Definition:KullbackLeibler Relative-Entropy2
520:         \begin{definition}
521:         \label{Definition:ME:RelativeEntropy_of_pdf}
522:         Let $(X,\mathfrak{M},\mu)$ be a measure space. Let
523:        $\mathfrak{M}$-measurable functions $p,r:X \rightarrow 
524:         {\mathbb{R}}^{+}$ be two pdfs. The KL-entropy of $p$
525:        relative to $r$ 
526:         is defined as
527:         \begin{equation}
528:          \label{Equation:ME:RelativeEntropy_of_pdf} 
529:         I(p\|r) = \int_{X} p(x) \ln \frac{p(x)}{r(x)} \, \ud \mu(x) \enspace,
530:         \end{equation}
531:         provided the integral on right exists.
532:         \end{definition}%EndDefinition:KullbackLeibler Relative-Entropy2
533: 
534:         As we have mentioned earlier, KL-entropy
535:         (\ref{Equation:ME:RelativeEntropy_of_pdf}) exist if the two 
536:         densities are absolutely continuous with respect to one
537:         another. On the real line the same definition can be written
538:         as
539:         \begin{displaymath}
540:         I(p\|r) = \int_{\mathbb{R}} p(x) \ln \frac{p(x)}{r(x)} \, \ud x \enspace,
541:         \end{displaymath}
542:         which exist if the densities $p(x)$ and $r(x)$ share the same support.
543:         Here, in the sequel we use the convention
544:         \begin{equation}
545:         \ln 0 = - \infty, \:\:\:\:\:\:\:\:\:\:\: \ln \frac{a}{0} = + \infty\:\:
546:         \mathrm{for any}\:\: a \in \mathbb{R}, \:\:\:\:\:\:\:\:\:\:\:
547:         0.(\pm \infty) = 0.
548:         \end{equation}
549:         
550:         Now we turn to the definition of entropy functional on a
551:         measure space.
552:         Entropy functional in 
553:         ~(\ref{Equation:ME:ShannonEntropyOf-PM-inducedBy-pdf}) is defined
554:         for a probability measure
555:         that is induced by a pdf. By the Radon-Nikodym theorem, one can
556:         define Shannon entropy for any arbitrary $\mu$-continuous probability measure as follows.
557:         %Definition: Shannon entropy of Probability measure
558:         \begin{definition}
559:          \label{Definition:ME:ShannonEntropy_of_ProbabiliyMeasure} 
560:          Let $(X,\mathfrak{M},\mu)$ be a $\sigma$-finite measure
561:         space. Entropy of any $\mu$-continuous probability measure $P$
562:         ($P \ll \mu$) is defined as
563:         \begin{equation}
564:         \label{Equation:ME:ShannonEntropy_of_ProbabilityMeasure}  
565:         S(P) = - \int_{X} \ln \frac{\ud P}{\ud \mu} \, \ud P  \enspace.
566:         \end{equation}
567:         \end{definition}
568:         Properties of entropy of a probability measure in the
569:         Definition~\ref{Definition:ME:ShannonEntropy_of_ProbabiliyMeasure} are
570:         studied in detail by
571:         Ochs~\cite{Ochs:1976:BasicPropertiesOfTheGeneralizedBoltzmann-Gibbs-ShannonEntropy} 
572:         under the name generalized Boltzmann-Gibbs-Shannon
573:         Entropy. In the literature, one can find notation of the form
574:         $S(P|\mu)$ to represent the entropy functional in
575:         (\ref{Equation:ME:ShannonEntropy_of_ProbabilityMeasure}) viz., the
576:         entropy of a  
577:         probability measure, to stress the role of the measure
578:         $\mu$ (e.g~\cite{Ochs:1976:BasicPropertiesOfTheGeneralizedBoltzmann-Gibbs-ShannonEntropy,Athreya:1994:EntropyMaximization}). Since
579:         all the information measures we define are with 
580:         respect to the measure $\mu$ on $(X, \mathfrak{M})$, we omit
581:         $\mu$ in the entropy 
582:         functional notation.
583: 
584:         By assuming $\mu$ as a probability measure in the
585:         Definition~\ref{Definition:ME:ShannonEntropy_of_ProbabiliyMeasure},
586:         one can relate Shannon entropy with Kullback-Leibler entropy
587:         as,
588:         \begin{equation}
589:         \label{Equation:ME:RelationBetweenMeasureTheoreticEntropyAndKullback} 
590:         S(P) = - I(P\|\mu) \enspace.
591:         \end{equation}
592: 	Note that when $\mu$ is not a probability measure, the
593:         divergence inequality $I(P\|\mu) \geq 0$ need not be
594:         satisfied.
595: 
596: 	A note on the
597:         $\sigma$-finiteness of measure $\mu$. In the definition of
598:         entropy functional we assumed that $\mu$ is a $\sigma$-finite
599:         measure. This condition was used by
600:         Ochs~\cite{Ochs:1976:BasicPropertiesOfTheGeneralizedBoltzmann-Gibbs-ShannonEntropy}, 
601:         Csisz\'{a}r~\cite{Csiszar:1969:OnGeneralizedEntropy}
602:         and
603:         Rosenblatt-Roth~\cite{Rosenblatt-Roth:1964:TheConceptOfEntropyInProbabilityTheory} 
604:         to tailor the measure-theoretic definitions. For all practical
605:         purposes and for most applications, this assumption is
606:         satisfied. (See
607:         \cite{Ochs:1976:BasicPropertiesOfTheGeneralizedBoltzmann-Gibbs-ShannonEntropy}
608:         for a discussion on the physical interpretation of measurable space
609:         $(X,\mathfrak{M})$ with $\sigma$-finite measure $\mu$ for
610:         entropy measure of the
611:         form~(\ref{Equation:ME:ShannonEntropy_of_ProbabilityMeasure}),
612:         and of the relaxation $\sigma$-finiteness 
613:         condition.) By relaxing this condition, more universal
614:         definitions of entropy functionals are studied
615:         by Masani~\cite{Masani:1992:TheMeasureTheoreticAspectsOfEntropy_Part_1,Masani:1992:TheMeasureTheoreticAspectsOfEntropy_Part_2}.
616: 
617: %        In this thesis we will not go into those details.  
618: 
619:   %---------------------------------------------------
620:   \subsection{Interpretation of Discrete and Continuous Entropies in
621:   terms of KL-entropy}
622:   \label{SubSection:ME:MeasureTheoreticCasesinDiscrete}	
623: 	\noindent
624:         First, let us consider discrete case of $(X, \mathfrak{M},
625: 	\mu)$, where $X= \{x_{1}, \ldots, x_{n} \} $, $\mathfrak{M} =
626: 	2^{X}$ and $\mu$ is a cardinality probability measure. Let $P$
627: 	be any probability measure on $(X, \mathfrak{M})$. Then $\mu$
628:   	and $P$ can be specified as follows.
629:         \begin{displaymath}
630:         \mu \mbox{:} \:\:\: {\mu}_{k} = \mu(\{x_{k}\})  \geq 0, \:\:k = 1,
631:         \ldots, n, \:\:\:\sum_{k=1}^{n} 
632:         \mu_{k} =1 \enspace, \:\:\: %\mbox{and}
633:         \end{displaymath}
634: 	and
635:         \begin{displaymath}
636:         P \mbox{:}\:\:\:  P_{k} = P(\{x_{k}\}) \geq 0 , \:\:k =1,
637:         \ldots, n, \:\:\: \sum_{k=1}^{n} P_{k} =1 \enspace.
638:         \end{displaymath}
639:         The probability measure $P$ is absolutely
640:         continuous with respect to the probability measure $\mu$ if
641:         $\mu_{k} =0$ implies $P_{k} =0$ for any $k=1,\ldots n$. The
642:         corresponding Radon-Nikodym 
643:         derivative of $P$ with respect to $\mu$ is given by
644:         \begin{displaymath}
645:                 \frac{\ud P}{\ud \mu}(x_{k}) = \frac{P_{k}}{\mu_{k}}, \,
646:                 k = 1, \ldots n \enspace.
647:         \end{displaymath}
648:         The measure-theoretic entropy $S(P)$
649:         (\ref{Equation:ME:ShannonEntropy_of_ProbabilityMeasure}),
650:         in this case, can be written as 
651:         \begin{displaymath}
652:         S(P) = - \sum_{k=1}^{n} P_{k}\ln \frac{P_{k}}{\mu_{k}} =
653:         \sum_{k=1}^{n} P_{k} \ln \mu_{k} - \sum_{k=1}^{n} P_{k} \ln
654:         P_{k} \enspace.
655:         \end{displaymath}
656:         If we take referential
657:         probability measure $\mu$ as a uniform probability
658:         distribution on the set $X$, i.e. $\mu_{k} = \frac{1}{n}$, we obtain
659:         \begin{equation}
660:         \label{Equation:ME:RelativionBetweenMeasureTheoreticAndDiscreteEntropies}
661:         S(P) = S_{n}(P) - \ln n \enspace,
662:         \end{equation}
663: 	where $S_{n}(P)$ denotes the Shannon entropy of pmf $P =
664:         (P_{1}, \ldots, P_{n})$ and $S(P)$ denotes
665: 	the 
666:         measure-theoretic entropy in the discrete case.
667: 
668: 	Now, lets consider the continuous case of
669: 	$(X,\mathfrak{M},\mu)$, where $X = [a,b] \subset \mathbb{R}$,
670: 	$\mathfrak{M}$ is set of Lebesgue measurable sets of $[a,b]$,
671: 	and $\mu$ is the Lebesgue probability measure. In this case
672: 	$\mu$ and $P$ can be specified as follows.
673:         \begin{displaymath}
674:         \mu \mbox{:}\:\:\: \mu(x) \geq 0 , x \in
675: 	[a,b], \ni \mu(E) = \int_{E} \mu(x) \, \ud x, \forall E \in
676: 	\mathfrak{M}, \: \int_{a}^{b} \mu(x)\, \ud x  =1 \enspace, 
677:         \end{displaymath}
678: 	and
679:         \begin{displaymath}
680:         P \mbox{:}\:\:\:  P(x) \geq 0 , x \in
681: 	[a,b], \ni  P(E) = \int_{E} P(x) \, \ud x, \forall E \in \mathfrak{M}, \:\int_{a}^{b} P(x)\, \ud x =1 \enspace.
682:         \end{displaymath}
683: 	Note the abuse of notation in the above specification of
684: 	probability measures $\mu$ and $P$, where we have used the same
685: 	symbols for both measures and pdfs. 
686: 
687: 
688:         The probability measure $P$ is absolutely continuous with
689:         respect to the probability measure $\mu$, if $\mu(x)=0$ on a
690:         set of a positive Lebesgue measure implies
691:         that $P(x)=0$ on the same 
692:         set. The Radon-Nikodym derivative of the probability measure
693:         $P$ with respect to the probability measure $\mu$ will be 
694:         \begin{displaymath}
695:                 \frac{\ud P}{\ud \mu}(x) = \frac{P(x)}{\mu(x)} \enspace.
696:         \end{displaymath}
697:         Then the measure-theoretic entropy $S(P)$ in this case
698: 	can be written as 
699:         \begin{displaymath}
700:         S(P) = - \int_{a}^{b} P(x) \ln \frac{P(x)}{\mu(x)} \, \ud x
701:         \enspace. 
702:         \end{displaymath}
703:         If we take referential probability measure $\mu$ as a uniform
704: 	distribution, i.e. $\mu(x) = \frac{1}{b-a}$, $x \in [a,b]$,
705:         then we obtain
706:         \begin{displaymath}
707:         \label{Equation:ME:RelativionBetweenMeasureTheoreticAndContinuousEntropies}
708:         S(P) = S_{[a,b]}(P) - \ln (b-a) \enspace,
709:         \end{displaymath}
710: 	where $S_{[a,b]}(P)$ denotes the Shannon entropy of pdf
711:         $P(x)$, $x \in [a,b]$ (\ref{Equation:ME:ContinuousEntropy})
712: 	and $S(P)$ denotes the measure-theoretic entropy in the
713: 	continuous case. 
714: 
715:         Hence, one can conclude that
716:         measure theoretic entropy $S(P)$ defined for a probability measure $P$ on
717:         the measure space $(X,\mathcal{M},\mu)$, is equal to both Shannon 
718:         entropy in the discrete and continuous case case up to an
719:         additive constant, when the reference measure $\mu$ is chosen as a uniform
720:         probability distribution.
721: 	On the other hand, one can see that measure-theoretic KL-entropy,
722:         in discrete and continuous cases are equal to its discrete and
723:         continuous definitions.
724:         
725:         Further, from
726:         (\ref{Equation:ME:RelationBetweenMeasureTheoreticEntropyAndKullback}) and
727:         (\ref{Equation:ME:RelativionBetweenMeasureTheoreticAndDiscreteEntropies}),
728:         we can write Shannon Entropy in terms Kullback-Leibler
729:         relative entropy
730:         \begin{equation}
731:         S_{n}(P) = \ln n - I(P \| \mu) \enspace.
732:         \end{equation}
733:         Thus, Shannon entropy appearers as being (up to an additive
734:         constant) the variation of information when we pass from the
735:         initial uniform probability distribution to new probability
736:         distribution given by $P_{k} \geq 0$, $\sum_{k=1}^{n} P_{k}
737:         =1$, as any such probability distribution is obviously
738:         absolutely continuous with respect to the uniform discrete
739:         probability distribution.
740:         Similarly, by
741:         (\ref{Equation:ME:RelationBetweenMeasureTheoreticEntropyAndKullback})
742:         and
743:         (\ref{Equation:ME:RelativionBetweenMeasureTheoreticAndContinuousEntropies})
744:         the relation between Shannon entropy and Relative entropy in
745:         discrete case 
746:         we can write Boltzmann H-function in terms of Relative entropy 
747:         as
748:         \begin{equation}
749:         S_{[a,b]}(p) = \ln (b-a) - I(P \| \mu) \enspace.
750:         \end{equation}
751:         Therefore, the continuous entropy or Boltzmann H-function
752:         $S(p)$ may be interpreted as being (up to an additive
753:         constant) the variation of information when we pass from the
754:         initial uniform probability distribution on the interval
755:         $[a,b]$ to the new probability measure defined by the
756:         probability distribution function $p(x)$ (any such 
757:         probability measure is absolutely continuous with respect to
758:         the uniform probability distribution on the interval
759:         $[a,b]$).
760: 
761: 	Thus, KL-entropy equips one with unitary interpretation of both
762: 	discrete entropy and continuous entropy.
763:         One can utilize Shannon entropy in the continuous case,
764:         as well as Shannon entropy in the discrete
765:         case, both being interpreted as the variation of information
766:         when we pass from the initial uniform distribution to the
767:         corresponding probability measure.
768: 
769:         Also,
770:         since measure theoretic entropy is equal to the discrete and
771:         continuous entropy upto an additive constant, ME prescriptions
772:         of measure-theoretic Shannon entropy are consistent with
773:         discrete case and the continuous case.
774: 
775: %=======================Section:================================
776: \section{Measure-Theoretic Definitions of Generalized Information
777:   Measures}
778: \label{Section:ME:MeasureTheoreticDefinitionsOfGeneralizedInformationMeasures}
779:         \noindent
780: %        In this section we extend the measure-theoretic definitions to
781: %        generalized information measures discussed in
782: %        Chapter~\ref{Chapter:KN}.
783: 	We begin with a brief note on the notation and assumptions
784:         used. 
785:         We define all the information measures 
786:         on the measurable space $(X,\mathfrak{M})$, and default reference
787:         measure is $\mu$ unless otherwise stated. 
788:         To avoid clumsy formulations, we will not
789:         distinguish between functions differing on a $\mu$-null set
790:         only; nevertheless, we can work with equations between
791:         $\mathfrak{M}$-measurable functions on $X$ if they are
792:         stated as valid as being only $\mu$-almost everywhere ($\mu$-a.e or
793:         a.e).
794:         Further we assume that all the quantities of interest
795:         exist and assume, implicitly, the $\sigma$-finiteness of $\mu$ and
796:         $\mu$-continuity of probability measures whenever
797:         required. Since these assumptions repeatedly occur in various
798:         definitions and formulations, these will not be mentioned in
799:         the sequel.
800:         With these assumptions we do not distinguish between 
801:         an information measure of pdf $p$ and of corresponding probability
802:         measure $P$ -- hence we give definitions of
803:         information measures for pdfs, we use  corresponding
804:         definitions of probability measures as well, when ever it is
805:         convenient or required  --  with the understanding that $P(E) = \int_{E} p\,
806:         \ud \mu $, the converse being due to the Radon-Nikodym theorem, where $p =
807:         \frac{\ud P}{\ud \mu}$. In both the cases we have $P \ll \mu$.
808: 
809:         First we consider the R\'{e}nyi generalizations.
810:         Measure-theoretic definition of R\'{e}nyi entropy can be given
811:         as follows.
812:         %DEFINITION: Measure-theoretic definition of Renyi entropy
813:         \begin{definition}
814:         \label{Definition:ME:Measure-TheoreticRenyiEntropy}
815:         R\'{e}nyi entropy 
816:         of a pdf $p:X \rightarrow {\mathbb{R}}^{+}$ on a measure space
817:         $(X,\mathfrak{M},\mu)$ is defined as 
818:         \begin{equation}
819:         \label{Equation:ME:RenyiEntropyOf-pdf}  
820:         S_{\alpha}(p) = \frac{1}{1-\alpha} \ln 
821:         \int_{X}p(x)^{\alpha}\, \ud \mu(x) \enspace, 
822:         \end{equation}
823:         provided the integral on the right exists and $\alpha \in
824:         \mathbb{R}$, $\alpha > 0$.
825:         \end{definition}%EndDEFINITION: Measure-theoretic definition of RenyiEntropy
826:         The same can be defined for any $\mu$-continuous probability
827:         measure $P$ as
828:         \begin{equation}
829:         \label{Equation:ME:RenyiEntropyOf-PM}
830:           S_{\alpha}(P) = \frac{1}{1-\alpha} \ln  \int_{X}
831:           {\left( \frac{\ud P}{\ud \mu} \right)}^{\alpha -1} \, \ud P \enspace.
832:         \end{equation}  
833:         On the other hand, R\'{e}nyi relative-entropy can be defined as
834:         follows.
835:         %DEFINITION: Measure-theoretic definition of Tsallis relative entropy
836:         \begin{definition}
837:         Let $p,r:X \rightarrow
838:         {\mathbb{R}}^{+}$ be two pdfs on measure space $(X,\mathfrak{M},\mu)$. The
839:         R\'{e}nyi relative-entropy of $p$ relative to $r$ 
840:         is defined as
841:         \begin{equation}
842:         \label{Equation:ME:RenyiRelativeEntropyOf-pdf}              
843:         I_{\alpha}(p\|r) = \frac{1}{\alpha -1} \ln \int_{X}
844:         \frac{p(x)^{\alpha}}{r(x)^{\alpha -1}} \, \ud \mu(x) \enspace,
845:         \end{equation}
846:         provided the integral on the right exists and $\alpha \in
847:         \mathbb{R}$, $\alpha > 0$.
848:         \end{definition}%EndDEFINITION: Measure-theoretic definition of Tsallis
849:          %relative entropy
850:         The same can be written in terms of probability measures as,
851:         \begin{eqnarray}
852: 	\label{Equation:ME:RenyiRelativeEntropyOf-PMs}
853:           I_{\alpha}(P\|R) &=& \frac{1}{\alpha -1} \ln   \int_{X}
854:           {\left( \frac{\ud P}{\ud R} \right)}^{\alpha -1} \, \ud P
855:           \nonumber \\
856:           &=& \frac{1}{\alpha -1} \ln   \int_{X}
857:           {\left( \frac{\ud P}{\ud R} \right)}^{\alpha} \, \ud R
858:           \enspace,
859:         \end{eqnarray}
860: 	whenever $P \ll R$; $I_{\alpha}(P \|R) = + \infty$, otherwise.
861: 	 Further if we assume $\mu$ in
862:         (\ref{Equation:ME:RenyiEntropyOf-PM}) is a probability measure
863:         then 
864: 	\begin{equation}
865: 	\label{Equation:ME:Renyi_EntropyandRelativeEntropy}
866: 	S_{\alpha}(P) = I_{\alpha}(P\|\mu) \enspace.
867: 	\end{equation}
868:         
869:         Tsallis entropy in the measure theoretic setting can be defined as
870:         follows.   
871:         %DEFINITION: Measure-theoretic definition of Tsallis entropy
872:         \begin{definition}
873:         \label{Definition:ME:Measure-TheoreticTsallisEntropy}
874:         Tsallis entropy of a pdf $p$ on $(X,\mathfrak{M},\mu)$ is
875:         defined as 
876:         \begin{equation}
877:         \label{Equation:ME:TsallisEntropyOf-pdf}  
878:         S_{q}(p) = \int_{X} p(x) \ln_{q} \frac{1}{p(x)}\, \ud \mu(x) =
879:         \frac{1 - \int_{X} p(x)^{q}\, \ud \mu(x) }{q-1}
880:         \enspace, 
881:         \end{equation}
882:         provided the integral on the right exists and $q \in
883:         \mathbb{R}$ and $q > 0$.
884:         \end{definition}%EndDEFINITION: Measure-theoretic definition
885:         %of TsallisEntropy
886: 
887: 	$\ln_{q}$ in
888:             (\ref{Equation:ME:TsallisEntropyOf-pdf}) is referred to as
889:             $q$-logarithm and is defined as $\ln_{q} x = \frac{\displaystyle
890:             x^{1-q} -1}{\displaystyle 1-q} 
891:         \:\:\: (x >0, q \in {\mathbb{R}})$.
892:         The same can be defined for $\mu$-continuous probability
893:         measure $P$, and can be written as 
894:         \begin{equation}
895: 	\label{Equation:ME:TsallisEntropyOf-PM}
896:            S_{q}(P) = \int_{X} \ln_{q}  {\left(\frac{\ud P}{\ud \mu}\right)}^{-1}
897:           \, \ud P \enspace.
898:         \end{equation}  
899:         
900:         The definition of Tsallis relative-entropy is given below. 
901:         %DEFINITION: Measure-theoretic definition of Tsallis relative entropy
902:         \begin{definition}
903:         Let $(X,\mathfrak{M},\mu)$ be a measure space. Let $p,r:X \rightarrow
904:         {\mathbb{R}}^{+}$ be two probability density functions. The
905:         Tsallis relative-entropy of $p$ relative to $r$ 
906:         is defined as
907:         \begin{equation}
908:         \label{Equation:ME:TsallisRelativeEntropyOf-pdf}            
909:         I_{q}(p\|r) = - \int_{X} p(x) \ln_{q} \frac{r(x)}{p(x)}\, \ud
910:         \mu(x)    = \frac{\int_{X} \frac{p(x)^{q}}{r(x)^{q-1}}\,
911:           \ud \mu(x) -1 }{q-1}
912:         \end{equation}
913:         provided the integral on right exists and $q \in
914:         \mathbb{R}$ and $q > 0$.
915:         \end{definition}%EndDEFINITION: Measure-theoretic definition of Tsallis
916:          %relative entropy
917:         The same can be written for two probability measures $P$ and
918:         $R$, as
919:         \begin{equation}
920:           I_{q}(P\|R)= - \int_{X} \ln_{q} {\left(\frac{\ud P}{\ud R}\right)}^{-1}\,
921:           \ud P \enspace,
922:         \end{equation}
923: 	whenever $P \ll R$; $I_{q}(P \|R) = + \infty$, otherwise.
924: 	If $\mu$ in
925:         (\ref{Equation:ME:TsallisEntropyOf-PM}) is a probability measure
926:         then 
927: 	\begin{equation}
928: 	\label{Equation:ME:Tsallis_EntropyandRelativeEntropy}
929: 	S_{q}(P) = I_{q}(P\|\mu) \enspace.
930: 	\end{equation}
931: 
932: %        We discuss the relations between generalized entropic
933: %        functionals in measure-theoretic case to discrete or continuous
934: %        case in
935: %        \S~\ref{Section:ME:MeasureTheoreticDefinitions_Revisited}. The
936: %        reason for this is the various relations discussed for
937: %        classical information measures cannot be extended to the
938: %        generalized case. As we are going to see contrary to the
939: %        classical case, where consistency of ME-prescriptions of measure-theoretic
940: %        definitions with discrete or continuous case can be argued
941: %        without invoking ME-prescriptions, consistent arguments for measure-theoretic
942: %        generalized entropy functionals involve explicitly
943: %        ME-prescriptions. Hence it is important for us to discuss the
944: %        ME-prescriptions in generalized case. First we briefly review
945: %        the ME-prescriptions in the classical case.
946: 
947: %=========================Section:=====================================
948: \section{Maximum Entropy and Canonical Distributions}
949: \label{Section:ME:MaximumEntropyAndCanonicalDistributions}
950:         \noindent
951:         For all the ME prescriptions of classical information measures
952:         we consider set of constrains of the form
953:         \begin{equation}
954:         \label{Equation:ME:ExpectationConstraints}
955:         \int_{X} u_{m} \, \ud P = \int_{X} u_{m}(x) p(x) \, \ud \mu(x) =
956:         \langle u_{m} \rangle \enspace, \:\:\:m = 
957:         1, \ldots , M \enspace,
958:         \end{equation}
959:         with respect to $\mathfrak{M}$-measurable functions $u_{m}: X
960:         \rightarrow \mathbb{R}, \:\: m = 1, \ldots M$, whose expectation
961:         values $\langle u_{m} \rangle, \, m=1,\ldots M$ are (assumed
962:         to be) {\it a priori} known, along with the normalizing
963:         constraint $\int_{X} \, \ud P =1$.
964:         (From now on we assume that any set of constraints on
965:         probability distributions implicitly includes this
966:         constraint, which will not be mentioned in the sequel.)
967: 
968: %-----Note on the notation for next chapter...	
969: %        A note on the notation: To avoid proliferation of symbols we
970: %        use the same notation for the minimum or maximum entropy
971: %        distributions and Lagrange multipliers in the various case;
972: %        the correspondence should be clear from the context. In the
973: %        maximum entropy case use $Z$ for the partition function and in
974: %        minimum entropy case we $\widehat{Z}$. 
975: 
976:         To maximize the
977:         entropy~(\ref{Equation:ME:ShannonEntropyOf-pdf})
978:         with respect
979:         to the constraints~(\ref{Equation:ME:ExpectationConstraints}), the
980:         solution is calculated via the Lagrangian:
981:         {\setlength\arraycolsep{0pt}
982:         \begin{eqnarray}
983:         \label{Equation:ME:LagranginForMaximumEntropy}
984:         \mathcal{L}(x, \lambda, \beta) = - \int_{X} \ln \frac{\ud
985:         P}{\ud \mu}(x)&& \, \ud P(x) - \lambda \left(\int_{X}\, \ud P(x) - 1
986:         \right) \nonumber \\
987:         && - \sum_{m=1}^{M} \beta_{m} \left(\int_{X} u_{m}(x)\, \ud P(x) -
988:         \langle u_{m} \rangle \right) \enspace, 
989:         \end{eqnarray}}
990:         where $\lambda$ and $\beta_{m}\, m=1,\ldots,M$ are Lagrange
991:         parameters (we use the notation $\beta = (\beta_{1}, \ldots, \beta_{M})$).
992:         \noindent
993:         The solution is given by
994:         \begin{displaymath}
995:         \ln \frac{\ud P}{\ud \mu}(x) + \lambda + \sum_{m=1}^{M}
996:         \beta_{m} u_{m}(x) = 0 \enspace.
997:         \end{displaymath}
998:         The solution can be calculated as  
999:         \begin{equation}
1000:         \ud P(x, \beta) = \exp \left( -\ln Z(\beta) - \sum_{m=1}^{M}
1001:         \beta_{m} u_{m}(x)\right) \ud \mu(x)
1002:         \end{equation}
1003:         or
1004:         \begin{equation}
1005:         p(x) = \frac{\ud P}{\ud \mu} (x) = \frac{e^{ -
1006:             \sum_{m=1}^{M} \beta_{m} 
1007:         u_{m}(x)}}{Z(\beta)}  \enspace,
1008:         \end{equation}
1009:         where the partition function $Z(\beta)$ is written as
1010:         \begin{equation}
1011:         \label{Equation:PartitionFunctionForMaximumEntropy}
1012:         Z(\beta) = \int_{X} \exp \left( - \sum_{m=1}^{M} \beta_{m}
1013:         u_{m}(x)\right) \ud \mu(x) \enspace.
1014:         \end{equation}
1015:         The Lagrange parameters $\beta_{m},\: m = 1, \ldots M$ are
1016:         specified by the set of
1017:         constraints (\ref{Equation:ME:ExpectationConstraints}).
1018: 
1019:         The maximum entropy, denoted by $S$, can be calculated as
1020:         \begin{equation}
1021:         \label{Equation:ME:MaximumEntropy}
1022:         S = \ln Z + \sum_{m=1}^{M} \beta_{m} \langle u_{m} \rangle \enspace.
1023:         \end{equation}
1024: 
1025:         The Lagrange parameters $\beta_{m},\: m = 1, \ldots M$, are
1026:         calculated by searching the unique solution (if it exists) of the
1027:         following system of nonlinear equations:
1028:         \begin{equation}
1029:         \label{Equation:ME:MaximumEntropy_ThermodynamicEquation_1}
1030:           \frac{\partial}{\partial \beta_{m}} \ln Z(\beta) = - \langle
1031:         u_{m} \rangle \enspace, \:\:\:m = 1, \ldots M \enspace. 
1032:         \end{equation}
1033:         We also have
1034:         \begin{equation}
1035:         \label{Equation:ME:MaximumEntropy_ThermodynamicEquation_2}      
1036:         \frac{\partial S}{\partial \langle u_{m} \rangle} = -
1037:         \beta_{m} \enspace, \:\:\: m = 1, \ldots M \enspace. 
1038:         \end{equation}
1039:         Equations
1040:         (\ref{Equation:ME:MaximumEntropy_ThermodynamicEquation_1}) and 
1041:         (\ref{Equation:ME:MaximumEntropy_ThermodynamicEquation_1}) are
1042:         referred to as the thermodynamic equations.
1043: 
1044: %================================Section:===================================
1045: \section{ME prescription for Tsallis Entropy}
1046: \label{Section:ME:ME-prescriptionForTsallisEntropy}
1047:         \noindent
1048:          The great success of Tsallis entropy is
1049:          attributed to the power-law distributions one can derive as
1050:          maximum entropy distributions by maximizing Tsallis entropy
1051:          with respect to the moment constraints. But there are
1052:          subtilities  involved in the choice of constraints one would
1053:          choose for ME prescriptions of these
1054:          entropy functionals. These subtilities  are still part of the
1055:          major discussion in the nonextensive formalism~\cite{FerriMartinezPlastino:2005:TheRoleOfConstraintsInTsallisNonextensiveTreatmentRevisited,AbeBagci:2005:NecessityOfqExpectation,WadaScarfone:2005:ConnectionsBetweenTsallisFormalismEtc}. 
1056:          
1057:         In the nonextensive formalism maximum entropy distributions
1058:         are derived with respect to the constraints which are
1059:         different from (\ref{Equation:ME:ExpectationConstraints}),
1060:         which are used for classical information measures. The
1061:         constraints of the
1062:         form~(\ref{Equation:ME:ExpectationConstraints}) are
1063:         inadequate for handling the serious mathematical difficulties
1064:         (see~\cite{TsallisMendesPlastino:1998:TheRoleOfConstraints}). To
1065:         handle these difficulties constraints of the form
1066:         \begin{equation}
1067:         \label{Equation:ME:Normalized-q-ExpectationConstraints}  
1068:         \frac{\int_{X} u_{m}(x) p(x)^{q} \, \ud \mu(x)}{\int_{X}
1069:           p(x)^{q}\, \ud \mu(x)} = {\langle\langle u_{m} \rangle\rangle}_{q} \enspace, m = 
1070:         1, \ldots , M
1071:         \end{equation}
1072: 	are proposed.
1073:         (\ref{Equation:ME:Normalized-q-ExpectationConstraints}) can
1074:           be considered as the expectation with respect to the
1075:           modified probability measure $P_{(q)}$ (it is indeed a
1076:           probability measure) defined as
1077:           \begin{equation}
1078:             P_{(q)}(E) = {\left( \int_{X} p(x)^{q} \, \ud \mu
1079:               \right)}^{-1} \int_{E} p(x)^{q} \, \ud \mu \enspace.
1080:           \end{equation}
1081:           The measure $ P_{(q)}$ is known as escort probability
1082:           measure. 
1083: 
1084:           The variational principle for Tsallis entropy maximization
1085:           with respect to
1086:           constraints~(\ref{Equation:ME:Normalized-q-ExpectationConstraints})
1087:           can be written as
1088:           \begin{eqnarray}
1089:           \label{Equation:ME:Lagrangin_TsallisMaximumEntropy_wrt_Norm-q-Expt}
1090:           \mathcal{L}(x, \lambda, \beta) =  &&\int_{X} \ln_{q}
1091:           \frac{1}{p(x)} \, \ud P(x) - \lambda \left(\int_{X}\, \ud P(x) - 1
1092:           \right) \nonumber \\
1093:           && - \sum_{m=1}^{M} \beta^{(q)}_{m} \left(\int_{X} {p(x)}^{q-1}
1094:           \left(u_{m}(x) - {\langle\langle u_{m}  \rangle\rangle}_{q}
1095:           \right) \, \ud P(x) \right) \enspace,
1096:           \end{eqnarray}
1097:           where the parameters $\beta_{m}^{(q)}$ can be defined in
1098:           terms of true Lagrange parameters $\beta_{m}$ as
1099:          \begin{equation}
1100:            \beta_{m}^{(q)} = {\left(\int_{X} p(x)^{q}\, \ud \mu
1101:              \right)}^{-1} \beta_{m}\enspace, \, m = 1, \ldots, M.
1102:           \end{equation}
1103:           The maximum entropy distribution in this case can be written
1104:           as
1105:           \begin{equation}
1106:           \label{Equation:ME:TsallisMaximumEntropyDistribution_wrt_q-Expt}   
1107:           p(x) = \frac{\displaystyle {\left[ 1 - (1-q)  {\left( \int
1108:             dx\,{p(x)}^{q} \right)}^{-1}  \sum_{m=1}^{M} \beta_{m} \left( u_{m}(x) - 
1109:           {\langle\langle {u}_{m} \rangle\rangle}_{q} \right) \right]}^{\frac{1}{1-q}}}
1110:           {\displaystyle {\overline{Z_{q}}}  }
1111:           \end{equation}  
1112: 
1113:          
1114:          \begin{equation}
1115:          \label{Equation:ME:TsallisMaximumEntropyDistribution_wrt_q-Expt_q-exponentialForm} 
1116:          p(x) = \frac{\displaystyle e_{q}^{-   {\left(\int_{X} p(x)^{q}\, \ud \mu
1117:              \right)}^{-1}   \sum_{m=1}^{M} \beta_{m} (u_{m}(x) -
1118:              {\langle\langle u_{m}\rangle\rangle}_{q}  )
1119:          }}{\displaystyle \overline{Z_{q}}} \enspace,         
1120:          \end{equation}
1121:          where
1122:          \begin{equation}
1123:            \overline{Z_{q}} = \int_{X} {e_{q}^{- {\left(\int_{X} p(x)^{q}\, \ud \mu
1124:              \right)}^{-1}   \sum_{m=1}^{M} \beta_{m} (u_{m}(x) -
1125:              {\langle\langle u_{m}\rangle\rangle}_{q}  ) }} \, \ud \mu(x) \enspace.
1126:          \end{equation}
1127: 
1128:         Maximum Tsallis entropy in this case satisfies
1129:         \begin{equation}
1130:         S_{q} = \ln_{q}\overline{{Z}_{q}} \enspace,
1131:         \end{equation}
1132:         while corresponding thermodynamic equations can be written
1133:         as 
1134:         \begin{equation}
1135:         \frac{\partial}{\partial \beta_{m}} \ln_{q} Z_{q}  =  -
1136:         {\langle\langle{{u}_{m}}\rangle\rangle}_{q} \enspace, \:\:\: m = 1, \ldots M
1137:         \enspace,
1138:         \end{equation}
1139:         \begin{equation}
1140:         \frac{\partial S_{q}}{\partial
1141:         {\langle\langle{{u}_{m}}\rangle\rangle}_{q}  }  =  -
1142:         \beta_{m} \enspace, \:\:\: m =1, \ldots M \enspace,
1143:         \end{equation}
1144:         where
1145:         \begin{equation}
1146:         \ln_{q} Z_{q} = \ln_{q} \overline{{Z}_{q}}
1147:         - \sum_{m=1}^{M} \beta_{m}
1148:         {\langle\langle{{u}_{m}}\rangle\rangle}_{q} \enspace.
1149:         \end{equation}
1150: 
1151: %=============================================================================
1152: \section{Measure-Theoretic Definitions: Revisited}
1153: \label{Section:ME:MeasureTheoreticDefinitions_Revisited}
1154:        \noindent
1155: 	It is well known that unlike Shannon entropy, Kullback-Leibler
1156:        relative-entropy in the discrete 
1157:        case can be extended naturally to the measure-theoretic
1158:        case. 
1159:        In this section, we show
1160:        that this fact is true for generalized relative-entropies
1161:        too. R\'{e}nyi relative-entropy on continuous valued space
1162:        $\mathbb{R}$ and its  
1163:        equivalence with the discrete case is studied
1164:        by R\'{e}nyi~\cite{Renyi:1960:SomeFundamentalQuestionsOfInformationTheory}. Here,
1165:        we present the result in the measure-theoretic case and
1166:        conclude that both measure-theoretic definitions of Tsallis and
1167:        R\'{e}nyi relative-entropies are equivalent to its discrete
1168:        case. 
1169: 
1170:        We also present a result pertaining to ME of
1171:        measure-theoretic Tsallis entropy. We show that ME of Tsallis
1172:        entropy in the measure-theoretic case is consistent with the
1173:        discrete case.
1174: 
1175:    %-----------------------Sub Section------------------     
1176:   \subsection{On Measure-Theoretic Definitions of Generalized Relative-Entropies}
1177:        \noindent
1178:         Here we show that generalized relative-entropies in the
1179:         discrete case can be naturally extended to measure-theoretic
1180:         case, in the  sense that measure-theoretic definitions can
1181:         be defined as a limit of a sequence of finite discrete
1182:         entropies of pmfs which approximate the pdfs involved.
1183:         We call this
1184:         sequence of pmfs as ``approximating sequence of pmfs of a
1185:         pdf''. To formalize these aspects we need the following 
1186:         lemma. 
1187:         %--------------Lemmma-------------
1188:         \begin{lemma}
1189:         \label{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf}  
1190:         Let $p$ be a pdf defined on measure space
1191:         $(X,\mathfrak{M},\mu)$. Then there exists a sequence of simple
1192:         functions $\{f_{n}\}$ (we refer to them as approximating sequence of
1193:         simple functions of $p$) such that $\lim_{n \to \infty} f_{n} = p$
1194:         and each $f_{n}$ can be written as
1195:         \begin{equation}
1196:         \label{Equation:ME:ActualDefinitionOfSeqenceOfSimpleFunctions} 
1197:          f_{n}(x) = \frac{1}{\mu(E_{n,k})} \int_{E_{n,k}} p \, \ud 
1198:         \mu \enspace, \:\:\:\:\:\:\: \forall x \in E_{n,k},
1199: 	 \:\:\: k = 1, \ldots m(n) \enspace,
1200:         \end{equation}  
1201:         where $(E_{n,1}, \ldots, E_{n,m(n)})$ is the measurable 
1202:         partition corresponding to $f_{n}$ (the notation $m(n)$
1203:         indicates that $m$ varies with $n$).  Further each $f_{n}$
1204:   	satisfies 
1205:         \begin{equation}
1206:          \int_{X} f_{n} \, \ud \mu = 1 \enspace.
1207:         \end{equation}  
1208:         \end{lemma}
1209:         %Proof----
1210:         \proof
1211: %	\footnote{$ \cup_{k=1}^{m(n)} E_{n,k} = X$ and $E_{n,i}
1212: %        \cap E_{n,j} = \emptyset$, $\forall i \neq j$} 
1213:          Define a sequence of simple functions $\{f_{n}\}$ as
1214:         \begin{equation}
1215:          f_{n}(x) = \left\{ \begin{array}{ll}
1216:           \frac{1}{ \mu p^{-1} \left(
1217:             \left[ \frac{k}{2^{n}}, \frac{k+1}{2^{n}} \right) \right)}
1218:             \displaystyle \int_{  p^{-1} \left(
1219:             \left[ \frac{k}{2^{n}}, \frac{k+1}{2^{n}} \right) \right)
1220:             } p \, \ud \mu \enspace,& \: \:
1221:          \:\:\textrm{if}\:\:  \frac{k}{2^{n}} \leq p(x) <
1222:          \frac{k+1}{2^{n}} , \\
1223:          & \:\:\:k = 0, 1, \ldots n 2^{n}-1
1224:          \\ \\
1225:          \frac{1}{ \mu p^{-1} \left(
1226:             \left[ n, \infty \right) \right)}
1227:             \displaystyle \int_{  p^{-1} \left(
1228:             \left[ n , \infty \right) \right)
1229:             } p \, \ud \mu \enspace,& \: \:
1230:          \:\:\textrm{if}\:\: n \leq p(x),
1231:            \end{array} \right.
1232:          \end{equation}
1233:          Each $f_{n}$ is indeed a simple function and can be written as
1234:          \begin{equation}
1235:           f_{n} = \sum_{k=0}^{n2^{n}-1} \left( \frac{1}{\mu E_{n,k}} 
1236:           \int_{E_{n,k}} p\, \ud \mu \right) \chi_{E_{n,k}} + \left( \frac{1}{\mu
1237:             F_{n}} \int_{F_{n}} p \, \ud \mu \right) \chi_{F_{n}} \enspace, 
1238:          \end{equation}
1239:          where $E_{n,k} =
1240:          p^{-1}\left(\left[\frac{k}{2^{n}},\frac{k+1}{2^{n}}\right)
1241:           \right)$, $k= 0, \ldots, n2^{n}-1$ and $F_{n} = p^{-1} \left(
1242:             \left[ n, \infty \right) \right)$. 
1243:          Since $\int_{E} p \, \ud \mu < \infty$ for any $E \in
1244:          \mathfrak{M}$, we have $\int_{E_{n,k}} p\, \ud \mu = 0$
1245:          whenever $\mu E_{n,k} =0$, for $k = 0, \ldots n2^{n} -1$. Similarly 
1246:          $\int_{F_{n}} p\, \ud \mu = 0$ whenever $\mu F_{n} =0$.
1247:          Now we show that $\lim_{n \to \infty} f_{n} = p$, point-wise.
1248:          
1249:          First assume that $p(x) < \infty$. Then $\exists \: n \in
1250:          {\mathbb{Z}}^{+} \ni p(x) \leq n$. Also $\exists \, k \in
1251:          {\mathbb{Z}}^{+} $, $0 \leq k 
1252:          \leq n2^{n}-1
1253:          \ni \frac{k}{2^{n}} \leq p(x) < 
1254:          \frac{k+1}{2^{n}}$ and $\frac{k}{2^{n}} \leq f_{n}(x) <
1255:          \frac{k+1}{2^{n}}$. This implies $0 \leq |p - f_{n} | <
1256:          \frac{1}{2^{n}}$ as required.
1257: 
1258:          If $p(x) = \infty$, for some $x \in X$, then $x \in F_{n}$ for
1259:          all $n$, and therefore $f_{n}(x) \geq n$ for all $n$; hence
1260:          $\lim_{n \to \infty} f_{n}(x) = \infty = p(x) $.
1261: 
1262:          Finally we have
1263:          \begin{eqnarray}
1264:            \int_{X} f_{n} \, \ud \mu &=& \sum_{k=1}^{n(m)} \left[
1265:            \frac{1}{\mu(E_{n,k})} \int_{E_{n,k}} p \,\ud \mu \right]
1266:             \mu(E_{n,k}) \nonumber \\
1267:             &=& \sum_{k=1}^{n(m)} \int_{E_{n,k}} p \,\ud \mu \nonumber \\
1268:             &=& \int_{X} p \, \ud \mu =1 \nonumber
1269:          \end{eqnarray}  
1270:          \endproof
1271:          %-------------End: lemmma-----------------
1272:          The above construction of a sequence of simple functions which
1273:          approximate a measurable function is similar to the
1274:          approximation theorem~\cite[pp.6, Theorem
1275:            1.8(b)]{Kantorovitz:2003:IntroductionToModernAnalysis} in
1276:          the theory of integration. But, approximation in
1277:          Lemma~\ref{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf}
1278:          can be seen as a mean-value approximation where as in the later
1279:          case it is the lower approximation. Further, unlike in the case
1280:          of lower approximation, the sequence of simple functions 
1281:          which approximate $p$ in
1282:          Lemma~\ref{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf}
1283:          are neither monotone nor satisfy $f_{n} \leq p$.
1284:              
1285:         Now one can define a sequence of pmfs $\{\tilde{p}_{n}\}$ corresponding
1286:         to the sequence 
1287:         of simple functions constructed in
1288:         Lemma~\ref{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf},
1289:         denoted by $\tilde{p}_{n} = (\tilde{p}_{n,1}, \ldots,\tilde{p}_{n,m(n)})$, as 
1290:         \begin{equation}
1291:         \label{Equation:ME:ActualDefinitionOfSeqenceOfPmfs} 
1292:          \tilde{p}_{n,k} = \mu(E_{n,k})f_{n}\chi_{E_{n,k}} = \int_{E_{n,k}} p \, \ud 
1293:          \mu \enspace, k = 1, \ldots m(n),
1294:         \end{equation}
1295:         for any $n$.
1296:         We have
1297:         \begin{equation}
1298:          \sum_{k=1}^{m(n)} \tilde{p}_{n,k} = \sum_{k=1}^{m(n)} \int_{E_{n,k}} p
1299:          \, \ud \mu
1300:          = \int_{X} p \, \ud \mu =1 \enspace,
1301:         \end{equation}
1302:         and hence $\tilde{p}_{n}$ is indeed a pmf. 
1303:         We call $\{\tilde{p}_{n}\}$ as the approximating sequence of pmfs of pdf
1304:         $p$.
1305:        
1306: %        We say an measure-theoretic definition of an information
1307: %        measure $\overline{S}$ is exact if
1308: %        \begin{equation}
1309: %         \lim_{n \to \infty} \overline{S}(P_{n}) = \overline{S}(p) \enspace.
1310: %        \end{equation}
1311: 
1312:           Now we present our main theorem, where we assume that $p$ and
1313: 	  $r$ are bounded. The
1314:           assumption of boundedness of $p$ and $r$ simplifies the
1315: 	  proof. However, the result can be
1316:           extended to an unbounded
1317:           case. See~\cite{Renyi:1959:OnTheDimensionAndEntropyOfProbabilityDistributions}
1318:           analysis of Shannon entropy and relative entropy on $\mathbb{R}$.
1319:          %THEOREM:Measure-theoretic definition of generalized relative entropies.
1320:          \begin{theorem}
1321:          \label{Theorem:ME:MeasureTheoreticDefinitionsOfGeneralizedRelative-Entropies}
1322:             Let $p$ and $r$ be pdf, which are bounded, defined on a
1323:             measure space $(X,\mathfrak{M}, \mu)$. Let $\tilde{p}_{n}$
1324:             and $\tilde{r}_{n}$ be the approximating sequence of pmfs of $p$ and $r$
1325:             respectively. Let $I_{\alpha}$ denotes the R\'{e}nyi relative-entropy as
1326:             in~(\ref{Equation:ME:RenyiRelativeEntropyOf-pdf}) and
1327: 	  $I_{q}$ denote the Tsallis 
1328:             relative-entropy as
1329:             in~(\ref{Equation:ME:TsallisRelativeEntropyOf-pdf}) 
1330:             then
1331:             \begin{equation}
1332:             \label{Equation:ME:InRenyisTheoremStatement_2}              
1333:             \lim_{n \to \infty} I_{\alpha}(\tilde{p}_{n} \| \tilde{r}_{n}) = I_{\alpha}(p\|r)
1334:             \end{equation}
1335: 	     and
1336:             \begin{equation}
1337:             \label{Equation:ME:InRenyisTheoremStatement_1}  
1338:             \lim_{n \to \infty} I_{q}(\tilde{p}_{n} \| \tilde{r}_{n}) = I_{q}(p\|r)
1339:             \end{equation}
1340:          \end{theorem}  
1341:          \proof
1342:          It is enough to prove the result for either Tsallis or
1343:          R\'{e}nyi since each are monotone and continuous functions of
1344:          each other. Hence we write down the proof for the case of R\'{e}nyi
1345:          and we use the entropic index $\alpha$ in the proof.
1346: 
1347:          Corresponding to pdf $p$, let $\{f_{n}\}$ be the approximating 
1348:          sequence of simple functions such that $\lim_{n \to \infty}
1349:          f_{n} = p$ as in
1350:          Lemma~\ref{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf}.
1351:          Let $\{g_{n}$ be the approximating sequence of simple
1352:          functions for $r$ such that $\lim_{n \to \infty} g_{n} = r$.  
1353:          Corresponding
1354:          to simple functions $f_{n}$ and $g_{n}$ there exists a common
1355:          measurable partition\footnote{Let $\varphi$ and $\phi$ are two
1356:          simple functions defined on $(X,\mathfrak{M})$. Let $\{E_{1},
1357:          \ldots E_{n}\}$ and $\{F_{1},\ldots, F_{m}\}$ be the measurable
1358:          partitions corresponding to $\varphi$ and $\phi$
1359:          respectively. Then partition defined as $\{E_{i} \cap E_{j} |
1360:          i = 1, \ldots n,\:\: j =1, \ldots m\}$ is a common measurable
1361:          partition for both $\varphi$ and $\phi$.}
1362:          $\{ E_{n,1}, \ldots E_{n,m(n)}\}$ such
1363:          that $f_{n}$ and $g_{n}$ can be written as
1364:          \begin{equation}
1365:          \label{Equation:ME:InRenyisTheorem_1_a}  
1366:            f_{n}(x) = \sum_{k=1}^{m(n)} (a_{n,k})
1367:            \chi_{E_{n,k}}(x) \enspace, \:\:\: a_{n,k} \in
1368:                {\mathbb{R}}^{+}, \, \forall k = 1, \ldots m(n) \enspace,
1369:          \end{equation}
1370:          \begin{equation}
1371:          \label{Equation:ME:InRenyisTheorem_1_b}             
1372:            g_{n}(x) = \sum_{k=1}^{m(n)} (b_{n,k})
1373:            \chi_{E_{n,k}}(x) \enspace, \:\:\: b_{n,k} \in
1374:                {\mathbb{R}}^{+}, \, \forall k = 1, \ldots m(n) \enspace,
1375:          \end{equation}
1376:          where  $\chi_{E_{n,k}}$ is the characteristic function of
1377:          $E_{n,k}$, for $k=1,\ldots m(n)$. By
1378:          (\ref{Equation:ME:InRenyisTheorem_1_a}) and
1379:          (\ref{Equation:ME:InRenyisTheorem_1_b}) the approximating
1380:          sequences of pmfs $\{\tilde{p}_{n} = (\tilde{p}_{n,1},
1381:          \ldots, \tilde{p}_{n,m(n)})\}$  
1382:           and $\{\tilde{r}_{n} = (\tilde{r}_{n,1}, \ldots,
1383:          \tilde{r}_{n,m(n)})\}$ can be written as
1384: %	corresponding 
1385: %          to pdfs $p$ and $r$ respectively can be written as $\tilde{p}_{n,k}
1386: %          =  (a_{n,k}) \mu(E_{n,k}),\, k = 1, \ldots , m(n) $ and
1387: %          $ \tilde{r}_{n,k}= (b_{n,k}) \mu(E_{n,k}), \, k = 1, \ldots ,
1388: %          m(n)$
1389:           (see (\ref{Equation:ME:ActualDefinitionOfSeqenceOfPmfs}))
1390:          \begin{equation}
1391: 	 \label{Equation:ME:InRenyisTheorem_2_a}
1392:            \tilde{p}_{n,k} =  a_{n,k} \mu(E_{n,k})\:\:\: k = 1, \ldots , m(n) \enspace,
1393:          \end{equation}
1394:          \begin{equation}
1395: 	 \label{Equation:ME:InRenyisTheorem_2_b}
1396:            \tilde{r}_{n,k} =  b_{n,k} \mu(E_{n,k})\:\:\: k = 1, \ldots , m(n) \enspace.
1397:          \end{equation}
1398:       	 Now R\'{e}nyi
1399:          relative entropy for $\tilde{p}_{n}$ and 
1400:          $\tilde{r}_{n}$ can be written as
1401:          \begin{equation}
1402:          \label{Equation:ME:InRenyisTheorem_2}  
1403:            S_{\alpha}(\tilde{p}_{n} \| \tilde{r}_{n}) =
1404:          \frac{1}{\alpha-1} \ln \sum_{k=1}^{m(n)} 
1405:            \frac{a_{n,k}^{\alpha}}{b_{n,k}^{\alpha -1}}
1406:            \mu(E_{n,k}) \enspace.
1407:          \end{equation}
1408: 
1409:          To prove $\lim_{n \rightarrow \infty} S_{\alpha}(\tilde{p}_{n} \|
1410:          \tilde{r}_{n}) = S_{\alpha}(p \| r) $ it is enough to prove that
1411:          \begin{equation}
1412:          \label{Equation:ME:InRenyisTheorem_2}             
1413:            \lim_{n \rightarrow \infty} \frac{1}{\alpha-1} \ln
1414:            \int_{X} \frac{ {f_{n}(x)}^{\alpha} }{
1415:              {g_{n}(x)}^{\alpha-1}} \, \ud \mu(x)
1416:             =   \frac{1}{\alpha-1} \ln
1417:            \int_{X} \frac{ {p(x)}^{\alpha} }{
1418:              {r(x)}^{\alpha-1}} \, \ud \mu(x) \enspace,
1419:           \end{equation}  
1420:            since we have\footnote{ Since simple functions
1421:            ${\left(f_{n}\right)}^{\alpha}$ and ${\left(g_{n}\right)}^{\alpha-1}$ can be
1422:            written as
1423:            \begin{displaymath}
1424:              {\left(f_{n}\right)}^{\alpha}(x) = \sum_{k=1}^{m(n)} 
1425:              \left( a_{n,k}^{\alpha} \right) \chi_{E_{n,k}}(x)
1426:             \enspace, \:\:\:\:\:\mbox{and}
1427:            \end{displaymath}
1428:            \begin{displaymath}
1429:              {\left(g_{n}\right)}^{\alpha-1}(x) = \sum_{k=1}^{m(n)} 
1430:              \left( b_{n,k}^{\alpha-1} \right) \chi_{E_{n,k}}(x) \enspace.
1431:            \end{displaymath}
1432:            Further,
1433:            \begin{displaymath}
1434:            \frac{f_{n}^{\alpha}}{g_{n}^{\alpha-1}}(x)
1435:             =    \sum_{k=1}^{m(n)} \left( \frac{ 
1436:              a_{n,k}^{\alpha} }{b_{n,k}^{\alpha-1}} \right)   \chi_{E_{n,k}}(x) \enspace.
1437:            \end{displaymath}
1438:          }%Endfootnote 
1439:          \begin{equation}
1440:          \label{Equation:ME:InRenyisTheorem_3}              
1441:            \int_{X} \frac{{f_{n}(x)}^{\alpha}}{{g_{n}(x)}^{\alpha -1} } \,
1442:            \ud \mu(x) =
1443:            \sum_{k=1}^{m(n)}
1444:            \frac{a_{n,k}^{\alpha}}{b_{n,k}^{\alpha-1}} \mu(E_{n,k}) \enspace.  
1445:          \end{equation}  
1446:           Further it is enough  to prove that 
1447:          \begin{equation}
1448:          \label{Equation:ME:InRenyisTheorem_3}                         
1449:            \lim_{n \rightarrow \infty} 
1450:            \int_{X}  {h_{n}(x)}^{\alpha} g_{n}(x)   \, \ud \mu(x)
1451:             =   
1452:            \int_{X} \frac{{p(x)}^{\alpha} }{
1453:              {r(x)}^{\alpha-1}} \, \ud \mu(x) \enspace,
1454:           \end{equation}
1455:          where $h_{n}$ is defined as $h_{n}(x) =
1456:          \frac{f_{n}(x)}{g_{n}(x)} $.\\ 
1457:         %Case 1---------
1458:         \noindent
1459:         {\em \underline {Case 1: $0 < \alpha < 1$}}
1460:         
1461:         In this case
1462:         the {\em Lebesgue dominated convergence
1463:           theorem}~\cite[pp.26]{Rudin:1966:RealAndComplexAnalysis}
1464:         gives that,
1465:          \begin{equation}
1466:          \label{Equation:ME:InRenyisTheorem_4}                                    
1467:            \lim_{n \to \infty} \int_{X}
1468:            \frac{f_{n}^{\alpha}}{g_{n}^{\alpha -1}} \, \ud \mu =
1469:            \int_{X} \frac{p^{\alpha}}{r^{\alpha -1}} \, \ud \mu \enspace.
1470:          \end{equation}
1471:          and hence (\ref{Equation:ME:InRenyisTheoremStatement_1})
1472: 
1473:          %Case 2-----------
1474:          \noindent
1475:          {\em \underline {Case 2: $\alpha  > 1$}}
1476: 
1477:          We have $h_{n}^{\alpha} f_{n}
1478:          \rightarrow \frac{f(x)^{\alpha}}{g(x)^{\alpha-1}}$ {\em
1479:            a.e}. By {\em Fatou's
1480:            Lemma}~\cite[pp.23]{Rudin:1966:RealAndComplexAnalysis} we
1481:          obtain that, 
1482:          \begin{equation}
1483:          \label{Equation:ME:InRenyisTheorem_LimInfInequality}  
1484:            \lim_{n \to \infty} \inf \int_{X}
1485:            h_{n}(x)^{\alpha} g_{n}(x) \, \ud \mu(x) \geq
1486:            \int_{X} \frac{{p(x)}^{\alpha} }{
1487:              {r(x)}^{\alpha-1}} \, \ud \mu(x) \enspace.
1488:          \end{equation}
1489:          From the construction of $f_{n}$ and $g_{n}$
1490:          (Lemma~\ref{Lemma:ME:ExistenceOfApproximatingSequenceOfSimpleFunctionsForPdf}) 
1491:          we have
1492:          \begin{equation}
1493:          \label{Equation:ME:InRenyisTheorem_5}                                               
1494:          h_{n}(x) f_{n}(x) = \frac{1}{\mu(E_{n,i})} \int_{E_{n,i}} 
1495:          \frac{p(x)}{r(x)} r(x) \, \ud \mu \enspace, \:\:\: \forall x
1496:          \in E_{n,i} \enspace.
1497:          \end{equation}
1498:          By Jensen's inequality we get
1499:          \begin{equation}
1500:          \label{Equation:ME:InRenyisTheorem_6}
1501:          h_{n}(x)^{\alpha} f_{n}(x) \leq \frac{1}{\mu(E_{n,i})}
1502:            \int_{E_{n,i}}  \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}}   \,
1503:            \ud \mu \enspace, \:\:\: \forall x \in E_{n,i} \enspace.
1504:          \end{equation}  
1505:          By (\ref{Equation:ME:InRenyisTheorem_1_a}) and
1506:          (\ref{Equation:ME:InRenyisTheorem_1_b}) we can write
1507:          (\ref{Equation:ME:InRenyisTheorem_6}) as
1508:          \begin{equation}
1509:          \label{Equation:ME:InRenyisTheorem_7}           
1510:            \frac{a_{n,i}^{\alpha}}{b_{n,i}^{\alpha-1}}  \mu(E_{n,i})   \leq
1511:            \int_{E_{n,i}}   \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}}
1512:            \, \ud \mu \enspace, \:\:\: \forall i = 1, \ldots m(n) \enspace.
1513:          \end{equation}  
1514:          By taking summations both sides of
1515:          (\ref{Equation:ME:InRenyisTheorem_7}) we get 
1516:          \begin{equation}
1517:          \label{Equation:ME:InRenyisTheorem_8}                      
1518:           \sum_{i=1}^{m(n)}  \frac{a_{n,i}^{\alpha}}{b_{n,i}^{\alpha-1}}  \mu(E_{n,i})   \leq
1519:           \sum_{i=1}^{m(n)} \int_{E_{n,i}}
1520:           \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}}  \, \ud \mu \enspace,
1521:           \:\:\: \forall i = 1, \ldots m(n) \enspace.
1522:          \end{equation}
1523:          The above equation (\ref{Equation:ME:InRenyisTheorem_8}) nothing but
1524:          \begin{displaymath}
1525:           \int_{X} h_{n}^{\alpha}(x) f_{n}(x) \, \mu(x)   \leq
1526:           \int_{X}  \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}}
1527:             \, \ud \mu \enspace, \:\:\: \forall n \enspace,
1528:          \end{displaymath}
1529:          and hence
1530:          \begin{displaymath}
1531:          \sup_{i > n } \int_{X} h_{i}^{\alpha}(x) f_{i}(x) \, \mu(x)
1532:          \leq \int_{X}  \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}} 
1533:            \, \ud \mu \enspace, \:\:\: \forall n \enspace.
1534:          \end{displaymath}
1535:          Finally we have
1536:          \begin{equation}
1537:          \label{Equation:ME:InRenyisTheorem_LimSupInequality}  
1538:            \lim_{n \to \infty} \sup \int_{X}
1539:            h_{n}^{\alpha}(x) f_{n}(x) \, \mu(x)   \leq \int_{X}
1540:             \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}} \, \ud \mu \enspace.
1541:          \end{equation}
1542:          From (\ref{Equation:ME:InRenyisTheorem_LimInfInequality}) and
1543:          (\ref{Equation:ME:InRenyisTheorem_LimSupInequality}) we have
1544:          \begin{equation}
1545:          \label{Equation:ME:InRenyisTheorem_LimEquality}  
1546:            \lim_{n \to \infty} \int_{X}
1547:              \frac{f_{n}(x)^{\alpha}}{g_{n}(x)^{\alpha-1}}  \, \mu(x) = \int_{X}
1548:             \frac{p(x)^{\alpha}}{r(x)^{\alpha-1}} \, \ud \mu \enspace,
1549:          \end{equation}
1550:          and hence (\ref{Equation:ME:InRenyisTheoremStatement_1}).
1551:          \endproof 
1552:           %EndProof--------------
1553:          
1554:   %--------------------Sub Section-----------------------------
1555:   \subsection{On ME of Measure-Theoretic  definition of Tsallis entropy}
1556:          \noindent
1557:          With the shortcomings of Shannon entropy that it cannot be
1558:          naturally extended to the non-discrete case, we have observed
1559:          that Shannon entropy in its general case on measure space can
1560:          be used consistently for the ME-prescriptions. One can easily
1561:          see that generalized information measures of R\'{e}nyi and Tsallis
1562:          too cannot be extended naturally to measure-theoretic case,
1563:          i.e., measure-theoretic definitions are not equivalent to the 
1564:          discrete case in the sense that they can not be defined as a
1565:          limit of sequence of finite discrete entropies corresponding to
1566:          pmfs defined on measurable partitions which approximates the
1567:          pdf. One can use the same counter example we discussed in
1568:          \S~\ref{SubSection:ME:DiscreteToContinuous}. We have already
1569:          given the ME-prescriptions of Tsallis entropy in the
1570:          measure-theoretic case. In this section, we show that the
1571:          ME-prescriptions in the measure-theoretic case are consistent 
1572:          with the discrete case.
1573: 
1574:  	 Proceeding as in the case of measure-theoretic entropy in
1575:          \S~\ref{SubSection:ME:MeasureTheoreticCasesinDiscrete},
1576:          measure-theoretic Tsallis
1577:          entropy $S_{q}(P)$~(\ref{Equation:ME:TsallisEntropyOf-PM}) in
1578:          the discrete case can be written as
1579:          \begin{equation}
1580: 	 \label{Equation:ME:MeasureTheoreticTsallisEntropyInDiscreteForm}
1581:          S_{q}(P) = \sum_{k=1}^{n} P_{k} \ln_{q} \frac{\mu_{k}}{P_{k}} \enspace.
1582:          \end{equation}
1583:          By (\ref{Equation:KN:PropertyOflnq(x/y)}) we get
1584:          \begin{equation}
1585: 	 \label{Equation:ME:MeasureTheoreticTsallisEntropyInDiscreteForm_1}
1586:          S_{q}(P) = \sum_{k=1}^{n} P_{k}^{q} \left[ \ln_{q} \mu_{k} -
1587:          \ln_{q} P_{k} \right] = S_{q}^{n}(P) + \sum_{k=1}^{n} P_{k}^{q}
1588:          \ln_{q} \mu_{k} \enspace,
1589:          \end{equation}
1590:          where $S_{q}^{n}(P)$ is the Tsallis entropy in discrete case.
1591:          When $\mu$ is a uniform distribution i.e., $\mu_{k} =
1592:          \frac{1}{n}\:\: \forall n = 1, \ldots n$ we get
1593:          \begin{equation}
1594: 	 \label{Equation:ME:MeasureTheoreticTsallisEntropyInDiscreteForm_1}
1595:          S_{q}(P) = S_{q}^{n}(P) - n^{q-1} \ln_{q} n \sum_{k=1}^{n}
1596:          P_{k}^{q} \enspace.
1597:          \end{equation}
1598:          Now we show that the quantity $\sum_{k=1}^{n} P_{k}^{q}$ is
1599:          constant in maximization of $S_{q}(P)$ with respect to the
1600:          set of constraints
1601:          (\ref{Equation:ME:Normalized-q-ExpectationConstraints}).
1602: 
1603:          The claim is that
1604:          \begin{equation}
1605:          \label{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation}
1606:          \int p(x)^{q}\, \ud \mu(x) = {(\overline{Z_{q}})}^{1-q} \enspace,
1607:          \end{equation}
1608: 	 which holds for Tsallis maximum entropy distribution
1609:          (\ref{Equation:ME:TsallisMaximumEntropyDistribution_wrt_q-Expt})
1610:          in general. This can be shown as follows. From
1611:          the maximum entropy 
1612:          distribution~(\ref{Equation:ME:TsallisMaximumEntropyDistribution_wrt_q-Expt}),
1613:          we have 
1614:          \begin{displaymath}
1615:          p(x)^{1-q} = \frac{\displaystyle 1 - (1-q)  {\left( \int_{X}
1616:             {p(x)}^{q}\, \ud \mu(x) \right)}^{-1}  \sum_{m=1}^{M}
1617:          \beta_{m} \left( u_{m}(x) -  
1618:          {\langle\langle {u}_{m} \rangle\rangle}_{q} \right)}
1619:          {\displaystyle ({\overline{Z_{q}}})^{1-q}  } \enspace,
1620:          \end{displaymath}
1621:          which can be rearranged as
1622:          \begin{displaymath}
1623:          ({\overline{Z_{q}}})^{1-q} p(x) = \left[ 1 - (1-q)
1624:          \frac{\sum_{m=1}^{M} \beta_{m} \left( u_{m}(x) -  
1625:          {\langle\langle {u}_{m} \rangle\rangle}_{q} \right)}{\int
1626:          {p(x)}^{q}} \, \ud \mu(x) \right] p(x)^{q} \enspace.
1627:          \end{displaymath}
1628:          By integrating both sides in the above equation, and by
1629:          using~(\ref{Equation:ME:Normalized-q-ExpectationConstraints})
1630:          we get (\ref{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation}).
1631: 
1632:          Now, (\ref{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation}) can
1633:          be written in its discrete form as
1634:          \begin{equation}
1635:          \label{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation_Discrete_1}
1636:           \sum_{k=1}^{n} \frac{P_{k}^{q}}{\mu_{k}^{q-1}} =
1637:          {(\overline{Z_{q}})}^{1-q} \enspace.
1638:          \end{equation}
1639:          When $\mu$ is uniform distribution we get
1640:          \begin{equation}
1641:          \label{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation_Discrete_2}
1642:           \sum_{k=1}^{n} P_{k}^{q} = n^{1-q}  {(\overline{Z_{q}})}^{1-q}
1643:          \end{equation}
1644:          which is a constant.
1645: 
1646:          Hence by
1647:          (\ref{Equation:ME:MeasureTheoreticTsallisEntropyInDiscreteForm_1})
1648:          and
1649:          (\ref{Equation:ME:SumOfpPowerqs_ForNormalizedExpectation_Discrete_2}),
1650:          on can conclude that with respect to a particular instance of
1651:          ME, measure-theoretic Tsallis entropy $S(P)$ defined for a
1652:          probability measure $P$ on 
1653:         the measure space $(X,\mathfrak{M},\mu)$, is equal to 
1654:         discrete Tsallis entropy up to an 
1655:         additive constant, when the reference measure $\mu$ is chosen as a uniform
1656:         probability distribution. There by, one can further conclude
1657:          that with respect to a particular instance of ME of
1658:          measure-theoretic Tsallis entropy is consistent with its 
1659:          discrete definition. 
1660: 
1661: %=======================Section: Conclusition===================
1662: \section{Conclusions}
1663: \label{Section:Conclusions}
1664: 	\noindent
1665: 	In this paper we presented measure-theoretic definitions of
1666: 	generalized information measures. We proved that the measure-theoretic
1667:         definitions of generalized relative-entropies, R\'{e}nyi and
1668:         Tsallis, are natural extensions of their respective discrete
1669:         cases. We also showed that, ME prescriptions of
1670:         measure-theoretic Tsallis entropy are consistent with the
1671:         discrete case.
1672: 
1673: 
1674: %========================Bibliography===================================
1675: \section*{References}
1676: 
1677: \bibliographystyle{unsrt}
1678: \bibliography{papi}
1679: 
1680: 
1681: \end{document}
1682: 
1683: 
1684: 
1685: 
1686: 
1687: