www.win-vector.com/dfiles/LogisticRegressionMaxEnt.pdf
According to Eq.(5), the Lagrange dual function of maximum entropy model is the neg
ative log-likelihood
L(p∗,Λ,γ∗) = −L˜ p (pΛ (y|x)).
Some comments: We usually have two systems. In the first system, we know the analytical
form of the solution function, and we aim to optimize the objective function. For the
second system, we didn’t know the form of the solution, but instead we have a collection of
constraints (e.g., the balance equations). We seek to optimize the objective function under
the set of constraints. System 2 provides the primal problems, and system 1 plays the role
of dual problems. Since system 1 is an unconstrained optimization problem, it is easier to
solve. In the next section, we will show that maximum likelihood given pΛ (y|x) = exp
P i
λifi(x,y)
ZΛ(x)
is actually the basic idea of multi-class logistic regression.
According to Eq.(5), the Lagrange dual function of maximum entropy model is the neg
ative log-likelihood
L(p∗,Λ,γ∗) = −L˜ p (pΛ (y|x)).
Some comments: We usually have two systems. In the first system, we know the analytical
form of the solution function, and we aim to optimize the objective function. For the
second system, we didn’t know the form of the solution, but instead we have a collection of
constraints (e.g., the balance equations). We seek to optimize the objective function under
the set of constraints. System 2 provides the primal problems, and system 1 plays the role
of dual problems. Since system 1 is an unconstrained optimization problem, it is easier to
solve. In the next section, we will show that maximum likelihood given pΛ (y|x) = exp
P i
λifi(x,y)
ZΛ(x)
is actually the basic idea of multi-class logistic regression.
by J Mount - 2011 - Cited by 5 - Related articles
Sep 23, 2011 - phrasing and also in its equivalent maximum entropy clothing. It is well known that logistic regression and maximum entropy modeling are ...
What Is Entropy? Boltzmann’s Interpretation Posted by John under Physics | Tags: Boltzmann, Causius, Entropy, Planck |
[7] Comments
I have in previous blogs explained a little bit about what entropy is and its relationship with the amount of heat exchanged. We have also seen that Clausius discovered that in isolated systems the entropy always increases and that the there is no way that we can reduce the entropy of a given isolated system. Also we saw that an increase in entropy points towards a degradation of the quality of energy such that low quality (=high entropy) energy can no longer be applied to convert it into work.
So far this has been a more or less a phenomenological description of what entropy is. You can compare this with the observation that every time you toss up a stone it will fall back to earth. So if you make a law of gravity that says that a bodies will always fall back to earth then that law is probably very accurate but it does not add to your understanding of what gravity really is! We are in a similar situation with Clausius definition of entropy, we still do not understand what entropy is. This changed when around 1900 Ludwig Boltzmann[1], an Austrian scientist, started to think about what entropy actually was.
Around 1900 there was still a fierce debate going on between scientists whether atoms really existed or not. Boltzmann was convinced that they existed and realized that models that relied on atoms and molecules and their energy distribution and their speed and momentum, could be of great help to understand physical phenomena. Because atoms where supposed to be very small, even in relatively small systems one faces already a tremendous number of atoms. For example: one mililiter of water contains about 3×10²² molecules! Clearly it is impossible to track of each individual atom things like energy and velocity. Boltzmann introduced therefore a mathematical treatment using statistical mechanical methods to describe the properties of a given physical system (for example the relationship between temperature, pressure and volume of one liter of air). Boltzmann’s idea behind statistical mechanics was to describe the properties of matter from the mechanical properties of atoms or molecules. In doing so, he was finally able to derive the Second Law of Thermodynamics around 1890 and showed the relationship between the atomic properties and the value of the entropy for a given system. It was Max Planck who based on Boltzmann results formulated of what was later called Boltzmann expression:
Ludwig Boltzmann was born in 1844 in Vienna. He was a theoretical physicist who worked in various locations: Graz, Heidelberg, Berlin, Vienna. In 1902 he was teaching mathematical physics and philosophy in Vienna for which he became very famous. His statistical mechanical theory received a lot of criticism from his peers such as Wilhelm Ostwald. Because of these continuous attacks and his depressions he committed suicide in 1906 in Trieste (Italy). On his tomb one can find the famous formula S = k log W.
[7] Comments
I have in previous blogs explained a little bit about what entropy is and its relationship with the amount of heat exchanged. We have also seen that Clausius discovered that in isolated systems the entropy always increases and that the there is no way that we can reduce the entropy of a given isolated system. Also we saw that an increase in entropy points towards a degradation of the quality of energy such that low quality (=high entropy) energy can no longer be applied to convert it into work.
So far this has been a more or less a phenomenological description of what entropy is. You can compare this with the observation that every time you toss up a stone it will fall back to earth. So if you make a law of gravity that says that a bodies will always fall back to earth then that law is probably very accurate but it does not add to your understanding of what gravity really is! We are in a similar situation with Clausius definition of entropy, we still do not understand what entropy is. This changed when around 1900 Ludwig Boltzmann[1], an Austrian scientist, started to think about what entropy actually was.
Around 1900 there was still a fierce debate going on between scientists whether atoms really existed or not. Boltzmann was convinced that they existed and realized that models that relied on atoms and molecules and their energy distribution and their speed and momentum, could be of great help to understand physical phenomena. Because atoms where supposed to be very small, even in relatively small systems one faces already a tremendous number of atoms. For example: one mililiter of water contains about 3×10²² molecules! Clearly it is impossible to track of each individual atom things like energy and velocity. Boltzmann introduced therefore a mathematical treatment using statistical mechanical methods to describe the properties of a given physical system (for example the relationship between temperature, pressure and volume of one liter of air). Boltzmann’s idea behind statistical mechanics was to describe the properties of matter from the mechanical properties of atoms or molecules. In doing so, he was finally able to derive the Second Law of Thermodynamics around 1890 and showed the relationship between the atomic properties and the value of the entropy for a given system. It was Max Planck who based on Boltzmann results formulated of what was later called Boltzmann expression:
S = k lnW
Here is S the entropy, k is Boltzmann constant, ln is the natural logarithm and W is the amount of realization possibilities the system has. This last sentence typically encounters some problems when we try to understand this. The value of W is basically a measure of how likely a system can exist given certain characteristics. Let me give an example. Imagine you have a deck of cards with 4 identical cards. The deck as a total can be described with parameters such as the number of cards, thickness of the deck, weight and so on. With four cards we have 4x3x2x1 = 24 possible configurations that all lead to the same (in terms of the parameters above) deck of cards. Therefore in this case W = 24. The Boltzmann constant, k, equals to 1.4 10-²³ J/K and the entropy S is then kln24 = 4.4 10 -²³ J/K. The more possibilities a given system has to establish itself (and with the many atoms we have in one gram of material there are many possibilities!) the more likely it will be that we will indeed observe that system and the higher the entropy will be.
Now it is easier to understand the observation of Clausius that the entropy increases all the time. This is because a given (isolated) system will tend to become more disordered and thus more likely to occur. Unfortunately the more disorder a given system has the less useful such a system is from a human perspective. Energy is much more useful when it is captured in a liter of fuel than when that same amount of energy, after we burned the fuel, is distributed all over the environment! Clearly the entropy went up because the disorder after burning increased.
Copyright © 2007 John E.J. Schmitz
Ludwig Boltzmann was born in 1844 in Vienna. He was a theoretical physicist who worked in various locations: Graz, Heidelberg, Berlin, Vienna. In 1902 he was teaching mathematical physics and philosophy in Vienna for which he became very famous. His statistical mechanical theory received a lot of criticism from his peers such as Wilhelm Ostwald. Because of these continuous attacks and his depressions he committed suicide in 1906 in Trieste (Italy). On his tomb one can find the famous formula S = k log W.
October 14, 2007 at 3:19 pm I bought the book two month ago. it is an interesting book, except for no much physics of entropy, in particular for lack of extensive explanation of some popular misunderstanding, which was discussed a lot in professor Lambert’s website. I was also expecting more words regarding the new concepts of high(or low)-entropy energies.
October 16, 2007 at 6:54 pm Dear Ray,
Thanks for your comment.
I take your remark very seriously and would like to learn more. Can you be more specific or have perhaps suggestions? Of course one has to make choices while preparing a book with this limited scope and keeping it digestable for the non-physicist. But maybe I made the wrong choices. On the other hand I can be more extensive at the blogsite and will be happy to do that.
Thanks for you time and interest,
John
August 5, 2008 at 12:39 pm This explanation is very interesting. Everything is clearly described step by step. But it is not clearly understood when we are looking at the given examples. So i request you to give your kind attention on those examples. I look forward to seeing your updates of this page.
Suchandan
August 5, 2008 at 8:21 pm Dear Suchandan,
Thanks for your comment. I am certainly willing to ellaborate more and to update the article to make it clearer.But I am not sure to which examples you refer. Can you be be a bit more specific?
Best regards, John
February 9, 2011 at 7:08 pm I suppose it is about this example with deck of cards.. it is a little to enigmatic for me, to understand that the more possibilities of arranging these cards, “the more likely it will be that we will indeed observe that system”. I think that even I have to cards(so only 2 microstate) I will see them as well as those 4 cards:D
greetingsX
May 22, 2014 at 3:01 pm Thank you for this post. Please help an idiot out. I have seen the parameter either called log (as in S=klog W or as you state it, S= K ln W. You state that ln is the “natural logarithm” – Unfortunately I do not understand this evidently “self explanatory” statement. I mostly slept through high school and college. Can you give me a hint as to what “ln” means?
August 7, 2014 at 10:51 am in both formulas k or K stands for the constant of Boltzman. normally it is noted in lower case