Sunday, August 17, 2014

volatility01 variance01 不同殘差下之預測標準誤的變化率(對數常態模型, Log returns (log normalized) assumes that investors hate variance per se, whereas in fact investors hate drawdowns

[PDF]

http://quantivity.wordpress.com/2011/02/21/why-log-returns/

human mathematics permalink

August 23, 2011 11:48 am

Regarding the paper links: There is no perfect objective metric. Log returns assumes that investors hate variance per se, whereas in fact investors hate drawdowns. Investors also hate some integral of drawdowns convolved with a convex function of time. Unless they subscribe to some philosophy, e.g. buy-and-hold-and-never-let-go, that has taught them otherwise.

[DOC]

误差理论与数据处理 - 湖北省教育考试院

www.hbea.edu.cn/files/.../06018误差理论与数据处理.doc 轉為繁體網頁

... 概念；在等精度测量和不等精度测量条件下，算术平均值或均值、加权算术平均值、标准差（方均根误差）、加权算术平均值的标准差、极限误差的意义及计算；残余误差 ...

每個數值樣本先平方一次再全部加總起來求平均值
: 得到了數值樣本的平均值後再開根號

以振幅來說平均應該是0(因為有方向性抵銷的關係), : 方均根看到的則是振幅波動的大小(但是看不到方向)
: 方均根看到的則是振幅波動的大小(但是看不到方向)

方均根每個數值樣本先平方一次再全部加總起來求平均值突波電流或者突波電壓 Edit \| View \| Share \| Delete		bose		0Comment count	1View count	8/10/14
	tw01 huang01 "室溫下的方均根速度电子伏特" 有些量是某截面積的平均值如電流/壓力等有些量則是某體積內的平均值如密度 Edit \| View \| Share \| Delete		bose		0Comment count	1View count	8/10/14
	室溫 (20℃) 下氧分子 (O2) 的均方根速率和平均平移動能某些恒星的温度达到108K 的数量级，在这温度下原子已不存在，只有质子存在 Edit \| View \| Share \| Delete

逢甲大學統計與精算研究所碩士論文 - 逢甲大學學位論文提交 ...

ethesys.lib.fcu.edu.tw/ETD-search/getfile?URN=etd-0222110...etd...

μ (對數常態模型) ....................... 177. 表8-4-13 不同殘差下之預測標準誤的變化率(對數常態模型) .................................................... 179. 表8-4-14 兩種皮爾森殘差下的.

Logarithmic growth

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A graph of logarithmic growth

In mathematics, logarithmic growth describes a phenomenon whose size or cost can be described as a logarithm function of some input. e.g. y = C log (x). Note that any logarithm base can be used, since one can be converted to another by multiplying by a fixed constant.^[1] Logarithmic growth is the inverse of exponential growth and is very slow.^[2]
A familiar example of logarithmic growth is the number of digits needed to represent a number, N, in positional notation, which grows as log_b (N), where b is the base of the number system used, e.g. 10 for decimal arithmetic.^[3] In more advanced mathematics, the partial sums of the harmonic series

1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\cdots

grow logarithmically.^[4] In the design of computer algorithms, logarithmic growth, and related variants, such as log-linear, or linearithmic, growth are very desirable indications of efficiency, and occur in the time complexity analysis of algorithms such as binary search.^[1]
Logarithmic growth can lead to apparent paradoxes, as in the martingale roulette system, where the potential winnings before bankruptcy grow as the logarithm of the gambler's bankroll.^[5] It also plays a role in the St. Petersburg paradox.^[6]
In microbiology, the rapidly growing exponential growth phase of a cell culture is sometimes called logarithmic growth. During this bacterial growth phase, the number of new cells appearing are proportional to the population. This terminological confusion between logarithmic growth and exponential growth may possibly be explained by the fact that exponential growth curves may be straightened by plotting them using a logarithmic scale for the growth axis.^[7]

Quantivity

Uncommon Returns through Quantitative and Algorithmic Trading

Why Log Returns

February 21, 2011

A reader recently asked an important question, one which often puzzles those new to quantitative finance (especially those coming from technical analysis, which relies upon price pattern analysis):

Why use the logarithm of returns, rather than price or raw returns?

The answer is several fold, each of whose individual importance varies by problem domain.

Begin by defining a return: $r_i$ at time $i$ , where $p_i$ is the price at time $i$ and $j \equiv (i - 1)$ :
     $r_i = \frac{p_i - p_j}{ p_j }$
Benefit of using returns, versus prices, is normalization: measuring all variables in a comparable metric, thus enabling evaluation of analytic relationships amongst two or more variables despite originating from price series of unequal values. This is a requirement for many multidimensional statistical analysis and machine learning techniques. For example, interpreting an equity covariance matrix is made sane when the variables are both measured in percentage.
Several benefits of using log returns, both theoretic and algorithmic.
First, log-normality: if we assume that prices are distributed log normally (which, in practice, may or may not be true for any given price series), then $log(1 + r_i)$ is conveniently normally distributed, because:
     $1 + r_i = \frac{p_i}{p_j} = \exp^{\log(\frac{p_i}{p_j})}$
This is handy given much of classic statistics presumes normality.
Second, approximate raw-log equality: when returns are very small (common for trades with short holding durations), the following approximation ensures they are close in value to raw returns:
     $\log(1 + r) \approx r$ , $r \ll 1$
Third, time-additivity: consider an ordered sequence of $n$ trades. A statistic frequently calculated from this sequence is the compounding return, which is the running return of this sequence of trades over time:
     $\displaystyle (1 + r_1)(1 + r_2) \cdots (1 + r_n) = \prod_i (1+r_i)$
This formula is fairly unpleasant, as probability theory reminds us the product of normally-distributed variables is not normal. Instead, the sum of normally-distributed variables is normal (important technicality: only when all variables are uncorrelated), which is useful when we recall the following logarithmic identity:
     $\log(1 + r_i) = log(\frac{p_i}{p_j}) = \log(p_i) - log(p_j)$
Thus, compounding returns are normally distributed. Finally, this identity leads us to a pleasant algorithmic benefit; a simple formula for calculating compound returns:
     $\displaystyle \sum_i \log(1+r_i) = \log(1 + r_1) + \log(1 + r_2) + \cdots + \log(1 + r_n) = \log(p_n) - \log(p_0)$
Thus, the compound return over n periods is merely the difference in log between initial and final periods. In terms of algorithmic complexity, this simplification reduces O(n) multiplications to O(1) additions. This is a huge win for moderate to large n. Further, this sum is useful for cases in which returns diverge from normal, as the central limit theorem reminds us that the sample average of this sum will converge to normality (presuming finite first and second moments).
Fourth, mathematical ease: from calculus, we are reminded (ignoring the constant of integration):
     $e^x = \int e^x dx = \frac{d}{dx} e^x = e^x$
This identity is tremendously useful, as much of financial mathematics is built upon continuous time stochastic processes which rely heavily upon integration and differentiation.
Fifth, numerical stability: addition of small numbers is numerically safe, while multiplying small numbers is not as it is subject to arithmetic underflow. For many interesting problems, this is a serious potential problem. To solve this, either the algorithm must be modified to be numerically robust or it can be transformed into a numerically safe summation via logs.
As suggested by John Hall, there are downsides to using log returns. Here are two recent papers to consider (along with their references):

Comparing Security Returns is Harder than You Think: Problems with Logarithmic Returns, by Hudson (2010)

Quant Nugget 2: Linear vs. Compounded Returns – Common Pitfalls in Portfolio Management, by Meucci (2010)

About these ads

30 Comments leave one →

John Hall permalink

February 21, 2011 1:23 am

I’m glad you posted this. It certainly was something I struggled with for a long time. This is one of those things that I never learned in school or on the CFA curriculum and if I didn’t start reading academic papers I never would have figured out.
In my own study, I was convinced most by the explanation in Meucci’s Risk and Asset Allocation book. There are hints of what he says in what you say. Basically, his argument is that you should take some invariants and then map them to expected market prices. Since you should be concerned about how these invariants move forward into time and how they can be combined into market prices, the properties of the invariants are more important than the properties of the final market prices (arithmetic returns are easy to aggregate for 1 point in time, but geometric returns are better to aggregate through time). Also, as you note, there is an easy formula to convert one to the other.
I was a bit tripped up in his analysis b/c if the geometric returns follow some garch or regime-switching process, then they aren’t IID. However, the log returns of these variables can still be projected in each period following these processes and then mapped to market prices for use in optimization. Meucci has a good short paper on why you shouldn’t use the projection of log returns in optimization on ssrn.

quantivity permalink*

February 21, 2011 1:50 am
Good point to highlight the downsides. Updating post now to include links to several relevant papers.

Reply
- Paul Grimoldi permalink
  
  February 25, 2011 8:28 pm
  With respect to the paper on high frequency trading, the variable that is modelled as iid is the cross-sectional volatility of principal components. There is absolutely no relationship between this independence and the returns of stocks in the original dimension. It is intuitive to model on a short term horizon the shocks that dislocate the relative relationships of principal components as iid events. This characteristic makes the Euclidean distance a good tool to measure aggregate change in these dislocations over a period of time H. Under this scenario, returns in the original dimension can very well show autoregressive behavior, and by no means there is an assumption of independence between them. I think the main point of the paper lies here: the real model is on the cross-sectional vol of principal components, and not in the returns themselves. Hope this helps. P.-

human mathematics permalink

August 23, 2011 11:48 am

quantivity permalink*

August 23, 2011 12:55 pm
@human: agreed, thanks for your comment; more formally, investors appear to hate negative semi-variance and express that temporal preference non-linearly. I am working on an Asset Allocation post that introduces more formality of both ideas; would be interested to get your ideas / comments on that, after it is posted.

Reply
- human mathematics permalink
  
  August 24, 2011 7:42 am
  Are you a former geologist? I was using the word “quasimetric” rather than “semivariance”
- human mathematics permalink
  
  August 24, 2011 7:51 am
  Did you read Danny Kahneman’s paper on remembered utility? That gave me some ideas on what kind of function to convolve the downside against.
  It might be even more complicated than downsides — maybe brief upsides vs spiky upsides vs long-and-steady affects how investors take the downside. I’m reading The Big Short right now. It’s incredible how Michael Burry’s investors lampooned him, even when he was RIGHT, calling the biggest short of the last decade with surgical precision. http://books.google.com/books?id=eParwQ0YdrcC&printsec=frontcover&dq=the+big+short&hl=en&src=bmrr&ei=vQ9VTpCDN8K_tgfDl_CPAg&sa=X&oi=book_result&ct=book-thumbnail&resnum=1&ved=0CEIQ6wEwAA#v=onepage&q=his%20investors&f=false
  Sure – you have my email now – so beep at me when you’ve finished it.

Pat Burns permalink

December 27, 2011 10:28 am

Another answer to the question of the title is http://www.portfolioprobe.com/2010/10/04/a-tale-of-two-returns/

Aykut permalink

April 7, 2012 2:54 am

Well,with due respect, I opine that the real answer to the ‘why ln(x), instead of x’ question is more elegant than the articlesuggests:
‘X’ amount of profit is a quantity in a 10-digit space,which gives a distorted view of natural quantities. One has to take the natural logarithm of a naturally occured quantity to bring it down to the undistorted /real scale. This is why it is called ‘Natural’ Logarithm. Hence, we takke the natural logiartihm of the returns to apply summation and substraction operations on them.
This is also why , the ln() of the returns has a normal distribution..This is also why, linear interpolation works on the ln() of the returns.

Pablo_Garg permalink

July 18, 2012 7:02 am

Hello, let me ask something. Do you know if i can use excess returns with logs? Moreover, if i decide to use log how can derive the difference with risk- free rate?

Michael permalink

September 27, 2012 2:29 am

Do really mean $\log(p_n) – \log(p_0)$?
Why dont you write $\log(r_n) – \log(r_0)$ instead?

quantivity permalink*

September 27, 2012 11:19 pm
@Michael: expression is correct, due to identity:
$\log(1 + r_i) = \log(p_i) - log(p_j)$

Reply

casablanca permalink

May 2, 2013 1:18 am

I’m a writer from Stonefield, Great Britain just forwarded this onto a coworker who was running some research on this. And she actually ordered me lunch just because I came across it for her… lol. So allow me to reword this…. Thanks for the meal… But yeah, thanx for spending some time to talk about this issue here on your blog.

tallahassee Lawyer permalink

May 2, 2013 2:54 pm

At last, after surfing http://quantivity.wordpress.com/2011/02/21/why-log-returns/ for quite some time, I got a site from which I was able to genuinely discover worthwhile facts in regard to the studies and the knowledge that I
want. There need to be more things like this on WordPress

Nick Vintila permalink

January 12, 2014 2:03 pm

@quantivity: I am a discretionary trader seeking a quantitative edge.
Have a related question and maybe you can provide guidance.
You said:
“… if we assume that prices are distributed log normally (which, in practice, may or may not be true for any given price series)”
“… given much of classic statistics presumes normality”
Question: How material is the “normality assumption” in creating profitable quant models and strategies?
How can one discern when this assumption is realistic (worth working with as a practitioner) and when is it purely for academical reasons?
Allow me to elaborate based on my limited knowledge so far.
It seems that the vast majority of quantitative education is founded on the normality assumption and on the creation of models from historical distributions.
“Risk” seems to be about going (or not going) outside the first/second standard deviation of the Gaussian curve.
As a trader, it almost sounds like this was born in a “buy and hold” world where time works in an investor`s favor.
Also as a trader, I know that outside the first standard deviations is where the best trades are in this volatile world.
On the other hand (from the Gaussian assumption), quant practitioners like Nassim Taleb advocate for non Gaussian ways.
Some highlights below:
“Granted, it has been tinkered with, using such methods as complementary “jumps”, stress testing, regime switching or the elaborate methods known as GARCH, but while they represent a good effort, they fail to address the bell curve’s fundamental flaws.
…
These two models correspond to two mutually exclusive types of randomness: mild or Gaussian on the one hand, and wild, fractal or “scalable power laws” on the other. Measurements that exhibit mild randomness are suitable for treatment by the bell curve or Gaussian models, whereas those that are susceptible to wild randomness can only be expressed accurately using a fractal scale. The good news, especially for practitioners, is that the fractal model is both intuitively and computationally simpler than the Gaussian, which makes us wonder why it was not implemented before.
…
Indeed, this fractal approach can prove to be an extremely robust method to identify a portfolio’s vulnerability to severe risks. Traditional “stress testing” is usually done by selecting an arbitrary number of “worst-case scenarios” from past data. It assumes that whenever one has seen in the past a large move of, say, 10 per cent, one can conclude that a fluctuation of this magnitude would be the worst one can expect for the future. This method forgets that crashes happen without antecedents. Before the crash of 1987, stress testing would not have allowed for a 22 per cent move.
…
Any attempts to refine the tools of modern portfolio theory by relaxing the bell curve assumptions, or by “fudging” and adding the occasional “jumps” will not be sufficient. We live in a world primarily driven by random jumps, and tools designed for random walks address the wrong problem. It would be like tinkering with models of gases in an attempt to characterise them as solids and call them “a good approximation”.
Pasted from the “A focus on the exceptions that prove the rule” article on ft dot com.
Like I said at the beginning I am looking for a quant edge and need a way to navigate through the vast knowledge and avoid assumptions that are incompatible with a practitioner`s reality.
Can you please offer some criteria and/or references that would help me navigate?
Is the Gaussian assumption safe as foundation for profitable models?
How to verify this?

quantivity permalink*

January 12, 2014 4:00 pm
@Nick: As P&L is what matters for trading, epistemological debates have little productive to offer beyond intellectual amusement (and selling books for Taleb). Day-to-day practice of trading is agnostic to these academic debates.
In practice, the single most important concept to understand is the existence and distinction between alpha model and risk model; and use of the correct corresponding quantitative methodology for each.
- Alpha model: describes how you make money; use whatever model makes the best money
- Risk model: describes your downside risk exposure; use whatever model best describes reality of the alpha phenomenon
Many folks unknowingly conflate the two. Avoid that.
To your specific questions:
Q: How material is the “normality assumption” in creating profitable quant models and strategies?
A: Irrelevant for alpha model; if a model makes money (whatever the distribution), trade it. Maybe relevant for risk model, if alpha phenomenon is best described by a non-Gaussian distribution.
Q: How can one discern when this assumption is realistic (worth working with as a practitioner) and when is it purely for academical reasons?
A: Build and then measure: build a risk model, trade over time, and then measure whether you lose more money than you expect. If you are losing more than your model says, then one of your assumptions is wrong.
Q: Vast majority of quantitative education is founded on the normality assumption and on the creation of models from historical distributions.
A: Historical accident. Gaussian distributions (and exponential distributions, more generally) are mathematically convenient, so much of closed-form Q world is built on them. Modern computational finance and ML methods, such as Monte Carlo methods (e.g. particle filters), are mostly agnostic to distribution assumptions.
Q: Is the Gaussian assumption safe as foundation for profitable models?
A: Maybe or maybe not. Models should be built to describe reality. Blind adherence to any unverified mathematical assumption(s) are likely to lead to hardship in trading.
Q: How to verify this?
A: Many diverse statistical techniques exist. QQ-plot is an elementary technique from exploratory analysis which may be applicable.
Q: Can you please offer some criteria and/or references that would help me navigate?
A: Strive to build models that describe reality. Use the best tools at your disposal and make as few unverified assumptions as possible. Continuously check reality to verify it matches your assumptions.

Reply

Johannes permalink

January 22, 2014 12:43 pm

Thank you for this very convincing explanation. A little point came to my mind regarding the sum of small numbers which you mentioned as a pro of log returns. It is certainly right that arithmetic underflow can occur when multiplying small numbers but also the difference of small numbers is subject to numerical problems, cf. for example here: http://en.wikipedia.org/wiki/Loss_of_significance