Wednesday, December 28, 2016

An introduction to basic statistics principle behind the A/B testing

Part I. what is A/B testing?

"A/B testing is comparing two versions of a web page to see which one performs better. You compare two web pages by showing the two variants (let's call them A and B) to similar visitors at the same time. The one that gives a better conversion rate, wins!"

For example, we are doing A/B testing for Holberton School webpage. In one version (A version, the old one), we have the 'click to apply immediately" on the menu, and another version (B version, the new one) the button is right next to the picture of "cisfun$" (coding your own shell). 2500 views are tested and the application rate is 12.5% in version A and 14% in version B.

Now, Julien asks us, "Hi, guys, how do we know if version B is really better?"

https://www.holbertonschool.com/

click on link #1: https://www.optimizely.com/ab-testing/

Part II. statistics as we know is all about a null distribution

A / B test is a comparative test, in the course of the experiment, we extract some samples from the population to collect data, and then come to an overall population parameter estimation. The scientific basis from which we can derive valid conclusions from experimental data is based on statistical principles.

1. statistical hypothesis testing and significance testing: basic concepts

In order to answer Julien's question, we need to set up two hypotheses and then test them.

Null Hypothesis (Ho): There is no difference between two webpages. We hope to overturn this hypothesis by our test results.
Alternative Hypothesis (Ha): We wish to validate the hypothesis by our test results

"Statistical Methodology - General A) Null hypotheses - Ho 1) In most sciences, we are faced with: a) NOT whether something is true or false (Popperian decision) b) BUT rather the degree to which an effect exists (if at all) - a statistical decision. 2) Therefore 2 competing statistical hypotheses are posed: a) Ha: there is a difference in effect between (usually posed as < or >) b) Ho: there is no difference in effect between"

click on link#2, page 13:
http://courses.pbsci.ucsc.edu/eeb/bioe286/Lecture%20Handouts/The%20linkage%20between%20Popperian%20Science%20and%20Statistical%20Analysis%20-%202016.pdf

2. The most important concept behind is null distribution

Page 8:

"Almost all ordinary statistics are based on a null distribution • If you understand a null distribution and what the correct null distribution is then statistical inference is straight-forward. • If you don’t, ordinary statistical inference is bewildering • A null distribution is the distribution of events that could occur if the null hypothesis is true"

https://www.google.com/search?q=oak+seedings+controlled+studies&biw=1600&bih=794&tbm=isch&source=lnms&sa=X&ved=0ahUKEwirp6eglZrRAhWG2yYKHeI0CpoQ_AUICCgD&dpr=1#imgrc=aczLFFR0vh9JsM%3A

3. We, of course, do hope to reject the null hypothesis Ho by our test results

But how do we do that statistically?

Logic: The null hypotheses and alternative hypotheses are a complete set of events, and they are opposite of each other. In a hypothesis test, either the null hypothesis H0 or the alternative hypothesis Ha must hold, and if one does not hold, one must accept the other one unconditionally. So, it is either Ho true, or Ha true.

In the A / B test, the purpose of our experiment is to demonstrate that the new version and the old version are statistically significantly different, in terms of application rate.

Again, the original null hypothesis Ho in this scenario is that there is no difference between the old version and the new version of webpage, Yes, there may be some differences in terms of application data we collected, but this difference in data is due to random fluctuations in a null distribution: meaning the webpage viewers as an population has certain "statistical and random fluctuations" in terms of their visits and filling out of applications, and this kind of fluctuations is bidirectional or with two -tails, sometime is more positive than negative, other times is more negative than positive.

Part III. Type I error is critical

Now we have a much better understanding of A/B testing: we want to do a A / B test to reject the null hypothesis Ho of no difference, proving Ho is false, and therefore proving that the alternative hypothesis Ha is true.

1. What is Type I error?

Type I error: The null hypothesis is rejected when the null hypothesis is true

This is saying that we did all the A/B testing, and in our report we rejected Ho, proving Ho is false, and therefore proving that the alternative hypothesis (Ha) is true.

But there is a probability that we could have made some sampling errors somewhere, and the probability of making this kind of error is called α.

"An analogy that some people find helpful (but others don't) in understanding the two types of error is to consider a defendant in a trial. The null hypothesis is "defendant is not guilty;" the alternate is "defendant is guilty."⁴A Type I error would correspond to convicting an innocent person; a Type II error would correspond to setting a guilty person free."

click on link #3:
https://www.ma.utexas.edu/users/mks/statmistakes/errortypes.html

Again, the logic is: The null hypotheses and alternative hypotheses are a complete set of events, and they are opposite of each other. In a hypothesis test, either the null hypothesis H0 or the alternative hypothesis Ha must hold, and if one does not hold, one must accept the other one unconditionally. So, it is either Ho true, or Ha true.

2. Type I error and p-value

A standard value of 0.05 is the generally accepted probability that we may commit Type I error.

This 5% probability or significance level in statistics is called α, also represents the confidence level of our test results. If α is 0.05, then the confidence is 0.95, that is, if our A/B testing shows that our chance of Type I error is <5%, then we are confident with a > 95% probability that the positive feedback we get from new our new web page is due to our newly improved webpage design.

If we set α to 0.01, then we have a much tougher job proving that our new webpage actually made a difference.

α is set by the industry standard, and we compare to it with out own p-value from our sample data.

3. Definition of p-value.

For a p-value (significance level) of 0.05, one expects to obtain sample means in the critical region 5% of the time when the null hypotheses is true.

Or in other words, for every 100 tests (independent and random) we conduct, there are 5 chances that we actually get the positive test results we hope for, but these 5 positive test results are not statistically significant at all for us to reject the null hypotheses Ho.

If p-value calculated from our sample data <= α (set by industry standard) , we can say that our tests have yielded statistically significant positive results we hope for, and we therefore can reject null hypothesis Ho, and accept alternative hypothesis Ha.

4. P-value calculation.

click on link #4: page 17, page 20, page 23, page 26.

http://courses.pbsci.ucsc.edu/eeb/bioe286/Lecture%20Handouts/The%20linkage%20between%20Popperian%20Science%20and%20Statistical%20Analysis%20-%202016.pdf

Part IV. Type II Error: Ho is false, but we have accepted it, & therefore wrongly rejected Ha

There is real difference between the two versions of Holberton school webpage, but we mistakenly believe that there is no real difference, and we think any difference is accidental, due to general viewer population's "random fluctuations". The probability of committing Type II error or β can be relatively large, with an industry's standard 10% and 20% , meaning we are more likely to underestimate the the probability that we can actually redesign and improve our webpage.

click on link #5: page 24

http://courses.pbsci.ucsc.edu/eeb/bioe286/Lecture%20Handouts/The%20linkage%20between%20Popperian%20Science%20and%20Statistical%20Analysis%20-%202016.pdf

As with the significance level or Type I error, in order to avoid Type II error, we need to compute β by calculating another parameter to give us a reference, that is, statistical power, similar to the calculating confidence interval, the statistical power to be calculated is 1 - β.

Suppose that the two versions of webpage do differ, and we can correctly reject the null hypothesis and obtain the probability of a statistically significant result, with a statistical power of 80-90 percent.

We calculate statistical power by analyzing sample size, variance, α, and minimum variance or lower confidence interval.

click on link #6: page 27

http://courses.pbsci.ucsc.edu/eeb/bioe286/Lecture%20Handouts/The%20linkage%20between%20Popperian%20Science%20and%20Statistical%20Analysis%20-%202016.pdf

Now, we have finally accomplished the A / B test, and out test results indicate that our newly improved webpage has significantly more application clicks from viewers, with a 95% confidence level, and 80% -90% of the statistical power,

Appendix

1.calculating p values

http://www.cyclismo.org/tutorial/R/pValues.html

Here we look at some examples of calculating p values. The examples are for both normal and t distributions. We assume that you can enter data and know the commands associated with basic probability. We first show how to do the calculations the hard way and show how to do the calculations. The last method makes use of the t.test command and demonstrates an easier way to calculate a p value.

2. Standard deviations and standard errors

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1255808/

Saturday, October 22, 2016

TravelSky Technology(China Airline): Predicting Airfare Price by SVM

TravelSky Technology(China Airline): Predicting Airfare Price by SVM

We were faced with the large dataset with no explicit links between records, making it a very challenging task to analyze price changes of an individual round-trip.

It was much more practical to develop a model that generalizes the properties of all records in the dataset, and to train a SVM as a binary pricing classifier to distinguish between ”expensive” and ”cheap” of all tickets (transaction records) processed.

Part I. General Introduction

Travelsky technologies is one of the largest global distribution system in the travel/tourism industry: it sells tickets for all airlines (also hotels) and processes millions billable transactions per month.

Project Goals

1 Construct and train a general classifier so that it can distinguish between expensive and cheap tickets.

2. Use this classifier to predict the prices of future tickets.

3. Determine which factors have the greatest impact on price by analyzing the trained classifier.

Exploratory data analysis

Extent of the dataset: billion records, 132.2 GiB (uncompressed) , hundreds departure airports, hundreds destinations, hundreds routes, hundreds airlines.

Lots of fields: “Buy” date: When was this price current? “Fly” date: When does the flight leave? Price. Cabin class Economy/Business/First (98% economy tickets) . Booking class A-Z … Airline The airline selling the ticket. some data looks like a time series, tickets are linked over time

Classification & Prediction methods

Implemented two different classifiers: Support vector machine (SVM), L1- regularized linear regression. Both are convex minimization problems that can be solved online by employing the stochastic gradient descent (SGD) method.

SVM: binary linear classifier. Goal: Find maximum-margin hyperplane that divides the points with label y “+1” from those with label y “-1”.

Training: Generate training label yi for i-th data point xi. Choose hyperplane parameters so the margin is maximal and the training data is still correctly classified.

Preprocessing

For each route r, calculate the arithmetic mean (and standard deviation) of the price over all tickets.

Assign labels: Label +: “Above mean price for this route”. Label -: “Below mean price for this route” .Only store mean/std-dev, do not actually store labels.

Feature Selection

Extract features from plaintext records (x).

Each plaintext record is transformed into a 990-dimensional vector.

Each dimension contains a numerical value corresponding to a feature such as: Number of days between “Buy” and “Fly” dates, Week of day (for all dates) , Is the day on a weekend (for all dates), Dates isMonday, isWeekend, isWinter, weekOfYear, . . . .

Each dimension is normalized to zero mean and unit variance (per route r).

Part II. More Detailed Descriptions of Our Model

Classification methods

In order to identify which records represent cheap tickets and which records have traits identifying them as expensive tickets, a classifier able to distinguish between ”expensive” and ”cheap” records is necessary.

It should be possible to train such a classifier on all records at once, identifying the features making a record cheaper or more expensive than other records. As some routes are more expensive than others, it does not make sense to include the route as a feature, but rather normalize prices per route. This enables comparison of prices across all routes without simply marking all records of a particular route as expensive. Each record is then labeled according to the normalized price.

In short, a record for a particular route is labeled as ”expensive” (+1) if its price is higher than the average price of all records for that specific route. Otherwise it is labeled as ”cheap” (-1).

After training the classifier, it should be able to predict a label from and assign this label to a new record with an unknown price. As the route of the new record is known, a numerical minimal or maximal price (the afore-mentioned average price per route) can be directly inferred from the predicted label.

Additionally, the model parameters of the trained classifier should contain information on how much each feature contributes to a record being cheap or expensive.

Online algorithms for classification

Due to the large amount of data, algorithms using more than a constant amount of memory are not suitable. Two algorithms were implemented, one for online support vector machines and the other for online-regularized logistic regression. This allows efficient training of the classifier on a parallel system with limited memory.

Some definitions and terminology as they are used in the following sections:

Each data point Xi represents the features of a single record Ri and is also called the feature

vector for record Ri

Each component contains information about a single aspect of the record Ri . The contents are described previously are derived from the fields of Ri

Each label yi represents the label (classification) of a single record Ri, with two values: ”expensive” (+1), “cheap” (-1).

Record Ri always consists of a pair (Xi Yi) , both values are known for trading dataset.

For new data points Xi, the value of the labels Yi is initially unknown, and is the result of the classification/prediction. A label has only two possible values: −1 and 1.

The weight vector w is the model parameter of the classifier to be estimated and is initially

unknown. In both classifiers discussed below, w has the same number of dimensions as a data

point Xi and determines the effect each value in Xi has on the classification result.

Feature vector generation

For each record, a feature vector consisting of 990 features was created. Before normalization, each entry was set to either 1.0 (boolean true), 0 (boolean false) or a value associated with a numerical field in the record.

The feature vector represents each record as a 990-dimensional vector.

Some examples of features, and record fields utilized:

Dates Request Date, Departure Date, Return Date

Date differences Return-Departure Date, Departure-Request Date

Categorical values Passenger Type, Airline

Numerical values Number of passengers, Number of hops

Sequences of categorical values Cabin Classes, Booking Classes, Availabilities

Sequences of numerical values Flight numbers

Feature vector normalization

As with price normalization, each of the 990 features fm was normalized in two steps. In a first MapReduce job, the arithmetic means µfm and standard deviations σfm were calculated using the same methods as for price normalization and subsequently stored to disk.

All following MapReduce jobs loaded the 990 means and standard deviations from disk and calculated the normalized feature vector x 0 i on-the-fly by calculating the standard score of each feature fm: f 0 m = fm − µfm σfm , m ∈ 1, . . . , 990

Stochastic gradient descent (SGD)

Given a convex set S and a convex function f, we can estimate the parameter w in min

f(w) is of the form f(w) = Pn

a single observed data point from the dataset. Finding w is done iteratively, by using one random

sample data point from the dataset per iteration. For regularization, w ∈ S needs to be ensured,

thus a projection onto S is necessary.

Let w0 ∈ S be the starting value. Then each iteration t consists of the update step

t=1 ft(w). Usually, each summand ft represents the loss function for

wt+1 = P rojS(wt − ηt∇ft(wt))

where P rojS is a projection onto the set S, ηt is the current step size (learning rate), and ∇ft is the

gradient of f approximated at the sample data point for iteration t.

It is possible to only use a subsample of the full dataset if the data points used for training are picked

at random from the dataset. Training can then either be halted after a fixed number of iterations or as soon as sufficient accuracy is achieved.

Friday, May 6, 2016

Rule learner (or Rule Induction)

It is also known as Separate-And-Conquer method. This method apply an iterative process consisting in first generating a rule that covers a subset of the training examples and then removing all examples covered by the rule from the training set. This process is repeated iteratively until there are no examples left to cover. The final rule set is the collection of the rules discovered at every iteration of the process [13]. Some examples of these kinds of systems are:

OneR

OneR or “One Rule” is a simple algorithm proposed by Holt. The OneR builds one rule for each attribute in the training data and then selects the rule with the smallest error rate as its ‘one rule’. To create a rule for an attribute, the most frequent class for each attribute value must be determined. The most frequent class is simply the class that appears most often for that attribute value. A rule is simply a set of attribute values bound to their majority class. OneR selects the rule with the lowest error rate. In the event that two or more rules have the same error rate, the rule is chosen at random.

R.C. Holte (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning. 11:63-91.

Ridor

Ridor algorithm is the implementation of a RIpple-DOwn Rule learner proposed by Gaines and Compton. It generates a default rule first and then the exceptions for the default rule with the least (weighted) error rate. Then it generates the “best” exceptions for each exception and iterates until pure. Thus it performs a tree-like expansion of exceptions. The exceptions are a set of rules that predict classes other than the default. IREP is used to generate the exceptions.

Brian R. Gaines, Paul Compton (1995). Induction of Ripple-Down Rules Applied to Modeling Large Databases. J. Intell. Inf. Syst.. 5(3):211-228.

PART

PART is a separate-and-conquer rule learner proposed by Eibe and Witten. The algorithm producing sets of rules called ‘decision lists’ which are ordered set of rules. A new data is compared to each rule in the list in turn, and the item is assigned the category of the first matching rule (a default is applied if no rule successfully matches). PART builds a partial C4.5 decision tree in each iteration and makes the “best” leaf into a rule. The algorithm is a combination of C4.5 and RIPPER rule learning.

Eibe Frank, Ian H. Witten: Generating Accurate Rule Sets Without Global Optimization. In: Fifteenth International Conference on Machine Learning, 144-151, 1998.

JRip (RIPPER)

JRip implements a propositional rule learner, Repeated Incremental Pruning to Produce Error Reduction (RIPPER), which was proposed by William W. Cohen as an optimized version of IREP. Ripper builds a ruleset by repeatedly adding rules to an empty ruleset until all positive examples are covered. Rules are formed by greedily adding conditions to the antecedent of a rule (starting with empty antecendent) until no negative examples are covered. After a ruleset is constructed, an optimization postpass massages the ruleset so as to reduce its size and improve its fit to the training data. A combination of cross-validation and minimum-description length techniques is used to prevent overfitting.

Cohen, W. W. 1995. Fast effective rule induction. In Machine Learning: Proceedings of the Twelfth International Conference, Lake Tahoe, California.http://citeseer.ist.psu.edu/cohen95fast.html

DecisionTable

DecisionTable algorithm builds and using a simple decision table majority classifier as proposed by Kohavi. It summarizes the dataset with a ‘decision table’ which contains the same number of attributes as the original dataset. Then, a new data item is assigned a category by finding the line in the decision table that matches the non-class values of the data item. DecisionTable employs the wrapper method to find a good subset of attributes for inclusion in the table. By eliminating attributes that contribute little or nothing to a model of the dataset, the algorithm reduces the likelihood of over-fitting and creates a smaller and condensed decision table.

Ron Kohavi: The Power of Decision Tables. In: 8th European Conference on Machine Learning, 174-189, 1995.

ConjunctiveRule

ConjuctiveRule algorithm implements a single conjunctive rule learner that can predict for numeric and nominal class labels. A rule consists of antecedents “AND”ed together and the consequent (class value) for the classification/regression. In this case, the consequent is the distribution of the available classes (or mean for a numeric value) in the dataset. If the test instance is not covered by this rule, then it’s predicted using the default class distributions/value of the data not covered by the rule in the training data. This learner selects an antecedent by computing the Information Gain of each antecedent and prunes the generated rule using Reduced Error Pruning (REP) or simple pre-pruning based on the number of antecedents. For classification, the Information of one antecedent is the weighted average of the entropies of both the data covered and not covered by the rule.

http://www.dbs.informatik.uni-muenchen.de/Lehre/KDD_Praktikum/weka/doc/weka/classifiers/rules/ConjunctiveRule.html

This entry was posted on Monday, April 14th, 2008 at 12:39 pm and is filed under Data Mining. You can follow any responses to this entry through theRSS 2.0 feed. You can leave a response, or trackback from your own site.

5 Responses to Rule learner (or Rule Induction)

Atif Abdul-Rahman says:

May 8, 2008 at 7:28 pm

Salaam,
very nice blog.
i found the summaries to b well written

Reply
Ridz says:

May 8, 2008 at 8:59 pm

Thank you. You have a very interesting blog too. I will add it to my list

Ridz

Reply
Spyder says:

May 13, 2010 at 7:54 am

Thank You. Excellently Summarized.

Reply
Anonymous says:

October 20, 2014 at 11:57 pm

hello,
nice blog brother

I have a request please

I need deep information about “JRip” for a presentation
Can you help me to find good resources ?

Reply
Manal says:

October 20, 2014 at 11:58 pm

hello,
nice blog brother

I have a request please

I need deep information about “JRip” for a presentation
Can you help me to find good resources ?

Reply

洛仑兹规范6_百度文库

wenku.baidu.com/.../5133dcfbaef8941ea76e0532.htm... Translate this page

Apr 15, 2011 - 本文通过麦克斯韦方程组引入电磁场规范,指出库伦规范和洛仑兹规范只是众多电磁场规范中的两种较特殊的规范,最后推导出在静态场中库伦规范和 ...

库仑规范_图文_百度文库

wenku.baidu.com/.../cd58693631126edb6f1a1051.ht... Translate this page

Nov 10, 2011 - 这是讲的比较清楚的一个,且还包含其他规范,推荐 .... 项对应库仑场E ，? ?t 对应着感应库 r 场E 。感b) 洛仑兹规范(Lorentz gauge) 洛仑兹

电磁辐射中，需要根据激发源来决定电磁场的性质，而麦克斯韦方程组用电场强度 E r 和电磁感应强度 B 对电磁场进行描述，与激发源没有直接的联系，难以直接的描述电磁场，因此，为了能够由激发源直接描述电磁场，引入了势函数的概念。

洛仑兹规范6_百度文库

库仑规范_图文_百度文库

暂无评价|0人阅读|0次下载|举报文档

洛仑兹规范6_自然科学_专业资料。本文通过麦克斯韦方程组引入电磁场规范 ,指出库伦规范和洛仑兹规范只是众多电磁场规范中的两种较特殊的规范,最后推导出在静态场中库伦规范和洛仑兹规范具有相同的非齐次方程,说明了麦克斯韦方程组、洛仑兹规范与库伦规范都只是从不同的角度描述电磁场的运动规律,因此无论用何种方式描述电磁场,电磁场本身都没有改变。

论洛仑兹规范与库伦规范在静态场中的自恰性洛仑兹规范与库伦规范在静态场中的在静态场中孙锴（西安建筑科技大学机电工程学院，陕西西安 710055）摘要：摘要：本文通过麦克斯韦方程组引入电磁场规范 A, φ ，指出库伦规范和洛仑兹规范只是众多电磁场规范 A, φ 中的两种较特殊的规范，最后推导出在静态场中库伦规范和洛仑兹规范具有相同的非齐次方程，说明了麦克斯韦方程组、洛仑兹规范与库伦规范都只是从不同的角度描述电磁场的运动规律，因此无论用何种方式描述电磁场，电磁场本身都没有改变。关键词关键词：洛仑兹规范；库伦规范；自恰性；矢量势；标量势中图分类号：中图分类号：O441.4； 0．引言． ( ) r ( ) r r 在电磁辐射中，需要根据激发源来决定电磁场的性质，而麦克斯韦方程组用电场强度 E r 和电磁感应强度 B 对电磁场进行描述，与激发源没有直接的联系，难以直接的描述电磁场，因此，为了能够由激发源直接描述电磁场，引入了势函数的概念。 1．电磁场的规范 A, φ 的引入．辐射电磁场中为了便于根据电荷电流计算场，常常使用标量势 φ 和矢量势 A 而非电场 ( ) r r r r 强度 E 和电磁感应强度 B 来描述电磁场。真空中电磁场的麦克斯韦方程组的微分形式为： r ρ ??E = ε0 （1） r r ?B ?× E = ? ?t （2） r ??B = 0 r r r J ?E c 2? × B = + ε 0 ?t r ? ? (? × A) ≡ 0 （3）（4）由矢量分析知旋度的散度为零，即：（5） 1 将（5）式代入（3）式，得 r r B = ?× A r （6） r 引入的矢势 A 只有横场部分具有确定的意义，而其纵场部分可以任意取。假定矢势 A 是一个关于空间和时间的连续函数， ? 和 ? 可以交换微分次序。将（6）式代入（2）式，得 ?t r r r ? ?A ? × E = ? (? × A) = ?? × ?t ?t 整理，得 r ? r ?A ? ?=0 ?×?E + ? ?t ? ? ? ? × ?φ ≡ 0 （7）由矢量分析知梯度的旋度度衡为零，即（8）比较（7）（8）式，引入标量势 φ ，得、 r r ?A E=? ? ?φ ?t （9）从（9）式可以看出，电场强度 E 不仅与标量势 φ 有关，还与矢量势 A 有关。矢量势 A 和标量势 φ 作为一组势函数，可以完备的描述一个辐射场，并且称 A, φ 为电磁场的规范。 2. 用电磁场的规范 A, φ 描述电磁场将（9）式代入（1）式，得 r r r ( ) r ( ) r r ? ?A ? ρ ? ? ??? ? ?t ? ?φ ? = ε 0 ? ? 整理得 ? 2φ + r ? ρ ?? A = ? ε0 ?t （10）将（9）式和（6）式同时代入（4）式，得 r r r J ? ?A c ?× ?× A = + (? ? ?φ ) ε 0 ?t ?t 2 ( ) 整理，得 r r 1 ?2 A r? r ? 1 ?φ ? A ? 2 2 = ? ? 0 J + ?? 2 + ? ? A? c ?t ? c ?t ? 2 （12） 2 从上面的推导我们可以看出，麦克斯韦方程组中的四个方程分别独立推导出了四个标量势 φ 和矢量势 A 的方程，他们分别是： r r r B = ?× A r r ?A E=? ? ?φ ?t ? 2φ + r ? ρ ?? A = ? ?t ε0 （6）（9）（10） r r 1 ?2 A r? r ? 1 ?φ ? A ? 2 2 = ? ? 0 J + ?? 2 + ? ? A? c ?t ? c ?t ? 2 （12）势函数方程与麦克斯韦方程组的对应关系见表 1-1。表 1-1. A, φ 规范的势函数方程的麦克斯韦方程组来源麦克斯韦方程组 ( ) r (A,φ )规范的是函数方程 r r ρ ??E = ε0 ? 2φ + r ? ρ ?? A = ? ?t ε0 r r ?B ?× E = ? ?t r r ?A E=? ? ?φ ?t r ??B = 0 r r r J ?E c ?× B = + ε 0 ?t 2 2 r r B = ?× A r r 1 ?2 A r? r ? 1 ?φ ? A ? 2 2 = ? ? 0 J + ?? 2 + ? ? A? c ?t ? c ?t ? r r ?A 在静态场中，有 = 0 ， E = ??φ ，因此 ?φ 是静电场的梯度，标量势 φ 是静电场的 ?t 电位函数。 3．静电场中洛仑兹规范过渡为库伦规范．静电场中洛仑兹规范过渡为库伦规范洛仑兹规范过渡由亥姆霍兹定理知，在无限空间中处处单值，且导数连续有界而源分布在有限区域中的矢量场 F 由其散度和旋度唯一确定。已知 B = ? × A ，若 ? ? A 也确定的话，矢量势 A 就可以唯一确定。此时，规范 A, φ 可以唯一确定电磁场。 r r r r r ( ) r 3 由库伦规范： ? ? A = 0 ，得 r ? 2φ = ? ρ ε0 （13） r r 1 ?2 A r 1 ?φ ? 2 A ? 2 2 = ?? 0 J + ? 2 c ?t c ?t （14） r r ?A 在静态场中有 = 0 ，比较（9）式可知 E = ??φ 。因此 ?φ 是电场的梯度，标量势 φ ?t 是静电场的电位函数。（13）式正是静电场的泊松方程，说明标量势 φ 是静电场的电位函数，因此有 ?φ =0 ?t （14）所以，在静态场中洛仑兹规范 ? ? A = ? 渡为库伦规范。 4．结束语． r 1 ?φ = 0 。可见在静电场下，洛仑兹规范过 c 2 ?t 由以上推导可以看出麦克斯韦方程组，洛仑兹规范和库伦规范只是不同的角度描述同一 r r r 个电磁场。无论是选用麦克斯韦方程组的电场强度 E 和电磁感应强度 B ，还是选用洛仑兹规范和库伦规范的 A, φ ，只要它们描述的是同一个电磁场，那么电场强度 E 、电磁感应强度 B 、标量势 φ 和矢量势 A 就可以相互推导，其表现表现形式将是一致的。也就是说，在同一个电磁场中标量势 φ 和矢量势 A 具有自恰性，表现的是同一个电磁场的不同表现形式。 ( ) r r r r 参考资料 [1] 郭硕鸿.电动力学[M].北京:人民教育出版社，1979. [2] 虞国寅，周国全.电动力学[M].武昌:武汉大学出版社，2008. Self-Consistency of Coulomb’s gauge and Lorentz’s gauge in the electromagnetostatic field Kai Sun (College of Mechanical and Electrical Engineering,Xi’an university of Architecture & Technology, Xi’an, Shannxi 710055, China) 4 Abstract: This paper proves that both Coulomb’s gauge and Loentz’s gauge are one of special forms of electromagnetic gauge (A,φ ) which discribes the same characteristics of motion of r Coulomb’s gauge and Loentz’s gauge electromagnetic field from different perspectives. Hence it derives the same form of nonhomogeneous equations from r electromagnetostatic field which proves that the selection of electromagnetic gauge A, φ has no effect to the electromagnetic field. Key words： Coulomb’s gauge； Loentz’s gauge； Self-Consistency； vector potential; scalar potential ：作者简介：孙锴，女，（1977-）西安建筑科技大学机电工程学院教师，主讲课程：电磁场与电磁波 ( ) in the 5

第五章电磁波的辐射 Electromagnetic Wave Radiation 本章所研究的问题是电磁波的辐射。本章所研究的问题是电磁波的辐射。方法和稳恒场情况一样，当考虑由电荷、法和稳恒场情况一样，当考虑由电荷、电流分布激发电磁场的问题时，流分布激发电磁场的问题时，引入势的概念来描述电磁场比较方便。念来描述电磁场比较方便。本章首先把势的概念推广到一般变化电磁场情况，然后通过势来解辐射问题。磁场情况，然后通过势来解辐射问题。本章主要内容电磁场的矢势和标势推迟势电偶极辐射电磁波的干涉和衍射电磁场的动量 §5. 1 电磁场的矢势和标势 Vector and Scalar Potential of Electromagnetic r 1、用势 A, ?描述电磁场为简单起见，讨论真空中的电磁场：为简单起见，讨论真空中的电磁场： r ??? D= ρ r ? r ??×E = ? ?B ? ? ?t ? r ??? B = 0 r ? r r ?D ??×H = j + ? ?t ? r r r r D=ε0E, B = ?0H . 针对磁场引入 r ?? B = 0 r r B =?× A r 的物理意义可由下式看出： A的物理意义可由下式看出： r S 即在任一时刻，沿任一闭合回路L的线积即在任一时刻，矢量 A沿任一闭合回路的线积分等于该时刻通过以L为边线的曲面的磁通量。为边线的曲面S的磁通量分等于该时刻通过以为边线的曲面的磁通量。 ∫ L r v r v A? dl = ∫∫ B? ds r 不能像静电场那样直接引入电势。对于电场 E不能像静电场那样直接引入电势。由 Faraday电磁感应定律可得：电磁感应定律可得：电磁感应定律可得 r r r r ?B ? ?A ?×E = ? = ? (?× A) = ??× ?t ?t ?t r ? r ?A? ?×? E+ ? = 0 ? ?t ? ? ? r r ?A E + = ?? ? ?t 是标势不是静电势即 r r ?A E = ?? ? ? ?t r r ?B =?× A ? r ?r ?A ? ?E = ?? ? ?t ? 电磁场和势之间的关系如下 r ?A r r = 0时,且 E = ?? ? a) 当 A 与时间无关，与时间无关，即且 ?t 就直接归结为电势；这时 ?就直接归结为电势；注意：注意： ? ? 混为一谈。与电势 r (E = ?? ) 混为一谈。因为在非稳恒情况下，不再是保守力场，不存在势能的概念，况下， E 不再是保守力场，不存在势能的概念，这就是说现在的 ?，在数值上不等于把单位正电荷从空间一点移到无穷远处电场力所做的功。荷从空间一点移到无穷远处电场力所做的功。为了区别于静电场的电势，了区别于静电场的电势，把这里的 ? 称为标势（Scalar potential)。。 c) 在时变场中r 磁场和电场是相互作用着的在时变场中，，整体，整体，必须把矢势 A 和标势 ? 作为一个整体来描述电磁场。述电磁场。 r r ?A b) 绝对不要把 E = ?? ? 中的标势 ? ? ?t r ? 种等价的方式，种等价的方式，但由于 E 、B 和 A、之间是微分方程的关系，方程的关系，所以它们之间的关系不是一一对应 r 的，这是因为矢势 A 可以加上一个任意标量函数 r 的梯度，的梯度，结果不影响 B，而这个任意标量函数 r r r ?A ? 要发生影响，的梯度在 E = ?? ? 中对 E 要发生影响，但 r ?t r ?A ? 将 E = ?? ? 中的?与此融合也作相应的 ?t r 变换，保持不变。变换，则仍可使 E 保持不变。 2、规范变换和规范不变性 r r r ? r 虽然 E 和 B，以及A 和是描述电磁场的两 r r 述变换式：述变换式： r ψ为任意的标量函数，即ψ =ψ(x,t)，作下设为任意的标量函数， r r r ?A→A = A+? ′ ψ ? ? ?ψ ? ? →?′ =? ? ?t ? r ′ 很容易证明：于是我们得到了一组新的 A . ?′ ，很容易证明： r r r ′ ψ ψ ?× A = ?×(A+? ) = ?× A+?×(? ) r r = ?× A= B r ′ ?A ?ψ ? r ?? ′ ? ? = ?? ? ? ( ) ? (A+? ) ψ ?t ?t ?t r ? ?A ? = ?? + (? ) ? ? (? ) ? ψ ψ ?t ?t ?t r ?A r = ?? ? = E ? ?t r r 由此可见，描述同一电磁场。由此可见，(A′ . ?′) 和 (A. ?) 描述同一电磁场。 a) 库仑规范(Coulomb gauge) 库仑规范(Coulomb r r 库仑规范条件为 ?? A= 0，即规定 A 是一个 r 有旋无源场（横场）。）。这个规范的特点是有旋无源场（横场）。这个规范的特点是 E的纵 ?具有无旋性，场部分完全由? 描述（即 ?? 具有无旋性)，横描述（ r r ?A 描述（具有无源性）。）。由场部分由 A描述（即具有无源性）。由 ?t r r ?A ? E = ?? ? ?t r r ?A 可见，可见，??? 项对应库仑场 E ，? ?t 对应着感应库 r 场E 。感 b) 洛仑兹规范(Lorentz gauge) 洛仑兹规范(Lorentz 是一个有旋有源场（定 A是一个有旋有源场（即 A 包含横场和纵场两部分）部分），这个规范的特点是把势的基本方程化为特别简单的对称形式。特别简单的对称形式。 r 1 ?? 洛仑兹规范条件为 ?? A+ 2 = 0 ，即规 C ?t r r ?t 3、达朗贝尔(d’ Alembert)方程达朗贝尔(d’ Alembert)方程从Maxwell’s equations r ? ? D= ρ ? ? r ?r ?A ? ?E = ?? ? ?t ? 2 r r D=ε0E r 所满足的方程，得到：出发推导矢势 A 和标势 ?所满足的方程，得到： r r 1 ?? r ? 2r 1 ? A ) = ??0 j ?? A? 2 2 ??(?? A+ 2 ? c ?t c ?t ? r ? 2 ?? ? + ?? A= ? ρ ? ?t ε0 ? a) 采用库仑规范上述方程化为 r (?? A= 0) ρ ? 2 ?? ? = ?ε ? 0 r ? 2 r ??2 A? 1 ? A ? 1 ? (? ) = ?? r ? 0j 2 2 2 ? c ?t c ?t ? r 1 ?? b) 采用洛仑兹规范( ?? A+ 2 采用洛仑兹规范( = 0) c ?t 上述方程化为 ? 2 ρ 1 ?2? ?? ? ? 2 2 = ? c ?t ε0 ? r ? 2 r ? 2r 1 ? A ?? A? c2 ?t2 = ??0 j ? 这就是所谓达朗贝尔达朗贝尔（方程。这就是所谓达朗贝尔（ d’ Alembert ）方程。 4、举例讨论试求单色平面电磁波的势 Solution: Solution: 单色平面电磁波在没有电荷，单色平面电磁波在没有电荷，电流分布的自由空间中传播，因而势方程（达朗贝尔方程在 Lorentz规范条件下）变为波动方程：规范条件下）规范条件下变为波动方程： 2 ? 2 1 ?? ?? ? ? 2 2 = 0 ? c ?t r ? 2 r 1?A ??2 A? =0 2 2 ? c ?t ? 其解的形式为：其解的形式为： ? =?0e ? ? rr ? r r i(k?x?ωt) ?A= A e 0 ? r 1 ?? 由Lorentz规范条件 ?? A+ 规范条件 = 0，即得 2 c ?t r r rr i(k?x?ω ) t 1 ik ? A+ 2 (?iω?) = 0 c c2 r r ?= k?A ω 磁波，这是因为：磁波，这是因为： r 这表明，这表明，只要给定了 A ，就可以确定单色平面电 r r r r r r r B = ?× A= ik × A= ik ×(A + A ) 纵横 r r r r = ik × A +ik × A 纵横 r r 对于单色平面波而言） 0（对于单色平面波而言） = ik × A r 横 r r r ?A E = ?? ? ? = ?ik? + iω A ?t r c2 r r r = ik( k ? A + iω ) A = ?i c [ ω 2 ω r r r r 2 k(k ? A ? k A ) ] c2 r r r = ?i k ×(k × A ) ω c r r = ? k ×B ω r r ? = ?cn×B 2 r r r 具有横向分量，如果取 A = A ，即只取 A具有横向分量，那么横有 r r r r k ? A= k ? A = 0 横 c2 r r ? = k ? A= 0 从而得到：从而得到：因此有：因此有： ω r r r r r r ?B = ?× A= ik × A= ik × A 横 ? r r ?r r r ?A ?A = ? = iω = iω 横 ? A A ?E = ?? ? ?t ?t ? r r 其中：其中： (k ? A = 0) 如果采用库仑规范条件，势方程在自由空间中变如果采用库仑规范条件，为 ??2? = 0 ? r ? 2 r 1 ?2 A 1 ? ? ?? A? 2 2 ? 2 ? = 0 c ?t c ?t ? 当全空间没有电荷分布时，当全空间没有电荷分布时，库仑场的标势 ? = 0 ，则只有 r r 1?A 2 ? A? 2 2 = 0 c ?t 2 其解的形式为 rr r r i(k?x?ωt) A= A e 0 由库仑规范条件得到 r r r 即保证了 A 只有横向分量，即 A= A ，从而得到只有横向分量，横 r r r ?? A=ik ? A= 0 r r r r r r ?B = ?× A=ik × A= ik × A横 ? r r ?r r r ?A ?A ?E = ??? ? = ? = iωA= iωA横 ?t ?t ? r (?? A= 0) 通过例子可看到：通过例子可看到：库仑规范的优点是：库仑规范的优点是：它的标势 ? 描述库仑作 r 求出，用，可直接由电荷分布 ρ 求出，它的矢势 A只有横向分量，横向分量，恰好足够描述辐射电磁波的两种独立偏振。偏振。 r 洛仑兹规范的优点是：洛仑兹规范的优点是：它的标势 ? 和矢势 A r 构成的势方程具有对称性。构成的势方程具有对称性。它的矢势 A的纵向部的选择还可以有任意性，分和标势 ? 的选择还可以有任意性，即存在多余的自由度。尽管如此，的自由度。尽管如此，它在相对论中显示出协变因此，本书以后都采用洛仑兹规范。性。因此，本书以后都采用洛仑兹规范。 Class is Over! Thank you! Boys and girls!

Wednesday, May 4, 2016

Potential Theory in Classical Probability, Potential theory, harmonic functions, Markov processes, stochas- ... of classical analytic potential theory: Green kernels, Laplace and Poisson.

Potential Theory--ICPT 94: Proceedings of the International ...

https://books.google.com/books?isbn=3110146541

Josef Král - 1996 - ‎Mathematics

Proceedings of the International Conference on Potential Theory, Held in Kouty, Czech Republic, August 13-20, 1994 Josef Král ...

[PDF]Lectures on Potential Theory - Tata Institute of Fundamental Research

www.math.tifr.res.in/~publ/ln/tifr19.pdf

by M Brelot - ‎Cited by 231 - ‎Related articles

Lectures on Potential Theory. By. M. Brelot. Notes by. K. N. Gowrisankaran and. M. K. Venkatesha Murthy. Second edition, revised and enlarged with the help of ...

Missing: ~~sgd~~

A little history of Submarine Groundwater Discharge (SGD ...

https://planetgeogblog.wordpress.com/.../a-little-history-of-submarine-gr...

Jan 28, 2014 - Although people already used SGD as an important water source for ... these deviations and created the basis for the Potential Theory and the ...

[PDF]Potential Theory in Classical Probability

www.ntu.edu.sg/.../greifswald_potenti...

Nanyang Technological University

by N Privault - ‎Cited by 5 - ‎Related articles

Key words: Potential theory, harmonic functions, Markov processes, stochas- ... of classical analytic potential theory: Green kernels, Laplace and Poisson.

Missing: ~~sgd~~

phymath999

Wednesday, December 28, 2016

An introduction to basic statistics principle behind the A/B testing

2. Standard deviations and standard errors

Saturday, October 22, 2016

TravelSky Technology(China Airline): Predicting Airfare Price by SVM

Friday, May 6, 2016

Rule learner (or Rule Induction)

Rule learner (or Rule Induction)

5 Responses to Rule learner (or Rule Induction)

洛仑兹规范6_百度文库

库仑规范_图文_百度文库

洛仑兹规范6_百度文库

wenku.baidu.com/.../5133dcfbaef8941ea76e0532.htm...

Translate this page
Apr 15, 2011 - 本文通过麦克斯韦方程组引入电磁场规范,指出库伦规范和洛仑兹规范只是众多电磁场规范中的两种较特殊的规范,最后推导出在静态场中库伦规范和 ...

库仑规范_图文_百度文库

wenku.baidu.com/.../cd58693631126edb6f1a1051.ht...

Translate this page
Nov 10, 2011 - 这是讲的比较清楚的一个,且还包含其他规范,推荐 .... 项对应库仑场E ，? ?t 对应着感应库 r 场E 。感b) 洛仑兹规范(Lorentz gauge) 洛仑兹

洛仑兹规范6_自然科学_专业资料

库仑规范_理学_高等教育_教育专区

Wednesday, May 4, 2016

Potential Theory in Classical Probability, Potential theory, harmonic functions, Markov processes, stochas- ... of classical analytic potential theory: Green kernels, Laplace and Poisson.

Potential Theory--ICPT 94: Proceedings of the International ...

[PDF]Lectures on Potential Theory - Tata Institute of Fundamental Research

A little history of Submarine Groundwater Discharge (SGD ...

[PDF]Potential Theory in Classical Probability

Wednesday, December 28, 2016

2. Standard deviations and standard errors

Saturday, October 22, 2016

Friday, May 6, 2016

Rule learner (or Rule Induction)

5 Responses to Rule learner (or Rule Induction)

wenku.baidu.com/.../5133dcfbaef8941ea76e0532.htm... Translate this page Apr 15, 2011 - 本文通过麦克斯韦方程组引入电磁场规范,指出库伦规范和洛仑兹规范只是众多电磁场规范中的两种较特殊的规范,最后推导出在静态场中库伦规范和 ...

wenku.baidu.com/.../cd58693631126edb6f1a1051.ht... Translate this page Nov 10, 2011 - 这是讲的比较清楚的一个,且还包含其他规范,推荐 .... 项对应库仑场E ，? ?t 对应着感应库 r 场E 。 感b) 洛仑兹规范(Lorentz gauge) 洛仑兹

洛仑兹规范6_自然科学_专业资料

库仑规范_理学_高等教育_教育专区

Wednesday, May 4, 2016

[PDF]Lectures on Potential Theory - Tata Institute of Fundamental Research

[PDF]Potential Theory in Classical Probability

wenku.baidu.com/.../5133dcfbaef8941ea76e0532.htm...

Translate this page
Apr 15, 2011 - 本文通过麦克斯韦方程组引入电磁场规范,指出库伦规范和洛仑兹规范只是众多电磁场规范中的两种较特殊的规范,最后推导出在静态场中库伦规范和 ...

wenku.baidu.com/.../cd58693631126edb6f1a1051.ht...

Translate this page
Nov 10, 2011 - 这是讲的比较清楚的一个,且还包含其他规范,推荐 .... 项对应库仑场E ，? ?t 对应着感应库 r 场E 。感b) 洛仑兹规范(Lorentz gauge) 洛仑兹