Monday, April 25, 2016

Logistic regression 或logit regression the dependent variable (DV) is categorical. analytic hierarchy process and conjoint analysis, 解釋每一個樣本時,你必須說明抽樣的每一個步驟是隨機或 非隨機的

http://www.theanalysisfactor.com/confusing-statistical-term-4-hierarchical-regression-vs-hierarchical-model/

逻辑回归英语:Logistic regression 或logit regression),即逻辑模型(英语:Logit model,也译作“评定模型”、“分类评定模型”)是离散选择法模型之一,属于多重变量分析范畴,是社会学生物统计学临床数量心理学计量经济学市场营销统计实证分析的常用方法。

目录

逻辑分布公式[编辑]


逻辑分布函数图像
 P(Y=1 | X=x) = \frac{ e^{x'\beta} }{1+ e^{x'\beta}}.
其中参数\beta常用最大似然估計

IIA假设[编辑]

Independent and irrelevant alternatives”假设,也称作“IIA效应”,指Logit模型中的各个可选项是独立的不相关的。

IIA假设示例[编辑]

市场上有A,B,C三个商品相互竞争,分别占有市场份额:60%,30%和10%,三者比例为:6:3:1
一个新产品D引入市场,有能力占有20%的市场——
如果满足IIA假设,各个产品独立作用,互不关联:新产品D占有20%的市场份额,剩下的80%在A、B、C之间按照6:3:1的比例瓜分,分别占有48%,24%和8%。
如果不满足IIA假设,比如新产品D跟产品B几乎相同,则新产品D跟产品B严重相关:新产品D夺去产品B的部分市场,占有总份额的20%,产品B占有剩余的10%,而产品A和C的市场份额保持60%和10%不变。 不是正的

满足IIA假设的优点[编辑]

  • 可以获得每个个性化的选择集合的一致的参数估计
  • 各个类别的子集的一般化的估计
  • 大大节省时间
  • 可选项数目很多的时候尤其如此

IIA假设的检验[编辑]

Hausman检验[编辑]

Hausman和McFadden提出的。

一般化模型的检验[编辑]

IIA问题的解决方法[编辑]

多项式Probit模型[编辑]

一般化极值模型[编辑]

可以将可选项间的相关性建模

巢式Logit模型[编辑]

巢式(Nested)表示可选项被分作不同的组,组与组之间不相关,组内的可选项相关,相关程度用1-λg来表示(1-λg越大,相关程度越高)

对偶组合Logit模型[编辑]
一般化分簇Logit模型[编辑]

混合Logit模型[编辑]

二类评定模型(Binary Logit Model)[编辑]

  • 仅有两个可选项:V1n,V2n
variable
type
統計量

組別比較

regression

model
numerical

mean

t-test/ANOVA

Linear
regression
categorical

percentage

Chi-square test

Logistic
regression
persontime

KM estimates
(survival curves)
Log-rank test

Cox regression

参见[编辑]

参考书目[编辑]

  • Agresti, Alan: Categorical Data Analysis. New York: Wiley, 1990.
  • Amemiya, T., 1985, Advanced Econometrics,Harvard University Press.
  • Hosmer, D. W. and S. Lemeshow: Applied logistic regression. New York; Chichester, Wiley, 2000.

analytic hierarchy process and conjoint analysis

www.ispor.org/congresses/Spain1111/presentations/IP15_AllAuthors.pdf

Nov 8, 2011 - CONJOINT ANALYSIS: TWO APPROACHES TO. INCLUDE .... The AHPstructures a decision into a hierarchy of criteria, sub criteria and .

Confusing Statistical Term #4: Hierarchical Regression vs ...

www.theanalysisfactor.com/confusing-statistical-term-4-hierarchical-regr...

Hierarchical regression is the practice of building successive linear ... Please note that Karen receives hundreds of comments at The Analysis Factor website each month. ... Logistic Regression for Binary, Ordinal, and Multinomial Outcomes ...

The Hierarchical Logistic Regression Model for ... - JStor

www.jstor.org/stable/2288464

JSTOR
by GY Wong - ‎1985 - ‎Cited by 409 - ‎Related articles
hierarchical logistic regression model is proposed for study- ing data with group ... "Multilevel Comparative Analysis of the World Fertility Survey," and National.

[PDF]Hierarchical Models

https://www.cs.princeton.edu/.../hierarchical-models....

Princeton University
by DM Blei - ‎2011
Oct 17, 2011 - The causal questions in the descriptive analysis are difficult to ... Linear and logistic regression are examples of generalized linear models.


3.3 Regression Analysis Regression analysis is widely used for prediction and also understand the estimating the relationships among variables. Variables are divided into two as dependent and independent variables while there are single and multivariate regression analyses. If data is going to be analyzed using a single independent variable, it is called one variable regression; otherwise it is called multivariate regression analysis [10]. Multivariate regression analysis is divided into two as standard and hierarchical regression analysis. We used standard multivariate regression analysis because we had more than one independent variable for our study. Our goal is to perform multivariate regression analysis on relationships between independent and dependent variables. Multivariate regression analysis formula can be given below: = + + . (1) where xj , y, βj and Ɛ are independent values, dependent values, parameters. The estimated y value is calculated by the values obtained from the regression results and error term (Ɛ) obtained by finding absolute values of difference of the actual y values from the estimated y values. n value is number of independent values. 


http://www.jds-online.com/files/JDS-647.pdf
. The Multilevel Model 2.1 Multilevel analysis for multistage clustered data In multilevel research, the structure of data in the population is hierarchical, and a sample from such a population can be viewed as a multistage sample. Multilevel Logistic Regression Analysis 95 Because of cost, time and efficiency considerations, stratified multistage samples are the norm for sociological and demographic surveys. For such samples the clustering of the data is, in the phase of data analysis and data reporting, a nuisance which should be taken into consideration. However, these samples, while efficient for estimation of the descriptive population quantities, pose many challenges for model-based statistical inference. 

Fall, 2006 Instructor: 陳正慧 Social Research Methods (社會學研究方法) 2117 & 6501 Homework #5: Sampling Due Date: December 6 (二部) or December 7 (一部), in class 說明: 這一個作業的目的是測驗你對抽樣理論的了解程度。在回答每一題時,你必須要 舉例說明,解釋名詞式的答案不被接受。你也無須長篇大論,簡短扼要地回答即 可。 注意: 在舉例解釋每一個樣本時,你必須說明抽樣的每一個步驟是隨機或 非隨機的。 回答以下問題: 1. 舉出一個例子,在例子中清楚說明下列名詞的意義及它們彼此之間的關係: a population,a target population,a sampling frame,and a sample. Answer: 假設我們要對東海大學的學生進行研究,預計抽取 500 位學生從事調查 研究。所有的東海大學學生這個概念是 population(母體)。研究者如果能從註冊 組取得一份東海大學學生名單,從中選取樣本,這份名單即為 target population(目 標母體)。要注意的是,由於實際上東海大學學生是隨時在變動的(例如:有人退 學、休學、缺席等…),取得的這份名單可能不是最正確的,意即目標母體與母 體多少有些出入。這份名單也是 sampling frame(抽樣架構)。抽出的 500 位學生 即是 sample(樣本)。 2. 從同一個母體(population)或 sampling frame 中舉例說明 simple random sampling and systematic random sampling,要解釋兩者之間的相同或相異之處。 Answer: 以東海大學社會系大學部學生作為 population(母體),並以大一到大四 的學生註冊名單作為 sampling frame(抽樣架構)。將此名單進行編碼,給予所有 學生連續編號從 1 到 N ,此步驟為非隨機。再抽出 100 位學生為一樣本進行研 究時: simple random sampling(簡單隨機抽樣)的辦法是,利用亂數表或者電腦亂數,取 出 100 個三位數的隨機號碼,將這 100 個號碼對應的名單作為其樣本。此步驟為 隨機。 Systematic random sampling (系統隨機抽樣)的作法則是隨機取一個號碼 k 作起點 (random start,此步驟為隨機),按照 N/100 的間隔規律取出 100 個號碼。 2 2 假設這學生註冊名單在進行抽樣之前是隨機產生的,那系統抽樣與簡單隨機抽樣 在本質上沒有差別。但在實際選取樣本的操作中,系統抽樣比較簡單。唯系統抽 樣在使用上需注意名單排列編號時有無某些特定的週期性存在。 3. 舉出有關聯的兩個例子說明 stratified sampling and cluster sampling 的相同或相 異之處,在每個例子中也要說明為何採用這種抽樣法。 Answer: 假設研究高中生的讀書行為: 1).利用全校模擬考統一排名的成績,區分出不同分數的層級,再依系統抽樣 隨機抽取學生樣本進行訪問。 2).建立全校班級名單,隨機抽取數個班級,再於抽取的班級進行隨機抽樣(這 後兩個步驟為隨機)。 前者為 stratified random sampling,因為成績類似的學生可能有較為相近的讀書行 為,利用成績區分將學生切割出同質性高的次集合,此抽樣法可以降低抽樣誤 差。後者則是 random cluster sampling,理由是同一班級內同學的讀書行為也許 相近,此抽樣法也無需建立學生成績資料,執行上比較容易。 4. 舉例說明 multistage stratified cluster sampling,並解釋在何種情況下採用此抽 樣法。 Answer: 假設研究者要對台灣的大學學生進行研究以得知他們對高學費政策的 看法。因為各大學的性質不同 (例如可區分為綜合性的研究大學、以文法商科或 理工科見長的大學、由技職專科改制的科技大學或學院等),研究者可先將所有 的大學分成不同集群(clusters),每一集群內的大學同質性高。然後研究者可將同 一集群中的大學依學生人數、學校所在地、平均學生學測分數等變數進行分層, 再從已分層的集群名單中以系統抽樣選出要進行研究的大學。(此一步驟為隨機) 接下來,假設抽出的大學不願提供學生名單,研究者可在選出的大學內用集群抽 樣選出學院、學系,再抽出受訪學生(這些步驟均為隨機)。 進行 multistage stratified cluster sampling 的主要理由是,一方面希望能夠保留 cluster sampling 實行上便利的優勢 (在本例中無需建立一份全台灣大學學生名 單),另一方面則藉由區隔出同質性高的層級以減少抽樣可能的誤差。 3 3 5. 舉例說明在什麼情況下你會使用非隨機立意抽樣法 (nonrandom purposive sampling)來選取你的研究對象,並解釋為什麼。 Answer: (nonrandom purposive sampling 為非隨機抽樣。) 研究者如想要了解東海大學學生同性戀的活動情況,但是校園內沒有立案的社 團,也沒有任何公開的訊息或官方的統計數字,他有幾個管道可以嘗試:上網尋 找有關東海大學同性戀的留言板或討論區、接觸有關性別議題的課程的師生、訪 問校園中的性別議題專家(授課教師、學生諮商輔導人員…)等。研究者如果找尋 到一個同性戀者,與之建立友誼,可引介其他的同性戀者,那麼研究者對於東海 大學同性戀者的活動情況將有更深的認識。 非隨機立意抽樣是以研究者心中的目的作為選擇樣本的依據,研究者不可能建立 一份「東海大學同性戀學生」的母體名單,也不會在意研究的個案是否代表母體。 這種抽樣方法適用於尋找特定的或是較難尋找的研究對象。 6. 很多研究小組都表示要對東海大學學生抽樣以進行問卷調查。假設你只計劃 訪問 200 位就讀東海大學的大學生,選用任何一種隨機抽樣法,以實際可行的方 式選出一具代表性的樣本。 (註: 在實際的情況下,你無法取得大學生名單。) Answer: 假設研究者以東海大學學生為研究對象,調查有關課後休閒活動的類 型,預計訪問 200 位學生,在沒有大學生名單的情況下,可採用 multistage cluster sampling 選出一具代表性的樣本。因東海大學有 6 個學院,可先隨機 抽出 3 個學院,由於各學院內人數不同,可以用 PPS 調整各學院被抽取到的機率, 使每個學生被抽取的機率相等。再從抽出的各學院中隨機抽出 3 個系,共計 9 個系。每個系再隨機抽出 2 班(包括大一至大四),合計 18 班,每班訪問 12 人。 由於每個系、每個班級人數皆不相同,在每個階段都可以用 PPS 調整各系、各班 隨機被抽取的機率。最後,每班可依班級人數依系統抽樣隨機抽出 12 人,可請 該系辦公室代為通知或親自到該班必修課與學生聯繫


What are type I and type II errors?


When you do a hypothesis test, two types of errors are possible: type I and type II. The risks of these two errors are inversely related and determined by the level of significance and the power for the test. Therefore, you should determine which error has more severe consequences for your situation before you define their risks.
No hypothesis test is 100% certain. Because the test is based on probabilities, there is always a chance of drawing an incorrect conclusion.
Type I error
When the null hypothesis is true and you reject it, you make a type I error. The probability of making a type I error is α, which is the level of significance you set for your hypothesis test. An α of 0.05 indicates that you are willing to accept a 5% chance that you are wrong when you reject the null hypothesis. To lower this risk, you must use a lower value for α. However, using a lower value for alpha means that you will be less likely to detect a true difference if one really exists.
Type II error
When the null hypothesis is false and you fail to reject it, you make a type II error. The probability of making a type II error is β, which depends on the power of the test. You can decrease your risk of committing a type II error by ensuring your test has enough power. You can do this by ensuring your sample size is large enough to detect a practical difference when one truly exists.
The probability of rejecting the null hypothesis when it is false is equal to 1–β. This value is the power of the test.
 Null Hypothesis
DecisionTrueFalse
Fail to rejectCorrect Decision (probability = 1 - α)Type II Error - fail to reject the null when it is false (probability = β)
RejectType I Error - rejecting the null when it is true (probability = α)Correct Decision (probability = 1 - β)

Example of type I and type II error

To understand the interrelationship between type I and type II error, and to determine which error has more severe consequences for your situation, consider the following example.
A medical researcher wants to compare the effectiveness of two medications. The null and alternative hypotheses are:
  • Null hypothesis (H0): μ1= μ2
    The two medications are equally effective.
  • Alternative hypothesis (H1): μ1≠ μ2
    The two medications are not equally effective.
A type I error occurs if the researcher rejects the null hypothesis and concludes that the two medications are different when, in fact, they are not. If the medications have the same effectiveness, the researcher may not consider this error too severe because the patients still benefit from the same level of effectiveness regardless of which medicine they take. However, if a type II error occurs, the researcher fails to reject the null hypothesis when it should be rejected. That is, the researcher concludes that the medications are the same when, in fact, they are different. This error is potentially life-threatening if the less-effective medication is sold to the public instead of the more effective one.
As you conduct your hypothesis tests, consider the risks of making type I and type II errors. If the consequences of making one type of error are more severe or costly than making the other type of error, then choose a level of significance and a power for the test that will reflect the relative severity of those consequences.

Multilevel model - Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Multilevel_model

Wikipedia
Multilevel models are statistical models of parameters that vary at more than one level. ... The units of analysis are usually individuals (at a lower level) who are nested within ... refers to the overall regression coefficient, or the slope, between the .... Multilevel models are a subclass of hierarchical Bayesian models, which are ...

[PDF]Hierarchical Logistic Regression with SAS GLIMMIX

www.lexjansen.com/wuss/2006/Analytics/ANL-Dai.pdf

by J Dai - ‎Cited by 33 - ‎Related articles
focused on hierarchical logistic regression modeling with GLIMMIX. ... studies often involve the analysis of data with complex patterns of variability, such as ...

Comparing hierarchical modeling with traditional logistic ...

www.ncbi.nlm.nih.gov/...

National Center for Biotechnology Information
by PC Austin - ‎2003 - ‎Cited by 113 - ‎Related articles
Comparing hierarchical modeling with traditional logistic regression analysis among patients hospitalized with acute myocardial infarction: should we be ...

[PDF]Fitting and understanding multilevel (hierarchical) models

www.stat.columbia.edu/~gelman/.../mlmtalk.pdf

Columbia University
by A Gelman - ‎2004 - ‎Related articles
Dec 8, 2004 - State-level opinions from national polls. Poststratification. Validation. Multilevel modeling of opinions. ▻ Logistic regression: Pr(yi = 1) = logit.

[PDF]Multilevel Logistic Regression Analysis Applied to Binary ...

www.jds-online.com/files/JDS-647.pdf

by MHR Khan - ‎2011 - ‎Cited by 32 - ‎Related articles
often follow a hierarchical data structure as the surveys are based on mul- ... used to exemplify all aspects of working with multilevel logistic regression models ...

How to conduct a multilevel (hierarchical) binary logistic ...

stats.stackexchange.com/.../how-to-conduct-a-multilevel-hierarchical-bin...

Aug 22, 2013 - However, I am not familiar with the multilevel model for logistic regression. Please give me some names of necessary multilevel analyses for ...

Hierarchical Multiple Regression vs Ordinal (logistics ...

stats.stackexchange.com/.../hierarchical-multiple-regression-vs-ordinal-lo...

Nov 17, 2013 - However, ordinal logistic regression can also be hierarchical and multiple: Those terms refer to ... Multiple logistic regression power analysis.

No comments:

Post a Comment