http://www.r-bloggers.com/agent-based-models-and-rnetlogo/

The basic

partworths that best explain the overall preference

judgments made by respondent

Estimation of Markov Chains Transition Probabilities

Using Conjoint Analysis (Expert Preference

Agent Based Models and RNetLogo

July 24, 2014

By Joseph Rickert

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Joseph Rickert

If I had to pick just one application to be the “killer app” for the digital computer I would probably choose Agent Based Modeling (ABM). Imagine creating a world populated with hundreds, or even thousands of agents, interacting with each other and with the environment according to their own simple rules. What kinds of patterns and behaviors would emerge if you just let the simulation run? Could you guess a set of rules that would mimic some part of the real world? This dream is probably much older than the digital computer, but according to Jan Thiele’s brief account of the history of ABMs that begins his recent paper, R Marries NetLogo: Introduction to the RNetLogo Package in the Journal of Statistical Software, academic work with ABMs didn’t really take off until the late 1990s.

Now, people are using ABMs for serious studies in economics, sociology, ecology, socio-psychology, anthropology, marketing and many other fields. No less of a complexity scientist than Doyne Farmer (of Dynamic Systems andPrediction Company fame) has argued in Nature for using ABMs to model the complexity of the US economy, and has published on using ABMs to driveinvestment models. in the following clip of a 2006 interview, Doyne talks about building ABMs to explain the role of subprime mortgages on the Housing Crisis. (Note that when asked about how one would calibrate such a model Doyne explains the need to collect massive amounts of data on individuals.)

Fortunately, the tools for building ABMs seem to be keeping pace with the ambition of the modelers. There are now dozens of platforms for building ABMs, and it is somewhat surprising that NetLogo, a tool with some whimsical terminology (e.g. agents are called turtles) that was designed for teaching children, has apparently become a defacto standard. NetLogo is Java based, has an intuitive GUI, ships with dozens of useful sample models, is easy to program, and is available under the GPL 2 license.

As you might expect, R is a perfect complement for NetLogo. Doing serious simulation work requires a considerable amount of statistics for calibrating models, designing experiments, performing sensitivity analyses, reducing data, exploring the results of simulation runs and much more. The recent JASS paperFacilitating Parameter Estimation and Sensitivity Analysis of Agent-Based Models: a Cookbook Using NetLogo and R by Thiele and his collaborators describe the R / NetLogo relationship in great detail and points to a decade’s worth of reading. But the real fun is that Thiele’s RNetLogo package lets you jump in and start analyzing NetLogo models in a matter of minutes.

Here is part of an extended example from Thiele's JSS paper that shows R interacting with the Fire model that ships with NetLogo. Using some very simple logic, Fire models the progress of a forest fire.
Snippet of NetLogo Code that drives the Fire model

to go
  if not any? turtles  ;; either fires or embers
    [ stop ]
  ask fires
    [ ask neighbors4 with [pcolor = green]
        [ ignite ]
      set breed embers ]
  fade-embers
  tick
end
 
;; creates the fire turtles
to ignite  ;; patch procedure
  sprout-fires 1
    [ set color red ]
  set pcolor black
  set burned-trees burned-trees + 1
end

The general idea is that turtles represent the frontier of the fire run through a grid of randomly placed trees. Not shown in the above snippet is the logic that shows that the entire model is controlled by a single parameter representing the density of the trees.

This next bit of R code shows how to launch the Fire model from R, set the density parameter, and run the model.

# Launch RNetLogo and control an initial run of the
# NetLogo Fire Model
library(RNetLogo)
nlDir <- "C:/Program Files (x86)/NetLogo 5.0.5"
setwd(nlDir)
 
nl.path <- getwd()
NLStart(nl.path)
 
model.path <- file.path("models", "Sample Models", "Earth Science","Fire.nlogo")
NLLoadModel(file.path(nl.path, model.path))
 
NLCommand("set density 70")    # set density value
NLCommand("setup")             # call the setup routine 
NLCommand("go")                # launch the model from R

Here we see the Fire model running in the NetLogo GUI after it was launched from RStudio.

This next bit of code tracks the progression of the fire as a function of time (model "ticks"), returns results to R and plots them. The plot shows the non-linear behavior of the system.

# Investigate percentage of forest burned as simulation proceeds and plot
library(ggplot2)
NLCommand("set density 60")
NLCommand("setup")
burned <- NLDoReportWhile("any? turtles", "go",
                c("ticks", "(burned-trees / initial-trees) * 100"),
                as.data.frame = TRUE, df.col.names = c("tick", "percent.burned"))
# Plot with ggplot2
p <- ggplot(burned,aes(x=tick,y=percent.burned))
p + geom_line() + ggtitle("Non-linear forest fire progression with density = 60")

As with many dynamical systems, the Fire model displays a phase transition. Setting the density lower than 55 will not result in the complete destruction of the forest, while setting density above 75 will very likely result in complete destruction. The following plot shows this behavior.

RNetLogo makes it very easy to programatically run multiple simulations and capture the results for analysis in R. The following two lines of code runs the Fire model twenty times for each value of density between 55 and 65, the region surrounding the pahse transition.

d <- seq(55, 65, 1)                  # vector of densities to examine
res <- rep.sim(d, 20)                # Run the simulation

The plot below shows the variability of the percent of trees burned as a function of density in the transition region.

My code to generate plots is available in the file: Download NelLogo_blog while all of the code from Thiele's JSS paper is available from the journal website.
Finally, here are a few more interesting links related to ABMs.

On validating ABMs
ABMs and

matically generates html versions of documents as we crawl the web.

Page 1

Middle-East Journal of Scientific Research 9 (3): 431-436, 201 1

ISSN 1990-9233

Estimation of Markov Chains Transition Probabilities

Using Conjoint Analysis (Expert Preference)

N. Akhondi, G. Yari, E. Pasha and R. F amoosh

Department of Statistics, Science and Research Branch, Islamic Azad University, Teharan, Iran

Abstract: This paper proposes methodology to estimate transition probabilities on the base of judgments

by experts that may be useful in situations of data absence. The Fractional Factorial Design (FFD) is used to

cope with the curse of dimensionality. By means of Conjoint Analysis (CA) approach we finally reconstruct

the complete Markov Chain transition probabilities. The experiment results show it is promising for us to use

(CA) in estimating of the entropy rate of Markov Chains with a finite state space.

Key words:Markov Chain - Transition probabilities -

Design in Conjoint Analysis

Conjoint Analysis - Design of experiments - Efficient

INT RODUC TION

The present paper proposes a framework based on

expert opinion elicitation, developed to estimate the

transition probability matrix of an irreducible, discrete

time, homogeneous Markov Chain with a finite state

space. In this article we address the question of

estimating the transition probability matrix of Markov

Chain in situations of data absence. In general, the full

probability distribution for a given stochastic problem is

unknown. When data are available, the most objective

estimation of them is the maximurnlikelihood estimation of

the transition probabilities (P11).

The difficulties grows when the aim is providing

scenarios analysis involving future states perhaps never

performed before. In this situation we need information

gathered from experts and we cannot resort to past data

[1]. Our methodology has the new idea of estimating

transition probabilities using conjoint (FFD) methods that

is useful in this conditions.

Conjoint analysis has as its roots the need to solve

important academic and industry problems [2]. It is a

popular marketing research technique. In order to respond

to consumers” needs, makers have to research consumers”

preferences of products, services and their selection

criteria of products. The conjoint analysis measures the

degree of importance which is given to particular aspects

of a product or service [3]. The real genius is making

appropriate tradeoffs so that real consumers in real market

research settings are answering questions from which

useful information can be inferred. In the thirty years

since the original conjoint analysis articles, researchers in

marketing and other disciplines, have explored these

tradeoffs [2]. In conjoint experiments, each respondent

receives a set of profiles to rate (or rank). Designing these

experiments involves determining how many and

which profiles each respondent has to rate (or rank) and

how many respondents are needed [4]. Experimental

design is a fundamental component of (CA).

The complexity of experimental design arises from

the exponential growth inthe number of attributes, i.e.

the curse of dimensionality. Use of a full factorial

design (all profiles) will place an excessive burden

on respondent for providing evaluations. Therefore,

utilize (FFD), i.e.

orthogonal design, or a subset of all profiles [5].

The basic

partworths that best explain the overall preference

judgments made by respondents [2]. (CA) is a

technique based on a main effects analysis-of-variance

researchers fractional balanced

conjoint problem is to estimate the

model that decomposes the judgment data into

components, based on qualitative attribute of the

products or services [6]. Most commonly used methods

to acquire partworths are the Linear Programming

Technique for Multidimensional Analysis of Preference

(LINMAP), Hierarchical Bayes (HB) methods, Multivariate

Analysis of Variance (MANOVA) and Ordinary Least

Squares (OL S) Regression [7].

Corresponding Author: N. Akhondi, Department of Statistics, Science and Research Branch,

Islamic Azad University, Teharan, Iran. E-mial: akhondinasrin@gmail.com.

Page 2

Middle-Basil Sci. Res, 9 (3): 431-436, 2011

Methods have been developed to take conjoint data

to approach to optimal or near—optimal products and

systems designs in tourism, entertainment, health

maintenance, gambling and etc. We introduce a new

application of (CA) in this article.

Leone and Fucili [1] used (CA) to estimate Markov

Chains transition probabilitiesThey built Fractional

Factorial Designs (FFD) on the starting states and in their

method the experts are asked to identify the presumably

destination states and quantify the probability of

occurrence of the transitions towards each destination

proposed scenarios(for each state included in a (FFD)).

In their method psychological critics may be raised

because of the respondents are not asked according

to the

(FFD) treatments by comparing each one of them.

The difficulties may grow if the number of attributes

well procedure of rating or ranking the

grows. We overcome the difficulties in this paper by

building 2 (FFDs) on the starting and destination states

and we ask experts to give ratings on the likelihood of

various states occurring in the future. we need two sets of

states to get transition probabilities. The Fractional

Factorial Design (FFD) can tackle the large number of

states in an elegant way. We used (CA) approach and

Logistic Regression to construct the complete Markov

Chain transition probabilities (under the assumptions of

independence of the attributes at an individual level for

the respondents). The conjoint methods used for ratings

data are now essentially dummy variable regression

methods.

The remainder of this paper is organized as follows:

First, we provide a review of (CA) and references to

related substantial theoretical and empirical work, then we

discuss our methodology and simulation data. Finally we

conclude the most important strengths and weaknesses

of the proposed methodology. We used Minitab (ver:15)

for generating (Pu), SPSS (ver: 1 6) for generating 2 (FFDs)

and Multivariate Analysis of Variance (MANOVA) for

estimating parameters.

Conjoint Analysis: The essence of conjoint analysis is to

identify and measure a mapping from more detailed

descriptors of a product or service onto a overall measure

of the customer’s evaluation of that product. Full-profile

analysis remains the most common form of conjoint

analysis and has the advantage that the respondent

evaluates each profile holistically and in the context of all

others profile. Its weakness is that the respondent’s

burden grows dramatically with the number of profiles

432

that must be ranked or rated. The respondent can be

asked to rank order all stimuli (profiles) or to provide a

metric rating of each stimulus [2]. When appropriate,

efficient experimental designs, (FFD) are used so that the

respondent need consider only a small fraction of all

possible product profiles ]8.[ If the number of attributes is

large often respondents can evaluate partial profiles (PP)

in which some of the features are explicit and the other

features are assumed constant [2]. (FFD) orthogonal

arrays are categorized by their resolution. For example,

resolution III designs enable the estimation of all main

effects free of each other, but some of them are

confounded with two-factor interactions. Higher

resolution designs require larger number of profiles.

Resolution 111 designs are most frequently used in

marketing conjoint studies. Orthogonal arrays can be

either balanced or unbalanced in terms of levels of

attributes. An unbalanced design gives larger standard

errors the parameter estimates for those attributes that are

less frequently administered. The minimum standard error

is attained when a full factorial design is used [5]. Various

measures for discussing the efficiency of an experimental

design can be described as follows for the linear model

(Kuhfeld, Tobias and Garratt 1994),

Y = XB + e (1)

Where [5 is a pxl vector of parameters, X is an nxp design

matrix, and e is random error. With the usual assumption

on errors, the least squares estimate of B is given by

(X'XY1 X'Y. The variance-covariance matrix of the

partworth (parameter) estimates of the attributes is

proportional to (X'XY‘. The efficiency of a design is

based on the information matrix. An efficient design will

have a smaller variance matrix.Two famous efficiency

measures (all based on the eigenvalues of (X'XY‘) are:

Orthogonal designs for linear models are generally

considered to be efficient because their efficiency

measure is close to l [5].

Ieng-Iong Lin [7] successfully presented an

integrated product design model to be applied in clothing

product design. His methodology focused not only on

either expertise of designers or demands of consumer

but on both of them. He used relationship matrix to

combine both the (CA) data from the two individual

Page 3

Middle-Basil Sci. Res., 9 (3): 431-436, 201]

groups (i.e., designer and consumer) to design

product. Van Houtven er al. [9] applied (CA) to

health-related benefit-risk tradeoffs

non-expected-utility framework. They demonstrate how

this

nonlinear weighting of adverse-event probabilities.

Jeremy J. Michalek et a1. [10] presented a unified

methodology for

estimate in a

method can be used to test for and estimate

product line optimization that

coordinates positioning and design models to achieve

realizable firm—level optima. This method is demonstrated

for a line of dial-readout scales, using physical models

and conjoint—based consumer choice data. Hiromi Yamada

et a1. [3] administrated the study to estimate the structure

of the variable to specify the quality requirement of the

new product using the conjoint analysis and the entropy

model. As a result, it was understood that the conjoint

analysis and the entropy model are effective methods to

estimate the quality requirement. Lekemoto and Yomaoka

[l l] proposesd a method of analysis by using (CA) that

makes it possible to use a lower number of profile cards

than that provided by

experiment even when a large number of items is being

the orthogonal design of

surveyed. An Internet survey of 1,600 consumers using

this method indicated that it generated identical analytical

results to those produced when the orthogonal design of

experiment was used. Byungun Yoon and Yongtae Park

[12] applied a new hybrid approach that enhances the

performance of morphology analysis (MA) by combining

it with conjoint analysis (CA) and citation analysis of

patent information. Alternatives for new technology

development from among the emerging technologies are

presented by combining the valuable levels of each

attribute in a morphology matrix predefined by domain

experts. The technological competitiveness of a company

can be analyzed by a newly suggested index, “technology

share,” which is analogous to the concept of market share

in traditional CA.

Proposed Methodology: Expert opinion is one of the key

research areas in Probabilistic risk analysis (PRA)

in engineering, public health, environment, program

management, regulatory policy, finance and etc. The use

of expert judgment is critical, and often inevitable, when

there are no empirical data or information available on the

variables of interest [13]. We illustrat our motivation for

resorting to experts in this section.

Assume a dynamic system with components] a set

of states S, a set of actions A, a reward function.

433

S x A -' R, a transition probability matrix that the full

probability distribution for this stochastic problem is

unknown. We restrict attention to time separable

Markovian decision problems for dealing with the curse

of dimensionality in dynamic systems [1 4]. An irreducible

discrete time Markov chain with a finite state space has

been studied previously by Papangelou [15], who

establishes a Large Deviation Principle (LDP) for Markov

chains whose order is unknown. Baris Tan and Kamil

Yilmaz [16] presented a complete analytical framework for

the testing procedure based on statistical theory of

Markov chains. They studied the time dependence and

time homogeneity properties of the Markov chains.

homogeneous Markov Chain with a finite state space that

assume an irreducible, discrete time

the starting and the destination states of the system are

defined by combinations of “n” key attributes, each with

Lk (K=1,...,n)

discretizing). For example, if n:3 the starting state i and

the destination state j are given by In, In, L3, ljl, Iii, ljs,

levels (continuous variables are

respectively. The total number of states is #1121 LK

and the number of transition probabilities

Pij=P(lJ-1, Iii, lj3 | In, In, IB) is equal to 12. Under the

assumptions of independence of the attributes at an

individual level for the respondents, Pij is given by

P-- (2)

lf transition probability matrix is unknown and

data are available we use, p (the maximurnlikelihood

estimator of

P119111 = Nij " Ni (3)

Where Niis the number of times that the starting state

11'

observed to go from the starting state

” has occurred when the process has been observed,

is the number of times that the process has been

to the

destination state " " in one step.

When data are not available we need information

gathered from experts.

Our initial purpose of estimating P1] may be seen as

the purpose of estimating P(ljK | In, In, ...,lin).

We build 2 (FFD)s on the starting I :HE:1LK and

the destination states.

start from I instead of 12. In our method the judges

By using this solution we can

are asked to assign transition probabilities to the

Page 4

Middle-Basil Sci. Res., 9 (3): 431-436, 2011

destination states in the (FFD’s) for each state included

in the (FFD) of starting states. The experts will be asked

about a reduced number of starting states (the (FFD)

ones) and under the assumptions of independence of the

attributes atanindividual level for the respondents the

results will be generalized to the others non included in

the (F FD) by means of Logistic Regression (LR). The (LR)

the between the

independent variables and the log-odds of the outcome

model examines relationships

variable;

(odds: p ). (4)

1'P

The model on log—odds (Logit) scale is linear

(L0g1t=10g(i)=xn +e) (5)

1-11

As above mentioned we want to estimate

P(l]§ | 111, L2, ...,lm) by means of logistic regression.

The estimated parameters are used to reconstruct the

probability of arriving in ljK also for starting states not

included in the (FFD). Finally the probabilities of

destination states are given by:

Application on Simulation Data: We propose an

simulation data to illustrate

methodology. We suppose n:4 (The number

our

attributes) each with 3 levels and the number of experts is

application on

8 in 2 groups (in each group 4 experts).We used SPSS

(verzlo) for generating 2 resolution 111 designs (FFDs)

for the starting and the destination states and Minitab

(ver:15) for generating P,J_ All experts received the same

(FPDs). We

P for each of respondents as mentioned below:

simulated transition probability matrix

In each of starting states in (FFD) we generated 4

independent bivariate normally distributed vectors:

zl,zz,zs,z4_ from N (p, Z), 11=(0,0), Zl=[l,-.9; -.9,l]

Where:

zj=(log (Ljk/ 1— Ljk)) j=1,2,3,4, k=1,2, (6)

Ljk= P013 | 111,112, li39li4) (7)-

We computed Ljk based on zj. Under the

assumptions of independence of the attributes at

434

an individual level for the respondents we computed

n:4. We repeated this

Pi- = nzq P(ljK |1i1,1i2,...,1in),

computation for 500 times and finally mean of them

recorded as Pij :P(lj1, [1-2, [1-1 11-4 | In, In, 1,-3‘ 1,4). Then we

updated Ljk based on Pm In the Table l we showed the

partially of these computations for L12 for the first

respondent. Finally we concerned eq. (5) to zj‘ j=1,2,3,4_.

The Multivariate Analysis of Variance (MANOVA) is

used for estimating of the parameters [17]. One of the

levels for each factor, regarded as dummy variables, is

eliminated. The estimated parameters are used to

reconstruct the probability of arriving in 11K also for

starting states not included in the

The Generalized Least Squares (GLS) Regression

estimates of the model:

The (LR) coefficients (significant at.05 level) for the

first 4 respondents are given in the following:

10g(L,, / 1- L12 ) =.465 1,1 +518 1,,+ e

(.183) (.183), (9)

10g(L,, / 1- L22 ) =.49';1,2 + e

(.161), (10)

10g(L11 / 1- Ln) = -1.244-.409 1,, + e

(.169), (11)

10g(L,, / 1- L4,) = -.908-.314l,1 +e

(.139), (12)

10g(L,1 / 1- L31) =.273 1,2 + e

(.132), (13)

Where the estimated coefficient standard errors are in

parentheses.

Also

log(Lnl 1- L21)= -1.207 (1 4)

log(L4l I 1' L41): 0 (15)

10g(L32/ 1- L32) = 4.208 (1 6)

The (LR) coefficients (significant at.05 level) for the

seconds 4 respondents are given in the following:

Page 5

Middle-Basil Sci. Res, 9 (3): 431-436, 2011

Table 1: The FFDs(for starting and destination states) and disclosed transition probabilities(P,J) for the ﬁrst respondent

Destination

Starting 3231 3312 2132 2321 2213 1333 1111 3123 1222 L1; Odds I L12/(1-L12) Log(Odds)

2231 .20 .12 .11 .08 .11 .21 .06 .01 .10 .31 .44 -.82

3213 .06 .12 .08 .11 .19 .20 .15 .08 .02 .37 .60 -.51

3132 .11 .18 .03 .22 .18 .19 .04 .01 .05 .43 .75 -.28

1111 .15 .14 .10 .16 .04 .14 .10 .10 .07 .30 .42 -.86

1333 .05 .07 .18 .15 .15 .16 .09 .07 .09 .48 .91 -.10

3321 .11 .15 .12 .15 .09 .18 .06 .10 .03 .36 .57 -.56

2123 .06 .14 .14 .14 .15 .00 .10 .13 .14 .42 .74 -.30

2312 .08 .18 .14 .13 .03 .05 .17 .14 .07 .30 .43 -.84

1222 .13 .17 .16 .11 .16 .06 .09 .01 .10 .43 .76 -.27

Note:2231 in FFD for starting states means that i: 112, In, 133, I41 and 3231 in FFD for destination states means that j: 113, In, 133, 141,

P,]=0.2O ( obtained from simulation),L12: P(2131 l 2231) + P(2321 | 2231) + P(2213 l 2231) etc

phymath999

Monday, April 11, 2016

Agent Based Models and RNetLogo

No comments:

Post a Comment