Methodology
Model validity testing starts when you smart modeling effort and it is distributed through all phases.
This lecture is about model validity testing.
Conceptual and Philosophical Foundations
Most of these are from IE 550.
Major distinction between statistical models and system dynamics models (theory like models, transparent). System dynamics models claim to explain the causal description of real processes.
Fundamental difference: short-term forecasting model is valid if it provides accurate enough short-term forecast.
Validity is measured by accurate enough forecast.
Fig1.1
After observing real points you update your forecast model. So each time you obtain a few points ahead. Then see the error and update the model. This is called as: ex-post prediction. Ex-post means after having observed new information.
These models don't claim any causality.
E.g. it is demand for automobile tires. It may depend on mile consumption, bankruptcy, some function of time.
In system dynamics there are different aspects of validity.
1. Behavior validity. Similar to statistical validity. But this is only one component.
2. Structural validity. This has greater importance. Causal justifiability of the model. Do the relations in the model reasonably approximate the real relations in the problem? System dynamics problem is valid, a) it has acceptable structure and acceptable representation of the real structure b) it can reproduce the dynamical behavior patterns of the real world.
Motto is:
The right behaviors for the right reasons.
This means, both behavior and reasons are important for validity.
System dynamics models are in the domain of science.
It is not only statistical problem. You are trying to convince the people for the structure. It is a simplification of the reality but it is a good simplification. This is like a good cartoon. Charlie Chaplin, Alfred Hitchcock, a few drawings: there is a cigar, fat man. If you know Hitchcock, you know immediately that it is a good representation of him, although it has only 5 lines of him. I can draw him with 25 lines but it won't be good. All great cartoon drawers have this ability. With only a few strokes they are able to represent the real person.
Models are like that. They are extreme simplifications of reality. This act in arts, we can not proceed in science in this way very easily. E=mc2 you can not look and see yes this is correct.
How do you convince people? This simplification is a good representation that it is a good representation of real world. This is a whole philosophical debate.
There is a discussion, can we positively prove, that a scientific theory is valid representation of reality. Some logical, positivists argued that this should be possible. Relativists argue that there is no absolute truth. All models are temporarily acceptable. These are all conventions. We can never prove that the model is valid representation of the reality, even if it can be a simple event. You try to establish confidence. Science is an act of confidence in building validity of models or theory.
System dynamics: validity testing is establishing confidence in the credibility of the models. There is no true, wrong model. Sterman: "all models are wrong". That is true philosophically. The question is which ones do you still use? Structural validity indicates a spectrum, not of yes or no. you have models that are great, fairly valid, so so, bad....
By the way, I talked about statistical significance is also a problematic term in philosophy. For one reason is, test of hypothesis is: H0: model is reality. Can we assume the equality? Alternative hypothesis is it is not equal. If you reject h0 it is a strong result, useful result. H0 assumes a state of world. H1 rejects it. If you cannot reject, there is a weak result, you cannot say anything about the outcome. You rejected h1 but you don't know the why the outcome came out. This is the foundation of statistical hypothesis testing.
H0 says: model statistically equals real world. If you reject h1 it is not practical. If you didn't reject, you don't have anything strong.
In policy analysis is strong. Model behavior is real behavior. h1: amplitude is
By rejecting in policy analysis H1, you obtain a strong result.
Why is structural validity so important in system dynamics? Practical reason: these models are built to understand how problems are generated, and come up with new policies to improve the behavior of system. Without structural validity how can you play with structure to improve policy behavior? Structural validity is essential in system dynamics, because the problem is not forecasting. I claim that new policy, inventories will be improved. Fig 1.2
Point predictive ability is behavior validity. In system dynamics, point forecasting nearly impossible, should not be expected from a system dynamics model. Point forecasting means point by point measure of errors. Ex-ante term: system dynamics models provide ex-ante predictions. You don't do curve fitting.
In statistical model this equation is a model. In system dynamics the curve doesn't even exist. It is outcome of the model. There is a big difference. Everything is done at time 0. The laws are given and you let it go. fig.1.3
In comparison, system dynamics has very successful output validity. Real data may very well. Real data may be far more different than statistical model. In fluctuating patterns, real data can be much further. You have noise in real systems and they are auto correlated in real life. Fig 1.4 you can easily get huge errors although you can represent the oscillating pattern.
System dynamics models provide pattern forecast rather than point forecast. this model says following: I am forecasting with a given set of relations, given set of initial conditions, the behavior of pattern will be damping oscillations, or collapse followed by growth oscillations. This is a forecast about behavior pattern. This distinction is terribly important. You are predicting but never will we compete with other models, even verbal models in the power of point forecasting because ex-ante models are not suitable for this. This is also true for chemical or scientific models.
Ex-ante: very weak point predictors but strong for behavior prediction.
Overall Nature and Selected Tests of Formal Model Validation
Again you know from IE 550.
Outline:
Validity testing:
· first structure validity (big)
· then behavior validity (smaller)
Structure validity
· as you build you make direct tests:
· structure-confirmation
· parameter-confirmation
· direct extreme condition
· dimensional consistency
· Indirect structure tests: (whole model) structure oriented. Whole system in connections. Does the whole have coherence?
· You do simulation runs, and by observing runs, can you say something about validity of the model.
· In theory you cannot: in automata theory: a given output can be generated by an infinite number of structures. So by looking at output you cannot deduce structure.
· Thus we do special simulations: extreme condition, phase relationship test...
· Most important: extreme condition test. If you run model under some specifically chosen extreme conditions, prior running the model, you can logically deduce what the result will be like. E.g. population model. Let us run with 0 woman population. You know the population will decay to zero. You know also the pattern of behavior.
· Question: is extreme condition so important, if there is a better operating model under normal conditions.
· Response: if the model yields a behavior pattern under extreme conditions. several possibilities exist:
· Model discovered something weak point in your problem.
· Model does not cover certain ranges of model. You spend your effort in other areas of model. You consciously know that. That is okay. But you have to face it.
· Model teaches you something. You have such a good model. When you run it, you obtain a pattern you were not expecting. You learn something that you didn't realize. That is the greatest benefit.
· How do you do these indirect tests?
· Sis software: indirect structure testing software. It does try to automate these dynamical patterns, you can write down the expected outputs. Then the software will make all these runs to categorize the outputs obtained. It recognizes the patterns. There is an attempt to automate the process.
· The problem in these indirect tests, you end up with 1000s of simulations and horrendous task to visually check each output.
· behavior sensitivity:
· Boundary adequacy: two versions: 1. you add a new structure to the model. There is a questionable variable. You run the model with new structure if the model doesn't add a new pattern, the new structure is not necessary. 2. You remove some structure that looks unnecessary.
· Phase relationship: important. You don't compare sales to sales in real data. You compare e.g. phase relationships between finished inventory and raw material in the model and reality. This will tell you about weaknesses about delays involved.
Behavior validity: more statistical. Remember this is about: I know I have a valid structure. Can I establish dynamical behavior pattern of real system by the model. Are they closed enough. Are the patterns closed enough, not the patterns? Then you have to define patterns firstly. Patterns might be trends, oscillations, amplitude, period, slope of damping envelope, curvature of the overshoot and decay, max-min points.
1. Define patterns
2. Measure them
3. Compare patterns of reality and model
· We have software for this: BTS.
No comments:
Post a Comment