General notes on proDEX:
proDEX is a tool for qualitative multi-attribute modelling in basic and extended DEX methodology (Bohanec, 2003). It is written in Python programming language and uses graphical components of data-mining suite Orange (Demsar et al., 2004). At first it was made for experimental purposes, to test new approaches to semi-automatic construction and new functionalities of the models. Later, the requirements of modelling tasks in environmental domains dictated support for the expanded methodology and graphical user interface. proDEX lacks some features that are found in other tools of DEX methodology (like DEX and DEXi), however it also has some features that are not available in other tools. The most important features are the use of general hierarchies, numerical inputs and probabilistic utility functions.
The proDEX tool is implemented as a group of Orange widgets and is available for download on webpage.
It can be freely used for non-commercial purposes.
Below are installation instructions and previews of components:
- Installation
- ReadModel component
- Viewer component
- AlternativesEditor component
Installation
proDEX is implemented as a group of Orange widgets. Orange is an open source data-mining suite and widgets are its graphical components based on Qt GUI library. Thus, installing Orange is a prerequisite for installation of proDEX. This process is nicely described on the Orange downloads page, where we can find appropriate installation packages. Notice that there are multiple choices offered, depending on whether you already have Python on your system or not.
Second step is downloading and installing proDEX. The proDEX GUI components are available for download (see above) as a zipped directory of files named “proDEX”. Unzip the directory into the directory of Orange widgets. This is usually situated at:
C:\Python23\Lib\site-packages\orange\OrangeWidgets
or
C:\Program Files\Python23\Lib\site-packages\orange\OrangeWidgets
In the third step, we must inform Orange of the new group of widgets. For this purpose, we start Orange (there is no group of proDEX widgets yet) and invoke the rebuilding of widget registry. This command is found under Options-Rebuild widget registry. A new group of widgets named “proDEX” should then appear. By clicking on its tab, we can see icons of “ReadModel”, “Viewer” and “AlternativesEditor” components like in the picture below:
To use a particular component, it must be put on the canvas (click on it). In proDEX we usually put all three components onto the canvas and connect them in the same order as they are shown in the toolbar:
ReadModel component
Each use of proDEX is started with import of a model. This is done in the “ReadModel” component.
This component is used for opening a proDEX model file or importing a DEXi model file. A proDEX model definition is made in Python. A definition of a simple and small example model is in file Car.py.
Model definition in Python is usually done manually (open the example file in text editor to see an example). To speed-up the model definition, we can use DEXi to construct the model interactively and then import the file with “ReadModel” component. However, for this purpose DEXi models must be converted to PMML format what involves the use of a transformation program (in Slovenian) and requires the model to be constructed according to a set of rules. So, in case you would like to transform DEXi models into proDEX definitions, please consult us by e-mail for support in this process. An imported model can be saved in the native proDEX format. This is done in the “Viewer” component (described below), which is usually the next to be put on canvas and connected to “ReadModel”.
Viewer component
“Viewer” is a component with three tabs. The first tab provides some general information about the model, settings and an optional graphical preview of the model structure:
Currently, there is only one model setting, the one for showing or hiding the value of the optional model parameters named CONFIDENCE. The setting affects all the widgets that operate on particular model.
There is also an option of saving a model in the native proDEX format. This option is used to save the model in the native proDEX format in case we used DEXi to construct the model.
The second tab is used for quick evaluation of alternatives (what-if analysis):
It provides combo-boxes for the values of all the basic attributes (inputs) of the model. These values can be changed and the corresponding final evaluations can be compared.
The third tab offers a preview of all the utility functions of the model:
AlternativesEditor component
In “AlternativesEditor”, the alternatives can be defined, saved, loaded, generated, evaluated and sorted.
The first tab shows a table with rows of alternatives, with all the values of basic attributes in columns. The defined alternatives can be saved to a file and later loaded. If there are no numeric inputs, all the possible alternatives can be automatically generated. Use this function with caution, as the number of possible alternatives can be huge, taking more memory than a computer might have.
To compare two alternatives, we must select them. This is done by clicking on their name while holding the ‘Ctrl’ key. The comparison of alternatives is made in the second tab. The values of all the aggregate attributes are computed and shown there.
More elaborate examples of use are in supplementary materials web page of a paper that was published in Environmental Modelling and Software.
References
Bohanec, M., 2003. Decision support. In: Mladenic, D., Lavrac, N., Bohanec, M., Moyle, S. (Eds.), Data mining and decision support: Integration and collaboration. Kluwer Academic Publishers, pp. 23�35.
Demsar, J., Zupan, B., Leban, G., 2004. Orange: From experimental machine learning to interactive data mining. White paper (http://www.ailab.si/orange) Faculty of Computer and Information Science, University of Ljubljana.
Znidarsic, M., Bohanec, M., Zupan, B, 2006. proDEX – a DSS tool for environmental decision-making. Environ. model. softw., vol. 21, no. 10, pp. 1514-1516.
proDEX features through an example of use
proDEX is a tool for qualitative multi-attribute modelling in basic and extended DEX methodology. The most important new features with respect to other tools for this methodology are the use of general hierarchies, numerical inputs and probabilistic utility functions. As the development of these features and some other specific functionalities was initiated by the needs of environmental modelling, the example of use for a more elaborate demonstration of use is from this domain.
Example model
The example model waqNUM2prob.py is a model for assessment of impact of farm-level cropping systems onto the water quality. This is a part of an actual model described in the literature, but the utility functions are altered and serve for demonstration purposes only. The model is designed to assess the differences among the cropping systems, which include particular type of genetically modified (GM) crops and the ones that do not. The structure of the model’s concepts looks like this:
We can see that the structure is a general hierarchy – it has the form of a directed acyclic graph. The concepts in the lowest level of the hierarchy are inputs to the model. They influence the concepts on the next level of hierarchy (their ancestors) and these futher influence their ancestors. This continues up to the top of the hierarchy. The influences among particular concepts are depicted by connection lines in the structure and are defined with qualitative utility functions. These functions have the form of if-then rules and are usually given in tabular form. An example of utility function for the concept “runoff water” might be defined like this:
soil state | pesticide use | runoff water |
---|---|---|
compact | high | very low |
compact | medium | low |
compact | low | medium |
compact | none | high |
non compact | high | very low |
non compact | medium | very low |
non compact | low | low |
non compact | none | high |
The utility functions in proDEX can also be defined as probabilistic, so that the result of a function is not a single goal value, but a probabilistic distribution of values. Each rule can also be given a parameter named “CONFIDENCE”, which is used to express the modeler’s confidence in the correctness of provided rules. Using these options, the utility function for the concept “runoff water” can be defined like this:
soil state | pesticide use | runoff water |
---|---|---|
compact | high | { high:0.0 medium:0.0 low:0.1 very low:0.9 CONFIDENCE:0.9 } |
compact | medium | { high:0.0 medium:0.1 low:0.8 very low:0.1 CONFIDENCE:0.9 } |
compact | low | { high:0.1 medium:0.6 low:0.3 very low:0.0 CONFIDENCE:0.9 } |
compact | none | { high:1.0 medium:0.0 low:0.0 very low:0.0 CONFIDENCE:1.0 } |
non compact | high | { high:0.0 medium:0.0 low:0.05 very low:0.95 CONFIDENCE:1.0 } |
non compact | medium | { high:0.0 medium:0.0 low:0.3 very low:0.7 CONFIDENCE:0.9 } |
non compact | low | { high:0.0 medium:0.2 low:0.6 very low:0.2 CONFIDENCE:0.8 } |
non compact | none | { high:1.0 medium:0.0 low:0.0 very low:0.0 CONFIDENCE:1.0 } |
The outcomes of alternatives assessment in a model with probabilistic utility functions is a probability distribution of top concept’s values. If “CONFIDENCE” is defined in the model, it is also calculated and provided for each alternative.
The input variable “fertilization” (intended fertilizer use in kg/ha) is a numeric one and has to be categorized prior to its use in qualitative rule functions. proDEX allows the categorization to be a part of model definition. This allows the intrinsically numeric values to be entered as numeric (not pre-categorized), what in practice keeps the user’s focus on alternatives, rather than on categorization. Another minor benefit is in the fact that the numeric inputs are saved as such in the alternatives and preserve all the information, although the model might require only its qualitative representation.
Since the native proDEX model definition is written in Python, categorizations are defined as very simple programming functions of prescribed form. In our case, the function looks like this:
def func1(fertilizerVal): result = None if fertilizerVal <= 120: result = {"very_low":1.0, "low":0.0, "medium":0.0, "high":0.0, "CONFIDENCE":1.0} elif (fertilizerVal > 120) and (fertilizerVal <= 160): result = {"very_low":0.0, "low":1.0, "medium":0.0, "high":0.0, "CONFIDENCE":1.0} elif (fertilizerVal > 160) and (fertilizerVal <= 200): result = {"very_low":0.0, "low":0.0, "medium":1.0, "high":0.0, "CONFIDENCE":1.0} elif fertilizerVal > 200: result = {"very_low":0.0, "low":0.0, "medium":0.0, "high":1.0, "CONFIDENCE":1.0} return result fertilizer_use.generalFunction = func1
Although written in Python code, the essence of function is not hard to understand because of its simple if-then structure. Such a low-level definition might be unfriendly to a modeller without programming experience, but it allows great flexibility in defining categorizations. We could, for instance, define a smooth categorization that would not transform a numeric value into a single qualitative value (as in the above example), but into a corresponding probability distribution of all the possible values (using a Gaussian kernel in their calculation for instance).
Of course, such a definition of categorizations is optional and a user, who is not comfortable with it, can avoid it by defining the model with only categoric variables and taking care for categorizations of numeric variables outside the model.
Basic model exploration
After importing the model in proDEX, we can explore its structure, behavior and utility functions in the Viewer component. The first tab of this component consists of:
- Basic information about the model (name, number of attributes, nodes).
- Checkbox for showing or concealing the “CONFIDENCE” parameter in the widgets using current model.
- Button for graphical preview of the model structure.
- Button for saving the model in the native (Python) format. This option is used when the model is imported from program DEXi.
The graphical preview of the model is shown in a new window. The right side of the window contains a sketch of the model structure, the controlling options are on the left. The option “Gap width” is used to increase or decrease the initial gap among the nodes in the sketch. This is used to increase readability of complex models.
The next option is “Plot mode”, which defines the sketch plotting algorithm. Both sketching algorithms are very simple and have some deficiencies. The “Level-based” mode plots the hierarchy in a bottom-up fashion and is usually a better choice, the “Recursive” mode is experimental and results in very nice plots of some models, but also some very bad ones (like in the case of our water quality model).
As the plotted model structure is just a rough sketch, it is useful for a quick preview. But when working with a model, we usually want to show it in the presentations, articles or web pages. For this purpose, the plotting window has a button “Export GML”, which allows the user to export the model structure to a GML format file. Files in this format can be used by various advanced graph drawing packages (such as the free graph editor YeD for instance).
Window with a graphical preview of model structure:
The second tab of Viewer component is used for what-if analysis. This is a simple analysis of model behavior, where the user changes the model input values and observes the changes in the final outcome. It is used for testing the global response and sensitivity of the model. The tab consists of selectors for the values of each input variable, the button that triggers evaluation and the result:
The third tab offers preview of all the utility functions in the imported model. It has a selector of concepts and for each of them, it shows its utility function in a table:
Work with alternatives
The work with alternatives is an essential part of simulations and decision analysis with hierarchical rule-based models. An alternative is an option (situation, scenario) that we wish to assess and compare to other alternatives.
proDEX offers tools for work with alternatives in the “AlternativesEditor” component. Its functionality is demonstrated separately for each of the task groups:
– managing
In the basic view of the “AlternativesEditor”:
we can define, save, load and generate the alternatives. The left side of the window contains the controls for these operations. Alternatives are shown in the table on the right (central) part of the window.
An example of an alternative is already given when “AlternativesEditor” in opened. Its values can be changed and it can be given a name or description by selecting it and pressing the “Rename” button. Additional alternatives can be added by pressing the button “Add”. They can be defined and renamed like the first one. An alternative can also be deleted by selecting it and pressing the button “Delete”.
Definition of a bigger number of alternatives for a particular problem is a time-consuming task. To avoid this every time we shut down the proDEX, the alternatives can be saved to a file and later loaded using the “Save” and “Load” buttons. This way we can also generate sets of alternatives for different assessments on the same model.
Alternatives can also be automatically generated. For very small models (like the “Car” model from Introduction), we can generate all the possible alternatives by pressing the button “Generate All”. However, this is dangerous when we work with bigger models. The number of all the possible alternatives equals the product of the number of values across all the input variables, which easily exceedes the number of alternatives that can be stored in memory of a computer…or even memory of all the computers on the world. To avoid this combinatorial explosion, we can use the interactive generator of alternatives. This useful tool is invoked by pressing the button “Generate some”:
In the interactive generator, we can select only the values of input variables that are relevant in a particular setting. The alternatives, generated from the selected values, are inserted in the table of alternatives by pressing the buton “To table”. At the bottom, there is also a line with information about the current number of alternatives. If this number exceeds 1000, a warning is issued before the alternatives get generated. Besides the problems of storing and representation of large number of alternatives, it is also very questionable whether it is possible to properly analyze large number of alternatives.
– assessment
To see the evaluation of an alternative, we must select it and open the “Compare” tab. Many alternatives can be compared simultaneously by selecting them together (press and hold Ctrl). The imaginary three alternatives:
result in this comparison:
In the comparison view, we can observe the evaluations of the goal and all the other compound concepts. Each alternative is now given in a column. The values of input variables are in the blue cells and the corresponding evaluatins of the compound concepts are in grey cells. The values with highest probability are printed in bold to stand out of distributions.
We can see, that the alternatives differ with respect to their impact on the overall quality of water. The evaluations of middle-layer attributes help us in understanding the results and differences among the given alternatives.
– ranking
Profound comparison of alternatives is a demanding and time-consuming task, especially if we have to deal with a large number of them. When we do not have particular alternatives, but want to simulate a large number of situations, the alternatives generator helps us in efficiently defining a large number of alternatives. But as we can only profoundly analyse a couple of them, we need also tools to refine our search, usually to find best or worst possible options (and their features) in a given setting. proDEX offers some help in that by alternative ranking option.
The alternatives can be ranked (sorted) by pressing the “Sort selected” or “Sort all” button in the “AlternativesEditor” window. We can also select ranking method: by average distribution value or by probability of best/worst value. The “Sort selected” button triggers ranking of only the selected alternatives, whereas the “Sort all” ranks all the defined alternatives. Both show results in a new window:
In ranking results window, the alternatives values are shown, but the most important are the last two grey columns. One of them holds the goal evaluation of each alternative and the other the value of ranking criterion (this criterion also depends on model definition, in this example bigger is worse in contrast to the definiton of “water quality” values). The true value of the ranking tool is shown in cases of ranking tens or hundreds of alternatives, where we can easily see the best or worst few. Ranking tool is currently in experimental phase and will probably be further developed in the future, to facilitate selection of the defined percent of best or worst alternatives and to enable such a selection to control the contents of the table of alternatives in “AlternativesEditor”.