Story Generation Driven by System-Modified
Evaluation Validated by Human Judges?
Pablo Gervás1 and Carlos León2
1 Instituto de Tecnología del Conocimiento
pgervas@sip.ucm.es
2 Departamento de Ingeniería del Software e Inteligencia Artificial
cleon@fdi.ucm.es
Abstract. Building systems which can transform their own generation
processes can lead to the creation of novel high quality artefacts. In
this paper a solution based on evaluation is proposed. The generation
process is driven by evaluation rules which can be modified by the system.
A panel of human evaluators provides feedback on the quality of the
artifacts resulting after each modification. The system keeps track of
which rules have been applied in the selection of each artifact, and learns
indirectly from the human judges which modifications to retain based on
the relative ratings of the artifacts. Relevant details and difficulties of
this approach are discussed.
1 Introduction
Societies of human creators are driven by two basic activities: creation of new
artifacts (as performed by artists) and evaluation of newly created artifacts (as
performed by artists and/or critics). Most of the efforts at modelling human
creativity in computational terms in the past have focused on the task of creating
artifacts. There are two strong arguments in favour of shifting the focus
towards evaluation. First, developing models or algorithms for producing artifacts
of a given kind tends to produce good/recognisable/typical artifacts of
that type, rather than creative new ones. Innovation requires both departure
from established procedures and the means for identifying when new results are
good. Second, generate and test approaches constitute a simple computational
way of rephrasing the task of creating artifacts in terms of the task of evaluating
them. Very simple enumerative procedures for traversing a search space may
yield surprisingly good results if driven by an appropriate evaluation function.
If such a shift is taken to an extreme, the enumeration of the valid alternatives
would not need to be altered in search for new artifacts, it would be
enough to modify the evaluation function to obtain new candidate elements.
Under this approach, the task of modifying creative procedures to obtain new
artifacts would take the form of modifying the evaluation function.
? Research funded by the MICINN (GALANTE: TIN2006-14433-C02-01), UCM, Comunidad
de Madrid (IVERNAO: CCG08-UCM/TIC-4300) and by BSCH-UCM.
85
In societies of human creators the development of an evaluation function
(usually understood as artistic sensibility or equivalent abilities) is recognised
as a fundamental requirement in the learning process of creative individuals.
This learning process almost always takes the form of having instances of good
artifacts pointed out.
This paper describes a system that outputs new artifacts obtained by exploring
a restricted conceptual space under the guidance of a set of evaluation
rules. The conceptual space to explore is that of sequences of events that may
be understood as stories. The exploration procedure is exhaustive enumeration
of the search space. The system starts off from an initial set of evaluation rules
for selecting new artifacts as the conceptual space is explored. A method for
actively modifying the set of evaluation rules is provided. Modifications of the
evaluation rules lead to new artifacts. The system learns which of the modified
rules to retain from the responses of a panel of human evaluators that act as
audience for its production of new artifacts.
2 Previous Work
Boden [1,2] divides creativity into exploratory creativity (exploring the common
possibilities for creating artefacts) and transformational creativity (changing
these common rules to find really new and valid objects).
Jennings [3] hypothesizes that societies create the evaluation criteria of creativity
in the individual’s mind, thus leveraging the concept of creativity to a
place beyond pure inner processes. As such, creativity is learned and taught between
individuals, and their relationships and the opinions that each one has
about another have a strong influence on the ideas about the quality or novelty
of artifacts. Autonomous creativity is the ability to change one’s own standards
without explicit direction from the outside. According to Jennings, autonomous
creativity in humans is achieved through social interaction
Ritchie [4] identifies the role of humans in Computational Creativity to be
still very necessary, given the current state of the art. In his model, it is stated the
role of humans must be clearly established before putting them in the generation
loop. It is also hypothesized that human actions in the system should never be
directly related to the generative objective of the system.
Wiggins [5,6] defines a formalization of Computational Creativity processes
in terms of their relation with classic Artificial Intelligence and the characteristics
that separate pure exploration processes from those typically and only present
in Computational Creativity. In his formalization, several sets are identified: U ,
the universe of concepts, containing the whole set of artefacts; the conceptual
spaces C0 · · · Cn, which are strict subsets of U , among others. Three functions
are also important to mention: R, which establishes the constraints that define
the conceptual space of valid results, T , which is the function that transverses
this conceptual space and sets an order on the identification of artefacts in the
Ci set constrained by R, and E , the function for evaluating artifacts.
86
3 Story Generation Based on Evaluation
The domain of story generation has been chosen to illustrate the ideas in this
paper because it deals with artifacts that are easy to represent symbolically, are
linear in nature, and, at a certain level of abstraction, their complete conceptual
space may be specified by definition in terms of combinations of their constituent
elements. Some of these points are sketched briefly below.
In terms of Wiggins’ model, the simplest approach for a generation system
that explicitly performs evaluation on the stories it generates could be the definition
of the E function (Eg at this level) and a basic generative strategy which
would generate all possible stories in the conceptual space. The generative strategy,
corresponding to Wiggins T function (Tg at this level), could be carried out
by a simple backtracking generation in which each step adds a new event to the
story (which therefore, after several steps in that branch, creates a whole story)
and then backtracks to test another generative branch. Given a certain set of
terminals like verbs, character names, places and valid time values, for instance,
events in the form subject–verb–arguments can be easily generated. The stories
can be considered to be sequences of events in the form {e1, e2, · · · , en}
(where events would be conceptual statements corresponding to sentences like
“Robert went to the park”).
The evaluation function (Eg) would output a real value in the interval [−1, 1],
−1 being a “very bad” story and 1 being a wonderful one. A value of 0 would
represent a plain, normal story, acceptable but not “good”. Thus we could obtain
a total order for stories in which any threshold in the [−1, 1] range could be used
to differentiate interesting stories (those falling above the threshold) from noninteresting
ones. The Eg function could be composed of rules whose structure
could be formed by a set of preconditions () considering the current partial
evaluation and the current state of the story and a set of effects () that the
application of those rules have on the final evaluation. Then, a very simple
evaluator would process the story events iteratively, checking the preconditions
and applying the postconditions , in such a way that the state of the evaluator
(the partial set of variables that form the evaluation) is progressively updated
for each processed event.
3.1 Evaluation-Driven Story Generation
The original definition of the T function given by Wiggins modelled the operation
of identifying the next element in the conceptual space to be considered.
Under a certain interpretation, this could be understood to refer to the actual
construction process followed by the creative system to obtain its next result. In
this case, the range of the T function defines system output. However, under a
different interpretation, the T function would be the procedure for constructing
the next element to be considered by the evaluation function E . As some of
the candidates proposed by the T function will be rejected by the evaluation
function E , system output in this case is defined by the interaction between the
87
T and the E functions. In this paper we consider this second interpretation.
Modifications of the E function will therefore control system output.
For the purposes of this paper, plain random modification of the rules can
be considered. More refined solutions may be considered. However, the system
should not rely on the quality of any particular method of transformation. At
this new level, we will also shift the responsibility for obtaining acceptable results
to the evaluation process, in this case, the evaluation of effects of the modified
rules. For this higher level evaluation we resort to a panel of judges.
3.2 Social Interaction Between Humans and Computers for
Controlling Transformation
The human judges that evaluate stories are asked to produce plain values which
are decided when reading stories. For every generated story, a single numeric
value in the range [−1, 1] could be received from humans reading a story, as
long as the variable to be obtained is clearly defined and it is just dependent on
human criteria regarding stories.
The proposed method for evaluating stories involves checking the available set
of evaluation rules against each story. Only some of these rules will have their
preconditions met, and therefore be applied to contribute to the final rating
that the system assigns to the story. For every story S that is finally selected as
system output, a record is kept of which evaluation rules contributed to establish
its internal rating: the particular subset of the evaluation rules (the FS set) that
contributed to its being selected as output.
By combining this record with the evaluations obtained from the human
judges, each rule in this subset FS could be assigned the rating that humans
assigned to the story S. In this way, rules would receive several ratings coming
from humans indirectly. This could be used, for instance, to keep the rules that
produce good stories and discard rules creating bad stories according to human
evaluation.
4 Discussion and Further Development
It is important to consider to what extent the autonomy of such a system would
be compromised by the role played by the human judges. Ritchie points out
[7] the need to keep humans isolated from the final objective of the system.
In the present case, this corresponds to ensuring that the human participants
play no direct role in the actual generation of the stories. At a more specific
level, since the system is transforming evaluation rules, the human judges must
not directly add knowledge concerning transformation of rules. In the described
set up, human judges do not at any stage come into contact with the set of
evaluation rules or the method used for transforming them. This constitutes a
certain safeguard of system autonomy.
Another aspect to take into account is whether the role played by the human
judges in the proposed system could be seen to be modelling real phenomena
88
that occur in human creativity. We believe that it emulates closely the role
played by critics and teachers in the formation of the creative capabilities of
human creators. Along these lines, improvements to the present proposal could
be contemplated. According to Jennings [3], the influence that external individuals
have on generators depends on the relation between the generators and the
evaluators. Issues like past agreement or mutual admiration may play a significant
role in tempering actual feedback. For instance, it might be interesting to
consider whether the learning process of the system might be refined by giving
priority to the opinions of judges that have awarded good ratings in the past.
The proposed solution would be inefficient. Although the system might explore
candidate artifacts at a fast rate, and transform evaluation rules at speed,
it relies on a stage of feedback from human judges that would take time (for a
number of stories to be read and evaluated by the judges). The system would
have to undergo a learning process equivalent to that of human storywriters
receiving feedback from knowledgeable mentors.
The current proposal restricts system output to a very specific conceptual
space, and all system operations, whatever transformations are applied to the
evaluation rules and whatever feedback is received from the judges, cannot lead
to outputs beyond that conceptual space. In that sense, it could only aspire
to be considered creative in an exploratory manner. Nonetheless, the system is
explicitly transforming its own procedures in a search for better valued artifacts.
This aspect of creative professions, the continuous search for improvement
through modification of the procedures, has yet to be addressed in the computational
creativity literature. The present proposal constitutes a first step in this
direction.
<references_biblio/>
References
1. Boden, M.: Creative Mind: Myths and Mechanisms. Routledge, New York, NY,
10001 (2003)
2. Boden, M.: Computational models of creativity. Handbook of Creativity (1999)
351–373
3. Jennings, K.: Developing creativity. Artificial Barriers in Artificial Intelligence. In:
Proceedings of International Joint Workshop on Computational Creativity. (2008)
4. Ritchie, G.: Some Empirical Criteria for Attributing Creativity to a Computer
Program. Minds & Machines 17 (2007) 67–99
5. Wiggins, G.: Searching for Computational Creativity. New Generation Computing,
Computational Paradigms and Computational Intelligence. Special Issue: Computational
Creativity 24(3) (2006) 209–222
6. Wiggins, G.: A preliminary framework for description, analysis and comparison of
creative systems. Knowledge-Based Systems 19(7) (2006)
7. Ritchie, G.: Uninformed Resource Creation for Humour Simluation. In: Proceedings
of the 5th International Joint Workshop on Computational Creativity, Madrid
(2008) 147–151
89