Theme-Based Cause-Effect Planning for Multiple-Scene Story Generation
Karen Ang, Sherie Yu and Ethel Ong
Center for Language Technologies
College of Computer Studies, De La Salle University
Manila, 1004 Philippines
karenang0903@yahoo.com, sherie_yu@yahoo.com, ethel.ong@delasalle.ph
Abstract
Early literacy in children begins through picture drawing
and the subsequent sharing of an orally narrated
story out of the drawn picture. This is the basis for the
Picture Books story generation system whose motivation
is to produce a textual counterpart of the input picture
in order for the child to associate words with images.
However, stories are comprised of sequences of
events that occur in a cause-effect loop, and the singlescene
input picture approach may lead to a story whose
event flow may not match the child’s original intended
story. In this paper, we present Picture Books 2, which
provides an environment for a child to creatively define
a sequence of scenes for his input picture and then uses
a theme-based cause-effect planner to generate a fable
story narrating the flow of events across the scenes for
children age 6-8 years old.
 Introduction
Storytelling is an important aspect of human life. People
use stories to share knowledge, experiences and ideas. Researches
have shown that in their early years, children
would draw pictures and then tell stories out of these afterwards.
This helps develop their literacy skills and creativity,
as one form of measuring creative thinking abilities
is to assess the articulatenes of children in telling stories
through drawings (Torrance, 1977). Furthermore, it has
been found that children recognize pictures more easily
than words (Fields and Spangler 2003).
 Picture Books (Hong et al 2009) is an existing system
that generates fable-form stories for children age 4-6 years
old based on a given single-scene input. This input picture
contains the basic story elements – background, characters
and objects – that are selected by the child from a predefined
list of stickers in the system’s Picture Editor. The
generated stories embody a moral lesson or theme encapsulated
in a plot structure that flows from negative to positive,
where a child violates a stated lesson, experiences the
consequences of such violation, and learns the required
value at the end of the story. The themes are randomly selected
from a list of pre-defined themes associated with the
specified background, while the plot structure follows the
classic story pattern presented by Machado (2003) that
flows from problem, rising action, resolution, to climax.
 Although Picture Books showed the potential for computers
to exhibit creativity in the form of literary art, there
are a number of factors in storytelling that are currently not
supported by the system (Ong 2009). Stories are sequences
of events or scenes, and the single-scene structure of the
system limits the planner on the events that it may generate,
which may not necessarily match the original intent of
the child whilst defining the scene. This makes the generated
story less interesting as it may not adequately capture
the story that was originally conceptualized by the child.
 Picture Books 2 (PB2) extends the first system (from
here-on referred to as PB1) by allowing the children, this
time age 6-8 years old, to define multiple scenes which
serve as the input picture to the story planner. Enabling the
children to input several scenes can lead to stories that are
longer and have more complex plot. Computational storytelling
can then be used to enhance the creative abilities of
children, as they fluently elaborate their stories through
connecting sequences of scenes to form a single storyline.
Fluency and elaboration are two measures of creative
thinking abilities as defined in (Torrance, 1977).
 Following PB1, PB2 also provides a set of background
that the child can select, and a library of character and object
images (called stamps) that can be pasted onto the selected
background in order to create the scenes. It uses a
theme-based cause-effect planning algorithm to generate
the content of the stories that still promote moral values,
this time set in more adventurous places like the camp or
the street to allow older children to learn to explore the
world and learn life’s lessons on their own.
 In order to facilitate the flow of the story from one scene
to the next, two types of transitions are identified. The existence
transition refers to the presence of a stamp in a particular
scene. The movement transition refers to the
movement or change in position of a particular stamp between
adjacent scenes.
 Another factor is the addition of traits to the characters.
Using common animal characters in fables, child educators
believe that embodying these characters with traits (e.g., a
monkey is mischievous, a panda is loyal, and a rabbit is
hardworking) would help children to relate to the story
better. Riedl and Young (2004) noted that character believability
is an essential property of narratives because the
events that occur in the story are motivated by the beliefs,
Proceedings of the Second International Conference on Computational Creativity 48
desires and the goals of the characters. Thus, each of the
characters in PB2 also possesses traits that comprise one of
the factors affecting the flow of the story to be generated.
 The rest of this paper is organized as follows. First, the
knowledge base used by the story planner is presented.
This is followed by a discussion of PB2’s architecture,
with emphasis on the planning process. The paper then
presents the results of qualitative and quantitative analysis
performed by linguists on the generated stories, and ends
with a summary of research findings and further work that
can be done to improve the system.
Storytelling Ontology
Storytelling relies on a large body of knowledge about the
story world, character representation (traits, emotions, behavior),
and a causal chain of actions and events. Actions
are performed directly by characters while events occur as
a result of performing some actions, the occurrence of another
event, or as a natural occurring phenomenon.
 The ontology of PB2 stores storytelling knowledge
comprising of world concepts and events common in a
child’s everyday life. It contains a network of binary semantic
relations patterned after ConceptNet (Liu and Singh
2004), a free, machine-usable lexical and commonsense
ontological knowledge representation. The PB2 ontology
was then populated with concepts that are suitable for the
target users and relevant to the identified themes.
Among the 20+ relations in ConceptNet, only the CapableOf,
UsedFor, ReceivesAction, EffectOf, HasSubevent,
HasProperty and IsA relations were relevant to PB2. Additional
semantic relations, namely Feels, CausesConflictOf,
LeadsTo, IsTransition, and HasResolution were defined to
support concepts related to scene transitions, character
traits, and theme-based planning. Table 1 describes these
relations and provides examples defined in PB2.
 In order to facilitate the flow of the story from one scene
to the next, the story planner must be able to identify
changes that have occurred between two adjacent scenes.
These transitions are classified into two, stamp (character
or object) appearance/disappearance and movement.
Appearance and disappearance is easily determined by
checking if a stamp that is present in one scene is still present
in the subsequent scene. Concepts related to this type
of transition are then modelled in the ontology using the
semantic relation IsTransition. For example, eat – IsTransition
– disappearance associates the action that can cause
an object, such as marshmallow, to disappear across two
adjacent scenes. Similarly, if the marshmallow that was
absent in the first scene appears in the second scene, the
relation buy – IsTransition - appearance is used to model a
possible action necessary for this.
 To model stamp movements, each background image is
divided into 6x6 grids, as shown in Figure 1. Each grid is
labelled to track the position of a stamp in the background.
A stamp is considered to have moved between two adjacent
scenes if its position label in the first scene is different
in the next scene. An example concept for this type of transition
is walk - IsTransition – movement.
PB2 also recognizes six traits – responsible, honest, brave,
helpful, obedient, and persevering. Each character has been
assigned to possess three positive and three negative traits.
Relation Definition Example
Feels Denotes the emotional
response of the character
to an event.
Character – Feels
– Sad
CausesCo
nflictOf
Used for selecting a
story theme (conflict)
based on the character’s
negative trait.
Brave –
CausesConflictOf
– Scared
LeadsTo Used to associate an
object to a theme (or
conflict)
Flashlight – LeadsTo
– Scared
IsTransition

Used to associate an
action to a transition.
Eat – IsTransition
– Disappearance;
Bring - IsTransition
– Appearance
HasResolution

Used to determine the
appropriate resolution
for a conflict.
Scared – HasResolution
–
Search
CapableOf Represents an action
that a character can
execute in the story.
Character – CapableOf
– Eat
UsedFor Represents an activity
that can take place in a
specified location.
Camp – UsedFor
– Camping
ReceivesAction
Relates an action that
can be performed on an
object.
Marshmallow –
ReceivesAction –
Eat
EffectOf Provides a causal chain
relationship between
two events.
Tired – EffectOf –
Sleep
HasSubevent
Specified an event that
can occur before another
event.
Eat – HasSubevent
– Cook
HasProperty
Specifies an adjective
to describe a noun.
Camp – HasProperty
– Far
IsA A generalized concept
of a noun.
Marshmallow –
IsA – Food
Table 1. Semantic Relations in the Ontology of PB2
Figure 1. Grids in the Camp Background
Proceedings of the Second International Conference on Computational Creativity 49
System Architecture
Picture Books 2 has four main modules, namely, the Story
Editor, the Story Planner, the Sentence Planner, and the
Story Generator. This is depicted in Figure 2.
Figure 2. System Architecture
 The Story Editor allows the child to choose from a predefined
list of background, character and object stamps.
Currently, there are four backgrounds (camp, street, park
and classroom), four characters (dog, pig, hippo and rabbit),
and 16 objects to choose from. The available objects
vary depending on the selected background. Each story can
have only one background for all its scenes, one character,
and up to four objects. An input picture is required to have
a minimum of three scenes to depict the initial setting, the
problem phase, and the resolution phase of the story.
An abstract representation of the input picture that is
forwarded to the Story Planner is shown in Figure 3. It
shows that the input picture contains three scenes. The first
scene (scene 0) contains a character stamp named Danny
who appeared in this scene and is at grid 25, and an object
stamp named marshmallow which also appeared in this
scene and is also at grid 25. The second scene showed a
movement of the character stamp from grid 25 to 28, and
the appearance of another object, the flashlight. The
marshmallow is assumed to have disappeared as it is not in
the scene anymore. The value null signifies no transition.
Story Planner
The Story Planner produces a story plan comprising of
semantic relations retrieved from the ontology. These semantic
relations represent the progression of the story
through a causal chain of character actions and events that
will lead the main character to overcome one of his/her
negative traits. The planner works by considering the traits
of the character, the objects present in the scenes, and the
scene transitions.
A theme depicting the conflict of the story is selected
based on the main character’s non-traits and objects present
in the conflict (or middle) scene. Candidate conflicts
are retrieved from the ontology using the relation
CausesConflictOf. Table 2 presents some character trait
concepts and their associated conflict concepts. Note that
each binary relation means that the absence of the trait
concept in the character (e.g., not brave) leads to a story
with the stated conflict (scared).
Figure 3. Abstract Story Representation
Concept 1 Relation Concept 2
Brave CausesConflictOf Scared
Responsible CausesConflictOf Lose
Obedient CausesConflictOf Disobey
Table 2. CausesConflictOf Relations between NonCharacter-Trait
and Conflict concepts
 The setting of the story is based on the background and
the selected theme. This includes the time when the story
takes place and the adjective to be used in describing the
background. For instance, given the theme of a character
learning to be brave that is set in the camp background, the
most likely story will be that the character is scared of the
dark, and thus, the time should be set to evening.
 The planner generates the chain of events in the story by
taking the background adjective as the root node and finding
a path in the ontology to connect this to the identified
conflict. Table 3 presents some adjective concepts associated
to backgrounds through the HasProperty relation.
Concept 1 Relation Concept 2
Camp HasProperty Far
Camp HasProperty Crowded
Park HasProperty Clean
Class HasProperty Quiet
Table 3. HasProperty Relations for Backgrounds
Events generation also considers the possible events that
may happen given the transitions between scenes. Furthermore,
an event can be considered in the story if the
character is capable of doing the associated action and the
object required for its performance is present in the scenes.
Other events may also require a specific location.
Once these preconditions are met, the planner finds a
causal chain of events from the root node (background
adjective) to the target node (conflict) using EffectOf relations.
Table 4 presents EffectOf relations showing the
causal link between concepts, starting from the background
adjective far, leading to its effect, e.g., tired. Tired, in turn,
may necessitate the character to eat. The chain continues
until the concept node matching the identified conflict,
e.g., scared, has been reached.
Proceedings of the Second International Conference on Computational Creativity 50
Concept 1 Relation Concept 2
Tired EffectOf Far
Eat EffectOf Tired
Sleepy EffectOf Eat
Sleep EffectOf Sleepy
Hear EffectOf Sleep
Scared EffectOf Hear
Table 4. Chain of EffectOf Relations
If a path is found, the planner also checks for candidate
sub-events that can occur in order to increase the length of
the story. Currently, only one sub-event is included in the
story plan.
Figure 4 illustrates a sample causal chain of events based
on the relation EffectOf. The orange-colored nodes denote
concepts found through the HasSubevent relation. For example,
the sleep concept has the sub-events pray, comb,
and brush. The dark colored nodes are the root node and
the target node representing the background adjective and
the theme or conflict of the story, respectively. A possible
story path is: “crowded-bump-dizzy-pray-sleephear_sound-scared”.
Figure 4. Sample Causal Chain of Events
In order to achieve variances in the generated stories
based on the same input picture, the planning algorithm
uses a simple random selection approach to identify the
nodes to be included in the chain of events. Future work on
PB2 should consider having a scoring function to guide the
planning process.
Events generation is repeated in order to find a path
from conflict to its possible resolutions. Table 5 presents
the relationships between conflict concepts and resolution
concepts. For example, if a character is scared, then the
resolution phase of the story should involve actions requiring
the character to search for the causes that lead to his
being scared, e.g., what is making the sound in the night?
Concept 1 Relation Concept 2
Scared HasResolution Search
Lose HasResolution Admit
Disobey HasResolution Apologize
Table 5. HasResolution Relations between Conflict and
Resolution concepts
The Sentence Planner produces character goals (Uijlings
2006) by aggregating two or more consecutive semantic
relations using discourse markers. A character goal repressents
one sentence in the final story. Figure 5 shows a
sample output of character goals for the first scene in the
abstract story representation in Figure 3. For each character
goal entry, agent is the actor, art(n) is the article to be
used, verb is the action to be performed, patient is the receiver
of the action, rst:n specifies the type of discourse
marker to be used, type signifies if the sentence will be
joined with another sentence, and tense is the verb’s tense.
Based on feedback from the linguist, children’s stories are
usually written in past tense.
Figure 5. Sample Output Character Goals
The Sentence Planner also lexicalizes concepts and generates
a set of sentence specifications, which is then forwarded
to the Story Generator to produce the surface text
with the use of an external surface realizer, simpleNLG
(Venour and Reiter 2008).
Results and Analysis
10 sets of stories were given to the evaluators comprising
of two linguists and one storywriter. These evaluators were
chosen to provide expert judgement on the quality of the
generated stories in terms of linguistic structure, narrative
content, and appropriateness for the target audience. No
feedback was solicited from the intended audience themselves
at the time of this writing, because as the results
below will show, the system still needs major work in its
planning algorithm and knowledge representation in order
to produce stories that the children may truly appreciate.
 The stories were generated from an ontology that contains
1,002 concepts and 1,442 semantic relations. The
lexicon has been populated with 769 terms. The evaluation
was performed twice; after the first evaluation, PB2 was
revised to address some of the feedback, then ten stories
were regenerated to undergo a second evaluation.
 Four criteria were used: language; coherence and cohesion;
character, objects and background, and content, each
of which has a set of associated questions that are rated by
the following scores: 5-strongly agree, 4-agree, 3-neutral,
2-disagree, 1-strongly disagree. Table 6 shows the results.
Proceedings of the Second International Conference on Computational Creativity 51
Criterion First Second
Language 4.01 4.43
Coherence and Cohesion 3.28 3.66
Character, Objects and
Background
4.02 4.02
Content 3.64 3.76
Overall Rating 3.74 3.96
Table 6: Summary of Quantitative Evaluation
 The language criterion deals with the correctness of the
sentence structure and appropriateness of the words used.
Also included are the proper usage of articles, pronouns
and punctuation marks. Here, PB2 received an average
score of 4.43 after the second evaluation since the English
grammar rules and the lexicon used during sentence generation
were defined specifically with the target users in
mind. During revision, rules related to correct usage of
pronouns and articles provided by the linguists were implemented.
However, there are still cases of incorrect usage
as shown in the examples below where there is an incorrect
usage of the article and a missing article.
Output: She spilled a juice.
Correct: She spilled juice.
Output: She played a game in park.
Correct: She played a game in the park.
The coherence and cohesion criterion is concerned with
the transition of events and the flow of the sentences, to
evaluate if the generated story makes sense and is easy to
understand. Coherence between sentences can be enhanced
through the use of discourse markers (Mann and Thompson
1987). Taylor (2009) provided a list of common discourse
markers, and those appropriate for elementary age
kids are presented in Table 7.
Used to signal Transition Word
Addition Also, again, and, besides
Time After, before, during, later, now,
then
Cause or Reason Because, since
Effect Because, hence, so, thus
Direction Above, behind, below, between,
near
Summary So, thus
Table 7: Common Discourse Markers for Children
 PB2 received the lowest average score of 3.66 in this
criterion because although the stories contained discourse
markers, these are sometimes used inappropriately with
respect to the context of the sentence, as shown below.
Output: Danny the dog ate a marshmallow, thus he
felt sleepy.
Correct: Danny the dog ate a marshmallow, and thus
he felt sleepy.
 On the other hand, the absence of discourse markers
resulted in the generation of choppy sentences.
Output: He slept in tent. He heard a sound.
Correct: He slept in tent. While he was sleeping, he
heard a sound.
 Because the planner utilizes a random selection approach
and does not perform reasoning over the resulting
path of semantic relations, there are also cases in which the
generated story is not logical.
He brought a blue water jug.
The camp was very far.
He felt tired.
He felt thirsty.
 The characters, objects and background criterion examines
the appropriate interplay between the character, object
and background elements of the story with the story itself.
This includes checking for the incidence of character traits
and moral lesson, and the appropriateness of the objects
with respect to the chosen background. The system received
an average score of 4.02 for both rounds of evaluation.
There are instances wherein the objects placed in the
input scenes are not included in the generated story. There
are also instances where the object, although introduced in
the story, does not play any role in the story. The excerpt
below illustrates this.
[1] It was a fine evening.
[2] Danny the dog was in the camp for a trip.
[3] He buys a packed marshmallow.
[4] The camp is very big.
…
[5] He sees a shadow.
[6] Danny the dog feels scared.
[7] He does not know what to do.
…
[8] Since then, He learns to be brave.
 The given story excerpt was generated from an input
picture comprising of three scenes. In the first scene, a
marshmallow object has been included and introduced in
line [3]. However, it plays no part towards the development
of the theme where the main character has to learn to
be brave. In fact, aside from line [3], no other text in the
story mentioned the marshmallow again.
 The overall content of the story includes evaluating the
appropriateness of the story to the target age group, the
adequacy of details provided, as well as the believability of
the events in the story. The evaluators noted that the generated
stories follow the basic structure of a children’s story.
They also found these stories to be quite interesting due to
the interplay of the conflict and resolution to the theme of
the story as well as the chain of events.
Conclusion
Picture Books 2 demonstrated that a coherent story with
the four basic classic story subplots of Machado (2003) can
be generated from a given input picture with at least three
scenes. This is achieved by a theme-based cause-effect
planner that utilizes an ontology of semantic domain and
narrative knowledge, and a sentence planner that utilizes
discourse markers to connect two or more events together.
Proceedings of the Second International Conference on Computational Creativity 52
 The story planner provided a mechanism to control the
sequencing of events to adhere to the basic story plot suitable
for children’s stories, but also allowed for flexibilities
and variances in the generated stories. This is done by
manually populating the semantic ontology with the relevant
binary relations and concepts. However the population
should be done with caution. Based on tests conducted,
over population can lead to illogical story paths and under
population can lead to not being able to generate stories.
 Because the ontology representation makes use of binary
relations, this can lead to logical errors in the resulting stories.
An instance of this is the relation “dizzy – EffectOf –
see” and “people – ReceivesAction – see”, which logically
means that if the character sees many people he or she
feels dizzy. However, given that there is also a relation
“marshmallow – receivesAction – see”, this resulted to a
story text where a character feels dizzy because he or she
saw a marshmallow, which makes no sense at all.
Even though both PB1 and PB2 utilize a semantic ontology,
the story planner of PB1 has a set of predefined planning
operators in the form of author goals (Hong et al
2009; Cua et al 2010) similar to Minstrel (Turner 1992)
that represent high-level tasks to guide the construction
process by focusing on the narrative structure of the story.
The author goals in turn are divided into two or more character
goals (Uijlings 2006), again predefined, and these
two types of goals are used to constrain the stories being
generated. The ontology is used only to provide information
needed to fill in the attributes in the character goals in
a theme-driven story plot template (Ong 2010). This generated
stories that are of good quality, coinciding with findings
of Peinado and Gervas (2006) that “ontology-based
stories obtain good results on coherence because the ontology
forces explicit links between events”.
 PB2, on the other hand, did away with predefined author
goals while its character goals are generated dynamically
based on the semantic relations found in the story path
retrieved by the planner. The development of a reasoning
engine that contains rules for checking the logical inconsistencies
and performing inferencing on the set of candidate
story paths should be explored to address the issue on resolving
ambiguities and generating logical stories.
 The development of a more comprehensive model for
representing the current state of the world and the changes
that had already taken place, such as previous actions of
the character, previous events that have taken place, and
changes in the objects that are in the character’s possession
or in the story world, should also be explored to address
these issues and enable the system to consistently generate
story events that are logical and believable.
 Finally, feedback should be solicited from the intended
users (the children) to provide an assessment on the effectiveness
and usability of the system in supporting their
creative expression, the degree of consistency of the actual
story to the target story conceived through the input picture,
and the ability of the generated story to capture and
retain their attention.
<references_biblio/>
References
Cua, J., Manurung, R., Ong, E., and Pease, A., 2010. Representing
Story Plans in SUMO. Proceedings of the
NAACL-HLT 2010 Workshop on Computational Approaches
to Linguistic Creativity, 40-48, June 2010, Los
Angeles, California: ACL.
Fields, M. and Spangler, K., 2000. Let's Begin Reading
Right: A Developmental Approach to Emergen Literacy.
Upper Saddle River, N.J.: Merrill.
Hong, A.J., Solis, C., Siy, J.T., Tabirao, E., and Ong, E.,
2009. Planning Author and Character Goals for Story Generation.
Proceedings of the NAACL Human Language
Technology 2009 Workshop on Computational Approaches
to Linguistic Creativity, 63-70, June 2009, Colorado, ACL.
Liu, H., and Singh, P., 2004. ConceptNet — A Practical
Commonsense Reasoning Tool-Kit. BT Technology Journal,
22(4): 211-226. Netherlands: Springer.
Machado, J., 2003. Storytelling. Early Childhood Experiences
in Language Arts: Emerging Literacy, 304-319.
Clifton Park, New York: Thomson/Delmar Learning.
Mann, W., and Thompson, S., 1988. Rhetorical Structure
Theory: Towards a Functional Theory of Text
Organization. TEXT, 8(3):243-281.
Ong, E., 2009. Prospects in Creative Natural Language
Processing. Proceedings of the 6th National Natural Language
Processing Research Symposium, Center for Language
Technologies, September 2009, Manila, Philippines:
De La Salle University.
Ong, E., 2010. A Commonsense Knowledge Base for Generating
Children's Stories. Proceedings of the 2010 AAAI
Fall Symposium Series on Common Sense Knowledge, 82-
87, November 2010, Virginia, USA: AAAI.
Peinado, F., and Gervas, P., 2006. Evaluation of Automatic
Generation of Basic Stories. New Generation Computing,
24(3):289-302, Ohmsha, Ltd. and Springer.
Riedl, M. O. and Young, R. M., 2004. An Intent-Driven
Planner for Multi-Agent Story Generation. Proceedings of
the Third International Joint Conference on Autonomous
Agents and Multi-Agent Systems, 186-193, Washington,
DC, USA: IEEE Computer Society.
Taylor, 2009. Elementary transition words.
http://www.greenville.k12.sc.us/taylorse/Taylorsy.
Torrance, E.P., 2007. Creativity in the Classroom: What
Research Says to the Teacher. National Education
Institution, Washington, D.C.
Turner, S.R., 1992. Minstrel: A Computer Model of Creativity
and Storytelling. University of California, Technical
Report CSD920057.
Uijlings, J.R.R., 2006. Designing a Virtual Environment
for Story Generation. Master’s Thesis, University of Amsterdam,
The Netherlands.
Venour, C. and Reiter, E., 2008. A Tutorial for Simplenlg.
http://www.csd.abdn.ac.uk/~ereiter/simplenlg
Proceedings of the Second International Conference on Computational Creativity 53