Steps Toward the AIR Toolkit: An Approach to Modeling Social Identity Phenomena in Computational Media D. Fox Harrell, Ph.D., Greg Vargas, Rebecca Perry Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge, MA 02139 USA {fox.harrell, gvargas, rebperry}@mit.edu Abstract The Advanced Identity Representation (AIR) Project is a new interdisciplinary approach to the problem of designing identity technologies to enable imaginative selfrepresentations for users by implementing dynamic social identity models grounded in computing and cognitive science. AIR Project research develops models of social computational identity (e.g., characters, avatars, and social networking profiles) to enable user representations that dynamically change in response to context and use, and to implement an identity modeling toolkit for constructing cross-application self-representations. This paper reports on the developing AIR Toolkit’s support for modeling social identity phenomena in which single users deploy multiple self-representations (avatars, characters, or profiles) for different purposes. Introduction Computational media have transformed the creation and representation of human identities. Understanding identity representation as both a creative and a computational act can inform development of technologies to enhance how identities are enacted as social and technical practices, particularly in videogames and social networks. Human-centered computing researchers have tended to focus on issues such as user and task analyses, cooperation, and usability, e.g. in (Muramatsu and Ackerman 1998; Suchman 1987). In contrast, humanists and social scientists have often investigated identity-inflected issues such as power, class, stigma, racism, sexism, and related themes (Nakamura 2002, 2008; Nelson and Tu 2001; Waggoner 2009) – exposing identity as a dynamic, creative feat of self and social construction. Games studies scholar Zach Waggoner (2009) describes identity creation as an unfolding process of self-representation that takes place in the creative liminal space between the user and the videogame avatar – between the embodied materiality of the player and the imagination. Social scientist Sherry Turkle’s (2004) studies of membership in multiple communities have revealed that users often experience a sense of “cycling through” different selves. Expression of multiple selves is intrinsic to everyday human creativity. Indeed, in his seminal work Erving Goffman (1959) described a negotiation between the socially constructed, public performance of the self, and the desired inner self – a complex, creative social and imaginative act. Informed by such perspectives, we take the view here that creation and maintenance of computation identities is, in part, an active creative feat of imaginative cognition. Furthermore, social categories are often aspects of identity that are reified in computational systems. Hence, we focus on a cognitive science perspective on categorization that highlights its imaginative nature and basis in cognitive mechanisms for metaphorical and metonymic mapping. The Advanced Identity Representation (AIR) Project consists of developing new technologies informed by categorization and classification theories from cognitive science and sociology. (Harrell 2009) We are developing a toolkit that can take data-structures for characters in games or profiles in social networks and use them to model social phenomena such as presenting oneself differently to different groups, becoming a member of a group, or passing as a member of another group. This is accomplished through performing operations such as finding analogically matching profiles/character data-structures, adapting them to different social categories, forming new categories based on analogical relationships between individuals, revealing or simulating stereotypical categories at the data-structural level, and more. Hence, we address a computationally reified, reductive form of identity, but do so: (1) as a critical technical practice (Agre 1997) aware of the aspects of identity that are not computational, and (2) recognizing the this reduction has already taken place “in the wild” as users have built identities already encoded as data-structures. Theoretical Framework Technical Components of a Sociodata Ecology Computational identity systems, e.g., social networking profiles, online accounts, and avatars/characters are implemented using a limited and often overlapping set of components. Proceedings of the Second International Conference on Computational Creativity 147 Figure 1: Shared technical underpinnings of computational identity applications There are two important motivations for describing these components: (1) identifying an appropriate level of abstraction for analyzing the technical side of computational representations comparatively across different types of applications, and (2) identifying components that can be analyzed both in terms of how they appear visually and how they are implemented algorithmically and datastructurally. Figure 1 describes the six components that comprise the majority of widely used computational identity technologies. (Harrell 2009) This paper focuses on support for components at levels 4 and 5 (statistical/numerical representation and formal annotation). These underpinnings exist in a sociodata ecology (Harrell 2010), wherein technical infrastructure, datastructures and algorithms, and code are looked at as they relate to issues such as embodied experiences, subjective interpretations, power relationships, and cultural values. Cognitive Model of Computational Identity The AIR Project approach begins with the basic cognitive building blocks of identity upon which social identity categories are built. Cognitive scientists have proposed that human conceptual categories form “idealized cognitive models” (ICMs) upon which categories of objects in the world are built (Lakoff 1987). Social networking sites explicitly group users into categories called “friends,” while games may group users into categories called elves or half-orcs. These categories may also manifest implicitly, for example Eric Gilbert and Karrie Karahalios’s (2009) metric for “tie strength” determines “friendliness between” users evidenced through use of the system. Yet, most computational user categorizations invoke much less robust models. Technical infrastructures may implement (often incorrect) stigmatizing identity classification models (Bowker and Star 1999; Goguen 1997), indeed some games feature datastructures instantiated with values where some races/genders are less intelligent than others. Cognitive science theory is presented below to provide models that can help explain how users project their identities onto their computational surrogates. (Gee 2003) Cognitive Categorization The AIR approach is influenced by the prototype theory of Eleanor Rosch and work in categorization by George Lakoff. (1987) Lakoff describes a metonymy/metaphor-based account of how imaginative extensions of “prototype effects” result in several phenomena of social identity categorization that have proven useful for the AIR Project: • Representatives (prototypes): “best example” members of categories; • Stereotypes: normal, but often misleading, category expectations; • Ideals: culturally valued categories even if not typically encountered; and • Salient Examples: memorable examples used to understand/create categories. Since the AIR Project technology involves techniques to formalize and implement ICMs as computational datastructures, identity phenomena become amenable to algorithmic manipulation and experimentation. Conceptual Blending and Multiple Selves Learning scientist James Gee’s concepts of the real, virtual, and projective identities in games provide a useful starting point for thinking about how embodied identity experiences and values in the real world intersect with the affordances and semiotic values of computational representations. (Gee 2003) For Gee, player representations as projected identities manifest the ways that real player values are reconciled with values understood as associated with avatars. The AIR Project approach emphasizes projected identity. (Corneliussen and Rettberg 2008) Using cognitive science terminology, this can be seen as metaphorically mapping ICMs (mental spaces) that humans have of themselves onto characters, or to use terminology from Gillles Fauconnier and Mark Turner’s (2002) conceptual blending theory as selectively projecting aspects from conceptualizations of both a real identity and a virtual identity into a blended identity. Examples of blended identities include the venerable notion of double-consciousness, the dual awareness of a person from a marginalized or oppressed group’s self-conception and the social stigma attributed to the social group (Du Bois 1903), and identity torque, the often psychologically painful experience of a person’s selfconception differing from a stigmatized perceptions reinforced by classification infrastructure (Bowker and Star 1999). The notion of blended identities is central here because it informs the idea that a single user can have multiple identities depending on the elements being projected. Implementation and Findings We have developed a model of multiple user identity datastructures and ways of displaying the contents of those data-structures via a GUI. For example, a profile on the social networking site Facebook consists of structured data Proceedings of the Second International Conference on Computational Creativity 148 indicating friends, items a user likes, personal information (such as gender or location), etc. Figure 2: A subgraph of a Facebook profile This can be represented as a graph in which items and attributes are nodes that are connected to users by relations such as ‘like’ or ‘friend.’ Some of these may also include numerical statistics such as integers for age (see Figure 2). Figure 3: A subgraph of a role-playing game character In such a profile the number of friends and pages for many typical user may reach the hundreds or thousands, resulting in interesting graph structures to analyze. Similarly, for a character in a game (especially roleplaying games in which character creation is a primary focus) a graph can be used to represent stats (numerical values for gameworld attributes like intelligence or dexterity), skills, race, class, gender, etc. (see Figure 3). Despite their differing structures, the similarities in these representations at the abstract data-structural level have allowed us to consider how multiple representations (or views on representations) can reflect identity phenomena from the real world such as self-presenting differently in different communities, attempting to “pass” as a member of another community, or being a central or marginal member of a community. In games, multiple representations can be used to implement phenomena such as critically modeling stereotyping (by making non-player characters uniformly respond to characters based on some subgraph of elements rather than the full graph), developing emergent profession/class models rather than top-down designations, and decoupling real world racial, ethnic, and gender categories from game mechanics-oriented numerical statistics for combat and exploration of game worlds. Toward this end, our models support implementation of: • Multiple Identities based upon: o adding to, subtracting from, or reorganizing the graphs described above; this can be used to automatically customize a user’s profile/character, or view of a profile/character, based upon who the profile/character is presented to o users explicitly creating multiple profiles (or views of a single profile/character) based on privacy settings or membership in different groups • Identity Categories emerging from finding clusters of users with analogous graphs • Prototypical Members of categories based upon maximizing analogy with other users • Critical Attributes are profile/character attributes that are most telling in revealing analogy with other users It is not clear that only manipulating these data-structures provides the necessary affordances for modeling real world identity experiences adequately. Further development may require augmenting these structures with metadata indicating salience of particular attributes or additional attributes. It will also require study of how users take up and deploy the data-structures beyond technical affordances of the systems (e.g., chatting in virtual worlds or flat text descriptions of characters in games). However, our model does introduce an extensible set of features to allow system designers to implement the semantics of social identity phenomena rather than hardcoding in racism as social critique (as in the game Dragon Age’s portrayal of racism against elves) or simplistic models of group membership such as the opt-in/opt-out model in Facebook. In future AIR project development, phenomena such as stereotyping, marginalization, naturalizing in communities, and stigmatization will be addressed. Technology Development There have been two main thrusts of technology development. These are: (1) AIR Toolkit Development (2) Application Development and Deployment (assessing popular software systems to use the AIR toolkit with and deploying the toolkit in those systems) Proceedings of the Second International Conference on Computational Creativity 149 Regarding (1), we are currently developing an interface, implemented in Python, capable of comparing and adapting user profiles. This interface is agnostic toward applications (it can be applied to games and social networking applications alike) and is agnostic toward algorithms used for comparing users. Initially, comparison is being done using a system called AnalogySpace developed by the Commonsense Reasoning research group led by Henry Lieberman at the MIT Media lab. (Speer, Havasi, and Lieberman 2008) We also have been considering using the Structure Mapping Engine developed by Ken Forbus, Dedre Gentner, Ron Ferguson, and others at Northwestern University. (Ferguson, Forbus, and Gentner 1997; Forbus 2001; Gentner 1983) Finally, we also have considered using a matching algorithm developed in (Chow and Harrell 2009; Harrell 2010). Aside from potentially varying in effectiveness, these different approaches require differing amounts of background knowledge and may be more or less useful for particular applications. Regarding (2), we have deployed the toolkit to implement multiple identity representations, categories, and comparisons in Facebook. Before selecting Facebok for our initial deployment, we assessed popular systems used in both social networking and gaming in order to determine which would be optimal for initially testing the system. Toolkit API We are designing an API for the basic functionality of the toolkit. The current AIR toolkit iteration uses Facebook's Graph API to download information about the user and his/her friends including profile information, friends, and likes. The toolkit then creates a large, sparse n x n matrix and performs a truncated Singular Value Decomposition (SVD) using the Divisi library from AnalogySpace. It offers functions for the following purposes (using the term “object” to refer to a profile or character structure): • Finding Similar Objects: The truncated SVD approximates dot products between each pair of objects. These approximated dot products are used as a similarity metric and the toolkit can return the objects most similar to a given object. • Predicting Features: The truncated SVD has a “smoothing” effect on the values in the matrix in a way that makes it useful for making inferences. The toolkit can use this to calculate the likelihood of a particular feature belonging to an object, whether or not it was represented in the original graph, as well as return the top predictions. • Projecting one object onto another: The toolkit can return a filtered view of a particular object filtered by the predictions of another object. We shall discuss more of the potential uses of such a tool later. • Creating Categories: The toolkit allows for the manual creation of a category by choosing initial seed objects, averaging the objects’ feature vectors and then suggesting other objects to be included in the categories as well as predicting important features for the category. • Creating and Inserting Objects into the Graph: The toolkit also allows the creation and insertion of new objects into the graph. This could be useful for creating prototypical objects and examining their relation to other objects or experimenting with the graph structure and seeing the changes it causes. The first use of this API is a web interface for exploring a user’s Facebook graph with the toolkit. We wrote a program that authenticates a Facebook user and downloads metadata from the user’s profile as well as their “likes,” then does the same for each of the user’s friends. The web interface we created downloads this information and converts it to the graph structure that the toolkit can read. The website then provides an interface structured like a readonly social network site focused on exploring the user’s network and examining other profiles. One key feature of this site is that it can allow the user to view other users’ profile data based on their relationship to her/his own. That is, when a user visits a friend’s profile, the user could see only the connections that they share or that the system thinks they should share (see Figure 4). Figure 4: User2547 filtered to show only the links predicted to be present in User 6366’s graph Figure 5: The interface allows the selection of groups of users (objects) to create categories based upon analogy between the users, find key features of those categories, and find other possible members of the category The interface enables exploration of basic toolkit functions such as comparing profiles, calculating predictions, adding profiles, and creating categories (see Figure 5). Proceedings of the Second International Conference on Computational Creativity 150 Model and Toolkit Development The AIR Toolkit is still under development and we hope to continue to implement mechanisms that allow those using the toolkit to represent the types of identity phenomena discussed above. Extensions to the models developed will consist of refining and extending techniques to implement a small subset of cognitive and social identity phenomena in software, initially addressing torque, metonymic category models, marginalization, markedness, naturalization, and category gradience. In addition to that work, we will add support for implementing modular graphical user-representations for users. Currently, our toolkit is limited to altering textual and semantic representations. Adding functionality for examining and altering graphical representations is potentially a more difficult problem, but would be helpful in systems that place an emphasis on avatars or other graphical models. With the progress made on the toolkit, it is possible to prototype further applications that take advantage of the models we have discussed. Examples might include: • a social networking GUI for changing a user’s self representation for different social groups as opposed to cumbersome alteration of privacy settings, • integrated networking/gaming applications allowing information social networking information to influence play style and vice versa, • a system modeling the phenomena of “passing” as a member of a different social group to facilitate a learner’s transition from a novice to an expert member of a group, • a social networking system supporting the ability to swap between multiple identities, perhaps based the user’s perception of which identities would be empowering, stigmatizing, or challenging in a given context, and more. Evaluation It will be important to assess whether or not users feel that our AIR Project systems are more empowering than current systems and if they can be used to minimize stigma built into identity representation structures. Though this assessment has not been completed yet, sufficient development work has been done so as to warrant reporting. We also have been developing methods to pursue such assessments. In the spring of 2010, Harrell conducted a pilot study for the AIR Project with four female participants and two researchers. The subjects, who were novice computer users, engaged in identity creation via the manipulation of character creation systems in three game systems, The Sims, the Nintendo Mii Channel, and the game Elder Scrolls IV: Oblivion. As the subjects engaged in character creation, semi-structured clinical interviews were conducted regarding the character creation process and the relationship of the characters created to a range of identity issues after users were first prompted to describe their creations “in their own words.” The dialogue was captured via digital video and the sessions were screen captured comprising raw data for analyses to be presented elsewhere. The dialogue captured in these files is being transcribed and will serve as the basis for crafting an empirical instrument for evaluating AIR systems as well as assessing whether users feel that these well-known games are adequately expressive. Transcripts and videos will be analyzed using grounded theory techniques (Glaser 1992; Strauss 1987), a well-known method of qualitative analysis. Open Questions and Concluding Reflections While we have made a good start with the preliminary framing and ongoing development of the AIR Toolkit a number of interesting open questions remain. In particular, given our reliance on cognitive accounts of metaphor and analogy, we have been influenced by the critique of Chalmers, et. al. in (Chalmers, French, and Hofstadter 1992) regarding computational approaches to the same as they assert: How are these data put into the correct form for the representation? Even if we have determined precisely which data are relevant, and we have determined the desired framework for the representation—a frame-based representation, for instance—we still face the problem of organizing the data into the representational form in a useful way. The AIR project takes heed of this concern, however it asks a reciprocal question. How can one design idealized logical forms amenable to our algorithmic techniques useful for modeling the social phenomona we are interested in? We see design of such ontologies as a creative problem requiring human judgment and do not intend the ontologies to be models of the real world. Rather, they are user’s own expressive self-representations or subjective ontologies. Identifying methods to reduce identities to abstract data types is both a non-trivial problem and a double-edged sword, potentially both facilitating and hindering analysis of the data. Can these data types be effectively optimized for use with analogical reasoning systems like AnalogySpace or SME? We will need to further develop our rationale for adopting particular analogy systems, and basis for our belief in their usefulness and validity. Another open question considers the relationships between OS level and application level GUIs. Turkle describes users toggling between online identities, arguing that this comprises a type of conversation between different identities, which enables a fluid, decentered, fragmented self to be deployed across different domains in creative and sometimes unexpected ways. (Turkle 1995) The experience she describes is linked to interactions with computer graphical user interfaces (GUIs) rather than specific applications. The AIR Project model will explore analytic methods and tools to identify and facilitate these changing presentations of self at either level. Finally, the core motivating observation for the AIR Project is that identity is a feat of imaginative cognition. Proceedings of the Second International Conference on Computational Creativity 151 Social categories are often reified in software systems which cognitive science theories have suggested are not objective, but are unconscious and based in metaphorical thought. Humans have great power in determining and shifting the meanings of our categories – the AIR Project is a modest step toward doing so in software. Acknowledgments We gratefully thank the National Science Foundation’s support provided by CAREER Award #0952896. We also thank Henry Lieberman, Catherine Havasi, Jason Alonso and others from the MIT Commonsense Reasoning Group. References Agre, P. E. 1997. Computation and Human Experience. Cambridge, U.K.: Cambridge University Press. Bowker, G. C., and Star, S. L. 1999. Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press. Chalmers, D. J., French, R. M., and Hofstadter, D. R. 1992. High-Level Perception, Representation, and Analogy: A Critique of Artificial Intelligence Methodology. Journal of Experimental and Theoretical Artificial Intelligence, 4(3), 185 - 211. Chow, K. K. N., and Harrell, D. F. 2009. Active Animation: An Approach to Interactive and Generative Animation for User-Interface Design and Expression. Paper presented at the 2009 Digital Humanities Conference. Corneliussen, H., and Rettberg, J. W. (Eds.). 2008. Digital Culture, Play and Identity: A World of Warcraft Reader. Cambridge, MA: MIT Press. Du Bois, W. E. B. 1903. The Souls of Black Folk. Chicago:Illinois: A.C. McClurg and Co. Fauconnier, G., and Turner, M. 2002. The Way We Think: Conceptual Blending and the Mind's Hidden Complexities. New York: Basic Books. Ferguson, R., Forbus, K. D., and Gentner, D. 1997. On the Proper Treatment of Noun-Noun Meatphor: A Critque of the Sapper Model. Paper presented at the Nineteenth Annual Meeting of the Cognitive Science Society. Forbus, K. D. 2001. Exploring Analogy in the Large. In The Analogical Mind: Perspectives from Cognitive Science. Cambridge, MA: MIT Press. Gee, J. P. 2003. What Video Games Have to Teach Us About Learning and Literacy. New York City: Palgrave Macmillan. Gentner, D. 1983. Structure-Mapping: A Theoretical Framework for Analogy. Cognitive Science, 7(2), 155-170. Gilbert, E., and Karahalios, K. 2009. Predicting Tie Strength with Social Media. Paper presented at the Proceedings of the 27th International Conference on Human Factors in Computing Systems. Glaser, B. 1992. Basics of Grounded Theory Analysis. Mill Valley, CA: Sociology Press. Goffman, E. 1959. The Presentation of Self in Everyday Life. New York: Anchor Books. Goguen, J. 1997. Towards a Social, Ethical Theory of Information. In Geoffrey Bowker, L. G., Leigh Star, William Turner (Ed.), Social Science Research, Technical Systems and Cooperative Work (pp. 27-56). Mahwah, N.J.: Lawrence Erlbaum Associates. Harrell, D. F. 2009. Computational and Cognitive Infrastructures of Stigma: Empowering Identity in Social Computing and Gaming. Proceedings of the Association for Computing Machinery (ACM) Cognition and Creativity Conference, Berkeley, CA. Harrell, D. F. 2010. Toward a Theory of Critical Computing: The Case of Social Identity Representation in Digital Media Applications. CTheory. Lakoff, G. 1987. Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. Chicago, IL: University of Chicago Press. Muramatsu, J., and Ackerman, M. S. 1998. Computing, Social Activity, and Entertainment: A Field Study of a Game Mud. Computer-Supported Cooperative Work, 7, 87-122. Nakamura, L. 2002. Cybertypes. New York: Routledge. Nakamura, L. 2008. Digitizing Race: Virtual Cultures of the Internet. Minneapolis, MN: University of Minnesota Press. Nelson, A., and Tu, T. L. N. (Eds.). 2001. Technicolor: Race, Technology and Everyday Life. New York: New York University Press. Speer, R., Havasi, C., and Lieberman, H. 2008. Analogyspace: Reducing the Dimensionality of Common Sense Knowledge. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. Strauss, A. 1987. Qualitative Analysis for Social Scientists. Cambridge, U.K.: Cambridge University Press. Suchman, L. 1987. Plans and Situated Actions: The Problem of Human-Machine Communication. New York: Cambridge University Press. Turkle, S. 1995. Life on the Screen: Identity in the Age of the Internet. New York: Simon and Schuster. Turkle, S. 2004. Whither Psychoanalysis in Computer Culture? Psychoanalytic Psychology, 21(1), 16-30. Waggoner, Z. 2009. My Avatar, My Self: Identity in Video Role-Playing Games. Jefferson, NC: McFarland and Company. Proceedings of the Second International Conference on Computational Creativity 152