Virtual Presence Technical Note 5 2008.05.15 Bot Tagging About this Note Number 5 Version 1 Date 2008.05.15 Category Informational Status Draft Short Name Bot Tagging Document VPTN-5.txt Authors Heiner Wolf, hw, wolf.heiner@gmail.com Working Group - Dependencies VPTN-3.txt Supersedes - Superseded By - Abstract Bots in chat rooms of virtual presence locations may appear like avatars of real people. They can not be distinguished easily. This document specifies a way to describe the purpose of a bot. The description is primarily machine readable so, that client software can compare user preferences with the description and inform users appropriately. Table of Contents 1. Introduction.......................................................1 2. Identity Item......................................................2 3. Tag Format.........................................................2 3.1 Bot Name.......................................................3 3.2 Avatar Gender..................................................3 3.3 Interaction Mode...............................................3 3.4 Purpose........................................................3 3.5 Description....................................................4 3.6 Content Rating.................................................4 3.7 Topics.........................................................4 3.7.1 Tag Line...................................................5 3.7.2 Quantifier.................................................5 3.7.3 Vocabulary.................................................6 4. Examples...........................................................6 5. Ethical Considerations.............................................7 6. Security Considerations............................................8 7. References.........................................................8 8. Revisions..........................................................8 1. Introduction In addition to real people there will be bots visible on web pages. Bots will join chat rooms of virtual presence locations for various purposes. Some will offer services, some will belong to games, and Bot Tagging 1 Virtual Presence Technical Note 5 2008.05.15 some will carry commercial advertisements. Many will appear as human avatars. Users should be able to differentiate between real people and bots. Some bot operators might not like the idea, but still users should be able to know if they are talking to a bot or to a real person. More specifically, users should be able to configure their clients to show only certain types of bots. Some users might not like gaming bots. Many may not like simple advertisements. Users will avoid excessive advertisements anyhow. There will be spam filters for virtual presence. There will be trust services to identify real people. There will be all kinds of anti- spam measures. There is now way to prevent users from opting out. But if operators tag their bots with keywords, then they have a chance to get the user's attention. Actually, bot operators (including advertisers) should make their content so attractive, that it entertains users rather than annoying them. If users can select what they like to see and if they like what they see, then bot operators will get a much higher return. This specification augments the identity [1] of bots with structured information about the bot's purpose. The information is primarily a list of tags. It is referred to as the "bot tag". 2. Identity Item Anyone who meets the bot must be able evaluate the bot tag. Therefore, the bot tag is part of the bot's identity data. The bot tag is an identity item. As any identity item, it can be inline or referenced by a URL. External reference example: ... Note: the tag is currently intended for bot tagging. But it might describe other entities later. Hence, it is called contenttype="tag" instead of contenttype="bottag". 3. Tag Format The bot tag uses an RDF format. The RDF data describes the identity file in which it is embedded. A bot tag uses an XML namespace specifically tailored to tagging of bots: http://www.virtual-presence.org/bottags# Bot Tagging 2 Virtual Presence Technical Note 5 2008.05.15 Note: other entities might use other namespaces, e.g. users could describe themselves using the FOAF vocabulary [2]. Example of an educational quizbot with a rainforest topic: Woodelf female reactive entertainment Save the Rainforest Quizbot PG Games:Video Games:Educational Society:Issues:Environment 3.1 Bot Name The "name" is a short human readable identifier. It might be displayed in addition to the nickname of the "properties" identity item. The parameter is optional. 3.2 Avatar Gender The "gender" indicates the gender of the bot's avatar for whatever display purpose this might be useful. The parameter is optional. Possible values: - "female" - "male" - "none" 3.3 Interaction Mode The "interaction" tells if the bot has interactive capabilities, whether it responds to actions or acts on it's own. The parameter is optional. Possible values: - "none": no interaction - "reactive": bot reacts to user input - "proactive": bot acts on its own behalf - "interactive": unspecified interaction capabilities. May be used as a combination of "reactive" and "proactive". 3.4 Purpose The "purpose" tells the primary purpose of the bot, whether it informs, advertises, entertains, etc. The parameter should be implemented. Bot Tagging 3 Virtual Presence Technical Note 5 2008.05.15 Possible values: - "entertainment": pure entertainment - "information": purely informative like a travel advisory - "infotainment": informative with an entertaining presentation - "advertisement": pure advertisement, e.g. brand promotion - "advertainment": advertising with an entertaining presentation - "product information": well, the usual consumer information - "sponsored information": general information with reference to products - "education": purely educative - "edutainment": educative with an entertaining presentation Note: if most users perceive a bot as pure advertisements then they must be labeled as such. Only if an ad-bot is really entertaining by e.g. offering a mini game, a quiz, or if it shows a play, then it can be labeled "advertainment". Telling users about product details and benefits is not considered pure "information". It is considered "product information". Only if the bot offers information of general interest, which is related to a product, then it can be labeled "sponsored information". If it offers only product related information, then it must be labeled as such. 3.5 Description The "description" is a human readable text. The parameter should be implemented. 3.6 Content Rating The "contentrating" indicates the target audience with respect to child protection. The parameter is optional. Client software may choose to hide the bot if the content rating is not appropriate for the user. The "source" attribute of the "contentrating" node defines the value space and its interpretation. Possible values of the "source" attribute: - "MPAA": content rating as defined by the Motion Picture Association of America (MPAA) [3]. Known MPAA ratings (from http://www.mpaa.org/FlmRat_Ratings.asp): - "G": General Audiences-All Ages Admitted. - "PG": Parental Guidance Suggested. Some Material May Not Be Suitable For Children. - "PG-13": Parents Strongly Cautioned. Some Material May Be Inappropriate For Children Under 13. - "R": Restricted, Under 17 Requires Accompanying Parent Or Adult Guardian. - "NC-17": No One 17 And Under Admitted. 3.7 Topics The "tags" parameter contains a list of categories which describe Bot Tagging 4 Virtual Presence Technical Note 5 2008.05.15 the topic of the bot as detailed as possible. The parameter should be implemented. The "tags" node contains at least one "tag" child. Each "tag" has a single topic identifier as node text. The "tag" may also contain a quantifier attribute ("q") which indicates how much the topic identifier applies. The topic list is a set of such topic identifiers with optional quantifier. Topic list example: Games:Video Games:Educational Society:Issues:Environment Science:Environment:Forests and Rainforests Science Science:Biology Science:Biology:Flora and Fauna:Animalia:Chordata:Mammalia:Primates:Hominidae:Gorilla This reads as: - This is an educational game bot - about environmental issues - specifically about forests and rainforests - a bit about science - not so much about biology - and definitely not about gorillas. 3.7.1 Tag Line A tag line is a "tag"-tag with a topic identifier as node text. Example: Games:Video Games:Educational Note: the dmoz vocabulary emerges from the dmoz.org hierarchy of categories. The words of the vocabulary are colon-separated directory paths. But with respect to this specification, there is no hierarchy implied. An individual "tag" is a simple identifier. For this specification, each colon-separated directory path is a simple character string. 3.7.2 Quantifier The quantifier may be negative to indicate that a topic does not apply although other topics might indicate that it does. A negative value can also be used to include a general topic and exclude a sub- topic as an exception. "q" is a float value between -1.0 and 1.0. It defaults to "1.0". Valid representations include: 1.0 Bot Tagging 5 Virtual Presence Technical Note 5 2008.05.15 1 0.7 -0.31415 0 0.01 -1 -1.0 3.7.3 Vocabulary A vocabulary must be so complete, that it can describe many different bot types. Only one vocabulary can be used for a bot. The default vocabulary is drawn from the category list of the Open Directory Project [4]. Each category path is a valid topic identifier. The path separator is ":". Path segments must be white space trimmed. Examples: Games Games:Video Games Society:Issues:Environment Note: The vocabulary is identified by the "source" attribute of the "tags" node. The "source" attribute defaults to "http://rdf.dmoz.org/rdf/structure.rdf.u8.gz". Clients should check the "source" and indicate if the vocabulary is not supported. Note: finding topic identifiers for the bot tag is very simple. Just search the directory. Note: other vocabularies could be used. Alternative vocabularies include the Yahoo directory (http://dir.yahoo.com/) and Wikipedia page names (XXX in http://en.wikipedia.org/wiki/XXX). The "rainforest" topic in various vocabularies: dmoz: "Science:Environment:Forests and Rainforests" Yahoo: "Science:Ecology:Ecosystems:Forests:Rainforests" Wikipedia: "Rainforest" 4. Examples A janitor bot that walks random pages and pretends to clean them up. It stops working frequently and issues a message of the day: Jan Itor male entertainment none Bot Tagging 6 Virtual Presence Technical Note 5 2008.05.15 Home:Homemaking:Cleaning and Stains Society:Philosophy:Humor A bot that impersonates a two joggers in order to advertise for a major sports clothing brand. The bot shows a short play where the joggers meet by chance, talk about jogging then part. This is the female: Susan female advertisement none Live for the run, don't run for your life Sports:Running:Women Health:Fitness:Personal Training Society:Lifestyle Choices A treasure chest bot where users can loot virtual items: Schatzkiste entertainment reactive Games Shopping:Gifts Recreation:Games Games:Online 5. Ethical Considerations Anyone who runs a bot should provide a bot tag. It is only fair to tell users that they are interacting with a bot and what it is Bot Tagging 7 Virtual Presence Technical Note 5 2008.05.15 about. Not providing the bot tag is considered bad style. Virtual presence avatars are more engaging than nicknames in a chat room. This also means, that a fake avatar which is driven by a bot can engage users much more than a text only chat bot. Not providing the bot tag means exploiting advantages offered by virtual presence at the cost of the virtual presence system and it's users. Disguising a bot as a person is plain fraud. It can be regarded as unfair competition if conducted by businesses. 6. Security Considerations Providing the bot tag is recommended, but there is no technical way to enforce it. Even if it is supplied, then the information can be faked. This specification relies on good behavior as much as the robots.txt quasi standard and SMTP emails. In case of email, things clearly worked out badly. Users have to fight email spam with heuristic filters. The same might be necessary in virtual presence. If chat servers are open to anyone, then chat room operators have no more chance to fight spam than individual email server operators. Open chat servers are like open email relays. But they do not spam directly to users. Spam reaches users only if they visit virtual locations. But spam can render virtual presence on some or all sites unusable. A web of trust might help to distinguish real people positively from bots rather than relying on good behavior of bot operators. 7. References [1] VPTN-3: User Data, Virtual Presence Technical Note, http://www.virtual-presence.org/notes/VPTN-3.txt [2] FOAF Vocabulary Specification 0.9, http://xmlns.com/foaf/0.1/ [3] MPAA rating, http://www.mpaa.org/FlmRat_Ratings.asp [4] Open Directory Project, http://www.dmoz.org 8. Revisions 1 hw 2008.05.12 First release Bot Tagging 8