Virtual Presence Technical Note 5 2008.05.15
Bot Tagging
About this Note
Number 5
Version 1
Date 2008.05.15
Category Informational
Status Draft
Short Name Bot Tagging
Document VPTN-5.txt
Authors Heiner Wolf, hw, wolf.heiner@gmail.com
Working Group -
Dependencies VPTN-3.txt
Supersedes -
Superseded By -
Abstract
Bots in chat rooms of virtual presence locations may appear like
avatars of real people. They can not be distinguished easily. This
document specifies a way to describe the purpose of a bot. The
description is primarily machine readable so, that client software
can compare user preferences with the description and inform users
appropriately.
Table of Contents
1. Introduction.......................................................1
2. Identity Item......................................................2
3. Tag Format.........................................................2
3.1 Bot Name.......................................................3
3.2 Avatar Gender..................................................3
3.3 Interaction Mode...............................................3
3.4 Purpose........................................................3
3.5 Description....................................................4
3.6 Content Rating.................................................4
3.7 Topics.........................................................4
3.7.1 Tag Line...................................................5
3.7.2 Quantifier.................................................5
3.7.3 Vocabulary.................................................6
4. Examples...........................................................6
5. Ethical Considerations.............................................7
6. Security Considerations............................................8
7. References.........................................................8
8. Revisions..........................................................8
1. Introduction
In addition to real people there will be bots visible on web pages.
Bots will join chat rooms of virtual presence locations for various
purposes. Some will offer services, some will belong to games, and
Bot Tagging 1
Virtual Presence Technical Note 5 2008.05.15
some will carry commercial advertisements. Many will appear as human
avatars.
Users should be able to differentiate between real people and bots.
Some bot operators might not like the idea, but still users should
be able to know if they are talking to a bot or to a real person.
More specifically, users should be able to configure their clients
to show only certain types of bots.
Some users might not like gaming bots. Many may not like simple
advertisements. Users will avoid excessive advertisements anyhow.
There will be spam filters for virtual presence. There will be trust
services to identify real people. There will be all kinds of anti-
spam measures. There is now way to prevent users from opting out.
But if operators tag their bots with keywords, then they have a
chance to get the user's attention. Actually, bot operators
(including advertisers) should make their content so attractive,
that it entertains users rather than annoying them. If users can
select what they like to see and if they like what they see, then
bot operators will get a much higher return.
This specification augments the identity [1] of bots with structured
information about the bot's purpose. The information is primarily a
list of tags. It is referred to as the "bot tag".
2. Identity Item
Anyone who meets the bot must be able evaluate the bot tag.
Therefore, the bot tag is part of the bot's identity data. The bot
tag is an identity item. As any identity item, it can be inline or
referenced by a URL.
External reference example:
...
Note: the tag is currently intended for bot tagging. But it might
describe other entities later. Hence, it is called contenttype="tag"
instead of contenttype="bottag".
3. Tag Format
The bot tag uses an RDF format. The RDF data describes the identity
file in which it is embedded.
A bot tag uses an XML namespace specifically tailored to tagging of
bots: http://www.virtual-presence.org/bottags#
Bot Tagging 2
Virtual Presence Technical Note 5 2008.05.15
Note: other entities might use other namespaces, e.g. users could
describe themselves using the FOAF vocabulary [2].
Example of an educational quizbot with a rainforest topic:
Woodelf
female
reactive
entertainment
Save the Rainforest Quizbot
PG
Games:Video Games:Educational
Society:Issues:Environment
3.1 Bot Name
The "name" is a short human readable identifier. It might be
displayed in addition to the nickname of the "properties" identity
item. The parameter is optional.
3.2 Avatar Gender
The "gender" indicates the gender of the bot's avatar for whatever
display purpose this might be useful. The parameter is optional.
Possible values:
- "female"
- "male"
- "none"
3.3 Interaction Mode
The "interaction" tells if the bot has interactive capabilities,
whether it responds to actions or acts on it's own. The parameter is
optional.
Possible values:
- "none": no interaction
- "reactive": bot reacts to user input
- "proactive": bot acts on its own behalf
- "interactive": unspecified interaction capabilities. May be used
as a combination of "reactive" and "proactive".
3.4 Purpose
The "purpose" tells the primary purpose of the bot, whether it
informs, advertises, entertains, etc. The parameter should be
implemented.
Bot Tagging 3
Virtual Presence Technical Note 5 2008.05.15
Possible values:
- "entertainment": pure entertainment
- "information": purely informative like a travel advisory
- "infotainment": informative with an entertaining presentation
- "advertisement": pure advertisement, e.g. brand promotion
- "advertainment": advertising with an entertaining presentation
- "product information": well, the usual consumer information
- "sponsored information": general information with reference to
products
- "education": purely educative
- "edutainment": educative with an entertaining presentation
Note: if most users perceive a bot as pure advertisements then they
must be labeled as such. Only if an ad-bot is really entertaining by
e.g. offering a mini game, a quiz, or if it shows a play, then it
can be labeled "advertainment". Telling users about product details
and benefits is not considered pure "information". It is considered
"product information". Only if the bot offers information of general
interest, which is related to a product, then it can be labeled
"sponsored information". If it offers only product related
information, then it must be labeled as such.
3.5 Description
The "description" is a human readable text. The parameter should be
implemented.
3.6 Content Rating
The "contentrating" indicates the target audience with respect to
child protection. The parameter is optional.
Client software may choose to hide the bot if the content rating is
not appropriate for the user.
The "source" attribute of the "contentrating" node defines the value
space and its interpretation.
Possible values of the "source" attribute:
- "MPAA": content rating as defined by the Motion Picture
Association of America (MPAA) [3].
Known MPAA ratings (from http://www.mpaa.org/FlmRat_Ratings.asp):
- "G": General Audiences-All Ages Admitted.
- "PG": Parental Guidance Suggested. Some Material May Not Be
Suitable For Children.
- "PG-13": Parents Strongly Cautioned. Some Material May Be
Inappropriate For Children Under 13.
- "R": Restricted, Under 17 Requires Accompanying Parent Or Adult
Guardian.
- "NC-17": No One 17 And Under Admitted.
3.7 Topics
The "tags" parameter contains a list of categories which describe
Bot Tagging 4
Virtual Presence Technical Note 5 2008.05.15
the topic of the bot as detailed as possible. The parameter should
be implemented.
The "tags" node contains at least one "tag" child. Each "tag" has a
single topic identifier as node text. The "tag" may also contain a
quantifier attribute ("q") which indicates how much the topic
identifier applies.
The topic list is a set of such topic identifiers with optional
quantifier.
Topic list example:
Games:Video Games:Educational
Society:Issues:Environment
Science:Environment:Forests and Rainforests
Science
Science:Biology
Science:Biology:Flora and
Fauna:Animalia:Chordata:Mammalia:Primates:Hominidae:Gorilla
This reads as:
- This is an educational game bot
- about environmental issues
- specifically about forests and rainforests
- a bit about science
- not so much about biology
- and definitely not about gorillas.
3.7.1 Tag Line
A tag line is a "tag"-tag with a topic identifier as node text.
Example:
Games:Video Games:Educational
Note: the dmoz vocabulary emerges from the dmoz.org hierarchy of
categories. The words of the vocabulary are colon-separated
directory paths. But with respect to this specification, there is no
hierarchy implied. An individual "tag" is a simple identifier. For
this specification, each colon-separated directory path is a simple
character string.
3.7.2 Quantifier
The quantifier may be negative to indicate that a topic does not
apply although other topics might indicate that it does. A negative
value can also be used to include a general topic and exclude a sub-
topic as an exception. "q" is a float value between -1.0 and 1.0. It
defaults to "1.0".
Valid representations include:
1.0
Bot Tagging 5
Virtual Presence Technical Note 5 2008.05.15
1
0.7
-0.31415
0
0.01
-1
-1.0
3.7.3 Vocabulary
A vocabulary must be so complete, that it can describe many
different bot types. Only one vocabulary can be used for a bot.
The default vocabulary is drawn from the category list of the Open
Directory Project [4]. Each category path is a valid topic
identifier. The path separator is ":". Path segments must be white
space trimmed.
Examples:
Games
Games:Video Games
Society:Issues:Environment
Note: The vocabulary is identified by the "source" attribute of the
"tags" node. The "source" attribute defaults to
"http://rdf.dmoz.org/rdf/structure.rdf.u8.gz". Clients should check
the "source" and indicate if the vocabulary is not supported.
Note: finding topic identifiers for the bot tag is very simple. Just
search the directory.
Note: other vocabularies could be used. Alternative vocabularies
include the Yahoo directory (http://dir.yahoo.com/) and Wikipedia
page names (XXX in http://en.wikipedia.org/wiki/XXX).
The "rainforest" topic in various vocabularies:
dmoz: "Science:Environment:Forests and Rainforests"
Yahoo: "Science:Ecology:Ecosystems:Forests:Rainforests"
Wikipedia: "Rainforest"
4. Examples
A janitor bot that walks random pages and pretends to clean them up.
It stops working frequently and issues a message of the day:
Jan Itor
male
entertainment
none
Bot Tagging 6
Virtual Presence Technical Note 5 2008.05.15
Home:Homemaking:Cleaning and Stains
Society:Philosophy:Humor
A bot that impersonates a two joggers in order to advertise for a
major sports clothing brand. The bot shows a short play where the
joggers meet by chance, talk about jogging then part. This is the
female:
Susan
female
advertisement
none
Live for the run, don't run for your life
Sports:Running:Women
Health:Fitness:Personal Training
Society:Lifestyle Choices
A treasure chest bot where users can loot virtual items:
Schatzkiste
entertainment
reactive
Games
Shopping:Gifts
Recreation:Games
Games:Online
5. Ethical Considerations
Anyone who runs a bot should provide a bot tag. It is only fair to
tell users that they are interacting with a bot and what it is
Bot Tagging 7
Virtual Presence Technical Note 5 2008.05.15
about. Not providing the bot tag is considered bad style.
Virtual presence avatars are more engaging than nicknames in a chat
room. This also means, that a fake avatar which is driven by a bot
can engage users much more than a text only chat bot. Not providing
the bot tag means exploiting advantages offered by virtual presence
at the cost of the virtual presence system and it's users.
Disguising a bot as a person is plain fraud. It can be regarded as
unfair competition if conducted by businesses.
6. Security Considerations
Providing the bot tag is recommended, but there is no technical way
to enforce it. Even if it is supplied, then the information can be
faked.
This specification relies on good behavior as much as the robots.txt
quasi standard and SMTP emails. In case of email, things clearly
worked out badly. Users have to fight email spam with heuristic
filters. The same might be necessary in virtual presence.
If chat servers are open to anyone, then chat room operators have no
more chance to fight spam than individual email server operators.
Open chat servers are like open email relays. But they do not spam
directly to users. Spam reaches users only if they visit virtual
locations. But spam can render virtual presence on some or all sites
unusable.
A web of trust might help to distinguish real people positively from
bots rather than relying on good behavior of bot operators.
7. References
[1] VPTN-3: User Data, Virtual Presence Technical Note,
http://www.virtual-presence.org/notes/VPTN-3.txt
[2] FOAF Vocabulary Specification 0.9, http://xmlns.com/foaf/0.1/
[3] MPAA rating, http://www.mpaa.org/FlmRat_Ratings.asp
[4] Open Directory Project, http://www.dmoz.org
8. Revisions
1 hw 2008.05.12 First release
Bot Tagging 8