Folder Versus Tag Preference in Personal
Information Management
Ofer Bergman, Noa Gradovitch, and Judit Bar-Ilan
Department of Information Science, Bar-Ilan University Ramat-Gan, Max ve-Anna Webb Street, Israel,
5290002. E-mail: oferbergman@gmail.com, noa.grasdovitch@live.biu.ac.il, barilaj@mail.biu.ac.il
Ruth Beyth-Marom
Department of Education and Psychology, The Open University of Israel, 108 Ravutski Street POB 808,
Raanana 43107, Israel. E-mail: ruthbm@openu.ac.il
Users’ preferences for folders versus tags was studied
in 2 working environments where both options were
available to them. In the Gmail study, we informed 75
participants about both folder-labeling and tag-labeling,
observed their storage behavior after 1 month, and
asked them to estimate the proportions of different
retrieval options in their behavior. In the Windows 7
study, we informed 23 participants about tags and asked
them to tag all their files for 2 weeks, followed by a
period of 5 weeks of free choice between the 2 methods.
Their storage and retrieval habits were tested prior to the
learning session and, after 7 weeks, using special clas-
sification recording software and a retrieval-habits ques-
tionnaire. A controlled retrieval task and an in-depth
interview were conducted. Results of both studies show
a strong preference for folders over tags for both
storage and retrieval. In the minority of cases where tags
were used for storage, participants typically used a
single tag per information item. Moreover, when multiple
classification was used for storage, it was only margin-
ally used for retrieval. The controlled retrieval task
showed lower success rates and slower retrieval speeds
for tag use. Possible reasons for participants’ prefer-
ences are discussed.
Personal information management (PIM) is a basic
human–computer behavior in which the user stores his or
her information items (e.g., files, e-mails, and web favorites)
to later retrieve them. Traditionally, PIM systems provided
folders for information storage and retrieval; however, as a
consequence of the popularity of Web 2.0, tags also defused
into PIM systems.
It is widely claimed that tags have two fundamental
advantages over folders: Tags enable multiple classification
and eliminate the need for hierarchies. (a) Multiple
Classification: In the folders method, an information item
can be stored only in a single folder; however, the user
may have a number of possible classifications related to that
item (Dourish et al., 2000). For example, pictures from a
conference in Copenhagen can be stored under “Pictures,”
“Trips,” “Conferences,” or “Copenhagen.” As time passes,
users may forget the choice they initially made, making
retrieval difficult. In contrast, the tagging method enables
users to apply any number of tags to their information item,
and use any of these tags to retrieve it. (b) No Hierarchical
Location: Folders may hide information items from view
because they do not show items contained in subfolders
(Malone, 1983). The tagging method consciously rejects
hierarchies and locations. Instead, all information items are
stored in a single repository and are retrieved via nonhier-
archical means such as tag search, tag selection, or tag
clouds.
Which option is preferable when both options are avail-
able and users are familiar with both? To our knowledge,
this is the first research study to test this question. We con-
ducted two studies: In one, the Gmail study, we first intro-
duced 75 participants to the folder-labeling and tag-labeling
options of the Gmail interface, waited 1 month, and then
observed their mailboxes to analyze their actual behavior. In
the second study, the Windows 7 study, we asked 23 partici-
pants to tag all files that they used for a period of 2 weeks,
then returned after 5 weeks to observe, using special soft-
ware, the amount of tagging performed on new files and
conducted in-depth interviews regarding the users’ behavior.
In the Windows 7 study, we also compared the retrieval
efficiency of the two methods using a controlled test.
Theoretical Background
Folder Hierarchies
Hierarchical storage was first introduced to end-users in
the Multics operating system in the mid-1960s. Users were
Received August 12, 2012; Revised December 4, 2012; accepted December
4, 2012
© 2013 ASIS&T
•
Published online in Wiley Online Library
(wileyonlinelibrary.com). DOI: 10.1002/asi.22906
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, ••(••):••–••, 2013