Dynamics of Software Development Crowdsourcing
Alpana Dubey*, Kumar Abhinav
†
, Sakshi Taneja
†
, Gurdeep Virdi*, Anurag Dwarakanath*, Alex Kass
‡
, Mani Suma
Kuriakose*
* Accenture Technology Labs, Bangalore, INDIA
† IIIT-Delhi, Delhi, INDIA
‡ Accenture Technology Labs, San Jose, USA
Abstract—The emergence of online labor markets has
concentrated a lot of attention on the prospect of using
crowdsourcing for software development, with a potential to
reduce costs, improve time-to-market, and access high-quality
skills on demand. However, crowdsourcing of software
development is still not widely adopted. A key barrier to
adoption is a lack of confidence that a task will be completed on
time with the required quality standards. While good managers
can develop good, intuitive estimates of task completion when
assigning work to their team members, they might lack similar
intuition for individuals drawn from an online crowd. The
phrase, “Post and Hope” is thus sometimes used when talking
about the crowdsourcing of software-development tasks. The
objective of this paper is to show the value of replacing the
traditional, intuitive assessment of a team’s capability with a
quantitative assessment of the crowd, derived through analysis of
historical performance on similar tasks. This analysis will serve
to transform “Post and Hope” to “Post and Expect.” We
demonstrate this by analyzing data about tasks performed on
two popular crowdsourcing platforms: Topcoder and Upwork.
Analysis of historical data from these platforms indicates that the
platforms indeed demonstrate some level of predictability in task
completion. We have identified certain factors that consistently
contribute to task completion on both the platforms. Our
findings suggest that a data-driven decision processes can play an
important role in successful adoption of crowdsourcing practice
for software development.
Keywords— Crowdsourcing; tracking; forecasting; software
development; workforce analytics.
I. INTRODUCTION
Crowdsourcing is gaining a lot of attention in industry as
alternative workforce model for software engineering efforts
[8, 10, 11, 12]. Crowdsourcing has already been successfully
applied to micro-tasks, such as image tagging, translation
services etc. [2, 3, 4, 5, 9] in which specific deep skills are not
required. In contrast, software development tasks require
sustained effort, complex cognitive skills, and training.
Therefore, there is more uncertainty when posting a software
development task to a crowd, whether someone with the
relevant skills will find the task, take it on, and successfully
complete in the required timeframe. Making crowdsourced
software engineering a significant, industrial-scale success will
require addressing several challenges [2, 6, 7]. One of the key
challenges is to reliably estimate whether or not a task posted
on a crowdsourcing platform will be completed, and beyond
that, whether it will be completed to the required quality
standard [35]. The situation today is what we refer to as, “Post
and Hope,” reflecting the difficulty of estimating how a crowd
will handle a task. This doesn’t necessarily imply that the
crowd or crowd workers themselves are less reliable. It may
actually be more reliable in many cases, but task posters lack a
data-driven analysis that could turn Post-and-Hope into
something more deterministic, i.e. Post-and-Expect. The high
level goal of this paper is to transform the Post-and-Hope
model of crowdsourcing to the Post-and-Expect model. To
achieve this goal, we investigate whether patterns can be
detected in what tasks are successfully completed by a given
crowd, and what factors correlate with successful outcomes.
Breaking this focus down a bit further gives rise to the
following research questions:
RQ1: What are the factors that influence task completion?
RQ2: What are the factors that influence quality of results?
RQ3: Can the probability of task completion be predicted
for future tasks based on historical data?
RQ4: Can the quality of completed task be predicted from
historical data?
To answer the above questions we performed an analysis of
multiple years’ worth of data from two popular crowdsourcing
platforms on which significant amount of software
development tasks are posted: Topcoder [11] and Upwork [3].
Topcoder uses a competition-based work model: Tasks are
posted as contests and multiple people work on the same task
and submit their work. A set of top submitters are selected as
winners. Upwork uses a model that is closer to traditional
freelancer hiring. People are hired from an online marketplace
to perform a task.
The results of our data analysis are encouraging. Our
experiments with various machine learning algorithms
demonstrate that task completion can be predicted with better
accuracy compare to the baseline algorithm. This study can be
used as a starting point for the organizations who are
investigating crowdsourcing as a means of alternative software
development workforce.
The main contributions of the paper are as follows:
1. An exploratory analysis of various characteristics of
the two popular crowdsourcing platforms that follow
entirely different engagement models based on 2.5
years of task data.
2. An assessment of the importance of various features
for predicting task completion and quality of output,
and an approach to predictive analytics based on the
data from the two platforms.
2016 IEEE 11th International Conference on Global Software Engineering
2329-6313/16 $31.00 © 2016 IEEE
DOI 10.1109/ICGSE.2016.13
49