The full thesis is available here.

Automated data science is a growing field that aims to make the process of applying data
science techniques more efficient, accurate, and accessible. One of the early and primary
tasks in data science process is the development of relevant hypotheses. Humans posess
the creativity and necessary sensemaking process to come up with hypotheses. This can
be used in a crowdsourcing environment to generate hypotheses about a dataset.
Meanwhile, Qiu et al. (2020c) proposed the conversational crowdsourcing which can
make the crowdsourcing process more intuitive and user-friendly. Moreover, this can
help to increase participation and user engagement. Inspired by this work, this thesis
adopts the concept of conversational crowdsourcing to generate hypotheses by a non-
expert crowd.
This thesis investigates the impact of various conversational styles used to communi-
cate with the crowdworker. Two distinct styles–“machine-like” and “mixed”–were de-
veloped and used as conversational styles. Moreover, the thesis examines the influence
of information elements, such as text and visualization presented to the crowdworker
and whether these affect the quality of hypotheses generated by these crowdworkers.
The thesis also considered how the cognitive loads of the crowd are impacted by conver-
sational styles and informational elements. For this, an experiment was conducted. In
the experiment, 40 workers from the Amazon MTurk platform generated 164 hypotheses
in a chat-based environment. The generated hypotheses were rated on their quality by
domain experts.
The analysis shows that there are complex interdependencies across the experiment
conditions. The results indicate that text-based information elements and a mixed con-
versational style put less cognitive load on the worker. Furthermore, the results show
that a specific combination of conversational style and information elements influences
the quality of the hypothesis. In particular, the text-based information elements and
machine-like conversational style generate hypotheses of higher quality than other com-
binations. However, the results presented in this thesis do not show statistical relevance.
Further research is required to strengthen the analysis done in this work.

Schreiben Sie einen Kommentar

Ihre E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert