zippy/samples/llm-generated/2111.05241_generated.txt

1 wiersz
6.9 KiB
Plaintext
Czysty Wina Historia

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

Abstract Crowdsourcing requesters on Amazon Me- chanical Turk (AMT) have raised questions about the reliability of the workers. The AMT workforce is very diverse and it is not possible to make blanket assumptions about them as a group. Some requesters now reject work en mass when they do not get the results they ex- pect. This has the effect of giving each worker (good or bad) a lower Human Intelligence Task (HIT) approval score, which is unfair to the good workers. It also has the effect of giving the requester a bad reputation on the workers’ forums. Some of the issues causing the mass rejections stem from the requesters not tak- ing the time to create a well-formed task with complete instructions and/or not paying a fair wage. To explore this assumption, this paper describes a study that looks at the crowdsourc- ing HITs on AMT that were available over a given span of time and records information about those HITs. This study also records in- formation from a crowdsourcing forum on the worker perspective on both those HITs and on their corresponding requesters. Results reveal issues in worker payment and presentation is- sues such as missing instructions or HITs that are not doable. 1 Introduction The rise of artificial intelligence has led to a surge in the need for data. Models are becoming increas- ingly more complex, needing more fine-grained data. In order to quickly collect such data, many researchers have turned to crowdsourcing. The crowdsourcing platform that is most familiar to re- questers is Amazon Mechanical Turk (AMT). Any- one in many countries can create an Amazon ac- count and offer work, in the form of HITs, to AMT workers. Desiring rapid results, requesters often post HITs quickly, not taking into account that the way in which a HIT is presented has a direct effect on the quality of the data they obtain. They may also not be aware that their actions as requesters affect the quality of the data they gather. For ex- ample, a HIT could have unclear instructions, low payment, or in some cases might not even function. The two former cases result can lead to lower data quality. Lately, this has lead requesters, seeing how to revise their HIT for better results, to refuse to pay for their earlier errorful HIT. Thus they send out mass rejections (rejecting all of the work on a given HIT regardless of its quality). While for the requester this is just restarting a HIT, the effect of this mass rejection is felt on both sides. The workers are not paid for the time they spent regard- less of the quality of their work. Also, since their work has been rejected, their individual rating goes down, thus making it hard for them to qualify in the future for other HITs. The requester also gets a bad reputation on the quality of their HITs and their payments on the workers’ forums such as Turkop- ticon (Irani and Silberman, 2013) 1 2, TurkerView 3, Turker Nation 4, and MTurk Crowd 5. These sites are regularly visited by many of the workers to determine which HITs are considered to be a reliable source of income. They also use the in- formation to avoid certain requesters. Requesters can monitor their reputation on these sites and use worker feedback to improve their HITs and thus eventually their reputations. Background Recently, recruiting workers and bringing them into a lab has become much less desirable due to the high cost and the lack of diversity of the workers. In response to this issue AMT has grown in popularity amongst researchers (Paolacci et al., 2010). 2.1 Worker Payment A major issue affecting both the interaction be- tween the requester and the worker and the data quality is worker payment. Hara et al (Hara et al., 2018) have shown that the mean wage for a worker on AMT is very low, $3.13 per hour at the time, while requesters generally pay on average $11.58 per hour. This apparent disparity is due to the fact that the most abundant HITs tend to be the lower paid ones. Researchers may believe that AMT workers are willing to work for very little, and that the amount of compensation does not af- fect data quality (Buhrmester et al., 2016) (Mason and Watts, 2009). Indeed, a good worker will try to work on the better paying HITs when possible. But if it is the end of the month and rent is due, workers are obliged to take whatever HITs are available attime even if they don’t pay well. This may result in the worker spending less time reading in- structions or in actually working on a HIT, in order to make the meager payment cover less work time. This usually results in lower quality HIT data. 2.2 Communication In addition to its regular nature, the HITs themselves are communicative. While for the requester this is just restarting a HIT, the effect of this mass rejection is felt on both sides. The workers are not paid for the time they spent regard- less of the quality of their work. Also, since their work has been rejected, their individual rating goes down, thus making it hard for them to qualify in the future for other HITs. The requester also gets a bad reputation on the quality of their HITs and their payments on the workers’ forums such as Turkop- ticon (Irani and Silberman, 2013) 1 2, TurkerView 3, Turker Nation 4, and MTurk Crowd 5. These sites are regularly visited by many of the workers to determine which HITs are considered to be a reliable source of income. They also use the in- formation to avoid certain requesters. Requesters can monitor their reputation on these sites and use worker feedback to improve their HITs and thus eventually their reputations. Background Recently, recruiting workers and bringing them into a lab has become much less desirable due to the high cost and the lack of diversity of the workers. In response to this issue AMT has grown in popularity amongst researchers (Paolacci et al., 2010). 2.1 Worker Payment A major issue affecting both the interaction be- tween the requester and the worker and the data quality is worker payment. Hara et al (Hara et al., 2018) have shown that the mean wage for a worker on AMT is very low, $3.13 per hour at the time, while requesters generally pay on average $11.58 per hour. This apparent disparity is due to the fact that the most abundant HITs tend to be the lower paid ones. Researchers may believe that AMT workers are willing to work for very little, and that the amount of compensation does not af- fect data quality (Buhrmester et al., 2016) (Mason and Watts, 2009). Indeed, a good worker will try to work on the better paying HITs when possible. But if it is the end of the month and rent is due, workers are obliged to take whatever HITs are available at that time even if they don’t pay well. This may result in the worker spending less time reading in- structions or in actually working on a HIT, in order to make the meager payment cover less work time. This usually results in lower quality HIT data. 2.3 Communication In addition to its regular nature, the interaction between requesters and workers is text-based.