zippy/samples/llm-generated/2110.10319_generated.txt

1 wiersz
5.2 KiB
Plaintext
Czysty Wina Historia

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

Abstract While large-scale pretrained language models have been shown to learn effective linguistic representations for many NLP tasks, there remain many real-world contextual aspects of language that current approaches do not capture. For instance, consider a clozetest I enjoyed the game this weekend: the correct answer depends heavily on where the speaker is from, when the utterance occurred, and the speaker™s broader social milieu and preferences. Although language depends heavily on the geographical, temporal, and other social contexts of the speaker, these elements have not been incorporated into modern transformer-based language models. We propose a simple but effective approach to incorporate speaker social context into the learned representations of large-scale language models. Our method first learns dense representations of social contexts using graph representation learning algorithms and then primes language model pretraining with these social context representations. We evaluate our approach on geographically-sensitive languagemodeling tasks and show a substantial improvement (more than 100% relative lift on MRR) compared to baselines. Introduction Language models are at the very heart of many modern NLP systems and applications (Young et al., 2018). Representations derived from largescale language models are used widely in many downstream NLP models (Peters et al., 2018; Devlin et al., 2019). However, an implicit assumption made in most modern NLP systems (including language models) is that language is independent of extra-linguistic context such as speaker/author identity and their social setting. While this simplifying assumption has undoubtedly encouraged remarkable progress in modeling language, there is overwhelming evidence in socio-linguistics that language understanding is influenced by the social context in which language is grounded (Nguyen et al., 2016; Hovy, 2018; Mishra et al., 2018; Garten et al., 2019; Flek, 2020; Bender and Koller, 2020). In fact, language use on social media where every utterance is grounded in a specific social context (like time, geography, social groups, communities) reinforces this often ignored aspect of language. When NLP applications ignore this social context, they may perform sub-optimally underscoring the need for a richer integration of social contexts into NLP models (Pavalanathan et al., 2015; Lynn et al., 2017; Zamani et al., 2018; Lynn et al., 2019; May et al., 2019; Kurita et al., 2019; Welch et al., 2020a; Hovy and Yang, 2021). Prior attempts to better leverage the social context surrounding language while learning language representations have mostly focused on learning social context dependent word embeddings and have been primarily used to characterize language variation across many dimensions (time, geography, and demographics). These methods learn word embeddings for each specific social context and can capture how word meanings vary across these dimensions (Bamman et al., 2014; Kulkarni et al., 2015; Hamilton et al., 2016; Welch et al., 2020a,b). However, word embedding based approaches in general suffer from two fundamental limitations: (a) word embeddings are not linguistically contextualized as noted by Peters et al. (2018) (b) word embedding learning is transductive – they can only generate embeddings for words observed during training and usually assume a finite word vocabulary and a set of social contexts all of which need to be seen during training. Recent approaches have addressed the first limitation by learning word representations that are contextualized by their tokenspecific usage context (Peters et al., 2018; Devlin et al., 2019; Liu et al., 2019; Yang et al., 2019b,a). The second limitation has been addressed by WordPiece tokenization methods (Schuster and Nakajima, 2012; Devlin et al., 2019; Liu et al., 2019). While these approaches have successfully captured linguistic context, they still do not capture social context in language representations.2 ‐How can we learn linguistically contextualized and socially contextualized language representations?‐ is the question we seek to answer in this paper. We propose LMSOC to (a) learn representations of tokens that are both linguistically contextualized and socially sensitive and (b) enable the language model to inductively generate representations for language grounded in social contexts it has never observed during the language model pre-training process. As an example, our model can enable NLP systems to associate the right entity being referred to based on the broader user/social context in which an utterance like the city is grounded. Conclusion We proposed a method to learn socially sensitive contextualized representations from large-scale language models. Our method embeds social context in continuous spacegraph representation algorithms and proposes a simple but effective socially sensitive pre-training approach. Our approach thus enables language models to leverage correlations between social contexts and thus generalize better to social contexts not observed in training. More broadly, our method sets the stage for future research on incorporating new types of representations and enabling artists to better leverage the social context surrounding a specific work.