kopia lustrzana https://github.com/thinkst/zippy
1 wiersz
5.9 KiB
Plaintext
1 wiersz
5.9 KiB
Plaintext
Abstract A critical point of multi-document summarization (MDS) is to learn the relations among various documents. In this paper, we propose a novel abstractive MDS model, in which we represent multiple documents as a heterogeneous graph, taking semantic nodes of different granularities into account, and then apply a graph-to-sequence framework to generate summaries. Moreover, we employ a neural topic model to jointly discover latent topics that can act as cross-document semantic units to bridge different documents and provide global information to guide the summary generation. Since topic extraction can be viewed as a special type of summarization that âÂÂsummarizesâ texts into a more abstract format, i.e., a topic distribution, we adopt a multi-task learning strategy to jointly train the topic and summarization module, allowing the promotion of each other. Experimental results on the Multi-News dataset demonstrate that our model outperforms previous state-of-the-art MDS models on both Rouge metrics and human evaluation, meanwhile learns high-quality topics. Introduction Multi-document summarization (MDS) is the task to create a fluent and concise summary for a collection of thematically related documents. Compared to single document summarization, it requires the ability to incorporate the perspective from multiple sources and therefore is arguably more challenging (Lin and Ng, 2019). Broadly, existing studies can be classified into two categories: extractive and abstractive. Extractive approaches directly select important sentences from the input documents, which is usually regarded as a sentence labeling (Nallapati et al., 2016; Zhang et al., 2018; Dong et al., 2018) or sentence ranking task (Narayan et al., 2018). By contrast, abstractive models typically use the natural language generation technology to produce a word-by-word summary. In general, extractive methods are more efficient and can avoid grammatical errors (Cui et al., 2020), while abstractive methods are more flexible and human-like because they can generate absent words(Lin and Ng, 2019). Recently, with the development of representation learning for NLP (Vaswani et al., 2017; Devlin et al., 2018) and large-scale datasets (Fabbri et al., 2019), some studies have achieved promising results on abstractive MDS (Liu and Lapata, 2019; Jin et al., 2020). Nevertheless, we found there are two limitations that have not been addressed by previous studies. First, some works simply concatenate multiple documents into a flat sequence and then apply single-document summarization approaches (Liu et al., 2018; Fabbri et al., 2019). However, this paradigm fails to consider the hierarchical document structures, which plays a key role in MDS task (Jin et al., 2020). Also, the concatenation operation inevitably produces a lengthy sequence, and encoding long texts for summarization is a challenge (Cohan et al., 2018). Second, when dealing with multiple documents, a critical point is to learn the cross-document relations. Some studies address this problem by mining the co-occurrence words or entities (Wang et al., 2020a), which can hardly capture implicit associations due to the diverse language expressions. Some other studies (Jin et al., 2020; Liu and Lapata, 2019) first generate low-dimensional vectors in sentence- or paragraph-level and then build interaction based on these highly compressed representations. These methods inevitably result in the loss of large amounts of fine-grained interaction features and would damage the interpretability of models. Therefore, how to learn the relation across documents effectively remains an open question. To shed lights on these missing points, this paper proposes a novel abstractive MDS model that marries topic modeling into abstractive summary generation. The motivation is that both tasks aim to distil salient information from massive text and therefore could provide complementary features for each other. Concretely, we jointly optimize a neural topic model (NTM) (Miao et al., 2017; Srivastava and Sutton, 2017) that learns topic distribution of source documents and corpus-level topic representations, and an abstractive summarizer that incorporates latent topics to summary generation process. In the encoding process, we represent multiple documents as a heterogeneous graph consisting of word, topic, and document nodes and encode it with a graph neural network to capture the interactions among different semantic units. In the decoding process, we devise a topic-aware decoder that leverages learned topics to guide the summary generation. We train the two modules with a multitask learning framework, where an inconsistency loss is applied to penalize the difference between the topic distribution of source documents and that of generated summaries. It encourages theto generate a summary that is thematically consistent with its source documents and also helps the two modules learn from each other. In this manner, our model is learned such that better topics can yield better summaries and vice versa. We conduct throughout experiments on the recently released Multi-News dataset, which demonstrates the effectiveness of our model. Introduction Multi-document summarization (MDS) is the task to create a fluent and concise summary for a collection of thematically related documents. Compared to single document summarization, it requires the ability to incorporate the perspective from multiple sources and therefore is arguably more challenging (Lin and Ng, 2019). Broadly, existing studies can be classified into two categories: extractive and abstractive. Extractive approaches directly select important sentences from the input documents, which is usually regarded as a sentence labeling (Nallapati et al., 2016; Zhang et al., 2018; Dong et al., 2018) or sentence ranking task (Narayan et al., 2018). By contrast, abstractive models typically use the natural language generation technology to produce a word-by-word summary. |