kopia lustrzana https://github.com/thinkst/zippy
1 wiersz
8.0 KiB
Plaintext
1 wiersz
8.0 KiB
Plaintext
Abstract Personalizing dialogue agents is important for dialogue systems to generate more specific, consistent, and engaging responses. How- ever, most current dialogue personalization ap- proaches rely on explicit persona descriptions during inference, which severely restricts its application. In this paper, we propose a novel approach that learns to predict persona infor- mation based on the dialogue history to per- sonalize the dialogue agent without relying on any explicit persona descriptions during infer- ence. Experimental results on the PersonaChat dataset show that the proposed method can im- prove the consistency of generated responses when conditioning on the predicted profile of the dialogue agent (i.e. âÂÂself personaâÂÂ), and improve the engagingness of the generated re- sponses when conditioning on the predicted persona of the dialogue partner (i.e. âÂÂtheir per- sonaâÂÂ). We also find that a trained persona pre- diction model can be successfully transferred to other datasets and help generate more rele- vant responses. Introduction Recently, end-to-end dialogue response genera- tion models (Sordoni et al., 2015; Serban et al., 2016; Bordes et al., 2017) based on recent ad- vances of neural sequence-to-sequence learning models (Sutskever et al., 2014; Vaswani et al., 2017) have gained increasing popularity as they can generate fluent responses. However, as the dialogue agent is trained with datasets contain- ing dialogues from many different speakers, it can not generate personalized responses for the current speaker, making the generated responses less rele- vant and engaging (Li et al., 2016b). To address this problem, recent studies attempt to personalize dialogue systems by generating di- alogue responses conditioning on given persona descriptions have been shown to help dialogue agents perform better (Zhang et al., 2018; Mazare àet al., 2018). However, a major drawback of the current dialogue agent personalization approaches is that they require explicit persona descriptions in both training and inference stages, which severely limits their application in real-world scenarios be- cause detailed persona descriptions for current speakers are not available in most scenarios. An- other problem is that current dialogue personaliza- tion approaches are not interpretable and the role of additional persona information is unclear. In this paper, we propose a novel dialogue agent personalization approach that automatically infers the speakerâÂÂs persona based on the dialogue his- tory which implicitly contains persona informa- tion. Our model generates personalized dialogue responses based on the dialogue history and the inferred speaker persona, alleviating the necessity of the persona description during inference. Specifically, we propose two different ap- proaches to perform persona detection. The first approach learns a âÂÂpersona approximatorâ which takes dialogue history as the input and is trained to approximate the output representation of a persona encoder that takes explicit persona description as the input. The second approach instead addresses the persona detection problem as a sequence-to- sequence learning problem and learns a âÂÂpersona generatorâ which takes the dialogue history as the input and generates the persona description of the speaker. This approach provides a stronger super- vision signal compared with the first approach and is more interpretable as the encoded persona infor- mation can be decoded to reconstruct the detected persona description. Our proposed approach can be used to incor- porate both âÂÂself-personaâ which is the persona information of the dialogue agent, and âÂÂtheir- personaâ which is the persona information of the dialogue partner. On one hand, generating dialogue responses conditioning on the inferred âÂÂself- personaâ can help the dialogue agent maintain a consistent persona during the conversation, thus enhancing the consistency of generated responses without the need of a pre-defined persona descrip- tion for every dialogue agent. On the other hand, generating dialogue responses conditioning on the predicted persona of the dialogue partner helps the dialogue model generate more engaging responses that are relevant to its dialogue partner. The abil- ity to automatically infer the persona information of the dialogue partner is particularly attractive be- cause in many real-world application scenarios, the persona information of the user is hardly avail- able before the dialogue starts. In addition, to fa- cilitate training and tackle the problem of lacking training data, we propose to train the persona de- tection model with multi-task learning by sharing layers and training jointly with the dialogue con- text encoder in both approaches. Our experiments on dialogue datasets with and without the persona description demonstrate the effectiveness of the proposed approach and show that a trained persona detection model can be suc- cessfully transferred to datasets withoutdescription. Related Work Preliminary study on dialogue personalization (Li et al., 2016b) attempts to use a persona-based neu- ral conversation model to capture individual char- acteristics such as background information and speaking style. However, it requires the current speaker during inference to have sufficient dialogue utterances included in the training set, which is quite restricted by the cold-start problem. An- other problem is that current dialogue personaliza- tion approaches are not interpretable and the role of additional persona information is unclear. In this paper, we propose a novel dialogue agent personalization approach that automatically infers the speakerâÂÂs persona based on the dialogue his- tory which implicitly contains persona informa- tion. Our model generates personalized dialogue responses based on the dialogue history and the inferred speaker persona, alleviating the necessity of the persona description during inference. Specifically, we propose two different ap- proaches to perform persona detection. The first approach learns a âÂÂpersona approximatorâ which takes dialogue history as the input and is trained to approximate the output representation of a persona encoder that takes explicit persona description as the input. The second approach instead addresses the persona detection problem as a sequence-to- sequence learning problem and learns a âÂÂpersona generatorâ which takes the dialogue history as the input and generates the persona description of the speaker. This approach provides a stronger super- vision signal compared with the first approach and is more interpretable as the encoded persona infor- mation can be decoded to reconstruct the detected persona description. Our proposed approach can be used to incor- porate both self-persona which is the persona information of the dialogue agent, and âÂÂtheir- personaâ which is the persona information of the dialogue partner. On one hand, generating dialogue responses conditioning on the inferred âÂÂself- personaâ can help the dialogue agent maintain a consistent persona during the conversation, thus enhancing the consistency of generated responses without the need of a pre-defined persona descrip- tion for every dialogue agent. On the other hand, generating dialogue responses conditioning on the predicted persona of the dialogue partner helps the dialogue model generate more engaging responses that are relevant to its dialogue partner. The abil- ity to automatically infer the persona information of the dialogue partner is particularly attractive be- cause in many real-world application scenarios, the persona information of the user is hardly avail- able before the dialogue starts. In addition, to fa- cilitate training and tackle the problem of lacking training data, we propose to train the persona de- tection model with multi-task learning by sharing layers and training jointly with the dialogue con- text encoder in both approaches. |