DExperts: On-the-Fly Controlled Text Generation with Experts and Anti-Experts

🔎

DExperts: On-the-Fly Controlled Text Generation with Experts and Anti-Experts

Tags

NLP

논문리뷰

Published

Published May 14, 2021

notion image

TL;DR

Controllable Generation의 일종

Dexpert라는 Language Models의 Ensenble을 만들다

Decoding 시점에 다음 Token 선택의 확률P를 Ensenble로 바꾸는 방식

이렇게 Ensenble할 모델을 (최대) 3개를 학습함
기존 LM: 이건 추가 학습 안함, 그냥 LM.
(원하는 방향의) LM M+: 원하는 방향(Sentiment, Non-toxic, etc..)의 데이터로 Finetune한 LM
(반대방향의) LM M-: 원하지 않는 방향(Toxic, etc...)의 데이터로 Finetune한 LM
이때 LM Prob를 로 취한다.

Detoxification, Sentiment control 두 가지 주제로 실험

Fluency, Diversity를 유지함

GeDI 논문의 후속연구인 셈. GeDI를 비교 Baseline으로 사용함.

How DExperts works

DEXPERTS = Decoding Method for controlled text generation based on a product of experts

dexperts는 크게 3가지(혹은 2가지)의 Language Model의 Ensenble

Out-of-box pretrained "base" LM, ex) GPT-2

Expert LM

Anti-Expert LM

notion image

Dexperts Formalization

어떤 PLM , Expert LM (required)와 Anti-Expert LM (optional)

최종적으로 나온 Ensenble의 Prob은 위와 같이 계산된다.

이때 는 임의로 조정 가능한 HyperParam, 논문에서는 이후 2.0으로 세팅함

이 alpha 값이 클 수록 기존 PLM에 더 많은 Modification이 이뤄진다.