FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

📎

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Tags

NLP

TLDR논문리뷰

Published

Published May 24, 2021

Link

https://arxiv.org/pdf/2103.06413.pdf

TL;DR

기존 연구들은 word-level debiasing

우리는 sentence-level debiasing

encoder model에 대해서 debiasing하는거 sentence level은 적음

PLM encoder 모델에 대해서 debiasing하는 "FairFil(fair filter)" model을 제안함

Contrastive learning을 사용
Filtered embedding간 correltaion 낮추고
Semantic info 유지함
PLM 자체를 재학습하지 않아도 됨

post-hoc 방법론

Bias degree 감소

Downstream task에서도 성능 좋음

기존에 "Sent-Debias"라는 방법론이 있음

PCA 통해서 Feature vector minimize
이슈) 정말 bias가 linear level에 있다고 생각해?
Train data에 너무 의존적임
Generalization 어려움

이 방법론도 예전에 읽은 모 논문과 비슷함

✅
Debiasing Pre-trained Contextualised Embeddings

기존 Embedding을 Input으로 받아서 → Debiased된 embedding이 output이 되게 하는 FairFil Net

Multi-view contrastive learning의 아이디어

학습 데이터가 있으면
Data Augmentation을 한다.

Potential bias direction이 다른쪽으로 나오게.

notion image

위 Table 예시 기준: 성별이 키워드, 문장에서 성별 단어만 변경.

원본 emb와 Debiased emb간 Mutual info는 Maximize (성능 유지용)

InfoNCE라는 기법을 사용한다.
SimCLR 논문을 참고한다.

Debias Regualizer - debiased emb와 sensitive word's emb간 mutual info 최소화

Bias Evaluation

SEAT dataset

notion image

Downstream 성능 Eval

SST-2
CoLA
QNLI

notion image

Sent-D(Sent-Debias) 방법론보다 성능이 더 좋음

Sent-D는 PCA방식으로 제거
제거하는 순간 정보 잃어버리는데, FairFil의 새로운 NN 학습은 그런 단점이 적음

notion image

Template로 생성한 sentence로부터 mean 쳐서 찾아낸 word embedding을 T-SNE로 비교한다.

Non-contextualized emb를 추출하는 방법으로 결국 이거 많이 쓰긴 하나보다.