Reality is also a Pizza

1. Tokenizer 모델에 입력하는 텍스트를 그대로 입력하는 것이 아니라 Tokenizer를 이용하여 텍스트를 tokenize한 후 각 token들을 고유의 id값으로 반환하여 BertEmbeddings에 입력해야한다. from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("klue/bert-base") tokenizer가 반환하는 출력값은 `input_ids`, 'token_type_ids`, `attention_mask`의 정보가 들어있는 객체이고 각 정보들의 값은 list이다. input_ids : token들의 id 리스트 token_type_ids : BERT는 입력으로 두 문장을 받을 수 있는데(Se..

Huggingface BERT 분석

부스트캠프 AI Tech 4기

Part 1: Foundations of Contrastive Learning Contrastive Learning Objectives Contrastive Data Sampling and Augmentation Strategies Analysis of Contrastive Learning 1. What is Contrastive Learning 최근의 NLP 모델들은 representation learning 알고리즘에 크게 의존한다. Contrastive Learning은 유사한 데이터 샘플 쌍은 가깝게 representation되고, 유사하지 않은 데이터 샘플 쌍은 멀리 떨어져 있도록 임베딩 공간을 학습하는 기법이다. Contrastive Learning을 하기위해서는 두 가지 필수 요소가 필요하다..

[Contrastive Data and Learning for Natural Language Processing] - 1.1 Contrastive Learning Objectives

자연어 처리

▮ MSE의 단점 아래는 MSE(평균제곱오차)의 식이다. $$ e = \frac{1}{2} \parallel y-o \parallel ^{2}_{2} $$ 목표로 하는 값 $ y $와 모델의 출력값 $ o $의 차이가 클수록 모델에게 주는 벌점($e$)도 커지는 것이 학습하는데 있어서 적합해보인다. 그러나 왜 딥러닝에서는 MSE를 사용하지 않을까? 예를 들어 입력이 1.5이고 출력값이 0인 샘플이 모델에 입력되었다고 가정하고, 아래에 다음과 같은 두 모델이 있다고 가정하자 1) $ \hat{y} = \sigma (0.4x + 0.5) $ → $ \hat{y} = 0.7503 $ 2) $ \hat{y} = \sigma (1.9x + 3.0) $ → $ \hat{y} = 0.9971 $ 2번 모델이 원래 목..

딥러닝에서 왜 목적함수로 MSE가 아닌 Cross Entropy를 사용할까?

딥러닝

새소식

티스토리툴바