Blog

tech diary

Training language models to follow instructions with human feedback

2023-02-15

category: papers

Training language models to follow instructions with human feedback

どんなものか

強化学習によるフィードバックを用いて言語モデルを人間の意図に沿うように調整する手法
ラベラーが用意したプロンプトセットを教師ありで学習し、さらにそこから追加で人間からのフィードバックによる強化学習で調整を行う
- このモデルはInstructGPTと呼ぶ

Reconstructing Training Data from Trained Neural Networks

2023-01-31

category: papers

Reconstructing Training Data from Trained Neural Networks

どんなものか

学習済みのニューラルネットワークモデルから、勾配を用いて学習データを再構成する手法を提案した
- つまり学習済みモデルから学習データセットを求めた

Sparse and Hierarchical Masked Modeling for Convolutional Representation Learning

2023-01-20

category: papers

Sparse and Hierarchical Masked Modeling for Convolutional Representation Learning

どんなものか

CNNベースのモデルに対してmasked image modelingを用いた事前学習における2つの重要な障害を特定し克服した
- 畳み込み演算は不規則でランダムなマスクの入力画像を扱えない
- BERT 事前学習のシングルスケールの性質はconvnetの階層的な構造と矛盾している

learning transferable visual models from natural language supervision

2023-01-18

category: papers

learning transferable visual models from natural language supervision

どんなものか

Webで収集した4憶の画像とテキストのペアから、どの画像がどのキャプションに合うかを学習させると有効な画像表現が得られる

Mastering Diverse Domains through World Models

2023-01-17

category: papers

Mastering Diverse Domains through World Models

どんなものか

World Modelを用いた汎用的でスケーラビリティの高いアルゴリズムであるDreamerV3を提案した
固定ハイパラで様々なタスクにおいて高い精度を出し、なおかつ難易度の高いタスクにおいても成功を見せる

Previous Page: 6 / 10 Next

*****

Non sunt multiplicanda entia sine necessitate
Pudhina is a free Jekyll theme by Knhash.
copyright ©️ 2022 - 2025