2020-02-05

PyAVのインストールで詰まった

Linux

Ubuntu 16.04を使っていたことで、色々面倒だった。ffmpeg が 3.0以上必要。

blog.programster.org

kazuhira-r.hatenablog.com

github.com

2020-02-04

Transformer の pytorch での実装してるサイトのメモ

機械学習 Python3

nlp.seas.harvard.edu

Label scaling と temperature scaling があり、予測結果のoverconfidence を抑制する。

codecrafthouse.jp

.unsqueeze(1) は縦長。scatter_で one-hot的に、置換している。

# true_dist.shape == (n, d)
# target.shape == (n, )

true_dist.scatter_(1, target.data.unsqueeze(1), self.confidence)

[0, x / d, 1 / d, 1 / d, 1 / d] は、multiclass の overconfidence を表現している。xが大きくなるにつれて overconfidenceする。

crit = LabelSmoothing(5, 0, 0.1)
def loss(x):
    d = x + 3 * 1
    predict = torch.FloatTensor([[0, x / d, 1 / d, 1 / d, 1 / d],
                                 ])
    #print(predict)
    return crit(Variable(predict.log()),
                 Variable(torch.LongTensor([1]))).data[0]
plt.plot(np.arange(1, 100), [loss(x) for x in range(1, 100)])
None

beam search の実装

github.com

2020-02-03

多重代入法

Python3

多重代入法のPDF

https://www.ism.ac.jp/~noma/Noma2017JJAS.pdf

statsmodels で実装できるっぽい。

2020-01-31

The Two-Stage Least Squares Estimation（二段階最小二乗法）

統計学

操作変数法のアプローチの一つとして、二段階最小二乗法が存在している。これは、操作変数法の推定量の計算方法を代替することで、予測精度を上げることを期待している。

↓ PDF

http://www3.grips.ac.jp/~yamanota/Lecture%20Note%208%20to%2010%202SLS%20&%20others.pdf

また、stage 1 の線形回帰については F統計量により、推定の強さを検定する。

facweb.cs.depaul.edu

2019-12-11

PCAの逆変換

機械学習

PCA の inverse_transform は、component_ が直交行列的なため、以下の変換ができる。

w = X c --> w cT= x

alexhwilliams.info

2019-12-07

category encoding について

category encoding をsklearn の BaseEstimator, TransformerMixin を利用して、作成されている。

binary encoding

bit 表現で one-hot encoding的な表現を作る。カラムの順序は value order でいい感じに調整できる。

BaseNEncoding

bit 表現(2進数)ではなくて、N進数表現している。

contrib.scikit-learn.org

github.com

2019-11-19

■

Python3

rle_encode の +2については、

pixel は 1 から、採番していくため、+1
0番目と1番目の差異を index=0とするため、実際のmask は +1

def rle_encode(mask):
    """ Ref. https://www.kaggle.com/paulorzp/run-length-encode-and-decode
    """
    pixels = mask.flatten('F')
    pixels[0] = 0
    pixels[-1] = 0
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 2
    runs[1::2] = runs[1::2] - runs[:-1:2]
    return ' '.join(str(x) for x in runs)

rle_encode(np.array([0,1,1,0]))

>>> '2 2'

日に日に分からんことが増えていく…

φ(..)メモメモ

PyAVのインストールで詰まった

Transformer の pytorch での実装してるサイトのメモ

多重代入法

The Two-Stage Least Squares Estimation（二段階最小二乗法）

PCAの逆変換

category encoding について

binary encoding

BaseNEncoding

■