[TensorFlow] RNN

9 min readMay 24, 2021

TensorFlow의 RNN layer는 세가지가 있습니다.

SimpleRNN
LSTM (Long Short Term Memory layer)
GRU (Gated Recurrent Unit)

SimpleRNN부터 시작하겠습니다. RNN layer를 이해하는 첫번째 단계는 받아들일 수 있는 데이터의 shape을 이해하는 것입니다. units=1 인 SimpleRNN layer을 만들겠습니다.

Let’s start with SimpleRNN. The first step in understanding the RNN layer is understanding the acceptable shape of the data. Let’s create a SimpleRNN layer with units=1.

SimpleRNNから始めます。RNN layerを理解する最初のステップは、受け入れることができるデータのshapeを理解することです。units=1であるSimpleRNN layerを作ります。

rnn = tf.keras.layers.SimpleRNN(units=1)

SimpleRNN layer의 입력이 될 numpy array를 생각해 보겠습니다. 이 array의 shape은 (4,) 입니다. 즉, 1차원 배열입니다.

Consider a numpy array that will be the input of the SimpleRNN layer. The shape of this array is (4,). That is, it is a one-dimensional array.

SimpleRNN layerの入力となるnumpy arrayを考えてみましょう。このarrayのshapeは（4、）です。つまり、1次元配列です。

x = np.array([1., 2., 3., 4.])
x.shapernn(x)

실행하면 다음과 같은 에러가 발생합니다.

When executed, the following error occurs.

実行すると、次のようなエラーが発生します。

ValueError: Input 0 of layer simple_rnn is incompatible with the layer: expected ndim=3, found ndim=1. Full shape received: (4,)

기대되는 배열은 3차원 배열인데, (4,)의 모양을 갖는 1차원 배열이 입력되어 오류가 발생한다고 합니다. 입력인 x 는 순서를 가지고 있는 있는 배열처럼 보이지만, TensorFlow는 이것을 순서가 있는 배열로 인식하지 않습니다.

The expected array is a three-dimensional array, and an error occurs because a one-dimensional array with the shape of (4,) is input. The input x looks like an ordered array, but TensorFlow doesn’t recognize it as an ordered array.

期待される配列は、3次元配列であるが、（4、）の形を有する1次元配列が入力されて、エラーが発生するとします。入力のxは順序を持っている配列のように見えるが、TensorFlowはこれ順序がある配列として認識していません。

numpy의 reshape 을 이용하여 1차원 배열을 3차원 배열로 만들 수 있습니다. (1,4,1)의 모습으로 바꾼 후, 다시 rnn layer에 적용해 보겠습니다.

You can use NumPy's reshape to make a one-dimensional array into a three-dimensional array. After changing it to the shape of (1,4,1), I will apply it to the rnn layer again.

numpyのreshapeを利用して、1次元配列を3次元配列にすることができます。（1,4,1）の姿に変えた後、再びrnn layerに適用してみましょう。

new_x = x.reshape([1,4,1])rnn(new_x)

오류가 발생하지 않고, <tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[-0.9996187]], dtype=float32)> 와 같은 값이 반환됩니다. 성공입니다.

No error occurs, and values such as <tf.Tensor: shape=(1, 1), dtype=float32, numpy=array([[-0.9996187]], dtype=float32)> are returned. Success.

エラーが発生せず、<tf.Tensor：shape=（1、1）、dtype= float32、numpy= array（[[ — 0.9996187]、dtype= float32）>と同じ値が返されます。成功です。

어떻게 계산된 것일까요?

How was it calculated?

どのように計算されたのでしょうか？

음악으로 생각해 보겠습니다. 우리는 음악을 들으면, 그 곡이 가지고 있는 분위기가 느껴집니다. 그 분위기를 말로 설명해보라고 하면, 우리는 그게 어떠한 느낌이다라고 설명을 하려고 하지만, 사실은 설명을 못합니다.

Think of it as music. When we listen to music, we feel the mood of the song. When asked to describe the mood in words, we try to explain what it feels like, but we can’t actually explain it.

音楽的に考えてみましょう。私たちは、音楽を聞くと、その曲が持っている雰囲気が感じられます。その雰囲気を言葉で説明してみろと、私たちはそれがどのような感じであると説明をしようとしたが、実際には説明をできません。

RNN은 그 곡의 분위기를 어떤 숫자로 표현한다고 생각할 수 있습니다. 첫 음을 들었을 때의 분위기, 그리고 하나의 음을 더 들었을 때의 분위기가 계속 바뀌어 가겠지만, 이 분위기를 계속 어떠한 숫자로 표현하려고 합니다. 이 숫자가 hidden state입니다.

You can think of RNN as producing numbers that express the mood of the song. The mood when you hear the first note, and the mood when you hear one more note, will continue to change, but the RNN will continue to express this mood with some numbers. This number is the hidden state.

RNNは、その曲の雰囲気を任意の数で表現すると考えることができます。最初の音を聞いたときの雰囲気は、1つの音をより聞いたときの雰囲気が継続変わっていくが、この雰囲気を継続どの数字で表現しようとします。この数字は、hidden stateです。

음악이 시작하기 전에 분위기인 hidden state h_0 = 0으로 놓습니다. 첫번째 음인 x_1이 들리면, 다음의 식을 통하여 h_1을 계산합니다.

Before the music starts, set the mood as a hidden state h_0 = 0. When you hear the first note x_1, calculate h_1 through the following equation.

音楽が開始する前に雰囲気のhidden state h_0=0にします。最初の負x_1が聞こえ、次の式を介してh_1を計算します。

W_x 와 W_h는 누가 가르쳐주었는지 모르지만, 일단 주어진 숫자들입니다. 당연히 계산할 수 있겠지요. 이제 두번째 음인 x_2가 들리면, 방금 전에 구한 h_1을 이용하여 h_2를 계산합니다.

W_x and W_h are numbers given first, although I don’t know who taught them. Of course, you can calculate it. Now when you hear the second note, x_2, calculate h_2 using the h_1 you just found.

W_xとW_hは誰教えてくれたのか分からないが、一度与えられた数字です。当然計算できるでしょう。今第二負x_2が聞こえたら、先ほど求めたh_1を利用してh_2を計算します。

여기서 중요한 점은 W_x 와 W_h 는 계속 같은 숫자라는 점입니다.

The important point here is that W_x and W_h will continue to be the same number.

ここで重要な点は、W_xとW_hはずっと同じ数という点です。

이런 방식으로 마지막 음 x_n까지를 모두 듣고나면, h_n이 구해집니다. h_n이 이 곡의 분위기입니다.

In this way, after listening to the last note x_n, h_n is found. h_n is the mood of this song.

このように、最後音x_nまでのすべて聞いた後、h_nが求められます。h_nこの曲の雰囲気です。

이상을 멋지게 그림으로 표현하면 다음과 같습니다.

The following is a nice picture of the above.

以上を見事に絵で表現すると、次のとおりです。

[TensorFlow] Subclassing

새로운 TensorFlow 모형을 만들어 보겠습니다. 이렇게 시작합니다.

financial-engineering.medium.com

Google Colab에서 Music21 사용하기

Python에서 사용할 수 있는 음악모듈 중에 music21이라는 것이 있습니다. 음악을 전문적으로 분석하기 위한 모듈인데, 여기서는 google colab에서 사용하는 방법을 설명하겠습니다.