Sentiment classification with LSTM in Python with Keras:

Sequence classification is a predictive modeling problem where you have some sequence of inputs over space or time and the task is to predict a category for the sequence.

Problem Description:The problem that we will use to demonstrate sequence learning in this tutorial is the IMDB movie review sentiment classification problem. Each movie review is a variable sequence of words and the sentiment of each movie review must be classified.

Keras provides access to the IMDB dataset built-in. The imdb.load_data() function allows you to load the dataset in a format that is ready for use in neural network and deep learning models. That’s really cool!!!!!

keras is really simple to use .. Just look at the code!!!!!

importing the classes and functions required for this model:

from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence

Now We need to load the IMDB dataset:
Here constraining the dataset to the top 5,000 words, Just to make it faster
top_words = 5000
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=top_words)

Next, we need to truncate and pad the input sequences so that they are all the same length for modeling:
max_review_length = 500
X_train = sequence.pad_sequences(X_train, maxlen=max_review_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

We can now define, compile and fit our LSTM model. Thats It!!!!! Hurray!!!! is’t it really simple?

The first layer is the Embedded layer that uses 32 length vectors to represent each word.
The next layer is the LSTM layer with 100 memory units.
Finally, because this is a classification problem we use a Dense output layer with a single neuron and a sigmoid activation function to make 0 or 1 predictions for the two classes.

embedding_vecor_length = 32
model = Sequential()

model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(100))
model.add(Dense(1, activation=’sigmoid’))

model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)

Now time for evaluation of the model:
scores = model.evaluate(X_test, y_test, verbose=0)
print(“Accuracy is”,(scores[1]*100))

Lets discuss if having any issues.

Code reference from machinelearningmastery Happy coding, Have fun with deep learning 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s