Photo by Sergey Zolkin on Unsplash

What does normalizing a text do?

We have previously called this method .lower() to turn all of the words lowercase, so that strings like “the” and “The” both become “the”, so we don’t double count them.

What if we wanna do even more?

Stemming

For example we can strip the affixes from words in a process called stemming. …

NLTK has preprocessed texts. But we can also import and process our own texts.

Importing

from __future__ import division 
import nltk, re, pprint

To Import a Book as a Txt

Install urlopen:

!pip install urlopen

And:

import urllib.requesturl = "https://www.gutenberg.org/files/11/11-0.txt"
raw = urllib.request.urlopen(url).read()
type(raw)
# <type 'str'>
len(raw)
// 1176831
raw[:75]
// 'The Project Gutenberg EBook of Crime…

Photo by Pixabay from Pexels

Work in Natural Language Processing typically uses large bodies of linguistic data. In this article, we explore some lexical resources that help us ingest and analyze corpora. These resources are part of Python or the NLTK library.

Getting NLTK Corpora

We can access pre-imported corpora in NLTK in one of 2 ways:

emma…

What's a Bidirectional RNN?

Bidirectional RNN is an RNN variant, that sometimes can increase performance. It is especially useful for natural language processing tasks.

The BD-RNN uses two regular RNNs, one of them where the sequential data is going forward, and one where the data sequences backwards, then merging their representations.

This method doesn’t…

Remember these two useful properties of Convolutional Models.

Translation Invariance

A convolutional model can learn a certain pattern in the lower right area, then after that point detect it anywhere on the image.

Spatial Hierarchy

A convolutional model can learn patterns in a hierarchical fashion, much like we do. The…

Photo by Vishwasa Navada K on Unsplash

What does a CNN model look like in code?

from keras import layers 
from keras import models
seq_model= models.Sequential()
seq_model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
seq_model.add(layers.MaxPooling2D((2, 2)))
seq_model.add(layers.Conv2D(64, (3, 3), activation='relu')) seq_model.add(layers.MaxPooling2D((2, 2)))
seq_model.add(layers.Conv2D(128, (3, 3), activation='relu'))

There is a model:

from keras import modelsseq_model= models.Sequential()

Models can be sequential and non-sequential.

from keras.models import Sequential, Model

Jake Batsuuri

I write about software && math. Occasionally I design && code. Find my stuff batsuuri.ca

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store