Image of a hub, © Paul Watson, CC  BY-NC-SA 2.0 The Linguistic Teaching resources Hub
Image © Paul Watson, Licence CC BY-NC-SA 2.0

Normalizing Textual Data with Python

* William J. Turkel * Adam Crymble *

Keywords: normalisation, python, text processing, Regular Expressions

http://programminghistorian.org/lessons/normalizing-data

This small tutorial explains in detail some basic normalisation tasks: lowercase all words and remove interpunctation.

Feedback

Sorry, there is no feedback available. Be the first one to provide feedback!

Resource details

Institution: The Programming Historian
Year of publication: 2012
Language: english
Type: Tutorial
Audience: Historians, Linguists
Level: basic
Prerequisites:

Being able to create a word list

Media: text/html
Objective:

Cleaning a word list with Python

Licence: CC-BY
Access: open
Creation date: Monday, 20 April 2015 15:00:24
Last modified: Sunday, 21 April 2024 22:25:55
BibTeX type: @misc
BibTeX entry:
@misc(TeLeMaCo:335,
author = "Turkel, William J. and Crymble, Adam",
title = "{N}ormalizing {T}extual {D}ata with {P}ython",
year = "2012",
url = "http://programminghistorian.org/lessons/normalizing-data"
)
  

Helpdesk Button