...

Plays and Poems: Data Preview (cat, cut, head and csvlook)

We'll cover the following...

Learning objectives
Data download
Data Preview

In this project, we utilise a text corpus containing plays and poems from the Shakespeare-era (16th and 17th centuries) and find which are the words most frequently used by some of the known authors (.e.g., Shakespeare) of that time!

You may wonder to know that, so far there is no comprehensive collection of electronic texts of these works in the public domain, rather a portion of the plays and poems are held in an machine readable archive at the Centre for Literary and Linguistic Computing at The University of Newcastle.

This has been assembled over many years by editing versions ...

Course Introduction

Project 1: Analyzing the 'US News' University Ranking Data

Project 2: Facebook Data Mining

Project 3: Australian Cities Crime Statistics

Project 4: Shakespearean-era plays and poems data mining

Bash Tutorials

REGEX Tutorials

AWK Tutorials

SED, GREP and Find Tutorials

Beyond the Text Files! Enter into the Big Data Landscape - Concepts

Conclusion

Plays and Poems: Data Preview (cat, cut, head and csvlook)