How to Prepare News Articles for Text Summarization

Last Updated on August 7, 2019

Text summarization is the task of creating a short, accurate, and fluent summary of an article.

A popular and free dataset for use in text summarization experiments with deep learning methods is the CNN News story dataset.

In this tutorial, you will discover how to prepare the CNN News Dataset for text summarization.

After completing this tutorial, you will know:

  • About the CNN News dataset and how to download the story data to your workstation.
  • How to load the dataset and split each article into story text and highlights.
  • How to clean the dataset ready for modeling and save the cleaned data to file for later use.

Kick-start your project with my new book Deep Learning for Natural Language Processing, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

How to Prepare News Articles for Text Summarization

How to Prepare News Articles for Text Summarization
Photo by DieselDemon, some rights reserved.

Tutorial Overview

This tutorial is divided into 5 parts; they are:

  1. CNN News
    To finish reading, please visit source site