Machine Learning Abstract Text Summarization
Project Group 5: William Edmisten, Li Huang, Jordan Palmer, Carter Sullivan


Background

Text summarization is the technique of shortening long pieces of text. The intention is to create a coherent and fluent summary having only the main points outlined in the document. One noticeable example of text summarization is the Reddit AutoTLDR Bot that summarizes news articles. Two fundamental approaches to text summarization are extractive and abstractive. The former extracts words and word phrases from the original text to create a summary. The latter paraphrases the original text to create new phrases and sentences that relay the most useful information from the original text. Abstractive text summarization specifically relies on using Deep Learning techniques to produce human-like sentences.



Original Paper Summary

We decided to base our project on the paper "Get To The Point: Summarization with Pointer-Generator Networks" written by Abigail See, Peter J. Liu, Christopher D. Manning. This paper is listed in the references section below.


In their paper, the team describes how they used new methods to combat the two biggest problems that occur when using neural sequence-to-sequence models to summarize lengthy text. The first main problem that this approach faces is its tendency to reproduce factual details incorrectly. In order to reduce this risk, a "hybrid pointer-generator network" was implemented whose job is to carry important words or phrases over to the summarized piece in order to increase the accuracy of the output. The second issue that these programs face is that they are likely to repeat themselves, adding superfluous information and lengthening the summarizations unnecessarily. To overcome this, a form of coverage was added in order to keep tabs on what has already been stated in the summary, reducing repetition.



Procedure

The Model:
To train and evaluate the model, we used the following official implementation provided by the authors: Official Implementation

In order to run the code with qualified computing power, we use VT ARC, specifically the Cascades Computing System. Unfortunately, the codebase only provides implementation for Python 2 and Tensorflow 1.2, we therefore have to switch to the following implementation: Modified Implementation
The code base converts official implementation to support Python 3 and Tensorflow 1.12. Unfortunately, the code base disabled the ROUGE evaluation due to outdated Pyrouge dependencies. Therefore we will not be able to evaluate our result with these scores.

The System:
In order to train the model, in Cascades we requested the V100 GPU Node with 8 cores and 18 hours of runtime. The system comes with following modules:
- Anaconda/5.1.0
- cuda/9.1.85
- cudnn/7.1
The Tensorflow 1.13 is manually installed via pip.

The Dataset:
For the dataset we used the CNN and Daily Mail data. To process the dataset for the model to run, we use the following codebase to process the data: Dataset: CNN and Daily Main Data
Due to limited time and computational power constraints, we decided to only use 10% of the dataset for training. We decided to use 27 chunks instead of 287 chunks mentioned by the original paper. We also use 100 examples per chunk instead of 1000.



Evaluation

To evaluate and confirm the original findings, we wanted ROUGE F1 and METEOR scores of the summaries to assess whether or not our reproduction matches the original results. The original paper, using pointer-generator + coverage method, obtains the following results:
ROUGE 1: 39.53, ROUGE 2: 17.28, ROUGE L: 36.38
METEOR exact match: 17.32, METEOR + stem/syn/para: 18.72.

Our model was trained with the dataset for 18 hours with the following results:
- tensorflow:loss: 2.868588
- tensorflow:coverage_loss: 0.206599

Unfortunately, as discussed earlier, ROUGE evaluation is disabled. Thus, we simply run the model on the evaluation dataset to generate a set of summaries. To avoid vocab not found error, the model simply replaces any unknown vocab with '.'

Decoded Summaries


References

Get To The Point: Summarization with Pointer-Generator Networks
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond