Loading Events

« All Events

  • This event has passed.

Master’s Thesis Defense: Mr. Mark Carthon III

August 3, 2020 @ 2:00 pm - 3:00 pm

Due to the COVID-19 Pandemic, the University has instruction to cancel all in-person events through the Summer semester to adhere to city and state orders limiting public gatherings. Events still running must now take place Online— listed events will include a link in which one may access the Online webspace:

To view Mr. Carthon’s defense, enter his Online chatroom via Collaborative Ultra— it will open one hour prior to the event at 2:00 pm on Monday, Aug 3.

Dictionary-Based Data Generation for Fine-Tuning BERT for Adverbial Paraphrasing Tasks

Mr. Mark Carthon III
University of Wisconsin-Milwaukee
MS Graduate Student – Teaching Assistant

Recent advances in natural language processing technology have led to the emergence of large and deep pre-trained neural networks. The use and focus of these networks is on transfer learning. More specifically, retraining or fine-tuning such pre-trained networks to achieve state of the art performance in a variety of challenging natural language processing/understanding (NLP/NLU) tasks. In this thesis, we focus on identifying paraphrases at the sentence level using the network Bidirectional Encoder Representations from Transformers (BERT). It is well understood that in deep learning the volume and quality of training data is a determining factor of performance. The objective of this thesis is to develop a methodology for algorithmic generation of high quality training data for paraphrasing task, an important NLU task, as well as the evaluation of the resulting training data on fine-tuning BERT to identify paraphrases. Here we will focus on elementary adverbial paraphrases, but the methodology extends to the general case. In this work, training data for adverbial paraphrasing was generated utilizing an Oxford synonym dictionary, and we used the generated data to re-train BERT for the paraphrasing task with strong results, achieving a validation accuracy of 96.875%.


Details

Date:
August 3, 2020
Time:
2:00 pm - 3:00 pm
Event Category: