Skip to content
Snippets Groups Projects
PAZAKOU THEODORA's avatar
PAZAKOU THEODORA authored
f010c01d
Name Last commit Last update
README.md

Natural Language Processing Portfolio

Here you can find a brief overview of my latest projects, including some graded work for my masters, personal projects as well as my master thesis project. All of the code, datasets and dependencies for every project are available in the corresponding repository.

Table of contents

  • Project 1: A riddle game using Python
  • Project 2: Sentiment analysis on Amazon book reviews using NLTK's Vader and RoBERTa
  • Project 3: A hybrid approach to symplifying medical documents in French using OpenNMT-py (in progress)

Project 1: A riddle game using Python

** Data language: French

This game challenges users to guess a randomly selected 5-letter French word from a provided lexicon within 6 attempts. After each guess, correctly placed letters are shown in uppercase, while correct letters in the wrong position appear in lowercase, with accented letters being neutralized, in the backend, to their non-accented form for easier gameplay.

Project 2: Sentiment analysis on Amazon book reviews using NLTK's Vader and RoBERTa

** Data language: English

As part of this personal project, I used NLTK's Vader model to perform sentiment analysis on a dataset consisting of book reviews on Amazon. I compared this bag-of-words approach to the results obtained by the application of the RoBERTa model, which is pretrained on tweets for sentiment analysis, on the same dataset.

Project 3: A hybrid approach to symplifying medical documents in French using OpenNMT-py (in progress)

** Data language: French

This repository will be updated during this semester with the code of my thesis project. The simplification will be carried out in two steps:

  • adaptation of the multilingual syntactic simplification tool(MUSST) for sentence simplification in French as a means to enlarging the already existing CLEAR corpus
  • treating the simplification task as a monolingual machine translation task using the resulting parallel corpus and OpenNMT-py