Recent Questions and Assignment Topics

SET11121_1: Data Wrangling


Task Title: Data Wrangling (Part A)

Subject Code: SET11121 / SET11521

Objective: There are two objectives associated with this assignment. The first objective is to perform a literature review on recent approaches to abusive language detection technique. The second objective is to load given abusive language dataset (JSON file) in the Python programming architecture and perform simple dataset splitting for test and train procedure. Performing simple filtering and frequency analysis also required in the Python code.


Overview: In the literature, review student must use three contemporary research papers (published after 2016) which deals with abusive language detection from the textual content. The size of the document should be around 1200 words.

In the programming task, the student must use three different JSON files given in the resources segment of the task. These three files represent three separate classes of Tweets (neither, racism, sexist). The student must upload these data in the Python code and split them test and train segment. Finally, the student must performa word-based frequency analysis on the dataset.           


University: Edinburg Napier

Tool requirement:

  • Python 3.7: Python programming language is required to handle JSON dataset.
  • PyCharm: Educational version of the PyCharm is used to manage the resources of the Python program.

Implementation Details:

  • Initially, a Python class has been designed, which can represent the Dataset and its processing functions.
  • The class must have a contractor for initializing path for the JSON files and train-test split percentage.
  • A text filtering function is also implemented in the in the class.
  • For this task NLTK related libraries are used rigorously.



Sample Output  

[nltk_data] Downloading package stopwords to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping corpora\

[nltk_data] Downloading package punkt to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping tokenizers\

Five Most common:[('sexist', 761), ('kat', 717), ('like', 713), ('women', 693), ('islam', 536)]

Least common:('maiming', 1)


Process finished with exit code 0


Latest Reviews
Jack Bailey, Bristol
34 minutes ago

My friend circle recommended me to avail the assignment services of Quick Assignment. After getting services of this company, I was impressed by the customised services of the company at affordable rates.

36 minutes ago

Assignment services from this particular firm is the most valuable business resource one can EVER purchase. I like assignment services more and more each day because it makes my life a lot easier.

Troy, Manchester
33 minutes ago

I finished my assignment before deadline and received distinction. Thanks for helping with my assignment.

Richard Hermitage, Manchester
15 minutes ago

The work was efficiently delivered before the deadline. I am very much satisfied with the quality of services offered by Well done!