Recent Questions and Assignment Topics

SET11121_1: Data Wrangling


Task Title: Data Wrangling (Part A)

Subject Code: SET11121 / SET11521

Objective: There are two objectives associated with this assignment. The first objective is to perform a literature review on recent approaches to abusive language detection technique. The second objective is to load given abusive language dataset (JSON file) in the Python programming architecture and perform simple dataset splitting for test and train procedure. Performing simple filtering and frequency analysis also required in the Python code.


Overview: In the literature, review student must use three contemporary research papers (published after 2016) which deals with abusive language detection from the textual content. The size of the document should be around 1200 words.

In the programming task, the student must use three different JSON files given in the resources segment of the task. These three files represent three separate classes of Tweets (neither, racism, sexist). The student must upload these data in the Python code and split them test and train segment. Finally, the student must performa word-based frequency analysis on the dataset.           


University: Edinburg Napier

Tool requirement:

  • Python 3.7: Python programming language is required to handle JSON dataset.
  • PyCharm: Educational version of the PyCharm is used to manage the resources of the Python program.

Implementation Details:

  • Initially, a Python class has been designed, which can represent the Dataset and its processing functions.
  • The class must have a contractor for initializing path for the JSON files and train-test split percentage.
  • A text filtering function is also implemented in the in the class.
  • For this task NLTK related libraries are used rigorously.



Sample Output  

[nltk_data] Downloading package stopwords to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping corpora\

[nltk_data] Downloading package punkt to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping tokenizers\

Five Most common:[('sexist', 761), ('kat', 717), ('like', 713), ('women', 693), ('islam', 536)]

Least common:('maiming', 1)


Process finished with exit code 0


Latest Reviews
Noah, East York, Bristol
25 minutes ago

 Efficient customer support. Thankyou!

Vanessa Angel, Melbourne
6 minutes ago

I am highly pleased with the work provided by The services availed from them enabled me to clear my course in an ease manner.

Harry Smith, Liverpool.
7 minutes ago

Professional project Awesome work, completed before deadline, professional writing. Thank you. Harry Smith, Liverpool.

Fiazan Sudhir, Manchester
57 minutes ago

I appreciate the work of quick assignment. Their quality writers and experts really helped me to sort out my problems and to get a good source. Thanks to them and I really recommend to all of my friends to approach them for taking any sort of help related to assignments.