Recent Questions and Assignment Topics

SET11121_1: Data Wrangling


Task Title: Data Wrangling (Part A)

Subject Code: SET11121 / SET11521

Objective: There are two objectives associated with this assignment. The first objective is to perform a literature review on recent approaches to abusive language detection technique. The second objective is to load given abusive language dataset (JSON file) in the Python programming architecture and perform simple dataset splitting for test and train procedure. Performing simple filtering and frequency analysis also required in the Python code.


Overview: In the literature, review student must use three contemporary research papers (published after 2016) which deals with abusive language detection from the textual content. The size of the document should be around 1200 words.

In the programming task, the student must use three different JSON files given in the resources segment of the task. These three files represent three separate classes of Tweets (neither, racism, sexist). The student must upload these data in the Python code and split them test and train segment. Finally, the student must performa word-based frequency analysis on the dataset.           


University: Edinburg Napier

Tool requirement:

  • Python 3.7: Python programming language is required to handle JSON dataset.
  • PyCharm: Educational version of the PyCharm is used to manage the resources of the Python program.

Implementation Details:

  • Initially, a Python class has been designed, which can represent the Dataset and its processing functions.
  • The class must have a contractor for initializing path for the JSON files and train-test split percentage.
  • A text filtering function is also implemented in the in the class.
  • For this task NLTK related libraries are used rigorously.



Sample Output  

[nltk_data] Downloading package stopwords to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping corpora\

[nltk_data] Downloading package punkt to

[nltk_data]     C:\Users\Krazzy\AppData\Roaming\nltk_data...

[nltk_data]   Unzipping tokenizers\

Five Most common:[('sexist', 761), ('kat', 717), ('like', 713), ('women', 693), ('islam', 536)]

Least common:('maiming', 1)


Process finished with exit code 0


Latest Reviews
James Hunt, Winchester
30 minutes ago

“Quick Assignment” they are really providing mind-blowing assignment services. Thank you so much to them because I really liked the way they are helping me and I am satisfied with their services.

Henry, New York
11 minutes ago

I have gotten at least 50 times the value from their assignment services. We were treated like royalty. Assignment services is worth much more than I paid.

Herman, Sheffield.
24 minutes ago

You have the team of top notch professionals who help me to get better results from my assignment. Thank you very much.

Andy Taylor, Leeds
52 minutes ago

I gave them Management Assignment and they have provided my correct solution within the assigned time. I have scored well really thankful to you.