Recent Questions and Assignment Topics

SET11121_2: Data Wrangling

Solution

Task Title: Data Wrangling (Part B)

Subject Code: SET11121 / SET11521

Objective: Designing classifier based on supervised learning is the main objective of this task. The target classifier must classify Tweets in three categories as racism, sexist, neither. For training and testing purpose, this task must use the dataset used in the Part-A. Training and testing split-up already exists in the code submitted in Part-A of the assignment.  

Overview: For designing of the classifier, this task must use a suitable supervised machine learning algorithm. Based on the Training dataset Students have to perform the learning processes of the classifier. After successful learning processes, Students have to calculate the classification accuracy of the classifier. This task must show the accuracy of the classifier based on the confusion matrix. 

 

University: Edinburg Napier

Tool requirement:

  • Python 3.7: Python programming language is required to handle JSON dataset.
  • PyCharm: Educational version of the PyCharm is used to manage the resources of the Python program.
  • Sklearn: Python sklearn library is required for this task, it will help build up the model.
  • NLTK: For natural language processing, the NLTK library is required.

Implementation Details:

  • Initially, training data must be labelled with the name of the class.
  • TFIDF is the feature must be extracted from Tweet string using the mono-gram model.
  • TFIDF features must be used for the training processes of the classifier.
  • Next task is to create the classifier using sklearn library of the Python.
  • Finally, Classifier.fit method need to use to learning processes of the classifier. (In the example used for this portfolio, RandomForest classifier is used)
  • Testing processes of the classifier can be done using the classifier.predict function.
  • confusion_matrix, classification_report and accuracy_score function can be used to predict the accuracy of the target classifier.

Sample Output  

Figure 1: Sample Output

Latest Reviews
Christina, Leeds.
51 minutes ago

MRKT20051 I am very happy with your work on MRKT20051. I got high marks in it. I would like to recommend your services to my friends. Christina, Leeds.
 

Stanley Myers, Cambridge
3 minutes ago

Better customer support is the crucial thing that most of the customer wants on a prior basis. Quick Assignment Services provide better and effective customer support, so customers do not have to be worried.

Rojer skull, Newcastle
25 minutes ago

I am really speechless with tears; by not only getting my assignment early before the due date but also enabled with quality. Thank you so much for a job, well done…… Bravo!!!!

Rebekah, Birmingham
19 minutes ago

I don’t have word to show my gratitude for your precious assistance in assignment help. I have attained good marks with your help. Thank you, Quick Assignment.