Recent Questions and Assignment Topics

SET11121_2: Data Wrangling

Solution

Task Title: Data Wrangling (Part B)

Subject Code: SET11121 / SET11521

Objective: Designing classifier based on supervised learning is the main objective of this task. The target classifier must classify Tweets in three categories as racism, sexist, neither. For training and testing purpose, this task must use the dataset used in the Part-A. Training and testing split-up already exists in the code submitted in Part-A of the assignment.  

Overview: For designing of the classifier, this task must use a suitable supervised machine learning algorithm. Based on the Training dataset Students have to perform the learning processes of the classifier. After successful learning processes, Students have to calculate the classification accuracy of the classifier. This task must show the accuracy of the classifier based on the confusion matrix. 

 

University: Edinburg Napier

Tool requirement:

  • Python 3.7: Python programming language is required to handle JSON dataset.
  • PyCharm: Educational version of the PyCharm is used to manage the resources of the Python program.
  • Sklearn: Python sklearn library is required for this task, it will help build up the model.
  • NLTK: For natural language processing, the NLTK library is required.

Implementation Details:

  • Initially, training data must be labelled with the name of the class.
  • TFIDF is the feature must be extracted from Tweet string using the mono-gram model.
  • TFIDF features must be used for the training processes of the classifier.
  • Next task is to create the classifier using sklearn library of the Python.
  • Finally, Classifier.fit method need to use to learning processes of the classifier. (In the example used for this portfolio, RandomForest classifier is used)
  • Testing processes of the classifier can be done using the classifier.predict function.
  • confusion_matrix, classification_report and accuracy_score function can be used to predict the accuracy of the target classifier.

Sample Output  

Figure 1: Sample Output

Latest Reviews
Richard Hermitage, Manchester
18 minutes ago

The work was efficiently delivered before the deadline. I am very much satisfied with the quality of services offered by quickassignment.co.uk. Well done!

Marcus Harris, Durham
3 minutes ago

Without their assignment services, we would have gone failed by now. Thank You, Guys! It has really helped in my academic. It's the perfect solution for the students.

Paul, London
44 minutes ago

Thank you for your incredible support in Medical Assignment. I have acquired a distinction grades. I will continue to get your assistance further.

Michal
46 minutes ago

Assignment services from this particular firm is the most valuable business resource one can EVER purchase. I like assignment services more and more each day because it makes my life a lot easier.