How to Improve Class Imbalance in Machine Learning

Objectives:

  • Students will explain how categories of race were socially constructed as a method of controlling slaves and perpetuating the institution of slavery.
  • Students will begin to trace the evolution of racial hierarchy after emancipation.

Resources:

Activity Steps

  1. To start this lesson, do any of you know the definition for AI? Slide 1
  2. This lesson will discuss the importance of creating fair machine learning models. SLIDE 2
  3. AI can be found in multiple areas of everyday life, such as social media, smart cars, video games, chatbots. SLIDE 3
  4. AI is a set of algorithms, that can make decisions with unforeseen circumstances SLIDE 4
  5. What is an algorithm? An algorithm is a set of rules used to obtain the expected output from a given input. SLIDE 5
  6. Machine learning refers to algorithms that enable systems to identify patterns, make decisions, and improve through experience. SLIDE 6
  7. AI refers to the general ability of computers to emulate human thought and perform tasks in real-world environments, such as recognizing patterns, and solving problems SLIDE 8
  8. Data science is the process of analyzing and extracting relevant info from data. SLIDE 9
  9. AI was created to mimic how humans make decisions. It can make decisions based on patterns in data. For example, even if you’ve never seen a Tibetan Mastiff, you would know it’s a dog based on patterns of features similar to all dogs. SLIDE 10
  10. Training a model means teaching an AI system to recognize these patterns by feeding it a large amount of data and allowing it to learn from that data. The model adjusts itself to improve accuracy in making predictions or decisions based on the input it receives. SLIDE 11
  11. In the next few slides, we will review the machine learning process with the students.
  12. The first step is to get data SLIDE 12
  13. The second step is to clean, prepare, and manipulate the data.Real-world data often has unorganized, missing, or noisy elements. Having a clean data set helps with your model’s accuracy SLIDE 13
  14. The third step is to train a model. An algorithm uses math to learn patterns in the data and develop predictions. Classification is used to categorize an input into many categories SLIDE 14
  15. The fourth step is to test the data. Check if your model’s predictions were correct. If the results are not satisfactory, you need to improve and retrain your ML model which is step 5 SLIDE 15
  16. Dr. Buolamwini proved that the datasets used to train some facial recognition software consisted mostly of Caucasian males. As a result, some algorithms could not recognize darker-skinned women. SLIDE 16
  17. A Class is the category you want the computer to learn. Class imbalance occurs when the training data is not evenly distributed between classes, leading to biased machine learning models. SLIDE 17
  18. Now will now make their own machine learning model using Teachable Machine. Here is a quick tutorial video about Teachable Machine. SLIDE 18
  19. Follow the instructions to create your own ML model on Teachable Machine and ensure that your model is fair. SLIDE 19
  20. Next, watch the video entitled, “Are we automating racism?” and reflect. SLIDE 22
  21. This video discusses how AI can amplify racism. For instance, biased training data and a lack of diversity in creating AI can result in AI systems that are incapable of recognizing African-American women. This lack of recognition exacerbates the issue, as African-American women already face police brutality. SLIDE 22
  22. Due to the disproportionate representation of African-Americans in the prison system, machine learning systems designed to predict recidivism are biased against them.We will talk more about this in the next lesson. SLIDE 23