Assignment 1 : Linear Regression and Logistic Regression
1. 1 Introduction to Machine Learning.
Machine learning is a subfield of artificial intelligence focused on enabling computers to learn from data without explicit programming. Instead of relying on hard-coded rules, machine learning algorithms identify patterns, make predictions, and improve their performance over time based on the data they are exposed to. This learning process allows computers to adapt to new situations, personalize experiences, and automate complex tasks.
1.2. Types of Machine Learning
Machine learning algorithms can be broadly categorized into three main types:
2.1 Supervised Learning
In supervised learning, the algorithm learns from labeled data, which means each data point is paired with a corresponding output or target value. The goal is to learn a mapping function that can predict the output for new, unseen inputs.
Example: Predicting house prices based on features like size, location, and number of bedrooms. The data would include historical house prices (labels) along with their corresponding features.
Common Algorithms: Linear Regression, Logistic Regression, Support Vector Machines, Decision Trees, Random Forests. (Shukla, 2023) discusses Regression and Classification, both supervised learning methods.
2.2 Unsupervised Learning
Unsupervised learning deals with unlabeled data, where the algorithm aims to discover underlying patterns, structures, or relationships in the data without any guidance from target values.
Example: Clustering customers into different groups based on their purchasing behavior. The algorithm would identify similarities and differences in customer data without knowing predefined customer segments.
Common Algorithms: K-means Clustering, Hierarchical Clustering, Principal Component Analysis.
2.3 Reinforcement Learning
Reinforcement learning involves an agent that learns to interact with an environment by taking actions and receiving rewards or penalties. The goal is to learn a policy that maximizes cumulative rewards over time.
Example: Training a robot to navigate a maze. The robot receives rewards for reaching checkpoints and penalties for hitting walls. Through trial and error, it learns an optimal navigation strategy.
What is Regression?
Regression analysis aims to model the relationship between a dependent variable (the target we want to predict) and one or more independent variables (predictors or features). The goal is to find a function that best describes this relationship, allowing us to estimate the value of the dependent variable given new values of the independent variables.
Types of Regression
Several types of regression cater to different data characteristics and relationships:
- Linear Regression: Assumes a linear relationship between the dependent and independent variables. It aims to find the best-fitting straight line (or hyperplane in multiple dimensions) that minimizes the sum of squared errors between predicted and actual values.
- Logistic Regression: Despite its name, logistic regression is a classification technique, not a regression technique. It is employed for categorical outputs. It predicts a binary output using a logistic function and is not suitable for continuous target variables.