┌─────────────────────────────────────────────────────────┐ │ MACHINE LEARNING TYPES │ ├─────────────────┬─────────────────┬─────────────────────┤ │ SUPERVISED │ UNSUPERVISED │ REINFORCEMENT │ │ 🎯 │ 🔍 │ 🎮 │ ├─────────────────┼─────────────────┼─────────────────────┤ │ Known output │ No known output │ Learn from │ │ Make predictions│ Find patterns │ decisions │ │ │ │ │ │ Examples: │ Examples: │ Examples: │ │ • Predict price │ • Group │ • Game playing │ │ • Classify spam │ customers │ • Robot control │ │ • Diagnose │ • Find anomalies│ • Dynamic pricing │ │ disease │ • Reduce │ │ │ │ dimensions │ │ └─────────────────┴─────────────────┴─────────────────────┘
Definition: Predicts a known output feature based on input features
Visual Flow:
Input Features → [Model] → Known Output ↓ ↓ Distance: 2 mi Price: \$15 Time: 6 pm Destination: Fenway
Examples:
Definition: Describes patterns in dataset without known output feature
Visual Flow:
Input Features → [Algorithm] → Discovered Patterns
↓ ↓
Distance: 2 mi Groups:
Time: 6 pm • Commuters
Destination: Fenway • Sports fans
• Tourists
Examples:
Definition: Algorithms make decisions and update based on previous results
Visual Flow:
State → [Agent] → Action → Reward → Update
↓ ↓ ↓
Price: \$15 Accept? Yes → Keep price
No? No → Lower to \$14
Examples:
import pandas as pd
import numpy as np
# 📊 Sample nurse retention dataset
nurse_data = {
'EmployeeID': [1415194][1620383][1533398][1479961][1570909],
'Age': [34][30][25][35][38],
'Quit': ['No', 'No', 'Yes', 'No', 'No'],
'Department': ['Maternity', 'Maternity', 'Cardiology', 'Maternity', 'Cardiology'],
'DailyRate': [404][1312][383][982][508]
}
nurses = pd.DataFrame(nurse_data)
print("🏥 Nurse Retention Dataset:")
print(nurses)
print("\n" + "="*60)
# 🔍 Identify Features and Instances
print("\n📊 Dataset Structure:")
print(f"Number of instances (nurses): {len(nurses)}")
print(f"Number of features: {len(nurses.columns)}")
print(f"Feature names: {nurses.columns.tolist()}")
# 🎯 Define ML Task: Predict if nurse will quit
print("\n🎯 Machine Learning Task:")
print("Type: SUPERVISED LEARNING (Classification)")
print("Goal: Predict whether a nurse will quit")
print("\nInput Features (X): Age, Department, DailyRate")
print("Output Feature (y): Quit")
# 📈 Feature Analysis
print("\n📈 Feature Analysis:")
print(f"Age range: {nurses['Age'].min()} - {nurses['Age'].max()} years")
print(f"Daily rate range: ${nurses['DailyRate'].min()} - ${nurses['DailyRate'].max()}")
print(f"Departments: {nurses['Department'].unique().tolist()}")
print(f"Quit distribution:\n{nurses['Quit'].value_counts()}")
Output:
🏥 Nurse Retention Dataset: EmployeeID Age Quit Department DailyRate 0 1415194 34 No Maternity 404 1 1620383 30 No Maternity 1312 2 1533398 25 Yes Cardiology 383 3 1479961 35 No Maternity 982 4 1570909 38 No Cardiology 508 ============================================================ 📊 Dataset Structure: Number of instances (nurses): 5 Number of features: 5 Feature names: ['EmployeeID', 'Age', 'Quit', 'Department', 'DailyRate'] 🎯 Machine Learning Task: Type: SUPERVISED LEARNING (Classification) Goal: Predict whether a nurse will quit Input Features (X): Age, Department, DailyRate Output Feature (y): Quit 📈 Feature Analysis: Age range: 25 - 38 years Daily rate range: $383 - $1312 Departments: ['Maternity', 'Cardiology'] Quit distribution: No 4 Yes 1
┌────────────────────────────────────────────────────────────┐ │ RIDESHARE PRICE PREDICTION │ │ │ │ STEP 1: COLLECT DATA │ │ ┌──────────────────────────────────────────────┐ │ │ │ Distance | Time | Location | Vehicle | Price │ │ │ │ 1.3 | 1pm | Theatre | Uber | \$17.5 │ │ │ │ 1.35 | 12pm | South St | Lyft | \$7.0 │ │ │ └──────────────────────────────────────────────┘ │ │ ↓ │ │ STEP 2: IDENTIFY FEATURES │ │ ┌─────────────────────┐ ┌──────────────┐ │ │ │ INPUT FEATURES │ │ OUTPUT │ │ │ │ • Distance │ → │ • Price │ │ │ │ • Time │ │ │ │ │ │ • Location │ │ │ │ │ │ • Vehicle Type │ │ │ │ │ └─────────────────────┘ └──────────────┘ │ │ ↓ │ │ STEP 3: TRAIN MODEL │ │ ┌──────────────────────────────────────┐ │ │ │ 🤖 MACHINE LEARNING MODEL │ │ │ │ (Learns patterns from data) │ │ │ └──────────────────────────────────────┘ │ │ ↓ │ │ STEP 4: MAKE PREDICTIONS │ │ ┌──────────────────────────────────────┐ │ │ │ New Trip: 2 miles, 6pm, Fenway │ │ │ │ Predicted Price: \$15.00 │ │ │ └──────────────────────────────────────┘ │ └────────────────────────────────────────────────────────────┘
"Machine learning is about teaching computers to learn from data, just like humans learn from experience!"
# Load data
df = pd.read_csv('file.csv')
# View data
df.head() # First 5 rows
df.shape # (rows, columns)
df.columns # Column names
# Select features
X = df[['feature1', 'feature2']] # Multiple columns
y = df[['target']] # Single column
# Access elements
df.iloc[0][1] # Row 0, Column 1
df.iloc[:5, 1:3] # First 5 rows, columns 1-2
Get the updates, offers, tips and enhance your page building experience
Up to Top