2.1 NumPy Arrays and Operations
Module 2.1 â NumPy Arrays and Operations ðĒ ðĪ Module 2.1 âĒ Python for Machine Learning ðĒ NumPy Arrays and Operations The backbone of every ML model â learn to store, manipulate, and operate on data at lightning speed with NumPy arrays. ðïļ The Analogy â Toolkit Upgrade Imagine you need to dig 100 holes in your garden. ðą You could use a spoon (Python lists) â it works, but it's painfully slow. Or you could use a power drill (NumPy) â same holes, 50x faster! ⥠Python lists are general-purpose containers â they can hold anything (numbers, strings, cats ðą). But when you need to do math on thousands of numbers (which is ALL of Machine Learning), they're like using a spoon to dig holes. NumPy is purpose-built for fast number crunching. ð NumPy (Numerical Python) is a Python library that provides fast, efficient multi-dimensional array objects and mathematical functions to operate on them. It's the foundation of nearly every ML and data science library. ð§Ū ⥠NumPy is up to 50x faster than Python lists for numerical operations ð Every ML dataset is stored as a NumPy array ð§ Neural network weights, image pixels, audio waves â all NumPy arrays ð Libraries like Pandas, Scikit-learn, TensorFlow all use NumPy internally ðū Uses less memory than Python lists Python lists store each item as a separate object scattered in memory. NumPy stores numbers in a contiguous block of memory (like seats in a row), so the CPU can process them all in one sweep. This is called vectorization. ðïļ â True or False â Activity 1 NumPy arrays are slower than Python lists for mathematical operations. Before we dive in, here are the terms you'll hear constantly. ð Don't memorize â just skim now and refer back as needed! A container that holds numbers in a structured grid The dimensions of an array (rows à columns) The data type of elements in the array Number of dimensions (1D, 2D, 3D...) A 1D array â a single row or column of numbers A 2D array â rows and columns (like a spreadsheet) A 3D+ array â like a stack of matrices Changing the shape of an array without changing data Applying an operation to all elements at once (no loops) NumPy's way of handling arrays with different shapes in operations ð§ Quick Check â Activity 2 A 2D array (rows and columns) is commonly called a: Think of creating a NumPy array like organizing items on a shelf. ðĶ You can arrange them by hand (np.array()), fill a shelf with zeros (np.zeros()), or create a numbered sequence (np.arange()). 1ïļâĢ From a Python List â np.array() The most common way. Hand your list to NumPy and it creates an efficient array. ðŊ scores = np.array([85, 92, 78, 95, 88]) # ðĶ Array of zeros (5 zeros) # ðĐ Array of ones (3 ones) # ðĒ Range of numbers (0 to 9) sequence = np.arange(0, 10) # ðĒ Range with step (0, 2, 4, 6, 8) evens = np.arange(0, 10, 2) np.zeros(n) â n zeros | np.ones(n) â n ones | np.arange(start, stop, step) â sequence. Just like Python's range() but returns an array! ðŊ 2ïļâĢ 2D Arrays (Matrices) â List of Lists Pass a list of lists to create a grid. Think of a spreadsheet where each inner list is a row. ð # ð 2D Array (3 students à 2 subjects) [85, 92], # Student 1: Math, Science [78, 88], # Student 2: Math, Science [95, 91] # Student 3: Math, Science # ðĶ 2D zeros: 3 rows à 4 columns # ðŊ Identity matrix (1s on diagonal) ð§ Quick Check â Activity 3 Which function creates an array filled with all zeros? âïļ Fill in the Blank â Activity 4 np.arange(0, 10, 2) produces [0, 2, 4, 6, ___]. What's the missing number? ðĒ ð Array Attributes â Inspecting Your Arrays Just like checking the specs on a phone ðą (screen size, storage, weight), every NumPy array has attributes that tell you about its structure. Dimensions as a tuple: (rows, cols) Number of dimensions (1, 2, 3...) Data type of elements (int64, float64) print(arr.shape) # (2, 3) .shape returns a tuple. For (2, 3): first number = rows (2), second = columns (3). Total elements = 2 à 3 = 6 (which matches .size). ð ð§ Quick Check â Activity 5 For np.array([[1,2,3],[4,5,6]]), what does .ndim return? For a NumPy array with shape (4, 5), what is .size? (Total elements) ð§Ū âïļ Array vs List â The Speed Battle Let's see exactly why NumPy beats Python lists. It's not just speed â the behavior is different too! ðĪŊ ðĶ Each element stored as separate Python object ð Operations use Python loops (slow) â [1,2] + [3,4] = [1,2,3,4] (concatenation!) ð Can mix types: [1, "hello", 3.14] ðĶ All elements stored in one contiguous block ⥠Operations use compiled C code (fast) â arr1 + arr2 = [4, 6] (element-wise!) ð All same type: [1, 2, 3] all integers With Python lists, + means concatenation (joining). With NumPy, + means element-wise addition. This catches beginners every time! ðŠĪ ðŽ List vs Array: Speed Comparison Watch how Python lists process one element at a time, while NumPy processes ALL at once ⥠Press Play to start the speed comparison ð Python List (slow, sequential) â True or False â Activity 7 In Python, [1, 2, 3] + [4, 5, 6] gives [5, 7, 9] (element-wise addition). ð 1D, 2D, 3D Arrays â Vectors, Matrices, Tensors
Subject: Python for Machine Learning | Chapter: Chapter 2: Python for Data Science