AWS Certified AI Practitioner - 20% of exam

Fundamentals of AI and ML

What you will learn

In this domain, you learn the relationship between artificial intelligence (AI), machine learning (ML), and deep learning (DL), the differences between supervised, unsupervised, and reinforcement learning, and the general lifecycle of running an ML project. It is a foundational domain that accounts for 20% of the AIF-C01 exam and forms the basis for everything that follows.

Key points

The containment relationship AI ⊃ ML ⊃ DL - AI is the broadest concept, ML is one approach within AI, and DL is an ML approach using neural networks
Supervised learning - learns from pairs of inputs and correct answers (labels). It is divided into classification (categorical) and regression (numerical)
Unsupervised learning - discovers the structure of data without correct labels. Representative examples are clustering, dimensionality reduction, and anomaly detection
Reinforcement learning - an agent interacts with an environment and learns actions that maximize reward
The ML lifecycle - the repeating cycle of data collection, preprocessing, training, evaluation, deployment, and monitoring
Overfitting - the phenomenon in which performance fits the training data but drops on test data. Care is needed to preserve generalization performance
Inference - the process of giving a trained model new input to obtain predictions. There are real-time inference and batch inference

Terms and concepts

The relationship between AI, machine learning, and deep learning

AI is a broad concept that refers to technologies that mimic intelligent human behavior in general. ML is an AI approach that "automatically learns rules from data," and DL is an ML approach that "uses multilayer neural networks." Most generative AI is an application of DL.

→ Generative AI Platform

Supervised learning and unsupervised learning

Supervised learning learns from labeled data and is used for image classification and sales forecasting. Unsupervised learning finds structure in unlabeled data and is used for customer segmentation and anomaly detection. Semi-supervised learning is an intermediate between the two.

→ Machine Learning Operations in Practice

Reinforcement learning

A technique in which an agent (the learner) selects actions toward an environment and learns a strategy that maximizes the reward obtained as a result. It is used in game AI, robot control, recommendation optimization, and RLHF (reinforcement learning from human feedback) for generative AI.

→ Designing a Recommendation Engine

Overfitting and generalization

Overfitting is the phenomenon in which a model fits the training data too closely and performs poorly on unseen data. It is prevented by increasing the amount of data, regularization, dropout, and cross-validation. Conversely, underfitting is a state in which the model is too simple to learn even the training data.

→ ML Development with SageMaker

Methods of inference

Real-time inference returns a prediction immediately for each API request (e.g., a chatbot). Batch inference processes a body of data all at once (e.g., predicting purchases for all customers overnight). Real-time offers low latency, while batch offers cost efficiency for large volumes of data.

→ Real-time / Batch Inference with SageMaker

Check your understanding

Check what you have learned with 5 questions