Sitemap

A to Z in ML, DL, and NLP Concepts

4 min readJan 4, 2025

Hello All, This post contains A-Z important terminologies related to Machine Learning, Deep Learning, and NLP. I will keep on Updating this post with more terms.

Press enter or click to view image in full size
  • Algorithm (ML): A step-by-step procedure (e.g., Linear Regression, SVM) for solving problems or making predictions from data.
  • Artificial Neural Networks (DL): Computational models inspired by the human brain, consisting of layers of neurons.

B

  • Bag of Words (NLP): A simplistic representation of text as a collection of word frequencies, disregarding order.
  • Backpropagation (DL): A method to compute and propagate gradients to optimize neural network weights.

C

  • Classification (ML): A supervised learning task of predicting discrete labels (e.g., email spam/not spam).
  • Convolutional Neural Networks (DL): Specialized models for handling spatial data like images.

D

  • Dropout (DL): A regularization technique where neurons are randomly deactivated to prevent overfitting.
  • Dimensionality Reduction (ML): Techniques (e.g., PCA) to reduce the number of features while retaining key information.

E

  • Embeddings (NLP): Dense vector representations of words or sentences that capture semantic meaning (e.g., Word2Vec).
  • Ensemble Methods (ML): Combining multiple models (e.g., Random Forests, Boosting) for improved accuracy.

F

  • Feature Selection (ML): Selecting the most relevant features for improving model performance.
  • Feedforward Networks (DL): Basic neural network architectures with forward-only connections.

G

  • Gradient Descent (ML/DL): An optimization algorithm to minimize error by updating weights in iterative steps.
  • Generative Models (ML): Models like GANs or VAEs for generating new data (e.g., images, text).

H

  • Hyperparameter Tuning (ML/DL): The process of optimizing non-learnable parameters like learning rate, batch size.
  • Hierarchical Clustering (ML): A technique for grouping data points into a hierarchy of clusters.

I

  • Instance-Based Learning (ML): Methods (e.g., k-NN) that memorize examples rather than forming a general model.
  • Intent Recognition (NLP): Detecting a user’s purpose or goal in conversational AI.

J

  • Jacobian Matrix (DL): A matrix representing partial derivatives, often used in neural networks.
  • Jaccard Similarity (ML/NLP): A metric to compare the similarity between two datasets or sets of words.

K

  • k-Means (ML): A clustering algorithm that partitions data into k groups based on proximity.
  • Kernel Trick (ML): A technique for transforming data into a higher dimension in SVMs for better separation.

L

  • Loss Function (ML/DL): A measure of how far predictions are from actual outcomes (e.g., Mean Squared Error).
  • LSTMs (DL/NLP): Long Short-Term Memory networks handle sequence data while mitigating the vanishing gradient problem.

M

  • Model Evaluation (ML): Techniques (e.g., accuracy, precision, recall) for assessing a model’s performance.
  • Multi-Head Attention (NLP/DL): Core of Transformer models, enabling focus on multiple input areas simultaneously.

N

  • Natural Language Generation (NLP): Creating human-like text from structured or unstructured data.
  • Normalization (ML): Scaling data features to a standard range to improve training stability.

O

  • Optimization (ML/DL): Finding the best parameters for a model by minimizing/maximizing the loss function.
  • Overfitting (ML/DL): When a model learns noise and performs poorly on unseen data.

P

  • Preprocessing (ML/NLP): Data cleaning, tokenization, and feature transformations before model training.
  • Pooling Layers (DL): In CNNs, layers for down-sampling data to reduce dimensions and computations.

Q

  • Quantization (DL): Reducing the precision of model parameters to accelerate inference on hardware.
  • Query Understanding (NLP): Interpreting the user query in search or conversational systems.

R

  • Recurrent Neural Networks (DL/NLP): Models with loops to process sequential data like time series or text.
  • Regularization (ML/DL): Techniques (e.g., L1, L2) to prevent overfitting by penalizing complex models.

S

  • Support Vector Machines (ML): A classification algorithm that maximizes margin between data classes.
  • Sequence-to-Sequence Models (DL/NLP): Architecture for tasks like machine translation.

T

  • Transformers (DL/NLP): State-of-the-art models (e.g., BERT, GPT) for NLP tasks using attention mechanisms.
  • Tokenization (NLP): Splitting text into tokens, such as words or subwords, for processing.

U

  • Unsupervised Learning (ML): Techniques for finding hidden patterns in unlabeled data (e.g., Clustering).
  • Universal Sentence Encoder (NLP): Pre-trained embeddings for encoding sentence-level semantics.

V

  • Vanishing Gradients (DL): A problem where gradients become too small, hindering learning in deep networks.
  • Vector Space Models (NLP): Representing text as vectors in high-dimensional space (e.g., TF-IDF, word embeddings).

W

  • Word Embeddings (NLP): Dense representations of words capturing semantic meanings (e.g., Word2Vec, GloVe).
  • Weight Initialization (DL): Setting initial weights to ensure convergence during training.

X

  • Explainability (ML/DL): Understanding and interpreting model predictions to build trust (e.g., SHAP, LIME).
  • XML Parsing (NLP): Extracting structured data from XML documents for text analysis.

Y

  • y-Label (ML): The dependent variable or ground truth in supervised learning tasks.
  • YAML for Configurations (DL/ML): A human-readable configuration format often used in model pipelines.

Z

  • Zero Shot Learning (ML/NLP): Generalizing to unseen classes or tasks without specific training data.
  • Z-Scores (ML): Standardizing data to have zero mean and unit variance for comparison.

--

--

No responses yet