Site icon Techplayon

Machine Learning – Principal Component Analysis

Machine Learning – Principal Component Analysis

As a machine learning Engineer, we need to work with multifeatured or multi dimensions data sets. As the no. of features or dimensions increased in a dataset, the amount of data required to obtain a statistically correct result increase exponentially.

This increased dataset results into multiple issues for the Machine Learning models such as overfitting, high computational time, and reduced accuracy. This problem is also known as the Curse of Dimensionality arise while working with high-dimensional data.

When no. of dimensions increases, the no. of possible combinations of features increases exponentially, which makes it computationally difficult to obtain a representative sample of the data and it becomes expensive to perform clustering or classification tasks.

In cases Machine Learning algorithms can be sensitive to the no. of dimensions, requiring more data to achieve the same level of accuracy as lower-dimensional data.

To address the problem Curse of Dimensionality, there are some techniques for feature selection and feature extraction are introduced. Dimensionality reduction is a type technique to of feature extraction technique that aims to reduce the number of input features while retaining as much of the original information as possible.

In this article, we will discuss one of the most popular dimensionality reduction techniques i.e. Principal Component Analysis (PCA).

What is PCA?

Principal Component Analysis is a technique that helps to find out the most common dimensions of the dataset and makes result analysis simpler. In the available dataset not all these datasets dimension is critical, some may be the primary key datasets, whereas others are not. So, PCA Method of factor analysis gives a calculative way of eliminating a few extra less important variables, thereby maintaining the transparency of all information.

Principal Component Analysis also called a dimensionality reduction method. With reduced data and dimensions, it is easily to explore and visualize the algorithms without wasting your valuable time.

Therefore, PCA statistically analyze all dimensions and reduce them as much as possible while preserving the exact information.

PCA Key Pointer

Terminologies of PCA

Procedure to Calculate Principal Components

Following figure show the sequence of step we need to follow to calculate the PCA in Machine Learning.

  1. Arrange the dataset and find out no. of dimensions and no. of samples.
  2. Calculate the mean of each dataset.
  3. Calculate the Covariance Matrix
  4. Calculate the Eigen value and Eigen Vectors
  5. Calculate PCA and Plot

Applications of PCA Analysis

Advantages of Principal Component Analysis

Example to Calculate Principal Component

Related Topics

References



Exit mobile version