Skip to content

My Personal Blog

Machine Learning and beyond

Menu
  • Artificial Intelligence
    • GenAI, Agentic Workflows & Knowledge Base with Amazon Bedrock
    • Building RAG Machine Learning Template on AWS
    • Reinforcement Learning
    • Computer Vision – AWS Rekognition
    • AWS Sagemaker AI Studio User Journey
    • MLOps Orchestrator with AWS Sagemaker AI
  • MACHINE LEARNING
    • Python for Data Science: Unsupervised Learning Algorithm
    • Python for Data Science: Supervised Learning Algorithm
    • Python for Data Science: Machine Learning
    • Supervised Machine Learning: Student Performance analysis using supervised learning algorithm
    • Unsupervised Machine Learning: Clustering using K-Means Algorithm
    • Unsupervised Machine Learning: Image compression using K-Means Cluster Algorithm
    • Unsupervised Machine Learning: Image compression using K-Means Cluster Algorithm
  • adventures
    • snowdonia-wales
    • Santini600
  • TRIATHLON
    • 2019 Ironman 70.3 Asia Pacific Championship, Vietnam
    • Race Report: 2019 Ironman Western Sydney
    • 2017 Ironman 70.3 Bintan
    • 2017 Perth Marathon
    • 2016 Ironman 70.3 Western Sydney
  • About Me
  • Let’s Connect
Menu

Unsupervised Machine Learning: Image compression using K-Means Cluster Algorithm

Posted on June 8, 2020August 14, 2025 by pluto gasanova

Image compression using K-Means Cluster Algorithm

Most of us are used to working with structured data that fits neatly within fixed row and columns in relational database and spreadsheet as the examples. 

However, more than 90% of data generated today is considered unstructured, and this number will continue to rise with the prominence of Internet of things. Examples of unstructured data include social media sites, satellite imagery, surveillance imagery, webpages, blogs, video files, audio files, text files, Call center transcripts/recording, etc

The unsupervised learning looks for previously undetected pattern/insight with minimum supervision in a dataset with no pre-existing labels such as unstructured data.

Therefore, there are wide open implementation of unsupervised learning on the unstructured data given the sheer volume of unstructured data in our life.

Clustering is one of the Unsupervised learning technique beside Association.

One interesting application of clustering is in color compression within images.

An image is stored in three-dimensional array of size (height, width, RGB), containing red/blue/green combination as integers from 0 to 255. 

One way we can view this set of pixels is as a cloud of points in a three-dimensional color space. We will reshape the data to [n_samples x n_features], and rescale the colors so that they lie between 0 and 1:

Now, let’s reduce these million colors to just 30 colors, using a k-means clustering accross the pixel space.




































































Some detail is certainly lost in the image on the right side, but the overall image is still easily recognizable. While this is an interesting application of k-means, there are certainly better way to compress information in images. But the example shows the power of thinking outside of the box with unsupervised methods like k-means.




Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • GenAI, Agentic Workflows & Knowledge Base with Amazon Bedrock
  • About Me
  • Let’s Connect
  • Hiking to Snowdon Summit – Spring 2025
  • Santini 600 : Cycling 600 km in 24 hours for a Good Cause

Archives

  • August 2025
  • June 2025
  • May 2025
  • March 2022
  • June 2020
  • May 2020
  • November 2019
  • June 2019
  • September 2017
  • July 2017
  • December 2016
© 2025 My Personal Blog | Powered by Superbs Personal Blog theme