TDM 40100: Project 9 — 2023
Motivation: Images are everywhere, and images are data! We will take some time to dig more into working with images as data in this project.
Context: In the previous project, we were able to isolate and display the Y, Cb and Cr channels of our ballpit.jpg
image, and we applied an image histogram equalization technique to Y and then merged 3 components, to an equalized image. We understood the structure of an image and how the image’s luminance (Y) and chrominance (Cb and Cr) contributed to the whole image. The human eye is more sensitive to the Y Channel than color channels Cb & Cr. In this project, we will continue to work with 'YCbCr` images as we delve into some image compression techniques, we will implement a variation of jpeg image compression!
Scope: Python, images, openCV, YCbCr, downsampling, discrete cosine transform, quantization
Dataset(s)
The following questions will use the following dataset(s):
-
/anvil/projects/tdm/data/images/ballpit.jpg
Questions
Some helpful links that are really useful.
JPEG is a lossy compression format and an example of transform based compression. Lossy compression means that you can’t retrieve the information that was lost during the compression process. In a nutshell, these methods use statistics to identify and discard redundant data. |
Since the human eye is more sensitive to the Y Channel than color channels, we can reduce the resolution of the color components to achieve image compression. we will first need to import some libraries
To read the image, we will use openCV
Then convert the image from default rgb format to YCrCb format
|
Question 1 (2 pts)
First we will use a downsample technique, to compress an image by reducing the resolution of the color channels. It will return a YCrCb image with lower resolution.
The following statement downsamples the image Cr channel to half (0.5) by using cv2.resize
ballpit_reduce = cv2.resize(ballpit_ycrcb[:,:,1],(0,0),fx=0.5,fy=0.5)
Then we will need to use cv2.resize()
to upsample the resolution reduced image to the original size by using the original image size’s tuple
cv2.resize(ballpit_reduce,(ballpit_ycrcb.shape[1],ballpit_ycrcb.shape[0]))
-
Please write a function named compress_downsample, it will take 3 arguments, a
jpg
file, a float number (fx) for the width downsampling factor; a float number (fy) for the Height downsampling factor. The returns will be a compressed ( downsampled ) image -
Visualize the compressed image aligned with original image
-
Calculate the compression ratio
You may use cv2.imwrite to save the compressed image to a file, get the size of it and divide by size of original image file
|
Question 2 (1 pt)
Second let’s look into the discrete cosine transform technique
Per MathWorks, the discrete cosine transform has the property that visually significant information about an image is concentrated in just a few coefficients of the resulting signal data. Meaning, if we are able to capture the majority of the visually-important data from just a few coefficients, there is a lot of opportunity to reduce the amount of data we need to keep. So DCT is a technique allow the important parts of an image separated from the unimportant ones. |
E.g.
We will need to split the previous created ballpit_ycrcb
into 3 Channels
y_c, cr_c,cb_c = cv2.split(ballpit_ycrcb)
Next, apply 2D DCT to each channel by cv2.dct
y_c_dct = cv2.dct(y_c.astype(np.float32))
cr_c_dct = cv2.dct(cr_c.astype(np.float32))
cb_c_dct = cv2.dct(cb_c.astype(np.float32))
-
Please find the dimension for the output DCT blocks
-
Please print a 8*8 DCT blocks for each channel separately
|
Question 3 (2 pts)
Now let us try to visualize the output of DCT compression. One common way to do it will be to set value zero to some of the DCT coefficients, such as high-frequency ones at right or downward in the DCT output matrix, for example if we only want to keep top-left of 50*50 block of coefficients. We can set the value to zero to all other areas. For example, for the Y channel,
cut_v = 50
y_c_dct[cut_v:,:]=0
y_c_dct[:,cut_v:]=0
After updating the DCT coefficients, we can do inverse DCT on each channel to change back to its pixel intensities from its frequency representation, for example for Y channel
y_rec = cv2.idct(y_c_dct.astype(np.float32))
-
Please create a function named
compress_DCT
to implement image compression with DCT. The arguments are a jpg image, and a number for the coefficient area you would like to keep (we only need to consider same size for horizontal and vertical directions) -
Visualize the DCT compressed image for ballpit.jpg align with the original one
-
Calculate the compression ratio
Question 4 (2 pts)
Next, let us try a quantization technique. Quantization reduces the precision of the DCT coefficients based on human perceptual characteristics. This introduces data loss, but reduces image size greatly. You can read more about quantization here. Apparently, the human brain is not very good at distinguishing changes in high frequency parts of our data, but good at distinguishing low frequency changes.
We can use a quantization matrix to filter out the higher frequency data and maintain the lower frequency data. One of the more common quantization matrix is the following.
q1 = np.array([[16,11,10,16,24,40,51,61],
[12,12,14,19,26,28,60,55],
[14,13,16,24,40,57,69,56],
[14,17,22,29,51,87,80,62],
[18,22,37,56,68,109,103,77],
[24,35,55,64,81,104,113,92],
[49,64,78,87,103,121,120,101],
[72,92,95,98,112,100,103,99]])
We can quantize the DCT coefficients by dividing the value from quantization matrix and rounding to integer. For example for Y channel
np.round(y_c_dct/q1)
-
Please create a function called
compress_quant
that will use the function from question 3, select a 8*8 block and quantize the DCT coefficients before we do DCT inversion -
Run the function with image ballpit.jpg, visualize the output compressed image align with original one
-
Calculate the compression ratio
Question 5 (1 pt)
-
Choose one of your favorite image as input and put all the steps together
-
Visualize the output aligned with the original image, and describe your findings briefly
Project 09 Assignment Checklist
-
Jupyter Lab notebook with your codes, comments and outputs for the assignment
-
firstname-lastname-project09.ipynb
.
-
-
Submit files through Gradescope
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |