Recurse SP2'23 #3: Towards a cel-shader
I worked late Monday and Tuesday, and today the rest of life just sorta caught up with me. So, the post is mostly just some research notes.
On Monday, a fellow Recurser, Efron, suggested writing a (2D) cel-shader. I want to pause my work on Vidrato to do some math, so this seems like a great next step.
This seems to have two main components:
- Edge detection and manipulation
- Color dithering (quantizing to a much smaller palette)
I don’t really know anything about any of this! For now, I’m going to focus on researching the edge detection angle.
OpenCV makes this too easy
I threw a Canny edge detection webcam filter together last weekend, but OpenCV is doing 100% of the legwork. (Funny enough, I got into OpenCV after reading a great intro by RC alum Sher Minn Chong! This article helped a lot with my initial explorations.) I still don’t understand the paramterization since I don’t understand the underlying algorithms yet, but I added some trackbars for today’s post so that the thresholds can be edited interactively.
Here’s a sample video, in which I’m sweeping the low threshold from 0 to 255, sweeping the high threshold, tuning setting the high threshold, and finally tuning the low threshold.
(Can’t see the video? Here’s an MP4 version for iOS users.)
The code:
#!/usr/bin/env python3
import cv2 as cv
import numpy as np
cap = cv.VideoCapture(0)
fps = cap.get(cv.CAP_PROP_FPS)
_, frame = cap.read()
y, x, _ = frame.shape
def noop(x):
pass
cv.namedWindow('Edgy')
cv.createTrackbar('Low Threshold', 'Edgy', 0, 255, noop)
cv.createTrackbar('High Threshold', 'Edgy', 0, 255, noop)
output = cv.VideoWriter('edgy.mp4',
cv.VideoWriter_fourcc(*'mp4v'), fps, (x, y))
while True:
try:
ret, frame = cap.read()
blurred = cv.GaussianBlur(frame, (7,7), 0)
blurred_gray = cv.cvtColor(blurred, cv.COLOR_BGR2GRAY)
t1 = cv.getTrackbarPos('Low Threshold', 'Edgy')
t2 = cv.getTrackbarPos('High Threshold', 'Edgy')
edges_2d = cv.Canny(blurred_gray, threshold1=t1, threshold2=t2)
# Black on white instead of white on black
edges_2d = cv.bitwise_not(edges_2d)
# Convert back to 3-channel RGB
edges = cv.cvtColor(edges_2d, cv.COLOR_GRAY2RGB)
# mirror for monitor, not file
cv.imshow('Edgy', np.flip(edges, axis=1)
output.write(edges)
if cv.waitKey(1) & 0xFF == ord('q'):
break
except KeyboardInterrupt:
break
cap.release()
output.release()
cv.destroyAllWindows()
Some notes
So, the above code is pretty representative of what I know about edge detection, which is to say all I really know so far is this:
- Basically, we’re interested in identifying discontinuities in a data set.
- This task seems to boil down to “choice of smoothing function” and “choice of edge strength computation”, along with parameterization of each.
- The smoothing function is meant to reduce the ambiguity around potential edges. An example that helped make this more concrete compared two simple data sets:
[5, 7, 6, 4, 152, 148, 149]
and[5, 7, 6, 41, 133, 148, 149]
. They both have a clear edge after the fourth element, but the second data set could also be argued to have edges after the third and fifth elements. - Computing edge strength seems to be a matter of computing a first or second order gradient, and considering the gradient’s magnitude at each point.
- Canny edge detection sounds like the most approachable method, and it seems to rely specifically on Gaussian smoothing.
- The smoothing function is meant to reduce the ambiguity around potential edges. An example that helped make this more concrete compared two simple data sets:
My goals:
- Understand and implement Gaussian smoothing, which seems like the most common pre-processing step.
- This in turn requires convolution.
- In particular, we’re interested in convolution matrices, which are called kernels in image processing. (These seem similar to kernel methods in machine learning, but I haven’t gotten around to tackling those yet.)
- Understand and implement the rest of Canny edge detection.
- See the process outlined on Wikipedia. There’s still a lot more to unpack about that.
In the context of the above code, this means I want to implement my own versions of cv::GaussianBlur and cv::Canny, as well as fast convolution.
(Around here I switched off to working in a Jupyter notebook for a while, which I’m hoping to have more to share from soon! In the meantime, I’ll have a lot of math to sort through.)