[Python] Computer Vision

# 이미지 불러오기
image = cv2.imread('화면 캡처 2025-10-09 114154.png')

Problem #1 Image Transformations

import cv2
import numpy as np

# 1-1 Translation
def translate(image, x, y):
    h, w = image.shape[:2]
    translation_matrix = np.float32([[1, 0, x], [0, 1, y]])
    transformed_image = cv2.warpAffine(image, translation_matrix, (w, h))
    return transformed_image

# 1-2 Rotation
def rotate(image, angle):
    h, w = image.shape[:2]
    center = (w // 2, h // 2)
    transform_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
    transformed_image = cv2.warpAffine(image, transform_matrix, (w, h))
    return transformed_image

# 1-3 Affine Transformation
def AffineTransformation(image, source_points, target_points):
    transform_matrix = cv2.getAffineTransform(np.float32(source_points),
                                              np.float32(target_points))
    h, w = image.shape[:2]
    transformed_image = cv2.warpAffine(image, transform_matrix, (w, h))
    return transformed_image

# 1-4 Perspective Transformation
def PerspectiveTransformation(image, source_points, target_points):
    transform_matrix = cv2.getPerspectiveTransform(np.float32(source_points),
                                                   np.float32(target_points))
    h, w = image.shape[:2]
    transformed_image = cv2.warpPerspective(image, transform_matrix, (w, h))
    return transformed_image

warpAffine → 2x3 행렬 (Affine 변환, 직선 유지, 평행 유지)
warpPerspective → 3x3 행렬 (투시 변환, 원근감까지 표현 가능 → 직사각형이 사다리꼴로 변형될 수도 있음)

translate: cv2.warpAffine() 이용, translation matrix 만들어서 평행 이동.
rotate: cv2.getRotationMatrix2D() + cv2.warpAffine() 사용.
AffineTransformation: 3개 점을 기준으로 cv2.getAffineTransform() → cv2.warpAffine().
PerspectiveTransformation: 4개 점을 기준으로 cv2.getPerspectiveTransform() → cv2.warpPerspective().

Problem #2 Linear Filters

# 2-1 Gaussian Filter
def Gaussian_filter(image):
    result = cv2.GaussianBlur(image, (5, 5), 1.0)
    return result

# 2-1 Sobel
def Sobel(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    grad_x = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
    grad_y = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
    result = cv2.magnitude(grad_x, grad_y)
    return result

# 2-1 Laplacian
def Laplacian(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    result = cv2.Laplacian(gray, cv2.CV_64F, ksize=3)
    return result

Gaussian: 노이즈 줄이기 (사전 처리용)
Sobel: 가로/세로 방향 경계선 찾기
Laplacian: 전체 에지 한 번에 찾기 (하지만 노이즈 민감)

Gaussian_filter: cv2.GaussianBlur(image, (kernel, kernel), sigma)
Sobel: cv2.Sobel(image, cv2.CV_64F, dx, dy, ksize=...)
Laplacian: cv2.Laplacian(image, cv2.CV_64F)
Salt & Pepper noise 추가 함수는 이미 구현돼 있음 → add_salt_pepper_noise.

Problem #2-3 Median Blur

def median_blur(image):
    # 커널 크기를 다양하게 적용 (예: 3, 5, 7)
    result_3 = cv2.medianBlur(image, 3)
    result_5 = cv2.medianBlur(image, 5)
    result_7 = cv2.medianBlur(image, 7)
    return result_3, result_5, result_7

cv2.medianBlur(image, ksize)` 사용
다양한 커널 크기 적용 (3, 5, 7 등).

Problem #3 Image Pyramids

# 3-1 Resize (Down / Up Sampling)
def resize_examples(image):
    down_result_01 = cv2.resize(image, (image.shape[1]//2, image.shape[0]//2), interpolation=cv2.INTER_NEAREST)
    down_result_02 = cv2.resize(image, (image.shape[1]//2, image.shape[0]//2), interpolation=cv2.INTER_LINEAR)
    down_result_03 = cv2.resize(image, (image.shape[1]//2, image.shape[0]//2), interpolation=cv2.INTER_CUBIC)

    up_result_01 = cv2.resize(image, (image.shape[1]*2, image.shape[0]*2), interpolation=cv2.INTER_NEAREST)
    up_result_02 = cv2.resize(image, (image.shape[1]*2, image.shape[0]*2), interpolation=cv2.INTER_LINEAR)
    up_result_03 = cv2.resize(image, (image.shape[1]*2, image.shape[0]*2), interpolation=cv2.INTER_CUBIC)

    return down_result_01, down_result_02, down_result_03, up_result_01, up_result_02, up_result_03

# 3-2 Gaussian Pyramid
def gaussian_pyramid(image):
    G_down1 = cv2.pyrDown(image)
    G_down2 = cv2.pyrDown(G_down1)
    G_up1 = cv2.pyrUp(G_down2)
    G_up2 = cv2.pyrUp(G_up1)
    return G_down1, G_down2, G_up1, G_up2

# 3-3 Laplacian Pyramid
def laplacian_pyramid(image):
    G_down = cv2.pyrDown(image)
    G_up = cv2.pyrUp(G_down)
    L = cv2.subtract(image, G_up)   # 원본 - upsampling
    return L

Down-sampling: cv2.resize(image, None, fx=0.5, fy=0.5, interpolation=...) 또는 cv2.pyrDown(image)
Up-sampling: cv2.resize(image, None, fx=2, fy=2, interpolation=...) 또는 cv2.pyrUp(image)
여러 단계 반복해서 Gaussian Pyramid / Laplacian Pyramid 구현.
Laplacian Pyramid는 Gaussian_up - 원본 방식으로 계산.

Problem #4 Median Blur

def my_median_blur(image, size):
    import numpy as np
    h, w = image.shape[:2]
    pad = size // 2
    padded = cv2.copyMakeBorder(image, pad, pad, pad, pad, cv2.BORDER_REFLECT)
    result = np.zeros_like(image)

    for i in range(h):
        for j in range(w):
            window = padded[i:i+size, j:j+size].flatten()
            median_val = np.median(window)
            result[i, j] = median_val
    return result

cv2.medianBlur 대신 직접 구현:

윈도우(커널) 영역 픽셀 모음.
numpy.sort()로 정렬.
중앙값을 픽셀 값으로 지정.

[전체 흐름을 한 컷으로 요약]

Resize: 크기만 바꿈(보간법 선택이 핵심). Down에서는 AREA, Up에서는 CUBIC/LANCZOS 추천.
Gaussian Pyramid: 해상도를 단계적으로 낮춰 저주파 정보만 남기는 계층.
Laplacian Pyramid: 각 단계에서 원본 − 확대한 저해상도 = 고주파 잔차를 모아두는 계층.
Median Blur: 윈도우 중앙값으로 노이즈 제거(특히 Salt & Pepper에 강함). 컬러는 채널별 중앙값.

3-1 resize_examples(image) — 리사이즈와 보간(interpolation)
무엇을 하나?

이미지를 절반으로 줄였다가(Down) 2배로 키우는(Up) 예시야.
cv2.resize(src, dsize, interpolation=…) 에서
- dsize=(새가로, 새세로)는 픽셀 절대 크기야.
- interpolation은 새 픽셀값을 어떻게 채울지 정하는 규칙.

보간법의 차이

INTER_NEAREST
- 가장 가까운 원본 픽셀 하나를 그대로 복사.
- 빠르지만 계단/블록 현상 심함(특히 Up-sampling에서).
INTER_LINEAR
- 2×2 주변 픽셀의 선형 가중 평균.
- 일반적으로 기본값. 빠르고 결과도 무난.
INTER_CUBIC
- 4×4 주변 픽셀의 3차(큐빅) 보간.
- 더 부드럽지만 느림. Up-sampling에서 디테일이 비교적 자연스러움.

실무 팁

Down-sampling 할 때는 INTER_AREA가 더 좋다(면적 기반 평균 → 앨리어싱 억제).
Up-sampling 품질을 끌어올리려면 INTER_CUBIC이나 INTER_LANCZOS4가 보통 더 낫다.
크기 비율만 지정하고 싶다면 dsize 대신 fx, fy를 쓰면 편해.
Down 전에 가우시안 블러를 살짝 주면(예: σ=0.5~1.0) aliasing이 줄어든다.

3-2 gaussian_pyramid(image) — 가우시안 피라미드
무엇을 하나?

pyrDown은 가우시안 블러(5×5 커널) 후 1/2 축소.
pyrUp은 2배 확대 및 보간으로 빈 픽셀을 메움.
여기선: image → G_down1 → G_down2 (축소 2회) → G_up1 → G_up2 (다시 확대 2회).

왜 쓰나?

해상도를 낮추면 **저주파(큰 구조)**만 남고, 고주파(세부/노이즈)는 줄어든다.
→ 다중 해상도에서 문제를 보고 처리하기 위해.

주의점

입력 폭/높이가 홀수면 pyrDown과 pyrUp 과정에서 1픽셀 차이로 크기가 안 맞을 수 있어.
→ 보통 pyrUp 뒤에 원본 크기에 슬라이스로 맞춰 자르거나 dstsize를 명시해 정렬한다.
여러 레벨이 필요하면 리스트에 레벨별 이미지를 모아 관리해.

3-3 laplacian_pyramid(image) — 라플라시안 피라미드(잔차 이미지)
무엇을 하나?

G_down = pyrDown(image) # 흐리게 + 축소 G_up = pyrUp(G_down) # 다시 확대(복원본) L = image - G_up # 원본 - 복원본 = 잃어버린 디테일(고주파)

L에는 엣지/윤곽 등 고주파 성분만 남는다.
이것이 한 레벨의 라플라시안이고, 여러 레벨로 반복하면 라플라시안 피라미드가 된다.

실무 포인트

cv2.subtract를 uint8로 쓰면 음수가 0으로 클리핑돼 디테일이 소실될 수 있어.
→ 계산 전후를 **float32**로 두는 게 안전해. 예:
f = image.astype(np.float32) L = f - cv2.pyrUp(cv2.pyrDown(f))
재구성(reconstruction):
G0 ≈ expand(G1) + L0 식으로 위 레벨을 **확대(expand)**해서 잔차를 더하면 원본에 근접하게 복원 가능.
cv2.Laplacian 함수(2차 미분)로 나오는 라플라시안 필터 결과와 피라미드의 라플라시안은 개념이 다르다.
- 전자는 미분 필터이고,
- 후자는 가우시안 차 이미지(DoG) 계열의 다중해상도 잔차다.

* cv2.medianBlur 대신 직접 구현**:

윈도우(커널) 영역 픽셀 모음.
numpy.sort()로 정렬.
중앙값을 픽셀 값으로 지정.

< 동작 원리 >

각 픽셀 주변의 size×size 윈도우를 꺼내 중앙값을 픽셀에 할당.
Salt & Pepper 같은 점 잡음 제거에 특히 강함(평균 대신 중앙값이라 엣지 보존).

< 코드 해석/장단점 >

경계 처리는 REFLECT(거울 반사)로 자연스러움.
시간 복잡도는 대략 O(H×W×K² log(K²)) (K=size), 큰 커널에서 매우 느림.
중요한 이슈(컬러 이미지)
- 컬러(BGR)라면 window의 shape이 (K, K, 3)인데 .flatten()으로 채널까지 섞여 중앙값 하나만 계산됨 → 세 채널 모두를 그 단일 값으로 채우는 셈.
- 즉, 색이 흐릿/회색화될 수 있음.
- 컬러라면 중앙값을 채널별로 계산해야 정확해.
- # 올바른 채널별 중앙값 median_val = np.median(window, axis=(0, 1)) # shape (3,) result[i, j] = median_val
size는 양의 홀수(3,5,7,…)만 의미가 있고, 너무 크면 엣지까지 지워짐.

최적화/대안

실무에서는 cv2.medianBlur(image, ksize)가 훨씬 빠르고 최적화됨.
파이썬 순환문 없이 벡터화하려면 numpy.lib.stride_tricks.sliding_window_view로 윈도우를 만들고 축 중앙값을 한 번에 계산하는 방법이 있지만, 메모리를 많이 먹는다.
흑백만 처리한다면 현재 코드도 맞지만, 컬러 처리라면 채널별 중앙값로 바꾸는 게 필수.

'개인 프로젝트 > 대학원 수업 정리' 카테고리의 다른 글

인공지능 수학 (0)	2025.10.13
[Python] (1) Computer Vision_과제 (0)	2025.10.09
Computer Vision(과제) (0)	2025.09.17
[데이터베이스시스템특론] 기말준비 (0)	2025.06.08
[논문작성법] 7~14주차 정리 (1)	2025.06.04

Learning_EunBi

[Python] Computer Vision_Gaussian Filter

Problem #1 Image Transformations

Problem #2 Linear Filters

Problem #2-3 Median Blur

Problem #3 Image Pyramids

Problem #4 Median Blur

[전체 흐름을 한 컷으로 요약]

* cv2.medianBlur 대신 직접 구현**:

'개인 프로젝트 > 대학원 수업 정리' 카테고리의 다른 글

댓글

티스토리툴바

[Python] Computer Vision_Gaussian Filter

Problem #1 Image Transformations

Problem #2 Linear Filters

Problem #2-3 Median Blur

Problem #3 Image Pyramids

Problem #4 Median Blur

[전체 흐름을 한 컷으로 요약]

*** cv2.medianBlur 대신 직접 구현:

'개인 프로젝트 > 대학원 수업 정리' 카테고리의 다른 글

관련글

댓글

티스토리툴바

* cv2.medianBlur 대신 직접 구현**: