LMIM 모델링 코드

IIIT5K의 .mat annotation에서 GT를 추출하고, 이를 LMDB 형식으로 변환하여 STR finetuning pipeline을 구축하였다.
LMIM pretrained checkpoint를 downstream recognition model에 연결해보았으나,

encoder dimension mismatch(768→384)로 인해 전이학습 효과는 제한적이었고,

1 epoch smoke test에서는 Acc 0.003%, Rec F-measure 0.0588 수준으로 파이프라인 동작만 확인하였다.

현재 IIIT5K 데이터셋을 기반으로 Scene Text Recognition 모델을 학습하고 있습니다.
먼저 LMIM pretraining으로 텍스트 이미지 representation을 학습한 뒤,
이를 recognition task에 finetuning하여 이미지에서 단어를 읽는 OCR/STR 성능을 확인하고 있습니다.

GPU 환경 정상
IIIT5K 로딩 정상
LMIM 모델 정상
학습 루프 정상
GPU 메모리 사용 정상
속도 정상화 완료

# =========================================================
# STEP 1. 환경 설치 전용 셀
# 이 셀 실행 후 반드시 "런타임 다시 시작"
# =========================================================
%cd /content

# 기존 충돌 패키지 제거
!pip uninstall -y numpy opencv-python opencv-python-headless opencv-contrib-python imgaug lmdb timm || true

# LMIM 구버전 코드와 맞는 조합 설치
!pip install -q --no-cache-dir \
  "numpy==1.26.4" \
  "timm==0.4.12" \
  "opencv-python==4.8.1.78" \
  "imgaug==0.4.0" \
  "lmdb" \
  "easydict" \
  "tensorboardX" \
  "pyyaml" \
  "scipy" \
  "tqdm" \
  "nltk"

print("\n[중요] 이제 Colab 메뉴에서 '런타임 > 런타임 다시 시작' 해주세요.")

# numpy 충돌 제거
!pip uninstall -y numpy

# numpy 1.x 설치
!pip install numpy==1.26.4

# 런타임 재시작
import os
os.kill(os.getpid(), 9)

# =========================================================
# STEP 2. 환경 확인
# =========================================================
import sys
import numpy as np

print("python:", sys.executable)
print("numpy version:", np.__version__)
print("numpy path:", np.__file__)

assert np.__version__ == "1.26.4", f"NumPy version mismatch: {np.__version__}"
print("[OK] NumPy 1.26.4 confirmed")

import numpy as np
print(np.__version__)
print(np.__file__)

%cd /content

!pip uninstall -y numpy imgaug opencv-python opencv-python-headless opencv-contrib-python lmdb timm

!pip install --no-cache-dir \
  numpy==1.26.4 \
  imgaug==0.4.0 \
  opencv-python==4.8.1.78 \
  lmdb \
  timm==0.4.12 \
  easydict \
  tensorboardX \
  pyyaml \
  scipy \
  tqdm \
  nltk

!python -m pip show numpy

# imgaug import 확인
import imgaug
import lmdb
import cv2
import torch

print("imgaug:", imgaug.__version__)
print("cv2:", cv2.__version__)
print("torch:", torch.__version__)
print("[OK] imports successful")

# =========================================================
# STEP 3. LMIM clone
# =========================================================
%cd /content
!rm -rf /content/LMIM
!git clone https://github.com/zhangyifei01/LMIM.git /content/LMIM

!ls /content/LMIM
!ls /content/LMIM/lmim_pretrain

# =========================================================
# STEP 4. Google Drive mount
# =========================================================
from google.colab import drive
drive.mount('/content/drive')

# =========================================================
# STEP 5. IIIT5K 데이터 연결
# =========================================================
import os, shutil
from pathlib import Path

DATA_DIR = "/content/drive/MyDrive/datasets/IIIT5K"   # 필요시 수정
LINK_PATH = "/content/LMIM/data/IIIT5K"

assert Path(DATA_DIR).exists(), f"Dataset path not found: {DATA_DIR}"

os.makedirs("/content/LMIM/data", exist_ok=True)

if os.path.islink(LINK_PATH):
    os.unlink(LINK_PATH)
elif os.path.exists(LINK_PATH):
    shutil.rmtree(LINK_PATH)

os.symlink(DATA_DIR, LINK_PATH)

print("[OK] linked:", LINK_PATH)
print("[OK] DATA_DIR exists:", os.path.exists(DATA_DIR))
print("[OK] LINK_PATH exists:", os.path.exists(LINK_PATH))
print("[OK] sample files:", os.listdir(LINK_PATH)[:20])

# =========================================================
# STEP 6. util/misc.py patch
# torch._six -> math.inf
# =========================================================
from pathlib import Path
import shutil

MISC_PATH = Path("/content/LMIM/lmim_pretrain/util/misc.py")
MISC_BAK = Path("/content/LMIM/lmim_pretrain/util/misc.py.bak_torchsix")

assert MISC_PATH.exists(), f"misc.py not found: {MISC_PATH}"

if not MISC_BAK.exists():
    shutil.copy(MISC_PATH, MISC_BAK)
    print(f"[OK] backup created: {MISC_BAK}")
else:
    print(f"[OK] backup already exists: {MISC_BAK}")

text = MISC_PATH.read_text()

old = "from torch._six import inf"
new = "from math import inf"

if old in text:
    text = text.replace(old, new, 1)
    MISC_PATH.write_text(text)
    print("[OK] torch._six patched")
else:
    print("[INFO] torch._six patch target not found (already patched or different code)")

print("\n[CHECK]")
updated = MISC_PATH.read_text()
idx = updated.find("inf")
print(updated[max(0, idx-100):idx+140])

# =========================================================
# STEP 7. np.float patch
# =========================================================
import os

root = "/content/LMIM"
patched_files = []

for current_root, _, files in os.walk(root):
    for fn in files:
        if not fn.endswith(".py"):
            continue

        path = os.path.join(current_root, fn)
        with open(path, "r", encoding="utf-8", errors="ignore") as f:
            text = f.read()

        new_text = text
        new_text = new_text.replace("dtype=np.float)", "dtype=float)")
        new_text = new_text.replace("dtype=np.float,", "dtype=float,")
        new_text = new_text.replace("dtype=np.float\n", "dtype=float\n")

        if new_text != text:
            with open(path, "w", encoding="utf-8") as f:
                f.write(new_text)
            patched_files.append(path)

print("[OK] np.float patch done")
for p in patched_files:
    print("-", p)

# =========================================================
# STEP 8. engine_pretrain.py patch
# ImageFolder -> (img, label) 대응
# =========================================================
from pathlib import Path
import shutil, re

ENGINE_PATH = Path("/content/LMIM/lmim_pretrain/engine_pretrain.py")
ENGINE_BAK = Path("/content/LMIM/lmim_pretrain/engine_pretrain.py.bak_onetime")

assert ENGINE_PATH.exists(), f"engine_pretrain.py not found: {ENGINE_PATH}"

if not ENGINE_BAK.exists():
    shutil.copy(ENGINE_PATH, ENGINE_BAK)
    print(f"[OK] backup created: {ENGINE_BAK}")
else:
    print(f"[OK] backup already exists: {ENGINE_BAK}")

text = ENGINE_PATH.read_text()

original_exact = """        samples = udata[0].to(device, non_blocking=True)
        samples_aug = udata[1].to(device, non_blocking=True)
"""

patched_block = """        samples = udata[0].to(device, non_blocking=True)

        # HOTFIX:
        # ImageFolder returns (image, label), but LMIM expects (image, augmented_image).
        samples_aug = samples.clone()

        print("[DEBUG] samples.shape =", samples.shape)
        print("[DEBUG] samples_aug.shape =", samples_aug.shape)
"""

patched = False

if original_exact in text:
    text = text.replace(original_exact, patched_block, 1)
    patched = True
    print("[OK] exact patch applied")
else:
    pattern = r"""
(\s*)samples\s*=\s*udata\[0\]\.to\(device,\s*non_blocking=True\)\s*
\1samples_aug\s*=\s*udata\[1\]\.to\(device,\s*non_blocking=True\)
"""
    replacement = r"""\1samples = udata[0].to(device, non_blocking=True)

\1# HOTFIX:
\1# ImageFolder returns (image, label), but LMIM expects (image, augmented_image).
\1samples_aug = samples.clone()

\1print("[DEBUG] samples.shape =", samples.shape)
\1print("[DEBUG] samples_aug.shape =", samples_aug.shape)"""
    new_text, count = re.subn(pattern, replacement, text, count=1, flags=re.VERBOSE)
    if count == 1:
        text = new_text
        patched = True
        print("[OK] regex patch applied")

if patched:
    ENGINE_PATH.write_text(text)
    print(f"[OK] saved: {ENGINE_PATH}")
else:
    print("[INFO] patch target not found. It may already be patched.")

updated = ENGINE_PATH.read_text()
idx = updated.find("samples = udata[0]")
print("\n[CHECK]")
print(updated[idx:idx+450])

# =========================================================
# STEP 9. main_pretrain.py save patch
# 마지막 epoch에서도 저장되게
# =========================================================
from pathlib import Path
import shutil

MAIN_PATH = Path("/content/LMIM/lmim_pretrain/main_pretrain.py")
MAIN_BAK = Path("/content/LMIM/lmim_pretrain/main_pretrain.py.bak_savefix")

assert MAIN_PATH.exists(), f"main_pretrain.py not found: {MAIN_PATH}"

if not MAIN_BAK.exists():
    shutil.copy(MAIN_PATH, MAIN_BAK)
    print(f"[OK] backup created: {MAIN_BAK}")
else:
    print(f"[OK] backup already exists: {MAIN_BAK}")

text = MAIN_PATH.read_text()

old = """        if args.output_dir and (epoch + 1) % 5 == 0:
            misc.save_model(
                args=args,
                model=model,
                model_without_ddp=model_without_ddp,
                optimizer=optimizer,
                loss_scaler=loss_scaler,
                epoch=epoch)
"""

new = """        if args.output_dir and (((epoch + 1) % 5 == 0) or ((epoch + 1) == args.epochs)):
            misc.save_model(
                args=args,
                model=model,
                model_without_ddp=model_without_ddp,
                optimizer=optimizer,
                loss_scaler=loss_scaler,
                epoch=epoch)
"""

if old in text:
    text = text.replace(old, new, 1)
    MAIN_PATH.write_text(text)
    print("[OK] save condition patched")
else:
    print("[INFO] exact block not found. It may already be patched.")

updated = MAIN_PATH.read_text()
idx = updated.find("if args.output_dir")
print("\n[CHECK]")
print(updated[idx:idx+350])

# =========================================================
# STEP 10. main_pretrain.py data_path patch
# =========================================================
file_path = "/content/LMIM/lmim_pretrain/main_pretrain.py"

with open(file_path, "r", encoding="utf-8") as f:
    text = f.read()

marker = "def main(args):"
inject = """def main(args):
    if isinstance(args.data_path, list) and len(args.data_path) == 1:
        args.data_path = args.data_path[0]
"""

if marker in text and "if isinstance(args.data_path, list) and len(args.data_path) == 1:" not in text:
    text = text.replace(marker, inject, 1)
    with open(file_path, "w", encoding="utf-8") as f:
        f.write(text)
    print("[OK] data_path patch inserted")
else:
    print("[INFO] data_path patch already exists or marker not found")

# =========================================================
# STEP 11. output/log dir
# =========================================================
import os

os.makedirs("/content/LMIM/outputs/pretrain_test_1ep_save", exist_ok=True)
os.makedirs("/content/LMIM/logs/pretrain_test_1ep_save", exist_ok=True)

print("[OK] output/log dirs ready")

# =========================================================
# STEP 12. run pretrain
# =========================================================
%cd /content/LMIM/lmim_pretrain

!python -u main_pretrain.py \
  --data_path /content/LMIM/data/IIIT5K \
  --output_dir /content/LMIM/outputs/pretrain_test_1ep_save \
  --log_dir /content/LMIM/logs/pretrain_test_1ep_save \
  --model mae_vit_base_patch4 \
  --batch_size 8 \
  --epochs 1 \
  --num_workers 0 \
  --device cuda

# =========================================================
# STEP 13. check outputs
# =========================================================
!echo "\n[OUTPUT DIR]"
!ls -R /content/LMIM/outputs/pretrain_test_1ep_save || true

!echo "\n[LOG DIR]"
!ls -R /content/LMIM/logs/pretrain_test_1ep_save || true

from pathlib import Path
import shutil

UTILS_PATH = Path("/content/LMIM/lmim_pretrain/utils.py")
UTILS_BAK = Path("/content/LMIM/lmim_pretrain/utils.py.bak_torchsix")

assert UTILS_PATH.exists(), f"utils.py not found: {UTILS_PATH}"

if not UTILS_BAK.exists():
    shutil.copy(UTILS_PATH, UTILS_BAK)
    print(f"[OK] backup created: {UTILS_BAK}")
else:
    print(f"[OK] backup already exists: {UTILS_BAK}")

text = UTILS_PATH.read_text()

old = "from torch._six import inf"
new = "from math import inf"

if old in text:
    text = text.replace(old, new, 1)
    UTILS_PATH.write_text(text)
    print("[OK] utils.py torch._six patch applied.")
else:
    print("[INFO] target line not found. Check manually.")

print("\n[CHECK]")
updated = UTILS_PATH.read_text()
idx = updated.find("inf")
print(updated[max(0, idx-120):idx+160])

%cd /content/LMIM/lmim_pretrain

!python -u main_pretrain.py \
  --data_path /content/LMIM/data/IIIT5K \
  --output_dir /content/LMIM/outputs/pretrain_test_1ep_save \
  --log_dir /content/LMIM/logs/pretrain_test_1ep_save \
  --model mae_vit_base_patch4 \
  --batch_size 8 \
  --epochs 1 \
  --num_workers 0 \
  --device cpu

# =========================================================
# OPTIONAL PATCH. qk_scale 에러가 실제로 날 때만 사용
# =========================================================
import os, re

target_files = []
for current_root, _, files in os.walk("/content/LMIM"):
    for fn in files:
        if fn.endswith(".py"):
            target_files.append(os.path.join(current_root, fn))

patched = []
for path in target_files:
    with open(path, "r", encoding="utf-8", errors="ignore") as f:
        text = f.read()

    new_text = text
    new_text = re.sub(r",\s*qk_scale\s*=\s*[^,\)\n]+", "", new_text)
    new_text = re.sub(r"qk_scale\s*=\s*[^,\)\n]+,\s*", "", new_text)

    if new_text != text:
        with open(path, "w", encoding="utf-8") as f:
            f.write(new_text)
        patched.append(path)

print("[OK] qk_scale patch applied to:")
for p in patched:
    print("-", p)

import os

hits = []
for root, _, files in os.walk("/content/LMIM"):
    for fn in files:
        if fn.endswith(".py"):
            path = os.path.join(root, fn)
            with open(path, "r", encoding="utf-8", errors="ignore") as f:
                txt = f.read()
            if "torch._six" in txt:
                hits.append(path)

print(hits)

import torch
print("cuda available:", torch.cuda.is_available())
print("device count:", torch.cuda.device_count())
print("device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "NO GPU")

!nvidia-smi

from pathlib import Path
import shutil

ENGINE_PATH = Path("/content/LMIM/lmim_pretrain/engine_pretrain.py")
ENGINE_BAK2 = Path("/content/LMIM/lmim_pretrain/engine_pretrain.py.bak_cudafix")

assert ENGINE_PATH.exists(), f"engine_pretrain.py not found: {ENGINE_PATH}"

if not ENGINE_BAK2.exists():
    shutil.copy(ENGINE_PATH, ENGINE_BAK2)
    print(f"[OK] backup created: {ENGINE_BAK2}")
else:
    print(f"[OK] backup already exists: {ENGINE_BAK2}")

text = ENGINE_PATH.read_text()

old = "        torch.cuda.synchronize()"
new = """        if torch.cuda.is_available():
            torch.cuda.synchronize()"""

if old in text:
    text = text.replace(old, new, 1)
    ENGINE_PATH.write_text(text)
    print("[OK] cuda synchronize patch applied.")
else:
    print("[INFO] target line not found. Check manually.")

print("\n[CHECK]")
updated = ENGINE_PATH.read_text()
idx = updated.find("synchronize")
print(updated[max(0, idx-150):idx+180])

import torch
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else "No GPU")

%cd /content/LMIM/lmim_pretrain

!python -u main_pretrain.py \
  --data_path /content/LMIM/data/IIIT5K \
  --output_dir /content/LMIM/outputs/pretrain_test_1ep_save \
  --log_dir /content/LMIM/logs/pretrain_test_1ep_save \
  --model mae_vit_base_patch4 \
  --batch_size 16 \
  --epochs 1 \
  --num_workers 2 \
  --device cuda

/content/LMIM/lmim_pretrain 2026-03-15 11:54:41.003511: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1773575681.023697 2525 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1773575681.030361 2525 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered W0000 00:00:1773575681.047852 2525 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1773575681.047878 2525 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1773575681.047882 2525 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1773575681.047887 2525 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. 2026-03-15 11:54:41.052250: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Not using distributed mode [11:54:47.945725] job dir: /content/LMIM/lmim_pretrain [11:54:47.945921] Namespace(batch_size=16, epochs=1, accum_iter=1, log_freq=200, model='mae_vit_base_patch4', mask_ratio=0.75, norm_pix_loss=False, weight_decay=0.05, lr=None, blr=0.001, min_lr=0.0, warmup_epochs=2, data_path='/content/LMIM/data/IIIT5K', output_dir='/content/LMIM/outputs/pretrain_test_1ep_save', log_dir='/content/LMIM/logs/pretrain_test_1ep_save', device='cuda', seed=0, resume='', start_epoch=0, num_workers=2, pin_mem=True, world_size=1, local_rank=-1, dist_on_itp=False, dist_url='env://', teacher_weight='', model_key='model|module', model_prefix='', distributed=False) [11:54:48.047671] Dataset ImageFolder Number of datapoints: 5000 Root location: /content/LMIM/data/IIIT5K StandardTransform Transform: Compose( Resize(size=(32, 128), interpolation=bicubic, max_size=None, antialias=True) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ) [11:54:48.048019] Sampler_train = <torch.utils.data.distributed.DistributedSampler object at 0x7cabf493f740> [11:54:52.197601] Model = LMIMViT( (model_maeenc): MAEEncoder( (patch_embed): PatchEmbed( (proj): Conv2d(3, 768, kernel_size=(4, 4), stride=(4, 4)) (norm): Identity() ) (blocks): ModuleList( (0-11): 12 x Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (norm): LayerNorm((768,), eps=1e-06, elementwise_affine=True) ) (model_maeenc_momentum): MAEEncoder( (patch_embed): PatchEmbed( (proj): Conv2d(3, 768, kernel_size=(4, 4), stride=(4, 4)) (norm): Identity() ) (blocks): ModuleList( (0-11): 12 x Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU(approximate='none') (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (norm): LayerNorm((768,), eps=1e-06, elementwise_affine=True) ) (decoder_embed): Linear(in_features=768, out_features=512, bias=True) (decoder_embed2): Linear(in_features=768, out_features=512, bias=True) (decoder_blocks): ModuleList( (0-7): 8 x LMIMDecoder( (self_attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (multihead_attn): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True) ) (linear1): Linear(in_features=512, out_features=2048, bias=True) (dropout): Dropout(p=0.1, inplace=False) (linear2): Linear(in_features=2048, out_features=512, bias=True) (norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (norm3): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) (activation): GELU(approximate='none') ) ) (decoder_norm): LayerNorm((512,), eps=1e-06, elementwise_affine=True) (decoder_pred): Linear(in_features=512, out_features=768, bias=True) ) [11:54:52.197821] base lr: 1.00e-03 [11:54:52.197909] actual lr: 6.25e-05 [11:54:52.197991] accumulate grad iterations: 1 [11:54:52.198064] effective batch size: 16 [11:54:52.199691] AdamW ( Parameter Group 0 amsgrad: False betas: (0.9, 0.95) capturable: False decoupled_weight_decay: True differentiable: False eps: 1e-08 foreach: None fused: None lr: 6.25e-05 maximize: False weight_decay: 0.0 Parameter Group 1 amsgrad: False betas: (0.9, 0.95) capturable: False decoupled_weight_decay: True differentiable: False eps: 1e-08 foreach: None fused: None lr: 6.25e-05 maximize: False weight_decay: 0.05 ) /content/LMIM/lmim_pretrain/util/misc.py:259: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. self._scaler = torch.cuda.amp.GradScaler() [11:54:52.200256] Start training for 1 epochs [11:54:52.202056] log_dir: /content/LMIM/logs/pretrain_test_1ep_save [11:54:55.799993] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:54:55.800391] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) /content/LMIM/lmim_pretrain/engine_pretrain.py:56: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(): [11:54:58.070936] Epoch: [0] [ 0/312] eta: 0:30:29 lr: 0.000000 loss: 1.9835 (1.9835) time: 5.8653 data: 3.5970 max mem: 4851 [11:54:59.486752] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:54:59.486936] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:02.896997] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:02.897203] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:06.252665] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:06.252871] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:09.710295] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:09.710555] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:12.646890] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:12.647127] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:16.245830] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:16.246035] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:19.132105] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:19.132312] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:23.165723] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:23.165929] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:25.564762] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:25.564989] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:29.552848] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:29.553057] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:32.302386] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:32.302616] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:35.622717] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:35.623157] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:38.967424] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:38.967830] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:42.213097] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:42.213549] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:45.710159] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:45.710361] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:48.934361] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:48.934846] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:52.056937] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:52.057132] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:55.506872] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:55.507083] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:55:58.892493] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:55:58.893055] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:02.380195] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:02.380517] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:02.935788] Epoch: [0] [ 20/312] eta: 0:16:23 lr: 0.000002 loss: 1.7568 (1.7625) time: 3.2432 data: 2.6953 max mem: 5815 [11:56:05.477367] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:05.477589] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:09.433902] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:09.434107] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:12.070869] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:12.071006] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:16.192584] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:16.192774] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:18.292623] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:18.293116] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:22.728275] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:22.728533] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:25.145332] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:25.145555] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:28.844453] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:28.845037] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:31.842982] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:31.843183] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:35.102228] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:35.102456] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:38.291375] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:38.291593] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:41.137227] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:41.137409] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:44.853516] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:44.853820] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:47.585503] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:47.585739] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:50.957830] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:50.958007] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:54.164900] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:54.165241] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:56:57.547298] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:56:57.547810] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:00.342185] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:00.342385] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:04.069036] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:04.069223] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:06.948095] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:06.948982] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:07.520870] Epoch: [0] [ 40/312] eta: 0:14:57 lr: 0.000004 loss: 1.0848 (1.4492) time: 3.2292 data: 2.6786 max mem: 5815 [11:57:10.487336] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:10.487560] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:13.247888] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:13.248366] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:16.769094] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:16.769304] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:19.744805] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:19.745011] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:23.418550] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:23.418747] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:26.395835] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:26.396036] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:29.823700] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:29.823942] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:32.725217] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:32.725441] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:36.411532] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:36.411728] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:39.331174] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:39.331365] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:42.726306] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:42.726570] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:45.678636] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:45.678862] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:49.236752] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:49.237378] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:52.081773] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:52.081976] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:55.672393] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:55.673493] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:57:58.806102] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:57:58.806316] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:02.289830] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:02.290067] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:05.718024] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:05.718215] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:08.825649] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:08.826312] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:12.294084] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:12.294312] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:12.851158] Epoch: [0] [ 60/312] eta: 0:13:48 lr: 0.000006 loss: 0.7756 (1.2309) time: 3.2664 data: 2.7029 max mem: 5815 [11:58:15.437571] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:15.437788] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:19.228315] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:19.228536] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:22.198413] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:22.198616] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:25.590451] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:25.590686] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:28.511207] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:28.511765] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:32.240298] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:32.240540] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:35.070707] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:35.071163] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:38.779631] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:38.779835] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:41.313183] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:41.313386] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:45.029808] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:45.030023] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:47.851134] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:47.851342] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:51.685772] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:51.686222] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:54.416769] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:54.416976] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:58:58.323741] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:58:58.323946] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:01.133212] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:01.133671] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:04.896864] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:04.897074] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:07.991472] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:07.991684] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:11.597856] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:11.598201] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:14.666254] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:14.666471] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:18.277089] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:18.277223] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:18.846570] Epoch: [0] [ 80/312] eta: 0:12:43 lr: 0.000008 loss: 0.5888 (1.0729) time: 3.2997 data: 2.7299 max mem: 5815 [11:59:21.265925] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:21.266142] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:24.963882] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:24.964099] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:28.422606] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:28.422821] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:31.949316] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:31.949540] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:35.680329] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:35.680702] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:38.474572] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:38.474776] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:42.300796] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:42.301012] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:45.123179] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:45.123386] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:49.013860] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:49.014679] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:51.755490] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:51.755711] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:55.477782] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:55.477994] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [11:59:58.681300] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [11:59:58.681546] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:01.857780] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:01.858021] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:05.442328] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:05.442551] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:08.232939] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:08.233176] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:11.848974] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:11.849175] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:14.718670] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:14.718900] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:18.472854] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:18.473213] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:21.408093] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:21.408304] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:25.564883] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:25.565137] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:26.133473] Epoch: [0] [100/312] eta: 0:11:40 lr: 0.000010 loss: 0.4633 (0.9534) time: 3.3643 data: 2.7932 max mem: 5815 [12:00:27.971688] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:27.971889] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:32.388525] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:32.388714] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:34.658251] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:34.658482] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:39.179232] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:39.179931] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:41.758741] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:41.758936] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:45.908075] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:45.908270] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:48.390160] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:48.390378] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:52.452412] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:52.452731] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:55.278984] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:55.279188] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:00:59.090266] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:00:59.090500] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:02.079703] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:02.079841] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:06.208583] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:06.209324] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:09.043246] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:09.043455] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:12.761923] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:12.762129] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:16.169266] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:16.169484] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:19.812287] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:19.812512] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:22.708774] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:22.709008] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:26.162595] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:26.162811] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:29.493902] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:29.494360] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:32.375184] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:32.375387] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:32.943107] Epoch: [0] [120/312] eta: 0:10:35 lr: 0.000012 loss: 0.3757 (0.8585) time: 3.3404 data: 2.7713 max mem: 5815 [12:01:35.939405] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:35.940027] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:38.710531] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:38.710739] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:42.678062] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:42.679714] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:45.548149] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:45.548356] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:49.063340] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:49.063585] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:52.127121] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:52.127315] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:55.549314] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:55.549642] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:01:58.763990] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:01:58.764200] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:01.777866] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:01.778111] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:05.583087] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:05.583282] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:08.630823] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:08.631054] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:12.807326] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:12.807806] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:15.467287] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:15.467533] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:19.419787] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:19.419997] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:22.091334] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:22.091561] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:25.802188] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:25.802409] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:28.990831] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:28.991218] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:32.386610] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:32.386829] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:35.466567] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:35.466777] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:39.140459] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:39.140675] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:39.718874] Epoch: [0] [140/312] eta: 0:09:30 lr: 0.000014 loss: 0.3032 (0.7800) time: 3.3387 data: 2.7642 max mem: 5815 [12:02:42.111279] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:42.111625] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:45.844194] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:45.844399] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:48.970959] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:48.971164] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:52.647755] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:52.648231] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:55.635284] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:55.635503] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:02:59.176451] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:02:59.176633] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:02.160023] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:02.160228] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:06.229248] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:06.229550] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:09.023944] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:09.024241] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:12.853553] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:12.853771] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:15.807720] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:15.807916] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:19.294685] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:19.294894] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:22.809014] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:22.809216] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:25.777133] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:25.777337] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:29.595623] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:29.595836] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:32.016910] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:32.017123] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:36.285437] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:36.285784] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:38.858611] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:38.858812] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:42.969744] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:42.969980] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:45.208228] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:45.208495] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:45.725191] Epoch: [0] [160/312] eta: 0:08:23 lr: 0.000016 loss: 0.2297 (0.7124) time: 3.3003 data: 2.7352 max mem: 5815 [12:03:49.453325] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:49.454018] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:51.847722] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:51.847970] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:56.072937] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:56.073143] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:03:58.822683] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:03:58.822886] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:02.771118] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:02.771947] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:05.680688] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:05.680895] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:09.373289] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:09.373648] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:12.362099] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:12.362293] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:16.059822] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:16.060051] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:19.304141] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:19.304340] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:23.201853] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:23.202053] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:25.780716] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:25.780895] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:29.981547] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:29.981769] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:32.488375] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:32.488598] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:36.519006] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:36.519476] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:39.569197] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:39.570384] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:43.169674] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:43.169878] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:46.120370] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:46.120595] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:49.943541] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:49.943751] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:52.540868] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:52.541387] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:53.137883] Epoch: [0] [180/312] eta: 0:07:18 lr: 0.000018 loss: 0.1780 (0.6538) time: 3.3706 data: 2.7978 max mem: 5815 [12:04:56.637581] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:56.637790] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:04:59.134802] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:04:59.135035] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:03.037245] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:03.037469] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:05.501858] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:05.502056] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:09.433626] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:09.433889] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:12.105870] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:12.106331] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:15.728085] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:15.728346] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:18.843250] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:18.843517] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:22.673812] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:22.674027] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:25.306729] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:25.306943] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:29.300858] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:29.302502] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:32.027149] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:32.027381] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:36.154805] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:36.155016] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:38.818524] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:38.818737] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:43.121062] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:43.121270] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:45.665350] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:45.665583] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:50.304801] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:50.305015] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:52.230294] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:52.230536] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:57.740618] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:57.740837] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:59.203695] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:05:59.203831] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:05:59.720553] Epoch: [0] [200/312] eta: 0:06:11 lr: 0.000020 loss: 0.1397 (0.6026) time: 3.3291 data: 2.7620 max mem: 5815 [12:06:04.668200] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:04.668453] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:05.963825] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:05.963961] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:11.485664] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:11.485867] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:12.726339] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:12.726716] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:18.031036] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:18.031245] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:19.273486] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:19.273711] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:24.272950] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:24.273157] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:26.083569] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:26.083779] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:30.899892] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:30.900090] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:33.139656] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:33.139868] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:36.985208] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:36.985435] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:40.051754] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:40.051954] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:44.186754] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:44.186989] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:46.921898] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:46.922097] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:50.770209] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:50.770792] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:53.293014] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:53.293247] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:57.558261] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:57.558760] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:06:59.913086] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:06:59.913289] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:04.279947] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:04.280148] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:06.713978] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:06.714182] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:07.301105] Epoch: [0] [220/312] eta: 0:05:06 lr: 0.000022 loss: 0.1088 (0.5581) time: 3.3790 data: 2.8218 max mem: 5815 [12:07:11.376932] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:11.377649] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:13.005817] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:13.006059] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:18.142652] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:18.142849] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:19.341804] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:19.342016] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:24.833319] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:24.833542] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:26.095806] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:26.096015] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:31.896176] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:31.896669] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:32.564940] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:32.565171] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:38.765517] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:38.765930] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:39.325596] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:39.325832] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:45.193055] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:45.193262] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:45.749521] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:45.749727] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:51.638720] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:51.638951] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:52.769417] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:52.769941] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:58.303450] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:58.303658] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:07:59.252826] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:07:59.253033] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:04.855914] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:04.856110] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:06.184599] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:06.184820] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:11.713834] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:11.714536] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:13.036107] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:13.037520] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:13.556585] Epoch: [0] [240/312] eta: 0:03:59 lr: 0.000024 loss: 0.0944 (0.5196) time: 3.3127 data: 2.7697 max mem: 5815 [12:08:18.255641] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:18.255863] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:19.830170] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:19.830370] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:24.735383] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:24.735604] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:26.372218] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:26.372497] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:31.388364] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:31.388586] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:33.241619] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:33.241859] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:38.094025] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:38.094229] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:40.252671] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:40.252887] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:44.802619] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:44.802828] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:47.100077] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:47.100282] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:51.782468] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:51.783095] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:54.152151] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:54.152372] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:08:58.580190] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:08:58.580391] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:00.774101] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:00.774324] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:05.237662] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:05.237873] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:07.778936] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:07.779154] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:12.054719] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:12.054911] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:14.251979] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:14.252182] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:18.887122] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:18.887324] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:20.454933] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:20.455135] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:20.971766] Epoch: [0] [260/312] eta: 0:02:53 lr: 0.000026 loss: 0.0821 (0.4861) time: 3.3707 data: 2.8189 max mem: 5815 [12:09:25.517460] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:25.517660] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:27.620482] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:27.620840] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:32.217247] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:32.217500] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:34.573633] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:34.573836] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:38.764712] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:38.764920] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:41.237816] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:41.238471] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:45.385999] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:45.386231] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:48.060760] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:48.060960] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:52.255712] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:52.255895] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:54.967850] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:54.968053] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:09:59.290377] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:09:59.290597] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:01.835566] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:01.835791] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:05.984395] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:05.984622] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:08.603367] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:08.603592] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:12.498313] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:12.498485] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:15.366970] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:15.367187] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:19.030771] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:19.030985] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:21.888922] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:21.889128] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:25.883813] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:25.884293] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:28.846068] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:28.846271] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:29.423972] Epoch: [0] [280/312] eta: 0:01:46 lr: 0.000028 loss: 0.0722 (0.4567) time: 3.4226 data: 2.8528 max mem: 5815 [12:10:32.488364] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:32.488770] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:35.391633] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:35.391841] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:39.210358] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:39.210573] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:41.987315] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:41.987559] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:46.225540] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:46.226027] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:48.580481] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:48.580719] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:53.213299] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:53.213519] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:55.740689] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:55.740889] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:10:59.824968] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:10:59.825172] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:02.772800] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:02.773029] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:06.721326] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:06.721565] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:09.312364] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:09.312826] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:13.031520] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:13.031717] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:15.900841] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:15.901375] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:19.825179] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:19.825374] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:22.711261] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:22.711484] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:26.438102] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:26.438340] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:29.081528] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:29.081721] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:33.317502] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:33.318379] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:35.564652] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:35.564888] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:36.082395] Epoch: [0] [300/312] eta: 0:00:40 lr: 0.000030 loss: 0.0660 (0.4307) time: 3.3329 data: 2.7636 max mem: 5815 [12:11:40.019265] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:40.019491] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:42.641522] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:42.641737] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:46.987078] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:46.987285] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:49.390444] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:49.390663] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:53.580983] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:53.581191] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:11:55.959239] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:11:55.959786] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:00.398452] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:12:00.398656] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:02.712087] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:12:02.712303] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:06.848927] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:12:06.849144] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:09.363869] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:12:09.364074] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:13.390382] [DEBUG] samples.shape = torch.Size([16, 3, 32, 128]) [12:12:13.390598] [DEBUG] samples_aug.shape = torch.Size([16, 3, 32, 128]) [12:12:13.946609] Epoch: [0] [311/312] eta: 0:00:03 lr: 0.000031 loss: 0.0632 (0.4177) time: 3.3330 data: 2.7697 max mem: 5815 [12:12:13.992688] Epoch: [0] Total time: 0:17:21 (3.3391 s / it) [12:12:13.992857] Averaged stats: lr: 0.000031 loss: 0.0632 (0.4177) [12:12:25.340682] Training time 0:17:33

총 학습 완료
Training time 0:17:33
최종 평균 loss: 0.4177
step당 약 3.34 s / it
GPU 메모리 약 5815 MB
즉 LMIM pretraining 파이프라인은 정상 동작 확인 완료

그리고 더 중요한 건, loss가 꽤 선명하게 내려갔어.

시작: 1.9835
20 step 부근: 1.7625
40 step 부근: 1.4492
100 step 부근: 0.9534
200 step 부근: 0.6026
최종 평균: 0.4177

이건 발표에서 충분히 말할 수 있어.
“IIIT5K 기반 LMIM pretraining 환경을 구축했고, 1 epoch에서 reconstruction-style training loss가 안정적으로 감소함을 확인

'개인 프로젝트 > 논문리뷰' 카테고리의 다른 글

[논문] Scene Text Recognition (0)	2026.03.16
LMIM 적용 코드 (0)	2026.03.05
LMIM +IIIT5K 논문 (0)	2026.03.02
Linguistics-aware_Masked_Image_Modeling_for_Self-supervised_Scene_Text_Recognition_CVPR_2025_paper (0)	2026.03.02
LMIM 코드 (0)	2026.02.25

Learning_EunBi

LMIM 모델링 코드

'개인 프로젝트 > 논문리뷰' 카테고리의 다른 글

댓글

티스토리툴바

LMIM 모델링 코드

'개인 프로젝트 > 논문리뷰' 카테고리의 다른 글

관련글

댓글

티스토리툴바