How VAE and UNet sample size work in HF Diffusers?

192 Views Asked by MAPLE LEAF At 28 September 2023 at 04:35

Does anyone know how sample size work in SD's VAE and UNet? All I know is the SD v1.5 was trained with 512*512, so it can generate 512*512 more properly. But when I set the pipeline like 384*384 or even 768*768, it seems it can generate it as well (but less correctly).

I have been search in official github all around, and I found that the sample size setting in UNet and VAE seems doesn't matter, as it didn't directly use it. (https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_condition.py#L163C9-L163C20)

sample_size_screenshot

I wondering could the SD (or the LDM) have ability of generalization to different sample size, so it's possible to inference in any width and height? If so, how its work in training and inference?

Original Q&A

How VAE and UNet sample size work in HF Diffusers?

There are 0 best solutions below

Related Questions in PYTORCH

Related Questions in AUTOENCODER

Related Questions in STABLE-DIFFUSION

Related Questions in UNET-NEURAL-NETWORK

Related Questions in DIFFUSERS

Trending Questions

Popular # Hahtags

Popular Questions