隨著AI生成內容(AIGC)在多個領域廣泛應用,StableDiffusion模型和生態正在迅速發展。作為計算能力中心和各種需求的交匯點,PAI不僅深入探究AIGC的基礎能力和預訓練模型,還積極應對各類垂直行業內容生成的挑戰。本文以服飾領域為例介紹如何基于PAI的基礎能力快速搭建端到端的虛擬上裝解決方案。
背景信息
您可以參考快速上手實現虛擬上裝來快速體驗虛擬上裝效果,或者參考端到端操作實現虛擬上裝來完成全流程操作。
目前PAI端到端虛擬上裝解決方案提供以下兩種使用方式:
Lora(Low-Rank Adaptive Relational Attention)是一種在圖像生成領域廣泛應用的算法。該算法可以通過添加少量的可微調參數和使用少量的數據集,快速進行模型的微調,從而為模特、動作、背景等提供廣泛的生成空間。但這種訓練方式無法完全確保服飾的細節和原始圖像完全一致。
效果展示圖如下:
在SDWebUI中,您可以同時利用多個ControlNet來完成圖像生成中的部分內容編輯。這意味著您可以在完整保留原始圖像衣物的細節后,對其余細節部分進行創意生成,例如對人物和背景進行重繪。
PAI利用提供的原始圖像和需要保留的衣物掩模圖,通過結合Canny和OpenPose的ControlNet,可以在完全保留衣物細節的同時,進行背景風格的重繪,從而達到展示效果圖中的結果。
快速上手實現虛擬上裝
您可以基于ControlNet模型、chilloutmix模型及訓練好的服飾LoRA模型快速體驗虛擬上裝效果,具體操作步驟如下:
步驟一:部署服務
打開部署服務頁面,具體操作請參見控制臺上傳部署。
在對應配置編輯區域,單擊JSON獨立部署。并在編輯框中配置以下內容。
{ "cloud": { "computing": { "instance_type": "ecs.gn6v-c8g1.2xlarge" } }, "containers": [ { "image": "eas-registry-vpc.${region}.cr.aliyuncs.com/pai-eas/stable-diffusion-webui:3.1", "port": 8000, "script": "./webui.sh --listen --port=8000 --api" } ], "features": { "eas.aliyun.com/extra-ephemeral-storage": "100Gi" }, "metadata": { "cpu": 8, "enable_webservice": true, "gpu": 1, "instance": 1, "memory": 32000, "name": "tryon_sdwebui" }, "storage": [ { "mount_path": "/code/stable-diffusion-webui/models/ControlNet/", "oss": { "path": "oss://pai-quickstart-${region}/aigclib/models/controlnet/official/", "readOnly": true }, "properties": { "resource_type": "model" } }, { "mount_path": "/code/stable-diffusion-webui/models/annotator/openpose/", "oss": { "path": "oss://pai-quickstart-${region}/aigclib/models/controlnet/openpose/", "readOnly": true }, "properties": { "resource_type": "model" } }, { "mount_path": "/code/stable-diffusion-webui/models/Stable-diffusion/", "oss": { "path": "oss://pai-quickstart-${region}/aigclib/models/custom_civitai_models/chilloutmix/", "readOnly": true }, "properties": { "resource_type": "model" } }, { "mount_path": "/code/stable-diffusion-webui/models/Lora/", "oss": { "path": "oss://pai-quickstart-${region}/aigclib/models/lora_models/tryon/", "readOnly": true }, "properties": { "resource_type": "model" } } ] }
上述配置中分別掛載了ControlNet模型、chilloutmix模型及訓練好的服飾LoRA模型。其中:
單擊部署,當服務狀態變為運行中時,表明服務部署成功。
說明如果部署按鈕置灰,請檢查復制的JSON文本格式是否有問題。
服務部署成功后,單擊服務名稱進入服務詳情頁面,單擊查看調用信息獲取SDWebUI服務的訪問地址和Token,并保存到本地。
步驟二:調用服務
方案一:使用Lora模型實現虛擬上裝
支持使用以下兩種方式調用服務:
Web應用方式
服務部署成功后,單擊服務方式列下的查看WEB應用,進入WebUI界面,即可開始調試服務。
在文生圖頁簽配置以下參數,并單擊生成。
參數
示例值
提示詞(Prompt)
<lora:purple_shirt:0.7>, my_shirt, my_pants, an European supermodel, full body, solo, grassland, sunny, house, sunflowers, blue sky, best quality, ultra high res, (photorealistic:1.4)
說明my_shirt、my_pants是服飾tag必須添加到Prompt,lora權重推薦配置0.7-0.8。
反向提示詞(Negative prompt)
nsfw, no head, no leg, paintings, sketches, (worst quality:2), (low quality:2),(normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worstquality,jpegartifacts, signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,jpeg artifacts,extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions, gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,more than one head, more than one face, more than one body, (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)
面部修復
選中面部修復;寬度和高度分別配置為768×1024或640×768;
采樣迭代步數(Steps)
建議配置30,或使用默認配置。
參照下圖配置參數,生成如圖效果圖,您的效果圖以實際為準。
API調用方式
參考以下Python腳本通過API調用服務:
import os import io import base64 import requests import copy import numpy as np from PIL import Image def get_payload(prompt, negative_prompt, steps, width=512, height=512, batch_size=1, seed=-1): print(f'width: {width}, height: {height}') res = { 'prompt': prompt, 'negative_prompt': negative_prompt, 'seed': seed, 'batch_size': batch_size, 'n_iter': 1, 'steps': steps, 'cfg_scale': 7.0, 'image_cfg_scale': 1.5, 'width': width, 'height': height, 'restore_faces': True, 'override_settings_restore_afterwards': True } return res if __name__ == '__main__': sdwebui_url = "<service_URL>" sdwebui_token = "<service_Token>" save_dir = 'lora_outputs' prompt = '<lora:purple_shirt:0.75>, my_shirt, my_pants, an European supermodel, solo, grassland, sunny, house, sunflowers, blue sky, best quality, ultra high res, (photorealistic:1.4)' negative_prompt = 'nfsw, no head, no leg, no feet, paintings, sketches, (worst quality:2), (low quality:2),\ (normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,\ skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers,\ extra digit, fewer digits, cropped, worstquality,jpegartifacts,\ signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,\ mutation,deformed,worst quality,jpeg artifacts,\ extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,\ too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,\ gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,\ more than one head, more than one face, more than one body,\ (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),\ Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)' steps = 30 batch_size = 4 headers = {"Authorization": sdwebui_token} payload = get_payload(prompt, negative_prompt, steps=steps, width=768, height=1024, batch_size=batch_size) response = requests.post(url=f'{sdwebui_url}/sdapi/v1/txt2img', headers=headers, json=payload) if response.status_code != 200: raise RuntimeError(response.status_code, response.text) r = response.json() os.makedirs(save_dir, exist_ok=True) images = [Image.open(io.BytesIO(base64.b64decode(i))) for i in r['images']] for i, img in enumerate(images): img.save(os.path.join(save_dir, f'image_{i}.jpg'))
其中:<service_URL>替換為步驟一中查詢的服務訪問地址;<service_Token>替換為步驟一中查詢的服務Token。
服務調用成功后,如下效果圖生成到
lora_outputs
目錄。您的效果圖以實際為準。
方案二:基于ControlNet實現人臺重繪
Web應用方式
以摳好的服飾Mask圖為例進行演示。單擊服務方式列下的查看Web應用啟動WebUI,在WebUI界面圖生圖頁簽配置以下參數。
基礎配置如下:
提示詞(Prompt):
grassland, sunny, house, sunflowers, blue sky, an European supermodel, full body, best quality, ultra high res, (photorealistic:1.4)
。反向提示詞(Negative prompt):
nsfw, no head, no leg, paintings, sketches, (worst quality:2), (low quality:2),(normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worstquality,jpegartifacts, signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,jpeg artifacts,extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions, gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,more than one head, more than one face, more than one body, (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)
。蒙版模式選擇重繪非蒙版內容。
蒙版蒙住的內容選擇填充。
采樣迭代步數(Steps):建議配置為30,或使用默認配置。
選擇面部修復,寬度設置為768;高度設置為1024;重繪幅度(Denoising)配置為1。
單擊ControlNet Unit0,在單張圖像頁簽上傳服務摳圖,并參照下圖配置參數。
單擊ControlNet Unit1,在單張圖像頁簽上傳原圖,并參照下圖配置參數。
并單擊生成,生成如下效果圖:
API調用方式
在下載demo數據的當前目錄,參考以下Python腳本通過API調用服務:
import os import io import base64 import requests from PIL import Image def sdwebui_b64_img(image: Image): buffered = io.BytesIO() image.save(buffered, format="PNG") img_base64 = 'data:image/png;base64,' + str(base64.b64encode(buffered.getvalue()), 'utf-8') return img_base64 def get_payload(human_pil, cloth_mask, cloth_pil, prompt, negative_prompt, steps, batch_size=1, seed=-1): input_image = sdwebui_b64_img(human_pil) mask_image = sdwebui_b64_img(cloth_mask) width = human_pil.size[0] height = human_pil.size[1] print(f'width: {width}, height: {height}') res = { 'init_images': [input_image], 'mask': mask_image, 'resize_mode':0, 'denoising_strength': 1.0, 'mask_blur': 4, 'inpainting_fill': 0, 'inpaint_full_res': False, 'inpaint_full_res_padding': 0, 'inpainting_mask_invert': 1, 'initial_noise_multiplier': 1, 'prompt': prompt, 'negative_prompt': negative_prompt, 'seed': seed, 'batch_size': batch_size, 'n_iter': 1, 'steps': steps, 'cfg_scale': 7.0, 'image_cfg_scale': 1.5, 'width': width, 'height': height, 'restore_faces': True, 'tiling': False, 'override_settings_restore_afterwards': True, 'sampler_name': 'Euler a', 'sampler_index': 'Euler a', "save_images": False, 'alwayson_scripts': { 'ControlNet': { 'args': [ { 'input_image': sdwebui_b64_img(cloth_pil), 'module': 'canny', 'model': 'control_sd15_canny [fef5e48e]', 'weight': 1.0, 'resize_mode': 'Scale to Fit (Inner Fit)', 'guidance': 1.0 }, { 'input_image': input_image, 'module': 'openpose_full', 'model': 'control_sd15_openpose [fef5e48e]', 'weight': 1.0, 'resize_mode': 'Scale to Fit (Inner Fit)', 'guidance': 1.0 } ] } } } return res if __name__ == '__main__': sdwebui_url = "<service_URL>" sdwebui_token = "<service_Token>" raw_image_path = '1.png' # 原圖 cloth_path = 'cloth_pil.jpg' # 服飾摳圖 cloth_mask_path = 'cloth_mask.png' # 服飾mask steps = 30 batch_size = 4 prompt = 'grassland, sunny, house, sunflowers, blue sky, an European supermodel, full body, best quality, ultra high res, (photorealistic:1.4)' negative_prompt = 'nfsw, no head, no leg, no feet, paintings, sketches, (worst quality:2), (low quality:2),\ (normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,\ skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers,\ extra digit, fewer digits, cropped, worstquality,jpegartifacts,\ signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,\ mutation,deformed,worst quality,jpeg artifacts,\ extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,\ too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,\ gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,\ more than one head, more than one face, more than one body,\ (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),\ Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)' save_dir = 'repaint_outputs' os.makedirs(save_dir, exist_ok=True) headers = {"Authorization": sdwebui_token} human_pil = Image.open(raw_image_path) cloth_mask_pil = Image.open(cloth_mask_path) cloth_pil = Image.open(cloth_path) payload = get_payload(human_pil, cloth_mask_pil, cloth_pil, prompt, negative_prompt, steps=steps, batch_size=batch_size) response = requests.post(url=f'{sdwebui_url}/sdapi/v1/img2img', headers=headers, json=payload) if response.status_code != 200: raise RuntimeError(response.status_code, response.text) r = response.json() images = [Image.open(io.BytesIO(base64.b64decode(i))) for i in r['images']] for i, img in enumerate(images): img.save(os.path.join(save_dir, f'image_{i}.jpg'))
其中:<service_URL>替換為步驟一中查詢的服務訪問地址;<service_Token>替換為步驟一中查詢的服務Token。
服務調用成功后,如下效果圖生成到當前目錄下的
repaint_outputs
目錄。您的效果圖以實際為準。
端到端操作實現虛擬上裝
方式一:使用Lora進行服飾訓練和可控生成
以下是訓練自有服裝Lora模型的端到端操作步驟:
下載數據集。
準備訓練模型需要的服飾數據(大約10~20張圖片),您可以自行準備訓練數據,也可以下載demo數據。數據格式可以參考demo數據,以下內容以demo數據為例。
在JupyterLab中使用以下命令下載并解壓數據集:
# 安裝unzip解壓工具 !sudo apt-get update && sudo apt-get install unzip #下載并解壓數據集 !wget http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/projects/tryon/data/data.zip && unzip data.zip
部署并調用SAM服務,來實現圖像摳圖。
Segment Anything Model(SAM)是由META提出的分割大模型,它可以為圖像中的所有對象生成掩碼(mask)。SAM模型將用于獲取人體區域Mask,去除背景區域,讓模型更專注于學習服飾的特征,提升Lora模型的訓練效果。
打開部署服務頁面,具體操作請參見控制臺上傳部署。
在對應配置編輯區域,單擊JSON獨立部署。并在編輯框中配置以下內容。
{ "cloud": { "computing": { "instance_type": "ecs.gn6v-c8g1.2xlarge" } }, "containers": [ { "command": "python api.py --port 8000", "image": "eas-registry-vpc.${region}.cr.aliyuncs.com/pai-eas/pai-quickstart:pytorch1.13.1_cuda11.6_grounded_sam_0.1", "port": 8000 } ], "features": { "eas.aliyun.com/extra-ephemeral-storage": "100Gi" }, "metadata": { "cpu": 8, "gpu": 1, "instance": 1, "memory": 32000, "name": "grounded_sam_api", "rpc": { "keepalive": 500000 } }, "name": "grounded_sam_api" }
其中:
name:自定義服務名稱,同地域內唯一。
containers.image:將
${region}
替換為當前地域ID,例如:華東2(上海)為cn-shanghai,其他地域ID,請參見地域和可用區。
單擊部署,當服務狀態變為運行中時,表明服務部署成功。
說明如果部署按鈕置灰,請檢查復制的JSON文本格式是否有問題。
單擊服務方式列下的調用信息,查看服務訪問地址和Token,并保存到本地。
在下載demo數據集的當前目錄,使用以下代碼調用服務生成Mask圖,以下載的
data
目錄下的demo數據為例,生成的Mask圖保存到crop_data
目錄。import os import io import copy import glob import requests import json import base64 import cv2 from PIL import Image import numpy as np url = "<service_url>" token = "<service_token>" image_dir = 'data/' save_dir = 'crop_data/' def extract_masked_part(mask, image, bg_value=255): """Extract masked part from original image. """ h, w = mask.shape[-2:] mask_image = (1 - mask.reshape(h, w, 1)) * bg_value + mask.reshape( h, w, 1) * image mask_image = mask_image.astype(np.uint8) return mask_image def encode_file_to_base64(f): with open(f, "rb") as file: encoded_string = base64.b64encode(file.read()) base64_str = str(encoded_string, "utf-8") return base64_str def post(image_path, url, token): base64_string = encode_file_to_base64(image_path) request_body = { "img_path": base64_string, "text_prompt": "people", "box_threshold": 0.75, "text_threshold": 0.7 } headers = {"Authorization": token} resp = requests.post(url=url + "/grounded_sam", headers=headers, json=request_body) print("sam status code:", resp.status_code) if resp.status_code != 200: raise RuntimeError(f'sam status code: {resp.status_code}', resp.text, \ "Please try to lower the `box_threshold` or `text_threshold`, maybe it's too high and detect nothing.") return base64.b64decode(resp.text) mask_save_dir = "masks" os.makedirs(save_dir, exist_ok=True) os.makedirs(mask_save_dir, exist_ok=True) imgs_list = glob.glob(os.path.join(image_dir, '*.jpg')) for image_path in imgs_list: mask_result = post(image_path, url, token) mask = np.array(Image.open(io.BytesIO(mask_result))) mask_0_255 = copy.deepcopy(mask) mask_0_255[mask_0_255==1] = 255 mask_save_path = os.path.join(mask_save_dir, os.path.basename(image_path)) Image.fromarray(mask_0_255).save(mask_save_path) img_rgb = np.array(Image.open(image_path)) human_pil = Image.fromarray(extract_masked_part(mask, img_rgb)) save_path = os.path.join(save_dir, os.path.basename(image_path)) human_pil.save(save_path) print(f'successfully save image: {save_path}')
其中:<service_url>替換為實際查詢的服務訪問地址;<service_token>替換為實際查詢的服務Token。
標注數據。
使用快速開始中的deepdanbooru_image-caption模型對
crop_data
目錄圖像進行標注,并將標注文件輸出到crop_data
目錄。具體操作步驟如下:在快速開始首頁搜索模型deepdanbooru_image-caption,并單擊模型進入模型詳情頁面,具體操作,請參見查找適合業務的模型。
在模型詳情頁面,單擊部署,并在彈出的計費提醒對話框中單擊確定。
頁面自動跳轉到服務詳情頁面。當服務狀態變為運行中時,表明服務部署成功。
在服務詳情頁面資源信息區域,單擊查看調用信息,查看服務的訪問地址和Token,并保存到本地。
在下載demo數據集的當前目錄,使用以下代碼調用服務,對生成的Mask圖進行數據標注。
import os import json import glob import base64 import requests url = "<service_url>" token = "<service_Token>" # image_dir = 'data/' image_dir = 'crop_data/' def encode_file_to_base64(f): with open(f, "rb") as file: encoded_string = base64.b64encode(file.read()) base64_str = str(encoded_string, "utf-8") return base64_str def post(image_path, url, token): base64_string = encode_file_to_base64(image_path) request_body = { "image": base64_string, 'score_threshold': 0.6 } headers = {"Authorization": token} resp = requests.post(url=url, headers=headers, json=request_body) results = json.loads(resp.content.decode('utf-8')) print("image caption status code:", resp.status_code) return results imgs_list = glob.glob(os.path.join(image_dir, '*.jpg')) for image_path in imgs_list: results = post(image_path, url, token) print(f'text results: {results}') img_name = os.path.basename(image_path) txt_path = os.path.join(image_dir, os.path.splitext(img_name)[0] + '.txt') with open(txt_path, 'w') as f: f.write(', '.join(results))
其中:<service_url>替換為實際查詢的服務訪問地址;<service_token>替換為實際查詢的服務Token。
生成的數據格式如下:
│───crop_data │ │───img_0.jpg │ │───img_0.txt │ │───img_1.jpg │ │───img_1.txt │ │───...
說明您可以手動優化標注文件文本內容,將單件服飾的多個描述詞合并為一個Tag,這樣可以使模型更加專注于學習一個明確的標簽,可以提升模型的效果。
訓練并部署Lora模型。
將標注好的數據(即crop_data目錄下的數據)上傳到OSS路徑中,假設OSS路徑為:
oss://{your bucket}/tryon/data/
。如何上傳數據,請參見控制臺上傳文件。在快速開始首頁搜索模型custom_civitai_models,并單擊模型進入模型詳情頁面,具體操作,請參見查找適合業務的模型。
在模型訓練區域,配置以下關鍵參數,其他參數保持默認即可,并單擊訓練。
參數
描述
訓練數據集
在下拉列表中選擇OSS文件或目錄,并選擇步驟a標注數據傳入的OSS路徑。
自定義模型
在下拉列表中選擇OSS文件或目錄,并輸入模板路徑:
oss://pai-quickstart-${region}.oss-${region}-internal.aliyuncs.com/aigclib/models/custom_civitai_models/chilloutmix/
。其中${region}
需要替換為當前地域ID,例如:華東2(上海)為cn-shanghai,其他地域ID,請參見地域和可用區。超參數配置
可以調整以下參數配置:
max_epochs:200。
height:768。
width:640。
lora_attn_rank:32。
模型輸出路徑
選擇OSS Bucket路徑,用來保存訓練生成的模型文件。
頁面自動跳轉到任務詳情頁面,訓練過程大約持續1~2個小時。
Lora訓練完成后,在模型部署區域直接單擊部署,并在彈出的計費提醒對話框中單擊確定,即可完成模型部署。當服務狀態變為運行中,表明服務部署成功。
在服務詳情頁面資源信息區域,單擊查看調用信息,查看服務的訪問地址和Token,并保存到本地。
調試模型。
Lora模型服務部署成功后,在服務詳情頁面右側單擊查看WEB應用,進入WebUI界面,即可開始調試服務。
在文生圖頁簽配置以下參數,并單擊生成。
參數
示例值
提示詞(Prompt)
<lora:pytorch_lora_weights:0.7>,belt, denim, jeans, pants, shirt, an international supermodel, full body, best quality, ultra high res, (photorealistic:1.4), grassland, flowers, sunshine, wooden house
說明將調用deepdanbooru_image-caption服務生成的txt文本中關于服飾的描述詞拷貝進提示詞中,其他背景人物等相關詞請自行填寫。
反向提示詞(Negztive prompt)
nfsw, no head, no leg, paintings, sketches, (worst quality:2), (low quality:2),(normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worstquality,jpegartifacts, signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,mutation,deformed,worst quality,jpeg artifacts,extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions, gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,more than one head, more than one face, more than one body, (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)
采樣迭代步數(Steps)
30
面部修復
選中面部修復,并將寬度配置為640;高度配置為768;每批數量配置為4。
輸出如下效果圖,您的效果圖以實際為準。
方式二:基于SAM和ControlNet的人臺重繪技術
部署服務。
部署SDWebUI服務并查看訪問地址和Token,具體操作請參見步驟一:部署服務。
部署SAM服務并查看訪問地址和Token,具體操作請參見部署并調用SAM服務,來實現圖像摳圖。
通過API調用服務,端到端實現摳圖和重繪功能。
下載測試數據到本地:test.png。
在下載測試數據的當前目錄執行以下代碼,使用API調用服務,生成效果圖至
outputs
目錄。import os import io import glob import base64 import requests import copy import numpy as np from PIL import Image import matplotlib.pyplot as plt def extract_masked_part(mask, image, bg_value=255): """Extract masked part from original image. mask is 0, 1 value. """ h, w = mask.shape[-2:] mask_image = (1 - mask.reshape(h, w, 1)) * bg_value + mask.reshape(h, w, 1) * image mask_image = mask_image.astype(np.uint8) return mask_image def sdwebui_b64_img(image: Image): buffered = io.BytesIO() image.save(buffered, format="PNG") img_base64 = 'data:image/png;base64,' + str(base64.b64encode(buffered.getvalue()), 'utf-8') return img_base64 def sam_b64_img(img_path): with open(img_path, "rb") as file: encoded_string = base64.b64encode(file.read()) base64_str = str(encoded_string, "utf-8") return base64_str def post_sam(url, token, image_path, text_prompt, box_threshold=0.6, text_threshold=0.3): base64_string = sam_b64_img(image_path) request_body = { "img_path": base64_string, "text_prompt": text_prompt, "box_threshold": box_threshold, "text_threshold": text_threshold } headers = {"Authorization": token} resp = requests.post(url=url + "/grounded_sam", headers=headers, json=request_body) print("sam status code:", resp.status_code) if resp.status_code != 200: raise RuntimeError(f'sam status code: {resp.status_code}', resp.text, \ "Please try to lower the `box_threshold` or `text_threshold`, maybe it's too high and detect nothing.") return base64.b64decode(resp.text) def get_sam_results(url, token, image_path, save_dir='./outputs'): # get human mask, remove background mask_result = post_sam(url, token, image_path, text_prompt='human', box_threshold=0.6, text_threshold=0.3) img_rgb = np.array(Image.open(image_path)) mask = np.array(Image.open(io.BytesIO(mask_result))) mask_0_255 = copy.deepcopy(mask) mask_0_255[mask_0_255==1] = 255 mask_save_path = os.path.join(save_dir, 'human_mask.png') Image.fromarray(mask_0_255).save(mask_save_path) human_pil = Image.fromarray(extract_masked_part(mask, img_rgb)) human_pil_path = os.path.join(save_dir, 'human_pil.jpg') human_pil.save(human_pil_path) # get cloth mask,這里適用于上下裝的場景 cloth_mask_shirt = post_sam(url, token, human_pil_path, text_prompt='shirt', box_threshold=0.3, text_threshold=0.2) cloth_mask_pants = post_sam(url, token, human_pil_path, text_prompt='pants', box_threshold=0.3, text_threshold=0.2) cloth_mask = np.array(Image.open(io.BytesIO(cloth_mask_shirt))) + np.array(Image.open(io.BytesIO(cloth_mask_pants))) # 單件衣服 # cloth_mask_results = post_sam(url, token, human_pil_path, text_prompt='cloth', box_threshold=0.3, text_threshold=0.2) # cloth_mask = np.array(Image.open(io.BytesIO(cloth_mask_results))) cloth_mask_save_path = os.path.join(save_dir, 'cloth_mask.png') cloth_mask_0_255 = copy.deepcopy(cloth_mask) cloth_mask_0_255[cloth_mask_0_255==1] = 255 Image.fromarray(cloth_mask_0_255).save(cloth_mask_save_path) cloth_pil = Image.fromarray(extract_masked_part(cloth_mask, img_rgb)) cloth_pil.save(os.path.join(save_dir, 'cloth_pil.jpg')) return human_pil, Image.fromarray(cloth_mask_0_255), cloth_pil def get_payload(human_pil, cloth_mask, cloth_pil, prompt, negative_prompt, steps, batch_size=1, seed=-1): input_image = sdwebui_b64_img(human_pil) mask_image = sdwebui_b64_img(cloth_mask) width = human_pil.size[0] height = human_pil.size[1] print(f'width: {width}, height: {height}') res = { 'init_images': [input_image], 'mask': mask_image, 'resize_mode':0, 'denoising_strength': 1.0, 'mask_blur': 4, 'inpainting_fill': 0, 'inpaint_full_res': False, 'inpaint_full_res_padding': 0, 'inpainting_mask_invert': 1, 'initial_noise_multiplier': 1, 'prompt': prompt, 'negative_prompt': negative_prompt, 'seed': seed, 'batch_size': batch_size, 'n_iter': 1, 'steps': steps, 'cfg_scale': 7.0, 'image_cfg_scale': 1.5, 'width': width, 'height': height, 'restore_faces': True, 'tiling': False, 'override_settings_restore_afterwards': True, 'sampler_name': 'Euler a', 'sampler_index': 'Euler a', "save_images": False, 'alwayson_scripts': { 'ControlNet': { 'args': [ { 'input_image': sdwebui_b64_img(cloth_pil), 'module': 'canny', 'model': 'control_sd15_canny [fef5e48e]', 'weight': 1.0, 'resize_mode': 'Scale to Fit (Inner Fit)', 'guidance': 1.0 }, { 'input_image': input_image, 'module': 'openpose_full', 'model': 'control_sd15_openpose [fef5e48e]', 'weight': 1.0, 'resize_mode': 'Scale to Fit (Inner Fit)', 'guidance': 1.0 } ] } } } return res if __name__ == '__main__': sam_url = "<sam_service_URL>" sam_token = "<sam_service_Token>" sdwebui_url = "<sdwebui_service_URL>" sdwebui_token = "<sdwebui_service_Token>" save_dir = './outputs' image_path = 'test.png' os.makedirs(save_dir, exist_ok=True) prompt = 'grassland, sunny, house, sunflowers, blue sky, an European supermodel, full body, best quality, ultra high res, (photorealistic:1.4)' negative_prompt = 'nfsw, no head, no leg, no feet, paintings, sketches, (worst quality:2), (low quality:2),\ (normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes,\ skin blemishes, age spot, glans,bad anatomy,bad hands, text, error, missing fingers,\ extra digit, fewer digits, cropped, worstquality,jpegartifacts,\ signature, watermark, username,blurry,bad feet,poorly drawn hands,poorly drawn face,\ mutation,deformed,worst quality,jpeg artifacts,\ extra fingers,extra limbs,extra arms,extra legs,malformed limbs,fused fingers,\ too many fingers,long neck,cross-eyed,mutated hands,polar lowres,bad body,bad proportions,\ gross proportions,text,error,missing arms,missing legs,extra digit, duplicate,\ more than one head, more than one face, more than one body,\ (extra digit and hands and fingers and legs and arms:1.4),(interlocked fingers:1.2),\ Ugly Fingers, (deformed fingers:1.2), (long fingers:1.2)' steps = 30 batch_size = 2 headers = {"Authorization": sdwebui_token} human_pil, cloth_mask_pil, cloth_pil = get_sam_results(sam_url, sam_token, image_path, save_dir) payload = get_payload(human_pil, cloth_mask_pil, cloth_pil, prompt, negative_prompt, steps=steps, batch_size=batch_size) response = requests.post(url=f'{sdwebui_url}/sdapi/v1/img2img', headers=headers, json=payload) if response.status_code != 200: raise RuntimeError(response.status_code, response.text) r = response.json() images = [Image.open(io.BytesIO(base64.b64decode(i))) for i in r['images']] for i, img in enumerate(images): img.save(os.path.join(save_dir, f'image_{i}.jpg')) # 顯示效果圖 imgs_list = glob.glob(os.path.join(save_dir, "*.jpg")) N=1 M=len(imgs_list) for i, img_path in enumerate(imgs_list): img = plt.imread(img_path) plt.subplot(N, M, i+1) plt.imshow(img) plt.xticks([]) plt.yticks([]) plt.show()
其中:<sam_service_URL>替換為SAM服務的訪問地址;<sam_service_Token>替換為SAM服務的Token;<sdwebui_service_URL>替換為SDWebUI服務的訪問地址;<sdwebui_service_Token>替換為SDWebUI服務的Token。