mtanti
/

face-describer

Model card Files Files and versions

mtanti commited on Apr 24, 2025

Commit

748a979

·

verified ·

1 Parent(s): 396d978

Update README.md

Files changed (1) hide show

README.md +37 -3

README.md CHANGED Viewed

@@ -1,3 +1,37 @@
----
-license: mit
----

+---
+license: mit
+language:
+- en
+base_model:
+- microsoft/git-base-coco
+---
+Given a photo of a face, will describe it.
+Be careful as it can be unflattering.
+Based on the GIT-Base-COCO image to text model and fine-tuned on [Face2Text](https://zenodo.org/records/10973388).
+How to use:
+```
+from transformers import AutoProcessor, AutoModelForCausalLM, AutoTokenizer
+import cv2
+DEVICE = 'cpu' # cpu or cuda
+IMG_PATH = 'face.png'
+processor = AutoProcessor.from_pretrained('microsoft/git-base-coco')
+model = AutoModelForCausalLM.from_pretrained('mtanti/face-describer')
+tokeniser = AutoTokenizer.from_pretrained('microsoft/git-base-coco')
+model.eval()
+model.to(DEVICE)
+img = cv2.imread(IMG_PATH)
+tensor_img = processor(
+    images=[img[:, :, ::-1]],
+    return_tensors='pt',
+)['pixel_values'].to(DEVICE)
+desc = tokeniser.decode(
+    model.generate(pixel_values=tensor_img, max_length=100, repetition_penalty=1.05, do_sample=True)[0, :],
+    skip_special_tokens=True,
+)
+```