CoSMo: A Multimodal Transformer for Page Stream Segmentation in Comic Books
Paper
•
2507.10053
•
Published
•
1
Multimodal AI, Document Understanding, Reading Systems.
ComicsPAP: understanding comic strips by picking the correct panel
One missing piece in Vision and Language: A Survey on Comics Understanding