Video-Reason
/

VBVR-LTX2.3-diffsynth

Model card Files Files and versions

wruisi commited on 6 days ago

Commit

05d6c61

·

verified ·

1 Parent(s): fdbd5d3

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -4,6 +4,8 @@ base_model:
 library_name: diffusers
 license: apache-2.0
 pipeline_tag: image-to-video
 ---
 # VBVR: A Very Big Video Reasoning Suite
@@ -51,11 +53,11 @@ The model was presented in the paper [A Very Big Video Reasoning Suite](https://
 | Model | Base Architecture | Other Remarks |
 |-------|-------------------|---------------|
-| [**VBVR-Wan2.1**](https://huggingface.co/Video-Reason/VBVR-Wan2.1) | Wan2.1-I2V-14B-720P | Diffusers format |
 | [VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2) | Wan2.2-I2V-A14B | Diffusers format |
 | [VBVR-Wan2.1-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.1-diffsynth) | Wan2.1-I2V-14B-720P | DiffSynth LoRA format |
 | [VBVR-Wan2.2-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.2-diffsynth) | Wan2.2-I2V-A14B | DiffSynth LoRA format |
-| [VBVR-LTX2.3-diffsynth](https://huggingface.co/Video-Reason/VBVR-LTX2.3-diffsynth) | LTX-2.3 | DiffSynth LoRA format |
 ## Release Information
 VBVR-LTX2.3 is trained from LTX-2.3 without architectural modifications, as the goal of VBVR is to *investigate data scaling behavior* and provide *strong baseline models* for the video reasoning research community. Leveraging the VBVR-Dataset, which constitutes one of the largest video reasoning datasets to date, the VBVR model family achieved highest scores on VBVR-Bench.

 library_name: diffusers
 license: apache-2.0
 pipeline_tag: image-to-video
+datasets:
+- Video-Reason/VBVR-Dataset
 ---
 # VBVR: A Very Big Video Reasoning Suite
 | Model | Base Architecture | Other Remarks |
 |-------|-------------------|---------------|
+| [VBVR-Wan2.1](https://huggingface.co/Video-Reason/VBVR-Wan2.1) | Wan2.1-I2V-14B-720P | Diffusers format |
 | [VBVR-Wan2.2](https://huggingface.co/Video-Reason/VBVR-Wan2.2) | Wan2.2-I2V-A14B | Diffusers format |
 | [VBVR-Wan2.1-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.1-diffsynth) | Wan2.1-I2V-14B-720P | DiffSynth LoRA format |
 | [VBVR-Wan2.2-diffsynth](https://huggingface.co/Video-Reason/VBVR-Wan2.2-diffsynth) | Wan2.2-I2V-A14B | DiffSynth LoRA format |
+| [**VBVR-LTX2.3-diffsynth**](https://huggingface.co/Video-Reason/VBVR-LTX2.3-diffsynth) | LTX-2.3 | DiffSynth LoRA format |
 ## Release Information
 VBVR-LTX2.3 is trained from LTX-2.3 without architectural modifications, as the goal of VBVR is to *investigate data scaling behavior* and provide *strong baseline models* for the video reasoning research community. Leveraging the VBVR-Dataset, which constitutes one of the largest video reasoning datasets to date, the VBVR model family achieved highest scores on VBVR-Bench.