Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

khopilot
/
km-tokenizer-khmer

Feature Extraction
Khmer
sentencepiece
tokenizer
khmer
graph-regularization
low-resource
southeast-asian
cambodia
Eval Results (legacy)
Model card Files Files and versions
xet
Community

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Gated model
You can list files but not access them

Preview of files found in this repository
  • .gitattributes
    1.52 kB
    initial commit 8 months ago
  • README.md
    5.86 kB
    Add caveats: UDS circularity, ALT in-domain, MRR significance 29 days ago
  • config.json
    93 Bytes
    Upload folder using huggingface_hub 8 months ago
  • special_tokens_map.json
    1.01 kB
    Upload folder using huggingface_hub 8 months ago
  • spiece.model
    191 kB
    xet
    Update spiece.model to V3f 30 days ago
  • tokenizer.model
    191 kB
    xet
    Update to V3f tokenizer (93.3% Sanskrit, TPC 0.293) 30 days ago
  • tokenizer.vocab
    169 kB
    Upload folder using huggingface_hub 8 months ago
  • tokenizer_config.json
    196 Bytes
    Upload tokenizer_config.json with huggingface_hub 8 months ago