Jyutping Improvement #4

hockyy · 2024-07-19T19:54:34Z

I don't know how you farm those jyutping,

https://words.hk/faiman/analysis/wordslist.json
https://words.hk/faiman/analysis/charlist.json

but anyway, if you haven't included this method, I think you can try. I'm too lazy to code a new library so I will use your to-jyutping.

Just so if you wanna update the dictionary, you can parse all the words from there, for the tokenizer, we can use jieba

https://github.com/hockyy/jieba-cantonese

I've made a script to auto generate jieba user dict to tokenize, so querying jyutping per token can be better, if the result don't exist, fall back to per character jyutping

The text was updated successfully, but these errors were encountered:

hockyy · 2024-07-19T19:56:19Z

let me know if you need any help.

I'm currently developing this project https://github.com/hockyy/miteiru

laubonghaudoi · 2024-07-19T20:18:35Z

@graphemecluster 據我所知粵典數據係一早就已經用咗嘅？而家嘅更新主要係用咗 Jon 嘅字型數據？

graphemecluster · 2024-07-19T20:22:53Z

而家淨係用 Jon 嘅數據，但都肯定準過結巴分詞
@chaaklau 你覺得你粵典個 word list 標粵拼有冇用？

graphemecluster · 2024-07-19T20:26:26Z

@hockyy The accuracy should reach more than 99% since our latest updates (JS/TS version 2.0.0 / Python version 0.3.0) a few days ago.

hockyy · 2024-07-19T20:31:48Z

ack ack okk thank you info

hockyy · 2024-07-19T20:32:32Z

btw 呢個import唔到

hockyy · 2024-07-19T20:33:35Z

我聽日debug啊好眼瞓😪

graphemecluster linked a pull request Jul 21, 2024 that will close this issue

Add multi-format bundle output and update package.json #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jyutping Improvement #4

Jyutping Improvement #4

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024

laubonghaudoi commented Jul 19, 2024

graphemecluster commented Jul 19, 2024

graphemecluster commented Jul 19, 2024

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024

Jyutping Improvement #4

Jyutping Improvement #4

Comments

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024

laubonghaudoi commented Jul 19, 2024

graphemecluster commented Jul 19, 2024

graphemecluster commented Jul 19, 2024

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024

hockyy commented Jul 19, 2024