Skip to content

Latest commit

 

History

History

metadata

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

GlotScript Resource

Current Version = V0.1

Check history for other versions.

How to load

# ! wget https://raw.githubusercontent.com/cisnlp/GlotScript/main/metadata/GlotScript.tsv
df = pd.read_csv('GlotScript.tsv', na_filter= False, sep='\t')

Format

  • MAIN or CORE: Given a language l identified by an ISO639 code, we categorize a script for l as MAIN if this is supported by at least two of the three sources.

  • AUXILIARY (aux): If only one metadata source agrees on a script and not the other, the script is placed in the auxiliary category specific to that source. Wiki-aux, LREC2800-aux, and SIL-aux are used for Wikipedia, LREC_2800, and SIL, respectively. SIL2-aux is exclusively used for discrepancies between ScriptSource and LangTag.

License

This dataset is available under the CC BY-SA 4.0 license, permitting modification and redistribution.

Sources