Command Line Interface¶
Export UNIHAN to Python, Data Package, CSV, JSON and YAML
usage: unihan-etl [-h] [-v] [-s SOURCE] [-z ZIP_PATH] [-d DESTINATION]
[-w WORK_DIR] [-F {json,csv}] [--no-expand] [--no-prune]
[-f [FIELDS [FIELDS ...]]]
[-i [INPUT_FILES [INPUT_FILES ...]]]
[-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
Named Arguments¶
-v, --version | show program’s version number and exit |
-s, --source | URL or path of zipfile. Default: http://www.unicode.org/Public/UNIDATA/Unihan.zip |
-z, --zip-path | Path the zipfile is downloaded to. Default: /home/docs/.cache/unihan_etl/downloads/Unihan.zip |
-d, --destination | |
Output of .csv. Default: /home/docs/.local/share/unihan_etl/unihan.{json,csv,yaml} | |
-w, --work-dir | Default: /home/docs/.cache/unihan_etl/downloads |
-F, --format | Possible choices: json, csv Default: csv |
--no-expand | Don’t expand values to lists in multi-value UNIHAN fields. Doesn’t apply to CSVs. Default: True |
--no-prune | Don’t prune fields with empty keysDoesn’t apply to CSVs. Default: True |
-f, --fields | Fields to use in export. Separated by spaces. All fields used by default. Fields: kAccountingNumeric, kBigFive, kCCCII, kCNS1986, kCNS1992, kCangjie, kCantonese, kCheungBauer, kCheungBauerIndex, kCihaiT, kCompatibilityVariant, kCowles, kDaeJaweon, kDefinition, kEACC, kFenn, kFennIndex, kFourCornerCode, kFrequency, kGB0, kGB1, kGB3, kGB5, kGB7, kGB8, kGSR, kGradeLevel, kHDZRadBreak, kHKGlyph, kHKSCS, kHanYu, kHangul, kHanyuPinlu, kHanyuPinyin, kIBMJapan, kIICore, kIRGDaeJaweon, kIRGDaiKanwaZiten, kIRGHanyuDaZidian, kIRGKangXi, kIRG_GSource, kIRG_HSource, kIRG_JSource, kIRG_KPSource, kIRG_KSource, kIRG_MSource, kIRG_TSource, kIRG_USource, kIRG_VSource, kJIS0213, kJa, kJapaneseKun, kJapaneseOn, kJinmeiyoKanji, kJis0, kJis1, kJoyoKanji, kKPS0, kKPS1, kKSC0, kKSC1, kKangXi, kKarlgren, kKorean, kKoreanEducationHanja, kKoreanName, kLau, kMainlandTelegraph, kMandarin, kMatthews, kMeyerWempe, kMorohashi, kNelson, kOtherNumeric, kPhonetic, kPrimaryNumeric, kPseudoGB1, kRSAdobe_Japan1_6, kRSJapanese, kRSKanWa, kRSKangXi, kRSKorean, kRSUnicode, kSBGY, kSemanticVariant, kSimplifiedVariant, kSpecializedSemanticVariant, kTGH, kTaiwanTelegraph, kTang, kTotalStrokes, kTraditionalVariant, kVietnamese, kXHC1983, kXerox, kZVariant |
-i, --input-files | |
Files inside zip to pull data from. Separated by spaces. All files used by default. Files: Unihan_DictionaryIndices.txt, Unihan_DictionaryLikeData.txt, Unihan_IRGSources.txt, Unihan_NumericValues.txt, Unihan_OtherMappings.txt, Unihan_RadicalStrokeCounts.txt, Unihan_Readings.txt, Unihan_Variants.txt | |
-l, --log_level | |
Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL |