transliterate package¶
Subpackages¶
- transliterate.backports package
- transliterate.contrib package
- Subpackages
- transliterate.contrib.apps package
- transliterate.contrib.languages package
- Subpackages
- transliterate.contrib.languages.bg package
- transliterate.contrib.languages.el package
- transliterate.contrib.languages.he package
- transliterate.contrib.languages.hi package
- transliterate.contrib.languages.hy package
- transliterate.contrib.languages.ka package
- transliterate.contrib.languages.ru package
- transliterate.contrib.languages.uk package
- Module contents
- Subpackages
- Module contents
- Subpackages
- transliterate.tests package
Submodules¶
transliterate.base module¶
-
class
transliterate.base.
TranslitLanguagePack
[source]¶ Bases:
object
Base language pack. The attributes below shall be defined in every language pack.
language_code
: Language code (obligatory). Example value: ‘hy’, ‘ru’.language_name
: Language name (obligatory). Example value: ‘Armenian’,‘Russian’.character_ranges
: Character ranges that are specific to the language.- When making a pack, check this page for the ranges.
mapping
: Mapping (obligatory). A tuple, consisting of two strings- (source and target). Example value: (u’abc’, u’աբց’).
reversed_specific_mapping
: Specific mapping (one direction only) used- when transliterating from target script to source script (reversed transliteration).
- ՝՝pre_processor_mapping՝՝: Pre processor mapping (optional). A dictionary
- mapping for letters that can’t be represented by a single latin letter.
- ՝՝reversed_specific_pre_processor_mapping՝՝: Pre processor mapping (
- optional). A dictionary mapping for letters that can’t be represented by a single latin letter (reversed transliteration).
example: >>> class ArmenianLanguagePack(TranslitLanguagePack): >>> language_code = "hy" >>> language_name = "Armenian" >>> character_ranges = ((0x0530, 0x058F), (0xFB10, 0xFB1F)) >>> mapping = ( >>> u"abgdezilxkhmjnpsvtrcq&ofABGDEZILXKHMJNPSVTRCQOF", # Source script >>> u"աբգդեզիլխկհմյնպսվտրցքևօֆԱԲԳԴԵԶԻԼԽԿՀՄՅՆՊՍՎՏՐՑՔՕՖ", # Target script >>> ) >>> reversed_specific_mapping = ( >>> u"ռՌ", >>> u"rR" >>> ) >>> pre_processor_mapping = { >>> # lowercase >>> u"e'": u"է", >>> u"y": u"ը", >>> u"th": u"թ", >>> u"jh": u"ժ", >>> u"ts": u"ծ", >>> u"dz": u"ձ", >>> u"gh": u"ղ", >>> u"tch": u"ճ", >>> u"sh": u"շ", >>> u"vo": u"ո", >>> u"ch": u"չ", >>> u"dj": u"ջ", >>> u"ph": u"փ", >>> u"u": u"ու", >>> >>> # uppercase >>> u"E'": u"Է", >>> u"Y": u"Ը", >>> u"Th": u"Թ", >>> u"Jh": u"Ժ", >>> u"Ts": u"Ծ", >>> u"Dz": u"Ձ", >>> u"Gh": u"Ղ", >>> u"Tch": u"Ճ", >>> u"Sh": u"Շ", >>> u"Vo": u"Ո", >>> u"Ch": u"Չ", >>> u"Dj": u"Ջ", >>> u"Ph": u"Փ", >>> u"U": u"Ու" >>> } >>> reversed_specific_pre_processor_mapping = { >>> u"ու": u"u", >>> u"Ու": u"U" >>> } Note, that in Python 3 you won't be using u prefix before the strings.
-
character_ranges
= None¶
-
characters
= None¶
-
classmethod
contains
(character)[source]¶ Checks if given character belongs to the language pack.
Return bool:
-
detect
(text, num_words=None)[source]¶ Heavy language detection, which is activated for languages that are harder detect (like Russian Cyrillic and Ukrainian Cyrillic).
Parameters: - value (unicode) – Input string.
- num_words (int) – Number of words to base decision on.
Return bool: True if detected and False otherwise.
-
detectable
= False¶
-
language_code
= None¶
-
language_name
= None¶
-
make_strict
(value, reversed=False)[source]¶ Strips out unnecessary characters from the string.
Parameters: - value (string) –
- reversed (bool) –
Return string:
-
mapping
= None¶
-
pre_processor_mapping
= None¶
-
pre_processor_mapping_keys
= []¶
-
reversed_characters
= None¶
-
reversed_pre_processor_mapping_keys
= []¶
-
reversed_specific_mapping
= None¶
-
reversed_specific_pre_processor_mapping
= None¶
-
reversed_specific_pre_processor_mapping_keys
= []¶
transliterate.conf module¶
transliterate.decorators module¶
-
transliterate.decorators.
transliterate_function
¶ alias of
TransliterateFunction
-
transliterate.decorators.
transliterate_method
¶ alias of
TransliterateMethod
transliterate.defaults module¶
transliterate.discover module¶
transliterate.exceptions module¶
-
exception
transliterate.exceptions.
LanguageCodeError
[source]¶ Bases:
exceptions.Exception
Exception raised when language code is left empty or has incorrect value.
-
exception
transliterate.exceptions.
ImproperlyConfigured
[source]¶ Bases:
exceptions.Exception
Exception raised when developer didn’t configure the code properly.
-
exception
transliterate.exceptions.
LanguagePackNotFound
[source]¶ Bases:
exceptions.Exception
Exception raised when language pack is not found for the language code given.
transliterate.utils module¶
-
transliterate.utils.
translit
(value, language_code=None, reversed=False, strict=False)[source]¶ Transliterates the text for the language given. Language code is optional in case of reversed translations (from some script to latin).
Parameters: - value (str) –
- language_code (str) –
- reversed (bool) – If set to True, reversed translation is made.
- strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str:
-
transliterate.utils.
suggest
(value, language_code=None, reversed=False, limit=None)[source]¶ Suggest possible variants.
Parameters: - value (str) –
- language_code (str) –
- reversed (bool) – If set to True, reversed translation is made.
- limit (int) – Limit number of suggested variants.
Return list:
-
transliterate.utils.
detect_language
(text, num_words=None, fail_silently=True, heavy_check=False)[source]¶ Detects the language from the value given based on ranges defined in active language packs.
Parameters: - value (unicode) – Input string.
- num_words (int) – Number of words to base decision on.
- fail_silently (bool) –
- heavy_check (bool) – If given, heavy checks would be applied when
simple checks don’t give any results. Heavy checks are language
specific and do not apply to a common logic. Heavy language detection
is defined in the
detect
method of each language pack.
Return str: Language code.
Module contents¶
-
transliterate.
translit
(value, language_code=None, reversed=False, strict=False)[source]¶ Transliterates the text for the language given. Language code is optional in case of reversed translations (from some script to latin).
Parameters: - value (str) –
- language_code (str) –
- reversed (bool) – If set to True, reversed translation is made.
- strict (bool) – If given, all that are not found in the transliteration pack, are simply stripped out.
Return str:
-
transliterate.
get_available_language_codes
()[source]¶ Gets list of language codes for registered language packs.
Return list:
-
transliterate.
detect_language
(text, num_words=None, fail_silently=True, heavy_check=False)[source]¶ Detects the language from the value given based on ranges defined in active language packs.
Parameters: - value (unicode) – Input string.
- num_words (int) – Number of words to base decision on.
- fail_silently (bool) –
- heavy_check (bool) – If given, heavy checks would be applied when
simple checks don’t give any results. Heavy checks are language
specific and do not apply to a common logic. Heavy language detection
is defined in the
detect
method of each language pack.
Return str: Language code.