Server¶
Models¶
-
class
onmt.translate.translation_server.
ServerModel
(opt, model_id, tokenizer_opt=None, load=False, timeout=-1, on_timeout='to_cpu', model_root='./')[source]¶ Bases:
object
Wrap a model with server functionality.
- Parameters
opt (dict) – Options for the Translator
model_id (int) – Model ID
tokenizer_opt (dict) – Options for the tokenizer or None
load (bool) – whether to load the model during
__init__()
timeout (int) – Seconds before running
do_timeout()
Negative values means no timeouton_timeout (str) – Options are [“to_cpu”, “unload”]. Set what to do on timeout (see
do_timeout()
.)model_root (str) – Path to the model directory it must contain the model and tokenizer file
-
detokenize
(sequence)[source]¶ Detokenize a single sequence
Same args/returns as
tokenize()
-
do_timeout
()[source]¶ Timeout function that frees GPU memory.
Moves the model to CPU or unloads it; depending on attr`self.on_timemout` value
-
maybe_detokenize
(sequence)[source]¶ De-tokenize the sequence (or not)
Same args/returns as
tokenize()
Core Server¶
-
class
onmt.translate.translation_server.
TranslationServer
[source]¶ Bases:
object
-
clone_model
(model_id, opt, timeout=-1)[source]¶ Clone a model model_id.
Different options may be passed. If opt is None, it will use the same set of options
-
preload_model
(opt, model_id=None, **model_kwargs)[source]¶ Preloading the model: updating internal datastructure
It will effectively load the model if load is set
-