Translate¶
translate.py
usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
[--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
[--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
[--report_bleu] [--report_rouge] [--report_time]
[--dynamic_dict] [--share_vocab]
[--random_sampling_topk RANDOM_SAMPLING_TOPK]
[--random_sampling_temp RANDOM_SAMPLING_TEMP]
[--seed SEED] [--beam_size BEAM_SIZE]
[--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
[--max_sent_length] [--stepwise_penalty]
[--length_penalty {none,wu,avg}] [--ratio RATIO]
[--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
[--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
[--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
[--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
[--log_file LOG_FILE]
[--log_file_level {INFO,CRITICAL,WARNING,ERROR,NOTSET,DEBUG,20,50,30,40,0,10}]
[--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
[--batch_size BATCH_SIZE] [--gpu GPU]
[--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
[--window_stride WINDOW_STRIDE] [--window WINDOW]
[--image_channel_size {3,1}]
Named Arguments¶
- -config, --config
config file path
- -save_config, --save_config
config file save path
Model¶
- --model, -model
Path to model .pt file(s). Multiple models can be specified, for ensemble decoding.
Default: []
- --fp32, -fp32
Force the model to be in FP32 because FP16 is very slow on GTX1080(ti).
Default: False
- --avg_raw_probs, -avg_raw_probs
If this is set, during ensembling scores from different models will be combined by averaging their raw probabilities and then taking the log. Otherwise, the log probabilities will be averaged directly. Necessary for models whose output layers can assign zero probability.
Default: False
Data¶
- --data_type, -data_type
Type of the source input. Options: [text|img].
Default: “text”
- --src, -src
Source sequence to decode (one line per sequence)
- --src_dir, -src_dir
Source directory for image or audio files
Default: “”
- --tgt, -tgt
True target sequence (optional)
- --shard_size, -shard_size
Divide src and tgt (if applicable) into smaller multiple src and tgt files, then build shards, each shard will have opt.shard_size samples except last shard. shard_size=0 means no segmentation shard_size>0 means segment dataset into multiple shards, each shard has shard_size samples
Default: 10000
- --output, -output
Path to output the predictions (each line will be the decoded sequence
Default: “pred.txt”
- --report_bleu, -report_bleu
Report bleu score after translation, call tools/multi-bleu.perl on command line
Default: False
- --report_rouge, -report_rouge
Report rouge 1/2/3/L/SU4 score after translation call tools/test_rouge.py on command line
Default: False
- --report_time, -report_time
Report some translation time metrics
Default: False
- --dynamic_dict, -dynamic_dict
Create dynamic dictionaries
Default: False
- --share_vocab, -share_vocab
Share source and target vocabulary
Default: False
Random Sampling¶
- --random_sampling_topk, -random_sampling_topk
Set this to -1 to do random sampling from full distribution. Set this to value k>1 to do random sampling restricted to the k most likely next tokens. Set this to 1 to use argmax or for doing beam search.
Default: 1
- --random_sampling_temp, -random_sampling_temp
If doing random sampling, divide the logits by this before computing softmax during decoding.
Default: 1.0
- --seed, -seed
Random seed
Default: 829
Beam¶
- --beam_size, -beam_size
Beam size
Default: 5
- --min_length, -min_length
Minimum prediction length
Default: 0
- --max_length, -max_length
Maximum prediction length.
Default: 100
- --max_sent_length, -max_sent_length
Deprecated, use -max_length instead
- --stepwise_penalty, -stepwise_penalty
Apply penalty at every decoding step. Helpful for summary penalty.
Default: False
- --length_penalty, -length_penalty
Possible choices: none, wu, avg
Length Penalty to use.
Default: “none”
- --ratio, -ratio
Ratio based beam stop condition
Default: -0.0
- --coverage_penalty, -coverage_penalty
Possible choices: none, wu, summary
Coverage Penalty to use.
Default: “none”
- --alpha, -alpha
Google NMT length penalty parameter (higher = longer generation)
Default: 0.0
- --beta, -beta
Coverage penalty parameter
Default: -0.0
- --block_ngram_repeat, -block_ngram_repeat
Block repetition of ngrams during decoding.
Default: 0
- --ignore_when_blocking, -ignore_when_blocking
Ignore these strings when blocking repeats. You want to block sentence delimiters.
Default: []
- --replace_unk, -replace_unk
Replace the generated UNK tokens with the source token that had highest attention weight. If phrase_table is provided, it will look up the identified source token and give the corresponding target token. If it is not provided (or the identified source token does not exist in the table), then it will copy the source token.
Default: False
- --phrase_table, -phrase_table
If phrase_table is provided (with replace_unk), it will look up the identified source token and give the corresponding target token. If it is not provided (or the identified source token does not exist in the table), then it will copy the source token.
Default: “”
Logging¶
- --verbose, -verbose
Print scores and predictions for each sentence
Default: False
- --log_file, -log_file
Output logs to a file under this path.
Default: “”
- --log_file_level, -log_file_level
Possible choices: INFO, CRITICAL, WARNING, ERROR, NOTSET, DEBUG, 20, 50, 30, 40, 0, 10
Default: “0”
- --attn_debug, -attn_debug
Print best attn for each word
Default: False
- --dump_beam, -dump_beam
File to dump beam information to.
Default: “”
- --n_best, -n_best
If verbose is set, will output the n_best decoded sentences
Default: 1
Efficiency¶
- --batch_size, -batch_size
Batch size
Default: 30
- --gpu, -gpu
Device to run on
Default: -1
Speech¶
- --sample_rate, -sample_rate
Sample rate.
Default: 16000
- --window_size, -window_size
Window size for spectrogram in seconds
Default: 0.02
- --window_stride, -window_stride
Window stride for spectrogram in seconds
Default: 0.01
- --window, -window
Window type for spectrogram generation
Default: “hamming”
- --image_channel_size, -image_channel_size
Possible choices: 3, 1
Using grayscale image can training model faster and smaller
Default: 3