ltsm.data_provider.tokenizer package
Submodules
ltsm.data_provider.tokenizer.standard_scaler module
- class ltsm.data_provider.tokenizer.standard_scaler.StandardScaler[source]
 Bases:
BaseProcessorRepresents a Standard Scaler object that uses Sklearn’s Standard Scaler for data processing.
- module_id
 The identifier for base processor objects.
- Type:
 str
- inverse_process(data)[source]
 Scales back the data to its original representation.
- Parameters:
 data (np.ndarray) – The data to scale back.
- Returns:
 The scaled back data.
- Return type:
 np.ndarray
- load(save_dir)[source]
 Loads the scaler saved at the save_dir directory.
- Parameters:
 save_dir (str) – The directory the scaler was saved.
- module_id = 'standard_scaler'
 
- process(raw_data, train_data, val_data, test_data, fit_train_only=False, do_anomaly=False)[source]
 Standardizes the training, validation, and test sets by removing the mean and scaling to unit variance.
- Parameters:
 raw_data (np.ndarray) – The raw data.
train_data (List[np.ndarray]) – The list of training sequences.
val_data (List[np.ndarray]) – The list of validation sequences.
test_data (List[np.ndarray]) – The list of test sequences.
fit_train_only (bool) – Indicates whether the datasets should be scaled based on the training data.
- Returns:
 A tuple of three lists containing the processed training, validation, and test data.
- Return type:
 Tuple[List[np.ndarray], List[np.ndarray], List[np.ndarray]]
ltsm.data_provider.tokenizer.tokenizer_processor module
- class ltsm.data_provider.tokenizer.tokenizer_processor.ChronosTokenizer[source]
 Bases:
objectA
ChronosTokenizerdefinines how time series are mapped into token IDs and back.For details, see the
input_transformandoutput_transformmethods, which concrete classes must implement.- input_transform(context)[source]
 Turn a batch of time series into token IDs, attention map, and scale.
- Parameters:
 context (
Tensor) – A tensor shaped (batch_size, time_length), containing the timeseries to forecast. Use left-padding withtorch.nanto align time series of different lengths.- Return type:
 Tuple[Tensor,Tensor,Any]- Returns:
 token_ids – A tensor of integers, shaped (batch_size, time_length + 1) if
config.use_eos_tokenand (batch_size, time_length) otherwise, containing token IDs for the input series.attention_mask – A boolean tensor, same shape as
token_ids, indicating which input observations are nottorch.nan(i.e. not missing nor padding).tokenizer_state – An object that will be passed to
output_transform. Contains the relevant context to decode output samples into real values, such as location and scale parameters.
- output_transform(samples, tokenizer_state)[source]
 Turn a batch of sample token IDs into real values.
- Parameters:
 samples (
Tensor) – A tensor of integers, shaped (batch_size, num_samples, time_length), containing token IDs of sample trajectories.tokenizer_state (
Any) – An object returned byinput_transformcontaining relevant context to decode samples, such as location and scale. The nature of this depends on the specific tokenizer.
- Returns:
 A real tensor, shaped (batch_size, num_samples, time_length), containing forecasted sample paths.
- Return type:
 forecasts
- class ltsm.data_provider.tokenizer.tokenizer_processor.MeanScaleUniformBins(low_limit, high_limit, config)[source]
 Bases:
ChronosTokenizer- input_transform(context)[source]
 Turn a batch of time series into token IDs, attention map, and scale.
- Parameters:
 context (
Tensor) – A tensor shaped (batch_size, time_length), containing the timeseries to forecast. Use left-padding withtorch.nanto align time series of different lengths.- Return type:
 Tuple[Tensor,Tensor,Tensor]- Returns:
 token_ids – A tensor of integers, shaped (batch_size, time_length + 1) if
config.use_eos_tokenand (batch_size, time_length) otherwise, containing token IDs for the input series.attention_mask – A boolean tensor, same shape as
token_ids, indicating which input observations are nottorch.nan(i.e. not missing nor padding).tokenizer_state – An object that will be passed to
output_transform. Contains the relevant context to decode output samples into real values, such as location and scale parameters.
- output_transform(samples, scale)[source]
 Turn a batch of sample token IDs into real values.
- Parameters:
 samples (
Tensor) – A tensor of integers, shaped (batch_size, num_samples, time_length), containing token IDs of sample trajectories.tokenizer_state – An object returned by
input_transformcontaining relevant context to decode samples, such as location and scale. The nature of this depends on the specific tokenizer.
- Returns:
 A real tensor, shaped (batch_size, num_samples, time_length), containing forecasted sample paths.
- Return type:
 forecasts
- class ltsm.data_provider.tokenizer.tokenizer_processor.TokenizerConfig(tokenizer_class, tokenizer_kwargs, n_tokens, n_special_tokens, pad_token_id, eos_token_id, use_eos_token, model_type, context_length, prediction_length, num_samples, temperature, top_k, top_p)[source]
 Bases:
objectThis class holds all the configuration parameters to be used by
ChronosTokenizerandChronosModel.- 
context_length: 
int 
- 
eos_token_id: 
int 
- 
model_type: 
Literal['causal','seq2seq'] 
- 
n_special_tokens: 
int 
- 
n_tokens: 
int 
- 
num_samples: 
int 
- 
pad_token_id: 
int 
- 
prediction_length: 
int 
- 
temperature: 
float 
- 
tokenizer_class: 
str 
- 
tokenizer_kwargs: 
Dict[str,Any] 
- 
top_k: 
int 
- 
top_p: 
float 
- 
use_eos_token: 
bool 
- 
context_length: