Models¶
twgeo.models.geomodel¶
-
class
twgeo.models.geomodel.
Model
(use_tensorboard=True, batch_size=64)¶ Bases:
object
Geolocation prediction model. Consists of 4 layers ( Embedding, LSTM, LSTM and Dense).
Parameters: - use_tensorboard – Track training progress using Tensorboard. Default: true.
- batch_size – Default: 64
-
build_model
(num_outputs, time_steps=500, vocab_size=20000, hidden_layer_size=128)¶ Build the model.
Parameters: - num_outputs – Number of output classes. For example, in the case of Census regions num of classes is 4.
- time_steps – Default: 500
- vocab_size – Use the top N most frequent words. Default: 20,000
- hidden_layer_size – Number of neurons in the hidden layers. Default: 128
Returns:
-
evaluate
(x_test, y_test)¶ Get the loss, accuracy and top 5 accuracy of the model.
Parameters: - x_test – Evaluation samples.
- y_test – Evaluation labels.
Returns: A dictionary of metric, value pairs.
-
load_saved_model
(filename)¶ Load a previously trained model from disk.
Parameters: filename – The H5 model. Returns:
-
predict
(x)¶ Predict the location of the given samples.
Parameters: x – A vector of tweets. Each row corresponds to a single user. Returns: The prediction results.
-
save_model
(filename)¶ Save the current model and trained weights for later use.
Parameters: filename – Prefix for the model filenames.
-
train
(x_train, y_train, x_dev, y_dev, epochs=7, reset_model=False)¶ Fit the model to the training data.
Parameters: - x_train – Training samples.
- y_train – Training labels. Must be a vector of integer values.
- x_dev – Validation samples.
- y_dev – Validation labels. Must be a vector of integer values.
- epochs – Number of times to train on the whole data set. Default: 7
- reset_model – If this is set to True, it will discard any previously trained model and start from scratch.
Returns: Raises: ValueError: If the number of training samples and the number of labels do not match.
-
twgeo.models.geomodel.
top_5_acc
(y_true, y_pred)¶