Models

twgeo.models.geomodel

class twgeo.models.geomodel.Model(use_tensorboard=True, batch_size=64)

Bases: object

Geolocation prediction model. Consists of 4 layers ( Embedding, LSTM, LSTM and Dense).

Parameters:
  • use_tensorboard – Track training progress using Tensorboard. Default: true.
  • batch_size – Default: 64
build_model(num_outputs, time_steps=500, vocab_size=20000, hidden_layer_size=128)

Build the model.

Parameters:
  • num_outputs – Number of output classes. For example, in the case of Census regions num of classes is 4.
  • time_steps – Default: 500
  • vocab_size – Use the top N most frequent words. Default: 20,000
  • hidden_layer_size – Number of neurons in the hidden layers. Default: 128
Returns:

evaluate(x_test, y_test)

Get the loss, accuracy and top 5 accuracy of the model.

Parameters:
  • x_test – Evaluation samples.
  • y_test – Evaluation labels.
Returns:

A dictionary of metric, value pairs.

load_saved_model(filename)

Load a previously trained model from disk.

Parameters:filename – The H5 model.
Returns:
predict(x)

Predict the location of the given samples.

Parameters:x – A vector of tweets. Each row corresponds to a single user.
Returns:The prediction results.
save_model(filename)

Save the current model and trained weights for later use.

Parameters:filename – Prefix for the model filenames.
train(x_train, y_train, x_dev, y_dev, epochs=7, reset_model=False)

Fit the model to the training data.

Parameters:
  • x_train – Training samples.
  • y_train – Training labels. Must be a vector of integer values.
  • x_dev – Validation samples.
  • y_dev – Validation labels. Must be a vector of integer values.
  • epochs – Number of times to train on the whole data set. Default: 7
  • reset_model – If this is set to True, it will discard any previously trained model and start from scratch.
Returns:

Raises:

ValueError: If the number of training samples and the number of labels do not match.

twgeo.models.geomodel.top_5_acc(y_true, y_pred)

Module contents