Text Models

https://github.com/INGEOTEC/text_models/actions/workflows/test.yaml/badge.svg https://badge.fury.io/py/text-models.svg https://coveralls.io/repos/github/INGEOTEC/text_models/badge.svg?branch=develop https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/text_models-feedstock?branchName=main https://img.shields.io/conda/vn/conda-forge/text_models.svg https://img.shields.io/conda/pn/conda-forge/text_models.svg Documentation Status https://colab.research.google.com/assets/colab-badge.svg

Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This library aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people’s mobility, among other studies that can be pursued with the library proposed. In summary, the Python library retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages (see Vocabulary). As well as mobility information related to the number of travels among locations for more than 200 countries or territories (see Mobility).

The library is described in A Python library for exploratory data analysis on twitter data based on tokens and aggregated origin–destination information. Mario Graff, Daniela Moctezuma, Sabino Miranda-Jiménez, Eric S.Tellez. Computers & Geosciences Volume 159, February 2022.

Quickstart Guide

We have decided to make a live quickstart guide; it covers the installation and the use of text_models to retrieve the mobility and text information. Finally, the notebook can be found in the docs directory on GitHub.


If you find text_models useful for any academic/scientific purpose, we would appreciate citations to the following reference:

title = {A Python library for exploratory data analysis on twitter data based on tokens and aggregated origin–destination information},
journal = {Computers & Geosciences},
volume = {159},
pages = {105012},
year = {2022},
issn = {0098-3004},
doi = {https://doi.org/10.1016/j.cageo.2021.105012},
url = {https://www.sciencedirect.com/science/article/pii/S0098300421002946},
author = {Mario Graff and Daniela Moctezuma and Sabino Miranda-Jiménez and Eric S. Tellez},
keywords = {Twitter exploratory analysis, Mobility patterns, Open-source Python library},

Table of Contents