.. _place: Mobility ================================== .. image:: https://github.com/INGEOTEC/text_models/actions/workflows/test.yaml/badge.svg :target: https://github.com/INGEOTEC/text_models/actions/workflows/test.yaml .. image:: https://badge.fury.io/py/text-models.svg :target: https://badge.fury.io/py/text-models .. image:: https://coveralls.io/repos/github/INGEOTEC/text_models/badge.svg?branch=develop :target: https://coveralls.io/github/INGEOTEC/text_models?branch=develop .. image:: https://dev.azure.com/conda-forge/feedstock-builds/_apis/build/status/text_models-feedstock?branchName=main :target: https://dev.azure.com/conda-forge/feedstock-builds/_build/latest?definitionId=16894&branchName=main .. image:: https://img.shields.io/conda/vn/conda-forge/text_models.svg :target: https://anaconda.org/conda-forge/text_models .. image:: https://img.shields.io/conda/pn/conda-forge/text_models.svg :target: https://anaconda.org/conda-forge/text_models .. image:: https://readthedocs.org/projects/text-models/badge/?version=latest :target: https://text-models.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status .. image:: https://colab.research.google.com/assets/colab-badge.svg :target: https://colab.research.google.com/github/INGEOTEC/text_models/blob/develop/docs/Mobility.ipynb This module deals with the process of measuring human mobility through Twitter's data. It processes the information provided by Twitter and provides the displacement in different ways, such as the number of travels in an origin-destination matrix, the overall mobility, and the outward mobility. To illustrate the library's use, let us produce mobility plots on the period contemplating from February 15, 2020, to July 12, 2020. The following code retrieved the mobility information on the specified period. >>> from text_models import Mobility >>> start = dict(year=2020, month=7, day=12) >>> end = dict(year=2020, month=2, day=15) >>> mob = Mobility(start, end=end) Let us start presenting mobility as the number of travels in Mexico, Canada, and Saudi Arabia. The following code computes the mobility in all the countries. The first line counts the trips that occurred within the country as well as the inward and outward movement. The information is arranged in a DataFrame or a dictionary, depending on whether the pandas' flag is activated. The second line generates the plot for the countries of interest, i.e., Mexico (MX), Canada (CA), Saudi Arabia (SA). >>> data = mob.overall(pandas=True) >>> data[["MX", "CA", "SA"]].plot() .. image:: mx-ca-sa.png An approach to transforming the mobility information from a number of trips into a percentage is by using a baseline period. The baseline statistics can be computed using different procedures; one uses the weekday, and the other uses a clustering algorithm, particularly k-means. The text_models library has two classes; one computes the percentage using weekday information, namely :py:class:`text_models.place.MobilityWeekday`, and the other using a clustering algorithm, i.e., :py:class:`text_models.place.MobilityCluster`. The following code computes the percentage using the weekday information; the code is similar to the one used to produce the previous figure being the only difference the class used. >>> from text_models import MobilityWeekday >>> mob = MobilityWeekday(start, end=end) >>> data = mob.overall(pandas=True) >>> data[["MX", "CA", "SA"]].plot() .. image:: mx-ca-sa-perc.png So far, we have seen a procedure to compute the mobility using the number of travels and the percentage in different countries. In order to complement the approach, let us compute the mobility between Mexico’s states. The code is similar to the one used in the previous examples, being the difference that one needs to provide a function (argument :py:attr:`level`) that transforms the landmark identifier into a state identifier and filters the landmarks that do not correspond to Mexico. >>> data = mob.overall(level=lambda x: mob.state(x) if x[:3] == "MX:" else None, pandas=True) Let us create a heat map to represent the mobility of all states into one figure. The first step is to resample the information to present average mobility in a week. Then, the mobility information is transposed to represent the states as rows and the weeks as columns. >>> dd = data.resample("W").mean() >>> unico = dd.T >>> index = unico.index.to_numpy() >>> columns = unico.columns >>> unico = unico.to_numpy() In order to write the state name instead of the identifier, we use the following class. >>> from text_models.place import States >>> states = States() It is time to create the heatmap; the following code creates the heatmap. >>> from matplotlib import pylab as plt >>> fig = plt.figure(figsize=(8, 6), dpi=300) >>> ax = fig.subplots() >>> _ = ax.imshow(unico, cmap="viridis") >>> cbar = fig.colorbar(_, ax=ax, orientation="vertical", shrink=1) >>> cbar.set_label('Percentage', rotation=90) >>> _ = plt.yticks(range(index.shape[0]), [states.name(x) for x in index]) >>> _ = plt.xticks(range(len(columns)), [x.weekofyear for x in columns]) >>> plt.xlabel("Week of the year") >>> plt.tight_layout() >>> plt.savefig("heatmap.png") .. image:: heatmap.png To complement the overview of the information that can be obtained from this module, we refer the reader to the `notebook `_. Using your Tweets ------------------------------------- The previous steps assumed the use of the mobility data collected and transform by ingeotec. However, sometimes one would like to use the algorithms developed in data collected with different characteristics. Let us assume the tweets are in a file called "tweets.json.gz"; the format is one JSON per line. In the tests, it is available some collected tweets to make this example self-contained. These tweets are on the following path. >>> from text_models.tests import test_place >>> from os.path import join >>> DIR = test_place.DIR >>> fname = join(DIR, "tweets.json.gz") Then to create the origin-destination matrix used by :py:class:`text_models.place.Mobility`, the following code can be used. >>> from text_models.place import OriginDestination >>> ori_dest = OriginDestination(fname) >>> ori_dest.compute("210604.travel") It is also possible to use a list of files instead of just one, so it is acceptable that the parameter :attr:`fname` would be a list. Furthermore, it might be the case that the file has a different format, so it is also possible to give a function (:attr:`reader`) that returns an iterable object where each element is a dictionary with the same format used by Twitter. The last part is to use the origin-destination matrix (i.e., "210604.travel") in the :py:class:`text_models.place.Mobility`. To do so, it is needed to replace the method used to find the mobility information, this is provided by the parameter :attr:`data`. The following code illustrates this process. >>> from text_models.place import Mobility >>> data = lambda x: join(".", x) >>> mob = Mobility(day=dict(year=2021, month=6, day=4), window=1, data=data) >>> dd = mob.overall(pandas=True) :mod:`text_models.place` ------------------------------------- .. automodule:: text_models.place :members: