Digital Landscape Traces: Inferring User Origin (Local vs. Tourist) ¶

Alexander Dunkel, Leibniz Institute of Ecological Urban and Regional Development,
Transformative Capacities & Research Data Centre (IÖR-FDZ)

No description has been provided for this image

This notebook documents the data preparation process for differentiating between "locals" and "tourists" based on their social media activity. The primary goal is to determine a probable home location for each user who has posted from within Germany. This allows us to classify them and subsequently analyze how these different groups interact with and perceive German landscapes.

The workflow involves several key stages:

Identify Active Users: We begin by isolating the 6.3 million users who have posted at least once from Germany out of a global dataset of 127.6 million users.
Gather Global Activity: For this cohort of users, we retrieve the locations of all their posts worldwide. A user's complete spatial footprint is necessary to reliably infer a home location.
Home Location Inference: We then process this location history. By clustering the spatiotemporal data (specifically, "user-days," which represent a unique day a user was active in a location), we can identify the most probable home location for each individual.
Classification: Based on whether a user's inferred home is inside or outside Germany, we classify them as a "local" or a "tourist." Users with insufficient data or ambiguous patterns are classified as "undetermined."
Privacy-Preserving Export: Finally, the classified user data is aggregated and exported. To protect user privacy while enabling reproducibility, we use privacy-enhancing techniques, such as HyperLogLog (HLL), to create a dataset that supports aggregated mapping and analysis without exposing individual-level data.

Prepare environment¶

To run this notebook, as a starting point, you have two options:

1. Create an environment with the packages and versions shown in the following cell.

As a starting point, you may use the latest conda environment_default.yml from our CartoLab docker container.

2. If docker is available to to, we suggest to use the Carto-Lab Docker Container

Clone the repository and edit your .env value to point to the repsitory, where this notebook can be found, e.g.:

git clone https://github.com/ioer-dresden/carto-lab-docker.git
cd carto-lab-docker
cp .env.example .env
nano .env
## Enter:
# JUPYTER_NOTEBOOKS=~/notebooks/ephemeral_events
# TAG=v0.12.3
docker network create lbsn-network
docker compose pull && docker compose up -d

Load dependencies:

import os, sys
from pathlib import Path
import psycopg2
import geopandas as gp
import pandas as pd
import matplotlib.pyplot as plt
from typing import List, Tuple, Dict, Optional
from IPython.display import clear_output, display, HTML

To reduce the code shown in this notebook, some helper methods are made available in a separate file.

Load helper module from ../py/modules/base/tools.py.

module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
    sys.path.append(module_path)
from modules.base import tools

Activate autoreload of changed python files:

%load_ext autoreload
%autoreload 2

Parameters¶

Define initial parameters that affect processing

A 6-character geohash is aproximate to a 5 km × 5 km grid. That is precise enough to distinguish neighborhoods but not individual buildings. Reduce to increase privacy. Increase to improve granularity.

# Initial accuracy to reduce granularity 
# of spatial data on hll convert
GEOHASH_PRECISION = 5

WORK_DIR = Path.cwd().parents[0] / "tmp"     # Working directory                     
OUTPUT = Path.cwd().parents[0] / "out"       # Define path to output directory (figures etc.)

for folder in [WORK_DIR, OUTPUT]:
    folder.mkdir(exist_ok=True)

Load dotfiles environment variables

from dotenv import load_dotenv
load_dotenv(
    Path.cwd().parents[0] / '.env', override=True)

False

DB_NAME_RAWDB = os.getenv("DB_NAME_RAWDB")    # lbsn-rawdb name
DB_HOST_RAWDB = os.getenv("DB_HOST_RAWDB")    # lbsn-rawdb name

Initial Query for DE posts¶

First, create a table with the Geometry for Germany

CREATE TABLE spatial.germany_boundary (name VARCHAR(255), geom GEOMETRY(Polygon, 4326));
INSERT INTO spatial.germany_boundary (name, geom) VALUES ('Germany', ST_SetSRID(ST_GeomFromText(
'SRID=4326; Polygon ((9.1433304674271767 54.87302263558621007, 9.29480994002676653 54.80168366606841346, 9.59188580908289623 54.8869844880437654, 9.89410309238382979 54.84183198146081395, 10.59215134630761668 54.51989772869728768, 11.14133007712143808 54.57629296235040783, 11.64437791630786023 54.33073223410536912, 12.16520515899094335 54.38463668923185423, 12.32216612986303517 54.57714431725651139, 13.06899846071041793 54.84973594845530442, 13.42317610499327429 54.90912276250554669, 13.6613590011170345 54.86449207854661836, 14.05716348310829744 54.61181819135651949, 14.06966560638267083 54.27763627849151362, 14.16803558587361067 54.23880270088397992, 14.24221454696379396 53.98771964604486584, 14.18527042109309377 53.91197215758904804, 14.45052538304003065 53.26225137859830738, 14.34852597989009837 53.05471971874010251, 14.1436577746865737 52.96136543546651865, 14.12288246055715035 52.83765843329013023, 14.63909584120017371 52.57299936662752771, 14.53435722180836365 52.39500777578825819, 14.57599486577049674 52.28835922389913549, 14.71568838750914665 52.23588560993300689, 14.68153710045686466 52.11665381861888591, 14.75899936762675679 52.06476543577628036, 14.59014437256479368 51.82100998772079947, 14.75750455139399264 51.66150559111019902, 14.72911862642354208 51.53143962124158861, 14.97434511713350958 51.44213090221938955, 15.04181552066978611 51.27421588035866762, 14.9793136600973753 51.07703822223624002, 14.76638305793181871 50.81933252297761783, 14.61880015620226914 50.85780455873344863, 14.65014207539650215 50.9315402808593376, 14.56403797087568819 50.91849758409625792, 14.59917754360878916 50.98711099987281159, 14.50821195321913137 51.04315459410929634, 14.30164070725288639 51.05495050765983933, 14.25854303463341921 50.98751826955253819, 14.3876350063505356 50.89908797413073671, 13.90038356464981462 50.79324026222838029, 13.85526158645308215 50.72711999918868742, 13.55215096069832725 50.71383131086633966, 13.46562195618838587 50.60232153031444113, 13.37098295180180685 50.65054589437755794, 13.19525624507969042 50.50326319983685153, 13.03213105020785179 50.50996067954179125, 12.948144730395029 50.404311645865846, 12.81939501546764859 50.4597950326146929, 12.51204793536396664 50.39725816862960528, 12.33108448895802667 50.24271404931397456, 12.33320130239820855 50.17155191349007737, 12.18445310862409769 50.32232471681226116, 12.10019899948429156 50.31832794437917755, 12.2014080343533351 50.10849193455516115, 12.54782478838330917 49.92026953740929684, 12.40074313458802635 49.75553183832374771, 12.52185046292856896 49.6865330432653991, 12.65587915330036139 49.43453810554916572, 13.02921064395610529 49.30451982223206642, 13.40289555288751444 48.9873427339800287, 13.62827657000775616 48.94930714460906529, 13.83973263295467859 48.77148568211856627, 13.82569624828329324 48.61841526565206095, 13.73062545330208195 48.51446092253360121, 13.50879208898379602 48.59022953606427109, 13.32976817266307989 48.32359206612102298, 12.75814504086099532 48.12619261719555652, 13.00253663695627893 47.85328766219589625, 12.90516623683242869 47.72360115407008152, 13.08022144406442067 47.68711627559571298, 13.04750486166005885 47.49208302571111062, 12.80384823830559071 47.54988690075992963, 12.78106185032419262 47.67414247229731927, 12.49888118252061986 47.62494554180335626, 12.44012646416501866 47.69517254637594306, 12.2551562061939876 47.67929967054340068, 12.25707210839112804 47.74283907176578623, 12.16257153215053677 47.70112119928398897, 12.20320083824810808 47.60655540780589945, 11.63551183390404731 47.5944925556869407, 11.2728918533763931 47.39785256984026773, 10.9703287861445915 47.40002058258386342, 10.89015587599294577 47.53719340445212538, 10.45445953557913299 47.55573834235065078, 10.43647213957763142 47.38064780345342797, 10.17419188314892153 47.27020119333047887, 10.23594858525905238 47.38181776567739689, 10.09978048238338033 47.35473600246461956, 10.09136801981833287 47.45875007585539151, 9.97084547833622992 47.54568017799789459, 9.77728950610588754 47.59490289205490399, 9.56184471246353951 47.50455693005835656, 9.25583223976610725 47.65909750244782117, 8.89535179764737904 47.64909162917541607, 8.80644066260651925 47.73826225010282087, 8.79566886012986515 47.675586000193789, 8.72795487162005657 47.69268014007644751, 8.65697507693829493 47.80034710313634605, 8.40439955129829741 47.69805376287092713, 8.62888483466741718 47.65174054269192538, 8.43714312685386858 47.56708343205480105, 8.20607236264947915 47.62099989582856097, 7.67923753718264379 47.53277577987836366, 7.63407099577989356 47.5611040421179041, 7.69355335740158353 47.60053205810805821, 7.60448410715588352 47.57779110297707348, 7.52107927548820498 47.66384788966843189, 7.6221212397035174 47.97279107394894027, 7.5769816401905814 48.11905638794200968, 7.74340888840953312 48.32546477927621709, 7.80214138631595233 48.58884316637909251, 8.23263282034844224 48.96657144521606142, 7.93564809218577238 49.05765511479177121, 7.63120134636511693 49.0548857056109, 7.44564667835157934 49.18415102884694079, 7.29317690469042645 49.11490732861039987, 7.05244591624608574 49.11275784369956909, 6.93810712304531307 49.22243763608810241, 6.73839761345914212 49.16359779690088772, 6.55081584387613702 49.42537952404586576, 6.36703072508794321 49.4694830820786251, 6.35758927802282869 49.57347554655927979, 6.52992601637259895 49.80704757898263324, 6.32226166784175803 49.8392608441127436, 6.11309021923830187 50.06076301060448941, 6.17568740345097922 50.23541597716564411, 6.40502825514988672 50.32330869294190734, 6.34221872841163758 50.38000552503501694, 6.37487303006633965 50.45124892989423415, 6.1984988852154288 50.52836615777204088, 6.26559804771176232 50.64270602206534022, 5.97480474174545861 50.7979707384588437, 6.09389718580655426 50.92131540468221829, 5.89752511249400868 50.97501346989577087, 5.86687398774438407 51.04838894699236107, 6.17541526680116704 51.15847977272341041, 6.08215105872443473 51.17179911623628641, 6.07213427667556971 51.24245499220415923, 6.22612991836109586 51.36051764507328699, 6.21230003242061457 51.51361357530629448, 6.09123534633420149 51.60590930886758088, 6.11820963876385804 51.65610310637333669, 5.95510381438958802 51.73828468199252484, 5.94515322244330946 51.82451729513712735, 6.1660793929496549 51.8409024535854428, 6.10334125709340469 51.89261697229551373, 6.15511712665414734 51.90541782067595022, 6.40506434674227876 51.82740354243958336, 6.3907900554168009 51.8742190043036544, 6.72166407024070622 51.89617477214807195, 6.82851309630240166 51.96406670771244762, 6.69507834878680796 52.07026212539682319, 7.0609954894503062 52.23478240438453213, 7.07243800091305275 52.37311639220780535, 6.98785513290812332 52.46916016640858516, 6.69772000387513344 52.48634521246162876, 6.6814024699480683 52.55359869772178172, 6.76649011302833969 52.56230612083083997, 6.70973231049777041 52.62782351794106717, 6.75214101519571841 52.6482893402808827, 7.05100158975159275 52.63994356514348283, 7.08700963868392364 52.85051212403533327, 7.21693779014697157 53.0067232392861456, 7.19205451590323719 53.3143070866028097, 6.9316102596058613 53.33411061581233525, 6.88866268113787328 53.44247621270710624, 6.68362781717007692 53.49664088127173045, 6.34585998714874222 53.72450141470858398, 6.72671131827274849 53.86254353925301075, 7.72794138558651866 53.99333519546297566, 7.53412317824211186 54.13912000258608259, 7.58474754686181996 54.31008616784242804, 7.84582089511826553 54.39292727477396738, 8.16128570027490241 54.33250950670844759, 7.93236566924031195 54.73415971779678557, 8.05011711081607473 55.09845306415763844, 8.47261129387618439 55.05409929404726199, 8.55470193184874006 54.92094757397143212, 9.1433304674271767 54.87302263558621007))'
), 4326));

To speed up, create a spatial index:

CREATE INDEX idx_germany_boundary_geom ON spatial.germany_boundary USING GIST (geom);
ANALYZE spatial.germany_boundary;

The original big query to select all posts for Germany is shown below. This must be executed directly via postgres terminal, ideally with a terminal multiplexer.

CREATE MATERIALIZED VIEW mviews.all_posts_de AS
    SELECT
        p.user_guid,
        ST_Y(p.post_latlng) AS latitude,
        ST_X(p.post_latlng) AS longitude
        p.post_create_date,
        p.post_publish_date
    FROM
        topical.post AS p
    JOIN
        spatial.germany_boundary AS de
    ON
        ST_Intersects(p.post_latlng, de.geom)

To also show posts outside of Germany, as surrounding context (without classification and toned down in opcaity), the following query was used:

COPY (
    WITH de_geometries AS (
        SELECT
            geom::geometry AS germany_geom
        FROM
            spatial.germany_boundary
        LIMIT 1
    ),
    expanded_bbox AS (
        SELECT
            ST_Buffer(
                (ST_Envelope(germany_geom))::geography,
                100000
            )::geometry AS buffered_geom,
            germany_geom
        FROM
            de_geometries
    )
    SELECT
        ST_Y(p.post_latlng) AS latitude,
        ST_X(p.post_latlng) AS longitude
    FROM
        topical.post AS p,
        expanded_bbox
    WHERE
        ST_Intersects(p.post_latlng, expanded_bbox.buffered_geom)
        AND NOT ST_Intersects(p.post_latlng, expanded_bbox.germany_geom)
)
TO '/pg_exp/2025-10-18_DE_outside_buffer_coords.csv'
WITH (FORMAT CSV, HEADER, ENCODING 'UTF-8');

Connecting to the Databases¶

Our setup uses two PostgreSQL databases. The first (lbsn-rawdb) contains the original, sensitive social media data. The second (hlldb) is our local working database, where we will perform calculations and store aggregated, privacy-preserving results using the HyperLogLog (HLL) extension. This separation ensures that the raw data remains isolated and secure.

db_user = "postgres"
db_pass = os.getenv('POSTGRES_PASSWORD')
db_host = "hlldb"
db_port = "5432"
db_name = "hlldb"

db_connection_hll = psycopg2.connect(
        host=db_host,
        port=db_port,
        dbname=db_name,
        user=db_user,
        password=db_pass
)
db_conn_hll = tools.DbConn(db_connection_hll)
cur_hll = db_connection_hll.cursor()
cur_hll.execute("SELECT 1;")
print(cur_hll.statusmessage)

SELECT 1

Simplify query access:

db_conn = tools.DbConn(db_connection_hll)
db_conn.query("SELECT 1;")

If any SQL results in an error, the cursor cannot be used again. In this case, run db_connection.rollback() once, to reset the cursor.

db_connection_hll.rollback()

Create Query Schema¶

Create a new schema called mviews and update Postgres search_path parameter, to include new schema:

sql_query = """
CREATE SCHEMA IF NOT EXISTS mviews;
ALTER DATABASE hlldb
SET search_path = "$user",
                  social,
                  spatial,
                  temporal,
                  topical,
                  interlinkage,
                  extensions,
                  mviews;"""

Since the above query will not return any result, we'll directly use the psycopg2 cursor object:

cur = db_connection_hll.cursor()
cur.execute(sql_query)
print(cur.statusmessage)

ALTER DATABASE

To access the raw post data without copying it into our working database, we use a PostgreSQL feature called a Foreign Data Wrapper (FDW). This creates a secure, read-only link from hlldb to the tables in lbsn-rawdb. This approach is highly efficient for large datasets and maintains a clear separation between raw and processed data.

On hlldb, install postgres_fdw extension:

sql_query = """
CREATE EXTENSION IF NOT EXISTS postgres_fdw SCHEMA extensions;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE EXTENSION

MVIEWS_REF = "all_posts_de"

Check if foreign table has been imported already:

sql_query = f"""
SELECT EXISTS (
   SELECT FROM information_schema.tables 
   WHERE  table_schema = 'mviews'
   AND    table_name   = '{MVIEWS_REF}'
   );
"""
result = db_conn_hll.query(sql_query)

result["exists"][0]

np.True_

Conditional load password - this only need to be done once, if the server hasn't been added before.

Enter the password for pg user lbsn_reader

USER_KEY = None
if not result["exists"][0]:
    import getpass
    USER_KEY = getpass.getpass()

Create Foreign Server connection to rawdb, on hlldb:

sql_query = f"""
CREATE SERVER IF NOT EXISTS lbsnraw 
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
    host '{DB_NAME_RAWDB}',
    dbname '{DB_HOST_RAWDB}',
    port '5432',
    keepalives '1',
    keepalives_idle '30',
    keepalives_interval '10',
    keepalives_count '5',
    fetch_size '500000');
CREATE USER MAPPING IF NOT EXISTS for postgres
    SERVER lbsnraw 
    OPTIONS (user 'lbsn_reader', password '{USER_KEY}');
"""
if not result["exists"][0]:
    cur_hll.execute(sql_query)
    print(cur_hll.statusmessage)

Optional cleanup step (deactivated cell):

cur_hll.execute("DROP MATERIALIZED VIEW IF EXISTS mviews.all_posts_de")

If result["exists"][0]: Import foreign table definition on the hlldb.

sql_query = f"""
IMPORT FOREIGN SCHEMA mviews
    LIMIT TO (
        {MVIEWS_REF})
    FROM SERVER lbsnraw 
    INTO mviews;
"""
# only import table 
# if it hasn't been imported already
if not result["exists"][0]:
    cur_hll.execute(sql_query)
    print(cur_hll.statusmessage)

test

db_conn.query(f"SELECT * FROM mviews.{MVIEWS_REF} LIMIT 10;")

If you see a permission denied for materialized view ..., grant the rights first:

GRANT SELECT ON ALL TABLES IN SCHEMA mviews TO lbsn_reader;

Commit changes to hlldb (deactivated, enable manually after check):

db_connection_hll.commit()

Preview samples¶

1 Instagram
2 Flickr
3 Twitter
4 Facebook
5 Foursquare
8 Reddit
23 iNaturalist"

cols = "origin_id, post_body, hashtags, emoji"
samples = db_conn.query(
    f"""
    (SELECT {cols} FROM mviews.{MVIEWS_REF} WHERE origin_id = 1 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF} WHERE origin_id = 2 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF} WHERE origin_id = 3 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF} WHERE origin_id = 23 LIMIT 10);
    """)

from pandas import option_context

with option_context('display.max_colwidth', 400):
    display(samples.head(40))

Extract all user posts globally for those users who have been active in Germany¶

Working with user data to extract home locations is challenging, due to privacy conflicts. What we want is an optimal tradeoff between acceptable granularity of results (estimated home location) and largest possible granularity of location information, to prevent compromising user privacy. The output only needs to be a three-fold classifier for each user who has been active in Germany:

1 - tourist (home location outside of Germany)
2 - local (home location inside Germany)
3 - both (either not enough data available or ambiguous results)

We can store this output in a HLL shard or a raw list of user IDs and classification. To reach this output, follow this list of steps:

Extract raw list of all unique user ids active in Germany
Extract raw list of all post or userday-locations per user (use Geohash, of e.g. 6, to reduce spatial granularity) 2a. Exclude all users with less than < 50 posts (e.g.), classify these IDs as "both"
For all other users, identify most probably home location based on HDBSCAN Clustering; store this as lat/lng coordinate
Classify in either local, or tourist, based on lat/lng intersection with Germany

We can make use of HyperLogLog, to a small degree. For instance, we could use HLL to aggregate userdays per GeoHash for each users, so that we have an immediate aggregation step that reduces the bulk of post data a bit.

HyperLogLog parameters

The HyperLogLog extension for Postgres from Citus that we're using here, contains several tweaks, to optimize performance, that can affect sensitivity of data.

From a privacy perspective, for example, it is recommended to disable explicit mode.

Explicit mode? When explicit mode is active, full IDs will be stored for small sets. In our case, any coordinates frequented by few users (outliers) would store full user and post IDs.

To disable explicit mode:

db_conn_hll.query("SELECT hll_set_defaults(11, 5, 0, 1);")

Data Pseudonymization and Generalization¶

Before processing, we establish two crucial functions to protect user privacy, in line with ethical research standards.

1. Cryptographic Hashing for Pseudonymization The HyperLogLog algorithm uses MurMurHash, a non-cryptographic hashing function. While fast, it can be vulnerable to certain attacks (see Desfontaines et al. 2018). To mitigate this and prevent the re-identification of users, we add a robust layer of security. We use the pgcrypto extension to apply a seeded cryptographic hash (SHA-256 with HMAC) to all user identifiers.

This process converts a user's original ID into a secure, non-reversible pseudonym. The use of a secret key (or "salt)") ensures that the same user ID will always produce the same pseudonym within our dataset, but this pseudonym cannot be reverse-engineered or matched against other datasets without the key.

2. Spatial Generalization with Geohash To reduce the spatial precision of location data, we snap GPS coordinates to a coarser grid using the Geohash algorithm. This function rounds coordinates to a specified precision, effectively aggregating points within a defined area. This protects privacy by obfuscating exact locations while retaining sufficient detail for neighborhood-level or regional analysis.

References:

Desfontaines, D., Lochbihler, A., & Basin, D. (2018). Cardinality Estimators do not Preserve Privacy. 1–21.

Create the pgcrypto extension:

sql_query = "CREATE EXTENSION IF NOT EXISTS pgcrypto SCHEMA extensions;"
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE EXTENSION

Prepare cryptographic hash function. The following function will take an id and a key (the seed value) to produce a new, unique hash that is returned in hex encoding.

sql_query = """
/* Produce pseudonymized hash of input id with skey
 * - using skey as seed value
 * - sha256 cryptographic hash function
 * - encode in base64 to reduce length of hash
 * - remove trailing '=' from base64 string
 * - return as text
 */
CREATE OR REPLACE FUNCTION 
extensions.crypt_hash (id text, skey text)
RETURNS text
AS $$
    SELECT 
        RTRIM(
            ENCODE(
                HMAC(
                    id::bytea,
                    skey::bytea,
                    'sha256'), 
                'base64'),
            '=')
$$
LANGUAGE SQL
STRICT;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE FUNCTION

Note: Cryptographic hashing alone will only produce pseudonymized data, since any id still relates to a single user (or post, etc.). Therefore, pseudonymization is considered a weak measure, which can be easily reversed, e.g. through [rainbow tables] or context lookup.

It is used here as an additional means to protect HLL sets from intersection attacks, e.g. as is discussed by Desfontaines et al. (2018).

What is a seed value? The seed value (skey) is a secret that is used, together with the encryption function (e.g. sha256) to produce a unique, collision-free output value (the hashed id). The same ID will be converted to a different hash if the seed value is changed. Therefore, in our case, the seed value must remain the same during the entire processing of data. In this case, the seed is called a key. This key can be destroyed afterwards, if no subsequent updates are necessary.

sql_query = """
/* Reduce spatial granularity of coordinates by GeoHash
   - will keep Null Island unmodified
 */
CREATE OR REPLACE FUNCTION 
extensions.geohash_reduce (IN coord geometry, geohash integer DEFAULT 5)
RETURNS geometry
AS $$
    SELECT
        CASE WHEN ST_Y(coord) = 0 AND ST_X(coord) = 0
        THEN
            coord
        ELSE
            ST_PointFromGeoHash(ST_GeoHash(coord, geohash), geohash)
        END as "coord"    
$$
LANGUAGE SQL
STRICT;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE FUNCTION

Step 1: Extract Unique Users Active in Germany¶

The first step in our analysis is to create a definitive list of all users who have posted at least one time from within Germany. We create a materialized view for this list, mviews.unique_user_de, which improves performance in subsequent queries.

Optional cleanup step:

sql_query = f"""
DROP MATERIALIZED VIEW IF EXISTS mviews.unique_user_de;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

%%time
sql_query = f"""
CREATE MATERIALIZED VIEW IF NOT EXISTS mviews.unique_user_de AS
SELECT DISTINCT t1.origin_id, t1.user_guid
FROM   mviews.all_posts_de t1;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

SELECT 6233753
CPU times: user 8.87 ms, sys: 0 ns, total: 8.87 ms
Wall time: 4min 5s

Consistency check and validation: The above 6,233,753 exact user counts confirm our earlier abstracted HLL cardinality estimation. We can also query the count, to check again:

db_conn_hll.query(f"SELECT count(user_guid) FROM mviews.unique_user_de;")

db_connection_hll.commit()

Step 2: Fetch Global Post History for German Users¶

With our list of Germany-active users, we now need to retrieve their entire global posting history. This is the most computationally intensive step, as it involves joining our list of ~6 million users against the main posts table containing hundreds of millions of entries.

The following query is executed directly on the lbsn-rawdb server for performance reasons. It performs several actions simultaneously:

It joins the unique_user_de list with the main topical.post table.
It applies the cryptographic hash function (extensions.crypt_hash) to the user_guid to pseudonymize the data at the source.
It reduces the precision of the coordinates using our geohash_reduce function.
It creates a userday identifier by concatenating the hashed user ID with the date. This unique marker is central to our home location inference, as it represents a day of activity for a specific user.

First, to speed up the calculation, we will add an index to the materialized view, since indexes are not inherited.

%%time
# optional cleanup
sql_query = f"""
DROP INDEX IF EXISTS idx_uniqueuser_userguid;
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

%%time
sql_query = f"""
CREATE INDEX idx_uniqueuser_userguid 
ON mviews.unique_user_de (user_guid);
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE INDEX
CPU times: user 1.05 ms, sys: 194 μs, total: 1.24 ms
Wall time: 3.97 s

Ok, the above query cannot be compared because topical.post is on a foreign table, and comparing 800M rows through the fdw wrapper is impractical.

byobu
docker exec -it lbsn-rawdb /bin/bash
psql -h localhost -p 5432 -U postgres rawdb
-- Enable timing:
\timing
-- test
select * from mviews.all_posts_de limit 1;

Therefore, execute both these queries below on the foreign database directly, connected to the postgres backend, as it may take a while to compute.

CREATE MATERIALIZED VIEW IF NOT EXISTS mviews.unique_user_de AS
SELECT DISTINCT t1.origin_id, t1.user_guid
FROM   mviews.all_posts_de t1;

CREATE INDEX idx_uniqueuser_userguid 
ON mviews.unique_user_de (user_guid);

Below is the "big" query:

CREATE MATERIALIZED VIEW IF NOT EXISTS mviews.posts_from_de_users AS
SELECT p.origin_id,
       extensions.crypt_hash(p.user_guid, '[redacted]') AS user_guid,
       ST_Y(extensions.geohash_reduce(p.post_latlng, {GEOHASH_PRECISION})) AS latitude,
       ST_X(extensions.geohash_reduce(p.post_latlng, {GEOHASH_PRECISION})) AS longitude,
       (extensions.crypt_hash(p.user_guid, '[redacted]') || to_char(COALESCE(NULLIF(p.post_create_date, ''), p.post_publish_date), 'yyyy-MM-dd')) AS userday
FROM   topical.post p
JOIN   mviews.unique_user_de u
       ON p.user_guid = u.user_guid;

> SELECT 291557025
> Time: 7977669.774 ms (02:12:57.670)

The query took about 2 hours and selected 291,557,025 Million posts from German tourists or locals worldwide.

Step 3: Filter Data for Reliable Home Location Inference¶

Before we can infer home locations, we must filter the 292 million collected posts to improve the quality and reliability of our input data. There are two main filtering criteria:

User Activity: Reliably inferring a home location requires a sufficient amount of data per user. Following best-practice parameter selection from previous study results (e.g., Huang et al., 2014), we set a minimum threshold of user activity. In this case, we exclude users with fewer than 30 unique userdays (days on which they posted). This removes users whose data is too sparse to yield a meaningful result.
Invalid Coordinates: The dataset may contain erroneous geotags, with a common error being posts located at (0.0, 0.0), often referred to as "Null Island". Since these points provide no valid geographical information, they are excluded from the analysis.

These filtering steps are performed on the lbsn-rawdb server to create a new materialized view, posts_from_de_users_filtered. This view contains the cleaned and pseudonymized data that will serve as the basis for our home location analysis.

Import worldwide users posts via FDW¶

Check if already imported:

MVIEWS_REF_USER = 'posts_from_de_users'

sql_query = f"""
SELECT EXISTS (
   SELECT FROM information_schema.tables 
   WHERE  table_schema = 'mviews'
   AND    table_name   = '{MVIEWS_REF_USER}'
   );
"""
result = db_conn_hll.query(sql_query)

sql_query = f"""
IMPORT FOREIGN SCHEMA mviews
    LIMIT TO (
        {MVIEWS_REF_USER})
    FROM SERVER lbsnraw 
    INTO mviews;
"""
# only import table 
# if it hasn't been imported already
if not result["exists"][0]:
    cur_hll.execute(sql_query)
    print(cur_hll.statusmessage)

IMPORT FOREIGN SCHEMA

If you see a permission denied for materialized view ..., grant the rights first:

GRANT SELECT ON ALL TABLES IN SCHEMA mviews TO lbsn_reader;

Make sample selectin (similar to the one we used in the beginning of the notebook).

cols = "origin_id, user_guid, latitude, longitude, userday"
samples = db_conn_hll.query(
    f"""
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 1 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 2 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 3 LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 23 LIMIT 10);
    """)

with option_context('display.max_colwidth', 400):
    display(samples.head(40))

If the sample was extracted without errors, commit changes to the local working DB:

db_connection_hll.commit()

Filter data¶

Before classification, let's filter the data for the categories of both or ambiguous. In line with the general parameter recommendations set out in the paper by Huang et al. (2014, 'From Where Do Tweets Originate?'), we can set a lower limit of 50 posts per user below which it is no longer possible to unambiguously identify the user's home location. Rather than using post count, we use userdays as a global measure of the total number of days on which the user has posted at least once. We may reduce the number of userdays to 30, to account for the slighly increased aggregation granularity.

We can also see what we can additionally filter in the query output above. The Null Island is prominently present in the first few elements selected from the database. These are erroneous posts with incorrect geotags. As they don't provide reliable geo-information for inferring home location, we can exclude them too.

Let's first test to exclude the Null Island

cols = "origin_id, user_guid, latitude, longitude, userday"
samples = db_conn_hll.query(
    f"""
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 1 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 2 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 3 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER} WHERE origin_id = 23 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10);
    """)

with option_context('display.max_colwidth', 400):
    display(samples.head(20))

This looks good.

We are now only missing the filter step to exclude all users below 30 userdays total. We will do this together with the Null Island exclusion. And we will do it again on our foreign raw server, to not send 292 Million elements through the Foreign Data Wrapper.

First, create indexes on the original materialized view, so the query speed is increased:

-- For user-level lookups
CREATE INDEX idx_posts_from_de_users_userguid
ON mviews.posts_from_de_users (user_guid);

-- For userday-based queries
CREATE INDEX idx_posts_from_de_users_userday
ON mviews.posts_from_de_users (userday);

-- Optional for spatial queries
CREATE INDEX idx_posts_from_de_users_lat_lon
ON mviews.posts_from_de_users (latitude, longitude);

Then, create a filtered mview containing only the valid post data required to infer user home locations. We split this into two distinct queries, to speed up compute and reduce complexity of the sql query.

First, group by distinct userdays and select only userday in the output selection. We can later extract the hashed user_guid substring from the result.

CREATE MATERIALIZED VIEW IF NOT EXISTS mviews.distinct_user_days_de_users AS
SELECT userday
FROM   mviews.posts_from_de_users
GROUP  BY userday;

> SELECT 106540196
> Time: 96678.938 ms (01:36.679)

This step reduced the dataset from ~292 million original posts to ~106 million distinct userdays, a ~70% reduction in size.

How many users will be excluded, if we filter by > 30 userdays?

WITH excluded_users AS (
    SELECT left(userday, length(userday) - 10) AS user_guid
    FROM mviews.distinct_user_days_de_users
    GROUP BY left(userday, length(userday) - 10)
    HAVING COUNT(*) <= 30
)
SELECT COUNT(*) AS number_of_excluded_users
FROM excluded_users;

>  number_of_excluded_users
> --------------------------
>                   5646069
> (1 row)
> 
> Time: 122854.215 ms (02:02.854)

5646069 / 6233753 * 100

90.57254915297413

This will exclude about 90% of users in our dataset. These users do not have enough user activity to derive the user homne location. These numbers tell a very important story about the nature of social media data (see the long-tail or power-law distribution).

Lastly, run the query below to create a new mview by selecting active users with > 30 distinct userdays and by filtering the Null Island.

CREATE MATERIALIZED VIEW IF NOT EXISTS mviews.posts_from_de_users_filtered AS
WITH active_users AS (
    SELECT left(userday, length(userday) - 10) AS user_guid
    FROM   mviews.distinct_user_days_de_users
    GROUP  BY left(userday, length(userday) - 10)
    HAVING COUNT(*) > 30
)
SELECT p.*
FROM   mviews.posts_from_de_users p
JOIN   active_users a
       ON p.user_guid = a.user_guid
WHERE  NOT (p.latitude = 0.0 AND p.longitude = 0.0);

SELECT 248294113
Time: 704626.323 ms (11:44.626)

This step filtered the dataset to ~248 million valid posts (excluding Null Island and low-activity users), down from the original ~292 million. Despite the large scale, the query completed in under 12 minutes.

Import to our local worker database.

MVIEWS_REF_USER_FILTERED = 'posts_from_de_users_filtered'

sql_query = f"""
SELECT EXISTS (
   SELECT FROM information_schema.tables 
   WHERE  table_schema = 'mviews'
   AND    table_name   = '{MVIEWS_REF_USER_FILTERED}'
   );
"""
result = db_conn_hll.query(sql_query)

sql_query = f"""
IMPORT FOREIGN SCHEMA mviews
    LIMIT TO (
        {MVIEWS_REF_USER_FILTERED})
    FROM SERVER lbsnraw 
    INTO mviews;
"""
# only import table 
# if it hasn't been imported already
if not result["exists"][0]:
    cur_hll.execute(sql_query)
    print(cur_hll.statusmessage)

IMPORT FOREIGN SCHEMA

If you see a permission denied for materialized view ..., grant the rights first:

GRANT SELECT ON ALL TABLES IN SCHEMA mviews TO lbsn_reader;

cols = "origin_id, user_guid, latitude, longitude, userday"
samples = db_conn_hll.query(
    f"""
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER_FILTERED} WHERE origin_id = 1 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER_FILTERED} WHERE origin_id = 2 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER_FILTERED} WHERE origin_id = 3 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10)
    UNION
    (SELECT {cols} FROM mviews.{MVIEWS_REF_USER_FILTERED} WHERE origin_id = 23 AND NOT (latitude = 0.0 AND longitude = 0.0) LIMIT 10);
    """)

with option_context('display.max_colwidth', 400):
    display(samples.head(40))

Step 4: Aggregate Data for Clustering Analysis¶

This is the final data preparation step before the home location clustering can be performed (in a subsequent process). The goal here is to aggregate the data by user and unique location, counting the number of distinct user-days for each point. This creates a summary of how frequently each user visits specific locations.

To do this efficiently, we insert the filtered data into a new table in our local hlldb, named spatial.latlng_de_hll. The query performs the following aggregation:

It groups the data by user_guid and location (latitude, longitude).
For each group, it uses hll_add_agg to create a HyperLogLog structure containing all the unique userday hashes. The cardinality of this HLL set gives us a highly accurate and memory-efficient estimate of the number of days a user was active at that specific location.

The resulting table provides a compact summary for each user, listing all the locations they have visited and the number of days they spent at each, ready for a density-based clustering algorithm like HDBSCAN to identify their most probable home location.

sql_query = """
CREATE TABLE IF NOT EXISTS spatial.latlng_de_hll (
    origin_id   int,
    user_guid   text,
    userday_hll hll,
    latlng_geom geometry(Point, 4326)
);
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE TABLE

Store changes to DB.

db_connection_hll.commit()

Aggregate HLL by user + lat/lng:

connect to the backend of the local hll db
execute directly in the pgsql shell

byobu
docker exec -it lbsn-hlldb /bin/bash
psql -h localhost -p 5432 -U postgres hlldb
-- Enable timing:
\timing
-- test
select * from mviews.posts_from_de_users_filtered limit 1;

>  origin_id |                  user_guid                  |    latitude     |    longitude    |                        userday
> -----------+---------------------------------------------+-----------------+-----------------+-------------------------------------------------------
>          1 | DxC+3nuAOXEF5fQN8ngX3AFDLuAF/algMxhp0dpbrU4 | -37.81494140625 | 144.99755859375 | DxC+3nuAOXEF5fQN8ngX3AFDLuAF/algMxhp0dpbrU42017-10-19
> (1 row)

On the raw/remote table: Add indexes

CREATE INDEX idx_posts_grouping
ON posts_from_de_users_filtered (origin_id, latitude, longitude, user_guid);

On the local/hll db, set HLL aggregation parameters (see above):

SELECT hll_set_defaults(11, 5, 0, 1);

this disables explicit mode, which is important for privacy preservation

Then run the aggregation query on the local/hlldb.

INSERT INTO spatial.latlng_de_hll(
            origin_id, 
            user_guid,
            userday_hll,
            latlng_geom)
SELECT  origin_id,
        user_guid,
        hll_add_agg(hll_hash_text(userday)) AS userday_hll,
        ST_SetSRID(ST_MakePoint(longitude, latitude), 4326) AS latlng_geom
FROM mviews.posts_from_de_users_filtered
GROUP BY origin_id, latitude, longitude, user_guid;

> INSERT 0 27,409,403
> Time: 1479633.236 ms (24:39.633)

In this final data preparation step, the raw post data is transformed into a compact summary of each user's geographic footprint. The initial filtered table contained approximately 248 million rows, with each row representing a single post from a user. The aggregation query then collapses this data, creating just one row for each unique combination of a user and a specific location (geohash cell). Consequently, if an individual posted hundreds of times from the same location across different days, all those entries are now represented by a single row, stored in the userday_hll column. The detailed temporal information is not lost; rather, the number of distinct days the user was active at that spot is efficiently stored within the userday_hll data structure and can be estimated with a HLL cardinality calculation. This crucial transformation resulted in a significant and expected reduction in dataset size from 248 million rows to roughly 27 million rows, producing a clean, efficient table where each row signifies a unique place a person has visited, perfectly setting up the data for clustering to determine their home location.

Have a look at the resulting data structure.

db_conn_hll.query(f"SELECT * FROM spatial.latlng_de_hll LIMIT 10;")

Add an index on column user_guid, so we can export the CSV sorted/grouped by user_guid (= all posts of a single user together).

sql_query = """
CREATE INDEX idx_latlng_de_hll_user_guid 
ON spatial.latlng_de_hll (user_guid);
"""
cur_hll.execute(sql_query)
print(cur_hll.statusmessage)

CREATE INDEX

db_connection_hll.commit()

Export data as CSV¶

Save hll data to CSV. The following records are available from table spatial.latlng:

distinct origin_id plus user_guid and latitude and longitude coordinates (clear text/hashed), this is the "base" we're working with
userday_hll - approximate user days stored as hll set

def get_latlng_fromdb(
        table_ref: str, chunk_size: int = 5000000) -> List[pd.DataFrame]:
    """Returns spatial.latlng data from db, excluding Null Island

    Important: The data will be grouped/sorted by hashed user_guid.
    """
    sql = f"""
    SELECT  origin_id,
            user_guid,
            ST_Y(p1.latlng_geom) As "latitude",
            ST_X(p1.latlng_geom) As "longitude",
            userday_hll
    FROM {table_ref} p1
    ORDER BY user_guid
    """
    # execute query, enable chunked return
    return pd.read_sql(sql, con=db_connection_hll, chunksize=chunk_size)

def write_chunkeddf_tocsv(
    filename: str, usecols: List[str], chunked_df: List[pd.DataFrame],
    chunk_size: int = 5000000):
    """Write chunked dataframe to CSV"""
    for ix, chunk_df in enumerate(chunked_df):
        mode = 'a'
        header = False
        if ix == 0:
            mode = 'w'
            header = True
        chunk_df.to_csv(
            filename,
            mode=mode, columns=usecols,
            index=False, header=header)
        clear_output(wait=True)
        display(
            f"Stored {(ix*chunk_size)+len(chunk_df)} "
            f"post-locations to CSV..")

Execute Query:

%%time
root = Path.cwd().parents[0] / "00_data"
usecols = ["origin_id", "user_guid", "latitude", "longitude", "userday_hll"]
filename = root / "2025-09-19_userdays_DE_HLL.csv"
if Path(filename).exists():
    print(f"CSV already exists, skipping load from db.. (to reload, delete file)")
else:
    table_ref = f"spatial.latlng_de_hll"
    write_chunkeddf_tocsv(
        chunked_df=get_latlng_fromdb(
            table_ref=table_ref),
        filename=filename,
        usecols=usecols)

'Stored 27409403 post-locations to CSV..'

CPU times: user 1min 42s, sys: 15.3 s, total: 1min 57s
Wall time: 2min 47s

%%time
data_files = {
    "LatLng-userday (DE)":filename, 
    }
tools.display_file_stats(data_files)

CPU times: user 2.77 s, sys: 468 ms, total: 3.24 s
Wall time: 3.23 s

Export lat/lng data¶

We will also need to export the full-resolution lat/lng data for all posts in DE. This needs to be run in the SQL backend:

COPY (
    SELECT
        user_guid,
        ST_Y(post_latlng) AS latitude,
        ST_X(post_latlng) AS longitude
    FROM
        mviews.all_posts_de
)
TO '/pg_exp/2025-09-30_DE_coords_user.csv'
WITH (FORMAT CSV, HEADER, ENCODING 'UTF-8');

> COPY 66632117
> Time: 131900.793 ms (02:11.901)

Create notebook HTML¶

!jupyter nbconvert --to html_toc \
    --output-dir=../resources/html/ ./00_user_origin_conversion.ipynb \
    --template=../nbconvert.tpl \
    --ExtractOutputPreprocessor.enabled=False >&- 2>&-

	origin_id	post_body	hashtags	emoji
0	1	So unglaublich schön, wenn man den Strand für sich alleine hat. Der frühe Vogel und so...#meer #strand #morgensamstrand #düne #aufmdarß #kaltaberschön #natur #amliebstenbarfuß #meeresluft #brise #küstenliebe #erholungpur #gleicherstmalkaffee	[meer, strand, morgensamstrand, düne, aufmdarß, kaltaberschön, natur, amliebstenbarfuß, meeresluft, brise, küstenliebe, erholungpur, gleicherstmalkaffee]	[]
1	3	I'm at Steintor in Bremen https://t.co/q1yCZXWamH	[]	[]
2	23	None	[]	[🐦]
3	3	I'm at Alt-Berliner Biersalon in Berlin https://t.co/RtwZ5kzaB1	[]	[]
4	2	None	[balticsea, bodden, canoneos6d, dars, darss, de, deu, deutschland, ef24105mmf4lisusm, europa, europe, fischlanddarszingst, flipri, fok, germany, img9852, mecklenburgvorpommern, meer, mv, nationalparkvorpommerscheboddenlandschaft, ocean, ostsee, ozean, prerow, sea, strand, zingst]	[]
5	23	None	[]	[🕷]
6	3	I'm at El Pikosito in Hamburg https://t.co/W1tsxWOtUN https://t.co/VvC5B0JX03	[]	[]
7	1	I’ll see you soon again my loved #balticsea 😊	[balticsea]	[😊]
8	1	Daily gear - Red Wing Irish Setter model 9874 Moc Toes in wonderful Black Klondike leather, taking a beating the sand.The boots are topped by Indigofera Jeans model Clint STPF in unSanforized Fabric Nr.2, also gathering sand in the cuffs, as a little memory of German beaches. ❤❤,#redwing #redwingmoctoe #redwing9874 #moctoes #brund #indigoferajeans #shrinktofit #hepcatstore #camouflage #german...	[redwing, redwingmoctoe, redwing, moctoes, brund, indigoferajeans, shrinktofit, hepcatstore, camouflage, germany, winterbeach]	[❤, ❤, 🇩, 🇪]
9	23	None	[]	[🐛]
10	2	Abenddmmerung am Darer Bodden bei Fuhlendorf (Mecklenburg).~Dawning at Darer Bodden near Fuhlendorf (Mecklenburg).	[landschaft, boot, abenddämmerung, fuhlendorf, mecklenburg, deutschland, schilf, see, bodden, sonnenuntergang]	[]
11	2	None	[bank, fuhlendorf, mecklenburgvorpommern, ostsee, sonnenuntergang, urlaub2007]	[]
12	23	None	[]	[🌱]
13	1	Am Nordstrand,.,📷 = E-M1+7-14 f2.8,.,Wünsche euch allen einen schönen Abend 🙋,Wish you all a great evening 🙋,.,#blacknwhite_perfection #pocket_bnw #blackandwhite #blackandwhiteonly #only_blackandwhite #blackandwhite_photographers #bnw_fanatics #bnwsouls #bnw_greatshots #bnw_rose #bnw_planet #bnw_international #noir_shots #show_us_bw #amateurs_bnw #getolympus #olympuskameras #olympuscamera #sha...	[blacknwhite, pocket, blackandwhite, blackandwhiteonly, only, blackandwhite, bnw, bnwsouls, bnw, bnw, bnw, bnw, noir, show, amateurs, getolympus, olympuskameras, olympuscamera, sharegermany, kodakmomentmentsde, meckpomm, mecklenburgvorpommern, pocket, microfournerds, mft, mftphotos, batpixs, bns, batpixs, germanvision]	[📷, 🙋, 🙋]
14	1	Prerow	[]	[]
15	3	Da isser. Der Silvestermann https://t.co/WnsnXlJ7z3	[]	[]
16	1	Urlauberschiff Neptun, das schwimmende Ferienhaus für bis zu 5 Personen im.Ostseebad Prerow	[]	[]
17	3	I'm at Espresso Perfetto Caffè Bar in Düsseldorf, Nordrhein-Westfalen https://t.co/XSP1XrKziP https://t.co/0eb8c5KjRs	[]	[]
18	2	None	[wieckamdars]	[]
19	23	None	[]	[🦟]
20	3	THREAD: Ich liebe Waschmittelwerbung. In kaum einem anderen Gattungsmarketing wird das fast immer gleiche, so eindrucksvoll als das ewig neue verkauft. Zum Beispiel Persil. Persil war in der Werbung immer, „das beste Persil, dass es je gab.“	[]	[]
21	1	...wenn sich der Tag dem Ende neigt. ,#meer #strand #abendrot #ostsee #küste #meeresluft #auszeit #einfachschön #daslebengeniessen	[meer, strand, abendrot, ostsee, küste, meeresluft, auszeit, einfachschön, daslebengeniessen]	[]
22	1	Der Tatzenabdruck eines #ostseebärs. Nehm ich zumindest so an, vielleicht war es aber auch nur ein Hund 🤔🐶🐻🌊🌞	[ostseebärs]	[🐶, 🐻, 🌊, 🌞]
23	23	None	[]	[🌳]
24	2	am Strand bei Prerow	[2470mm, balticsea, beach, buhne, clouds, d750, dars, darss, deutschland, farben, germany, goldenestunde, himmel, meer, mohle, morgen, nikon, nordufer, ostsee, prerow, sky, sonne, sonnenaufgang, sunrise, tamron, teiltonung, wellen, wellenbrecher, wolken]	[]
25	2	None	[strand, dars, meer, wasser, a, leuchtturm, wieck]	[]
26	1	Beachselfie. #tbt #throwbackthursday #handinhand #coupleshot #couplegoals #meandmyman #beachselfie #sundowner #sandindenschuhen #aufnachmv @aufnachmv #fischlanddarsszingst @fischlanddarsszingst #urlaubindeutschland #urlaubingermany #pocket_germany @germanytourism #prerowstrand #prerow #perspective #ig_color #tv_pointofview #tv_perspective #ig_perspective #canoneos70d #imkesfotoperlen	[tbt, throwbackthursday, handinhand, coupleshot, couplegoals, meandmyman, beachselfie, sundowner, sandindenschuhen, aufnachmv, fischlanddarsszingst, urlaubindeutschland, urlaubingermany, pocket, prerowstrand, prerow, perspective, ig, tv, tv, ig, canoneos, imkesfotoperlen]	[]
27	3	Ich komme wieder! #zut #ultratrailrun #grainau #basetrail… https://t.co/8PKwmbkCj2	[basetrail, grainau, ultratrailrun, zut]	[]
28	3	Ein Meer, eine Frisur. https://t.co/qJbYXx1yKt	[]	[]
29	2	None	[strand, sandstrand, sea, küste, sun, weather, sand, partlycloudy, natur, rostock, 2016, meer, germany, deutschland, ufer, outdoor, cloudy, landschaft, mecklenburg, mecklenburgvorpommern, wetter, landscape, cyclon, clouds, prerow, seebrücke, nature, beach, ostsee, meckelnburg]	[]
30	2	None	[fuhlendorf, mecklenburgvorpommern, ostsee, sonnenuntergang, urlaub2007]	[]
31	1	Füsse im Sand ,#urlaub #ferien #strand #sand #ostsee #dünen #ferienhaus #hausboot #urlauberschiff #störtebeker #fischlanddarsszingst #prerow #reisen #travelling #holiday	[urlaub, ferien, strand, sand, ostsee, dünen, ferienhaus, hausboot, urlauberschiff, störtebeker, fischlanddarsszingst, prerow, reisen, travelling, holiday]	[]
32	3	Eis essen, Wetter ist wieder schön (@ Müritzterasse Hotel & Restaurant) https://t.co/qFfqcgWDeN https://t.co/YzRG7ItCdZ	[]	[]
33	23	None	[]	[🦋]
34	3	Just posted a photo @ Wickede (Dortmund) https://t.co/BkFZoSXF7S	[]	[]
35	2	None	[]	[]
36	23	None	[]	[🦎]
37	2	Got loads of pictures to upload and there is still one week of holiday left. It's just a great place for taking pictures!	[canon, detail, eos, sun, pov, bw, cold, baltic, color, beauty, nd, coldtemperature, blue, filter, harbour, mood, colour, shadow, photographer, wood, angle, details, long, germany, wide, pier, longexposure, evening, purple, lighting, zingst, sky, harbor, colors, shadows, light, photography, composition, exposure, focus, atmosphere, 1000d, water, sundown, beautiful, balticsea]	[]

	origin_id	user_guid	latitude	longitude	userday
0	1	kMDfOJvENI1Ng2eoodUiXzpcFpIBg+9oggGHzQt2vgg	0.000000	0.000000	kMDfOJvENI1Ng2eoodUiXzpcFpIBg+9oggGHzQt2vgg2018-03-30
1	3	+4bkudodB5teCvCf/Utbv8prpoBGGn+3g/ag/m5avds	41.022949	28.674316	+4bkudodB5teCvCf/Utbv8prpoBGGn+3g/ag/m5avds2016-06-10
2	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	29.992676	-99.382324	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-05-10
3	1	g+rlgiPaziS2VCzxYgjWummcJOw25Jo5avqBrJqGtB8	0.000000	0.000000	g+rlgiPaziS2VCzxYgjWummcJOw25Jo5avqBrJqGtB82017-10-15
4	1	nrgLCixMp7CaSXVI7wTrAClQkPFc3nkeM00Bdu23wC8	0.000000	0.000000	nrgLCixMp7CaSXVI7wTrAClQkPFc3nkeM00Bdu23wC82017-04-06
5	2	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo	0.000000	0.000000	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo2013-05-14
6	3	a/5nS9vxACtbibGP+dr844V0XCvgjfaj5OFJSI7GUB0	48.273926	16.413574	a/5nS9vxACtbibGP+dr844V0XCvgjfaj5OFJSI7GUB02016-06-10
7	23	SwmG8jB1LIz9XJ9ph2jnqABjQLZLARwUcY2xmXD1ju0	63.874512	-149.040527	SwmG8jB1LIz9XJ9ph2jnqABjQLZLARwUcY2xmXD1ju02022-07-29
8	23	FGdFg9Y3VmHKRh0AaYDIV54Fud3bs6g2njF/TPZIBq0	53.635254	-2.922363	FGdFg9Y3VmHKRh0AaYDIV54Fud3bs6g2njF/TPZIBq02020-06-20
9	1	nrgLCixMp7CaSXVI7wTrAClQkPFc3nkeM00Bdu23wC8	0.000000	0.000000	nrgLCixMp7CaSXVI7wTrAClQkPFc3nkeM00Bdu23wC82018-01-04
10	23	yuZnGBpOiOcAHz4MY+hWnMtUcj7ytv1u0Uco6nbdXyM	50.954590	4.196777	yuZnGBpOiOcAHz4MY+hWnMtUcj7ytv1u0Uco6nbdXyM2022-07-17
11	23	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI	41.857910	-73.498535	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI2021-05-21
12	23	vqRZEVNQBizXpuuRTT1A6LJiv2TMZ1DZGQ33qJTByVU	-33.420410	-70.598145	vqRZEVNQBizXpuuRTT1A6LJiv2TMZ1DZGQ33qJTByVU2020-10-16
13	3	eGqTi0qpyJ4Kd03bt+IcsZYWTaUp3V6JP2AHlDDuFDk	52.404785	12.546387	eGqTi0qpyJ4Kd03bt+IcsZYWTaUp3V6JP2AHlDDuFDk2016-09-06
14	3	Q8O36wmadMt5zhegNh7bIQEKH/Yaxrc5BgeQevlXm4A	40.056152	29.509277	Q8O36wmadMt5zhegNh7bIQEKH/Yaxrc5BgeQevlXm4A2016-06-10
15	2	WExKGKIuJDJUvdO4cAZ4WxY63VAheBcZHKINh+AG7YQ	0.000000	0.000000	WExKGKIuJDJUvdO4cAZ4WxY63VAheBcZHKINh+AG7YQ2014-03-07
16	1	hRa2CLizJFv4H8z3/T/h+L24rVMV5XnbqGux3b5TrbU	0.000000	0.000000	hRa2CLizJFv4H8z3/T/h+L24rVMV5XnbqGux3b5TrbU2018-03-08
17	2	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo	0.000000	0.000000	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo2011-09-30
18	2	2RatSf70eBDaz3NrjqqYNMbSFtKqJ83rj60Ym6TavpI	0.000000	0.000000	2RatSf70eBDaz3NrjqqYNMbSFtKqJ83rj60Ym6TavpI2011-04-17
19	2	WiT0e/H/v9ym8qJRdhfJ+03SmHfRwtQrzaef2OcG9Pc	36.716309	-4.416504	WiT0e/H/v9ym8qJRdhfJ+03SmHfRwtQrzaef2OcG9Pc2011-12-21
20	3	v9kldUBUa9e65APzWWwN9kW4SleOTzUA9FOKVaNdzLI	46.472168	30.739746	v9kldUBUa9e65APzWWwN9kW4SleOTzUA9FOKVaNdzLI2016-06-10
21	1	8jH8kpE8mohkgH1tLM6gHfvwsi8neKjK7Our6/YUB40	0.000000	0.000000	8jH8kpE8mohkgH1tLM6gHfvwsi8neKjK7Our6/YUB402018-03-05
22	1	9KuMJzUymuJP4v6yIQ3cWW5zYwo5/wSTtarrgOqFvQ8	0.000000	0.000000	9KuMJzUymuJP4v6yIQ3cWW5zYwo5/wSTtarrgOqFvQ82018-02-21
23	2	WExKGKIuJDJUvdO4cAZ4WxY63VAheBcZHKINh+AG7YQ	0.000000	0.000000	WExKGKIuJDJUvdO4cAZ4WxY63VAheBcZHKINh+AG7YQ2014-03-06
24	23	s4E5fdRmS12LoYFgclueJM56PUguVqg8by8cl6SCiM4	54.733887	158.576660	s4E5fdRmS12LoYFgclueJM56PUguVqg8by8cl6SCiM42011-04-10
25	2	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo	0.000000	0.000000	wKm5WC6nFSu9D9kxxxyvRp2rmQLmRtKKtt9IbrkzrOo2011-10-01
26	23	s4E5fdRmS12LoYFgclueJM56PUguVqg8by8cl6SCiM4	54.733887	158.620605	s4E5fdRmS12LoYFgclueJM56PUguVqg8by8cl6SCiM42022-06-25
27	23	rlMKzQ9G9xbhEtO2rG7wXHkOp+Ym9JxDdKzTpqtdQ7c	39.177246	-76.838379	rlMKzQ9G9xbhEtO2rG7wXHkOp+Ym9JxDdKzTpqtdQ7c2022-07-30
28	1	XHnmV1NFudCFq2o8FvGf8AH1IeTC0AmkGgKIfa7eaMg	0.000000	0.000000	XHnmV1NFudCFq2o8FvGf8AH1IeTC0AmkGgKIfa7eaMg2017-09-29
29	2	gC9ugbSqRWM/0jfyxPO/V2niP78vmztSohtjricIk30	0.000000	0.000000	gC9ugbSqRWM/0jfyxPO/V2niP78vmztSohtjricIk302017-08-25
30	1	hRa2CLizJFv4H8z3/T/h+L24rVMV5XnbqGux3b5TrbU	0.000000	0.000000	hRa2CLizJFv4H8z3/T/h+L24rVMV5XnbqGux3b5TrbU2018-03-06
31	1	Gh1781CIEz+wAV9073ckp56VBOVwVbbjOB52is4L3uo	0.000000	0.000000	Gh1781CIEz+wAV9073ckp56VBOVwVbbjOB52is4L3uo2017-08-28
32	3	+4bkudodB5teCvCf/Utbv8prpoBGGn+3g/ag/m5avds	41.022949	28.981934	+4bkudodB5teCvCf/Utbv8prpoBGGn+3g/ag/m5avds2016-06-10
33	23	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI	56.315918	-120.959473	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI2020-08-08
34	3	jTgw7uZ/+q7lhxMxYvjOgkJchbC4oAis7DLAKIIblo8	39.484863	-8.195801	jTgw7uZ/+q7lhxMxYvjOgkJchbC4oAis7DLAKIIblo82016-06-10

	origin_id	user_guid	latitude	longitude	userday
0	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	54.470215	37.727051	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-07-24
1	3	AQ4IzvgwM/0bxTZwcIMCfbRVlZMMnqxLEZ+nLfYXpTY	42.429199	11.140137	AQ4IzvgwM/0bxTZwcIMCfbRVlZMMnqxLEZ+nLfYXpTY2017-12-31
2	3	vT2uHYkzfeG0AYyDXDZFydwcu8/C3uEj5TvTuUDmP/c	39.836426	28.146973	vT2uHYkzfeG0AYyDXDZFydwcu8/C3uEj5TvTuUDmP/c2016-06-10
3	23	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI	42.604980	-80.441895	GuanIHrPSpXXHbxWvKS5CAoDpFY8Gx0cHKHFmsB5yaI2017-09-19
4	3	EN1L3zUyVWSF8c1ylS85hs8Qafd6S3hm09XkxltgQN8	43.791504	11.271973	EN1L3zUyVWSF8c1ylS85hs8Qafd6S3hm09XkxltgQN82016-06-10
5	2	xHIYRwFpei5Xy4Y3twbzazIP3HuV37nqidyZZVIXEgU	42.780762	9.470215	xHIYRwFpei5Xy4Y3twbzazIP3HuV37nqidyZZVIXEgU2014-04-27
6	23	UoeC6L/PHeNPlBGKSiLGg2xZyaPTnGIMqw49iY2N+bM	42.077637	-71.125488	UoeC6L/PHeNPlBGKSiLGg2xZyaPTnGIMqw49iY2N+bM2016-04-02
7	2	KBqrqtiu48jX+qm3nCDTbtxw0Nr3qSRr8VyGBdUTqhI	42.429199	12.150879	KBqrqtiu48jX+qm3nCDTbtxw0Nr3qSRr8VyGBdUTqhI2017-04-14
8	3	RZFlO2YBRyr1+XiofHlBBmlDCq1soe/w+DkZ2B2QG6I	41.418457	2.131348	RZFlO2YBRyr1+XiofHlBBmlDCq1soe/w+DkZ2B2QG6I2017-09-30
9	2	dE4nsP0isPLsNlHvbGsKuDj5ObDwhgJcrVdzIUOWKwY	42.780762	9.470215	dE4nsP0isPLsNlHvbGsKuDj5ObDwhgJcrVdzIUOWKwY2013-09-29
10	2	2hX85AQbTuqme9Hg/ZgvHWfHA3mOP/hMKrB+MsgXfvc	42.429199	12.106934	2hX85AQbTuqme9Hg/ZgvHWfHA3mOP/hMKrB+MsgXfvc2013-07-20
11	1	LQLwusE/3u9nzB7xLShbNSt7wnho7qCe3gnavoJkhvg	42.429199	12.150879	LQLwusE/3u9nzB7xLShbNSt7wnho7qCe3gnavoJkhvg2018-03-18
12	23	HLgGwqXpnegmQ11O0Y+WyNxosxA2tPUx6qLpbmvpO3E	33.728027	-111.862793	HLgGwqXpnegmQ11O0Y+WyNxosxA2tPUx6qLpbmvpO3E2023-04-03
13	2	5pgPpRlCRLzu8BEVyd+XYFu004NzEt/I6qxxgW4VgKI	42.604980	13.864746	5pgPpRlCRLzu8BEVyd+XYFu004NzEt/I6qxxgW4VgKI2007-05-01
14	2	WaJzlT61ZRiSgYDacXp7iDbghCJhG3g1kMlOO6vzRWs	42.736816	13.425293	WaJzlT61ZRiSgYDacXp7iDbghCJhG3g1kMlOO6vzRWs2012-12-22
15	3	0pRQ/d20IdqNFLmvoQlF1ubgdZViY0zI/AvwXtc/SrU	41.374512	2.175293	0pRQ/d20IdqNFLmvoQlF1ubgdZViY0zI/AvwXtc/SrU2016-06-10
16	23	SwmG8jB1LIz9XJ9ph2jnqABjQLZLARwUcY2xmXD1ju0	37.551270	128.430176	SwmG8jB1LIz9XJ9ph2jnqABjQLZLARwUcY2xmXD1ju02020-08-16
17	3	r90or3zDNrQ8EsHDT14WDRbIpV5BPaLsGqlUYYNuNTY	53.547363	9.997559	r90or3zDNrQ8EsHDT14WDRbIpV5BPaLsGqlUYYNuNTY2016-06-10
18	1	wIcgBc5grJxwvrxnIQzfj/lWRe83CusGYcivhbEvB8k	42.429199	12.150879	wIcgBc5grJxwvrxnIQzfj/lWRe83CusGYcivhbEvB8k2017-05-07
19	3	rJgzOhWgw5ebg2cARau33jlnKIw1/Fs5m+DqVhEu+No	51.437988	7.009277	rJgzOhWgw5ebg2cARau33jlnKIw1/Fs5m+DqVhEu+No2016-06-10

	origin_id	user_guid	latitude	longitude	userday
0	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	54.470215	36.188965	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-06-27
1	1	sgr84Ax2dPOzDsMEnVD5OEVsF5B7nqbzOk81Bxi8/ds	42.341309	-71.037598	sgr84Ax2dPOzDsMEnVD5OEVsF5B7nqbzOk81Bxi8/ds2014-09-13
2	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	56.403809	43.835449	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-09-12
3	23	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk	41.110840	-74.245605	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk2019-10-01
4	3	JxRpnA+K3soeKCprx++GZDGDh2VQCGZXBR+uw/j7JGQ	48.845215	2.351074	JxRpnA+K3soeKCprx++GZDGDh2VQCGZXBR+uw/j7JGQ2016-06-17
5	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	57.722168	59.875488	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-06-19
6	23	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk	40.495605	-73.806152	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk2020-10-08
7	3	KwZVnrdOZGapvUKnK3j92efAFivvzYMXvHGYEJ/X3ag	38.474121	27.224121	KwZVnrdOZGapvUKnK3j92efAFivvzYMXvHGYEJ/X3ag2016-06-17
8	2	Ucptu7IM4imYZWol4hgBvCc0WYN1ZQkn3Uijusx9eqw	42.429199	-71.213379	Ucptu7IM4imYZWol4hgBvCc0WYN1ZQkn3Uijusx9eqw2009-09-13
9	3	3e9FlL+226y+XtNIZeXMAbNKaeZmOGbJfgtbsFTdP70	50.822754	3.493652	3e9FlL+226y+XtNIZeXMAbNKaeZmOGbJfgtbsFTdP702016-06-17
10	3	0a2EKEblW0ABA3MOv5w0YihLBuiOKM6s+zgSf0eUFjc	53.635254	9.997559	0a2EKEblW0ABA3MOv5w0YihLBuiOKM6s+zgSf0eUFjc2016-06-17
11	23	u7g5IxX+uvCaNWF9qNnC0+ndebi1BrNgPlyBckas760	-33.859863	151.237793	u7g5IxX+uvCaNWF9qNnC0+ndebi1BrNgPlyBckas7602021-02-14
12	23	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI	55.832520	48.669434	6su6xTcLVjs8dizW895H6DPXjTNMleiDjjMYTHxOzRI2020-07-01
13	3	4rVXNRfMljXN3fyucJ3jLfYbjusYP3PXKKTi246lXv0	53.811035	-9.514160	4rVXNRfMljXN3fyucJ3jLfYbjusYP3PXKKTi246lXv02016-06-17
14	23	UoeC6L/PHeNPlBGKSiLGg2xZyaPTnGIMqw49iY2N+bM	33.859863	-90.109863	UoeC6L/PHeNPlBGKSiLGg2xZyaPTnGIMqw49iY2N+bM2018-10-11
15	3	1oIlXDHUkr0lJJQdSCRbVjB+VJ+uCMmHkbmuJXHOhtw	52.272949	6.174316	1oIlXDHUkr0lJJQdSCRbVjB+VJ+uCMmHkbmuJXHOhtw2016-06-17
16	23	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk	41.462402	-73.850098	kw9VFBn4p/OwH4Xl7qJCEkP+jfXckt6cMYE2Ss7HTPk2019-09-03
17	3	rPKZFCqfQCDJA0o8bgE5CaJu3Jj85uSgNh8JrxhKqXE	46.647949	14.436035	rPKZFCqfQCDJA0o8bgE5CaJu3Jj85uSgNh8JrxhKqXE2016-06-17
18	1	k0b0w+Tnxhg709TD0TGDeq3u9wKaiRfXgYVV5rpHy/g	42.473145	-71.213379	k0b0w+Tnxhg709TD0TGDeq3u9wKaiRfXgYVV5rpHy/g2018-03-15
19	3	dhLLtHeqS/RxgDrDzO+ad5zm77/A7hY4T3sBypZDLlY	52.448730	4.855957	dhLLtHeqS/RxgDrDzO+ad5zm77/A7hY4T3sBypZDLlY2016-06-17
20	23	wcQDol0ILvFf6UvVnASVn6+eFw9n8hMNpTUJVYMEFa0	33.508301	-82.199707	wcQDol0ILvFf6UvVnASVn6+eFw9n8hMNpTUJVYMEFa02020-06-27
21	3	kWBisviS1tQ1997rn4Q1BuxpbPFzmnaB/zmg+Ywh2Mg	41.022949	29.113770	kWBisviS1tQ1997rn4Q1BuxpbPFzmnaB/zmg+Ywh2Mg2016-06-17
22	3	4rVXNRfMljXN3fyucJ3jLfYbjusYP3PXKKTi246lXv0	54.118652	-9.162598	4rVXNRfMljXN3fyucJ3jLfYbjusYP3PXKKTi246lXv02016-06-17

	origin_id	user_guid	userday_hll	latlng_geom
0	1	DCdJx28GmltEEolm0yQwxvY+Apsey43pLYYKNa6tPxc	\x138b40e7e3	0101000020E6100000000000004CDC63C000000000987E...
1	1	HFdpiy6Zwdfo7yf5UwbKu/PGUtq9396y/zVfTly5yEg	\x138b401a832d2136e2524563a1a783b741	0101000020E6100000000000004CDC63C000000000987E...
2	1	VncevvIHzngxF/O2wOJLULVqfxqRAcV9rpw6/S2VzxM	\x138b40f061	0101000020E6100000000000004CDC63C000000000987E...
3	1	qEVra/yUsCCriW/qawmZ7C7Jn5/QqPuY8kisnMLT0TI	\x138b40c603	0101000020E6100000000000004CDC63C000000000987E...
4	1	vjbncjQPC+6jnAJ7mpc01kiVNwP3OkRcADi/LqoEhSA	\x138b402ea24d61cae2fb21	0101000020E6100000000000004CDC63C000000000987E...
5	1	N8XE75etPF/LYOYVgqDwjACW/V5sDCV37HnlesH41u8	\x138b4003018a82	0101000020E6100000000000000080963F00000000987E...
6	1	Lb1OdaArMrTjtYvDLOwe9HpacpuDHBxe6DjVWvQWDpE	\x138b406142	0101000020E610000000000000545F60C000000000F878...
7	1	ZNwYBRdbVDs5s6JWqXD29e4x12QrQqJe+n3boR4/SWQ	\x138b401002	0101000020E610000000000000545F60C000000000F878...
8	1	8XKEx4xN4E7jdw1L5WMxDh5xJWt4wnnT2hfWPud2lhE	\x138b408566	0101000020E610000000000000985453C0000000002876...
9	1	Lrr/WqThxOK1MfS6c2s0cjxKNBEB3yAQIdMDhyhwxKc	\x138b402e44	0101000020E610000000000000985453C0000000002876...

	origin_id	post_guid	post_latlng	user_guid	post_create_date	post_publish_date	post_thumbnail_url	post_views_count	post_like_count	post_url	hashtags	emoji	post_title	post_body	post_geoaccuracy	post_comment_count	post_type	post_filter	place_guid
0	3	VHk0t41dPrAA83yDvBz2JxFg/u1PI0D0xjJXjOAE/hw	0101000020E610000025C1866DACFA2340C477A4F2DDC6...	NkxKboB5Al0wGwYzuyT0+xepLnCeeBY42hgFRp3H8rY	None	2016-06-17 02:09:28	None	None	None	https://twitter.com/170210070/statuses/7436265...	[alsterhaus, germany, hamburg, mutterland, rai...	[]	None	A rainyday in Hamburg. \n#mutterland #alsterha...	latlng	None	text	None	None
1	3	MSW12kCKg9YSsKPz1+rgnEvnqkRK/8ecILl40cK54Rs	0101000020E6100000850A0E2F88E82A40810A47904A3B...	1fdkAMexfLycm8IAo0Up3mD4HVt6eP1Q5aCPayJkr/E	None	2016-06-17 02:11:28	None	None	None	https://twitter.com/121565208/statuses/7436270...	[]	[]	None	I'm at Dreieck Neukölln (25) (1) in Berlin htt...	latlng	None	text	None	None
2	3	Lv1+fUXS7wBqHzxk+Psx3ngqenrSFYQIPyWT2bytPCY	0101000020E6100000384888F2052D2440FB230C0396C6...	Dr5jwbDjoxk1JaDChxL7OlS7bP2d9+O3YMmMNOIGn5E	None	2016-06-17 02:14:42	None	None	None	https://twitter.com/743748012/statuses/7436279...	[]	[😂, 😭]	None	😂😭😭😭😭😭😭 Yo People Are Really Dumb Beloved 😭😭😭😭😭	latlng	None	text	None	None
3	3	zDzR2yzUqpL7yxByd6c1nis7kunYAlfw1AqXciKUmQo	0101000020E6100000A88E554ACFFC2340B7D3D68860C6...	MLdIkxXo7NI1Bf02L9eDK2DoHvRQfB9blo9Faqmpccg	None	2016-06-17 02:15:27	None	None	None	https://twitter.com/2834844017/statuses/743628...	[hamburg]	[]	None	Niederschlag (Intensität 40%, Fläche 11%) - me...	latlng	None	image	None	None
4	3	wieoI7kek0r3GBKtnXo2a63poRA5f6e/pYl63XdbL9M	0101000020E6100000D11ED4EA50F62A4083E500A8FF3C...	1fdkAMexfLycm8IAo0Up3mD4HVt6eP1Q5aCPayJkr/E	None	2016-06-17 02:16:38	None	None	None	https://twitter.com/121565208/statuses/7436283...	[]	[]	None	I'm at Plänterwald in Berlin https://t.co/cyve...	latlng	None	text	None	None
5	3	jbDI9Sw+CNjUGK+37LZNEt2eR0QrgCdchOCPY+CYBS8	0101000020E6100000C66D3480B70024406666666666C6...	7CsievJjbgYtZNiS+t1SPOW1GbD3/ynJckihYfJDXJ4	None	2016-06-17 02:18:54	None	None	None	https://twitter.com/122542309/statuses/7436289...	[BetAwards, IChooseBreezy, cb]	[]	None	June 7: Chris Brown at Hamburger Barclaycard A...	latlng	None	text	None	None
6	3	omEopmnZiI+HP0XekR5V11ONYU2gVcarRcxo/j7Ql7Q	0101000020E6100000C66D3480B70024406666666666C6...	7CsievJjbgYtZNiS+t1SPOW1GbD3/ynJckihYfJDXJ4	None	2016-06-17 02:19:49	None	None	None	https://twitter.com/122542309/statuses/7436291...	[BetAwards, IChooseBreezy, cb]	[]	None	June 7: Chris Brown at Hamburger Barclaycard A...	latlng	None	text	None	None
7	3	zXpQJbOqT8nq5TicVqxKQP/ajPU0S6ulZVABvdwbrUI	0101000020E61000001B2FDD2406C12A4099BB96900F42...	+v/H1/hOSfiCrs5J5BKINpCzYiJpx3+WZEftNXfHxLk	None	2016-06-17 02:20:17	None	None	None	https://twitter.com/1336218432/statuses/743629...	[Frauentausch, GERPOL, JoCox, Lanz, rawtele5, ...	[]	None	1. #GERPOL\n2. #Lanz\n3. #JoCox\n4. #rawtele5\...	latlng	None	text	None	None
8	3	KR0WllQh7UEWwYQfjYBUfIpvspzSPuv/vxVKZiENIdE	0101000020E6100000C66D3480B70024406666666666C6...	7CsievJjbgYtZNiS+t1SPOW1GbD3/ynJckihYfJDXJ4	None	2016-06-17 02:20:29	None	None	None	https://twitter.com/122542309/statuses/7436293...	[BetAwards, IChooseBreezy, cb]	[]	None	June 7: Chris Brown at Hamburger Barclaycard A...	latlng	None	text	None	None
9	3	0OjjGVma2W6EnjVtAjsUnAd2tfv0dlq7kGgpZrwUgNs	0101000020E610000069006F81042D24400C8FFD2C96C6...	Dr5jwbDjoxk1JaDChxL7OlS7bP2d9+O3YMmMNOIGn5E	None	2016-06-17 02:20:56	None	None	None	https://twitter.com/743748012/statuses/7436294...	[]	[😂, 😭]	None	People Really Retarded And Google Is Really Fr...	latlng	None	text	None	None

name	LatLng-userday (DE)
size	2.53 GB
records	27,409,404