Quick dive in geospatial data and the technologiies around

In the open source, I looked at 3 differents technologies:

  • mysql
  • postgresql
  • mongodb

I spent nearly a day looking at those things and mongodb was just working fine with easy to understand default that makes data manipulation simple.

NoSql world

Documentation for mongo is self explanatory, easy to setup and use. It might not have all the bell and whistle of the existing solution in the relational world but it benefits in ease of use and get a better developer experience than their counterparts.

Relational world

Mysql and postgres both support spatial extension (postgis). However it tends to be trickier than mongo as despite being powerful postgis gives you the power to do whatever you want and with great power comes large responsibilities.

So far, the most comprehensive documentation to get started I have seen is that one

Once you get started, it was simpler to use postgis I got around those points:

  • postgis by default don’t make any assumption of your data and on the coordinate system you are using. It doesn’t default to lat and long as you need to explicitly tells him everything which decrease the readability of your query. lat and lon are defined in this spec and its corresponding SRID (aka spacial reference system id) is 4326
  • again don’t forget to specify the coordinate system!
    -- don't
    ST_GeomFromText('POINT(-106.8179680 39.1904180)');
    -- do
    SELECT ST_GeomFromText('POINT(-106.8179680 39.1904180)', 4326);
  • Mysql implementation is light compared to postgres one and many cool stuff in postgis require postgres. At first I was trying to use mysql as I thought my requirements were simple but I finally gave up as most of the things it would requires workaround or making query that are harder to maintain and expand.
  • the function I use the most: ST_X (get longitude), ST_Y (get latitude), ST_Make_Point (or GeomFromText if using mysql), ST_Distance_Sphere (calculate distance in m between 2 points), ST_Contains, ST_Transform
  • Postgis functions have a ST prefix in postgres, not in mysql. Code snippet:
      -- those 2 query do the same things
      SELECT ST_SetSRID( ST_MakePoint(-106.8179680, 39.1904180), 4326);
      SELECT ST_GeomFromText('POINT(-106.8179680 39.1904180 )', 4326);
    

A working code using postgres

\list
CREATE DATABASE test;
\connect test;
CREATE EXTENSION postgis;


DROP TABLE test;
CREATE TABLE test (
id VARCHAR(16) UNIQUE NOT NULL,
location GEOMETRY NOT NULL
);
CREATE INDEX location ON test USING GIST(location);

INSERT INTO test (id, location) VALUES
(1, ST_GeomFromText('POINT(-103.8003460 40.2488450 )', 4326)),
(2, ST_GeomFromText('POINT(-107.8726700 38.4849180)', 4326)),
(3, ST_GeomFromText('POINT(-105.3584800 39.5040250)', 4326)),
(4, ST_GeomFromText('POINT(-106.8179680 39.1904180)', 4327));


-- find poi withing 100km around a point
SELECT id, ST_X(location), ST_Y(location) FROM test
WHERE ST_Distance_Sphere(
location,
ST_GeomFromText('POINT(-106 39 )', 4326)
) < 100000;

A working code for mysql:

CREATE DATABASE test;
use test;
CREATE TABLE test (
 id VARCHAR(16) UNIQUE NOT NULL,
 location GEOMETRY NOT NULL,
 SPATIAL INDEX(location)
) ENGINE=MyISAM;
INSERT INTO `test` (`id`, `location`) VALUES
(1, GeomFromText('POINT(40.2488450 -103.8003460)')),
(2, GeomFromText('POINT(38.4849180 -107.8726700)')),
(3, GeomFromText('POINT(39.5040250 -105.3584800)')),
(4, GeomFromText('POINT(39.1904180 -106.8179680)'));

SELECT id, X(location), Y(location) from test;