PostGIS Support in ZomboDB
As of version 10-1.0.5 of ZomboDB, the postgis
datatypes of geometry
and geography
are supported.
PostGIS Requirement
PostGIS must already be installed prior to enabling PostGIS support in ZomboDB, but can also be installed after ZomboDB as noted below.
Installation of PostGIS prior to ZomboDB
If the postgis
extension is already enabled when you install the ZomboDB plugin using CREATE EXTENSION zombodb
, then
PostGIS support will automatically be enabled in ZomboDB.
Installation of PostGIS after ZomboDB
If the PostGIS plugin is installed after ZomboDB, you will need to run SELECT zdb.enable_postgis_support()
to enable
support for postgis
in ZomboDB. If ZomboDB was able to detect the PostGIS extension, the above will return true
.
Otherwise it'll return false
.
Supported Coordinate Reference Systems
While PostGIS supports a plethora of coordinate systems, the current release of ElasticSearch(6.6.1) only supports
WGS84. To bridge the CRS gap between the two products, ZomboDB creates casts from postgis
's
geography
and geometry
types to json
using ST_AsGeoJSON()
and uses ST_Transform()
to convert coordinates from
their source CRS to WGS84 for storage in the ElasticSearch index.
Examples and Sample Data
There are two sample sets.
Sample dataset 1 is loaded in the NAD83 / Texas South Central (ftUS) CRS.
Sample dataset 2 is loaded in the WGS84 - World Geodetic System 1984 CRS.
Both are a PG Dump from PostgreSQL 10.6 and PostGIS 2.4.
With this dataset loaded and PostGIS installed, make sure ZomboDB is installed with PostGIS support on as indicated above and create a ZomboDB index on the table by running:
CREATE INDEX sample_data_2278_zombodb
ON sample_data_2278
USING zombodb ((sample_data_2278.*))
WITH (alias=sample_data_2278);
and
CREATE INDEX sample_data_4326_zombodb
ON sample_data_4326
USING zombodb ((sample_data_4326.*))
WITH (alias=sample_data_4326);
Querying the Sample Data
The most common ways of searching across spatialized data would be searching through points using polygons and bounding boxes whether they be drawn by the user or calculated from the extent of a map on the screen. To do this we will use the Geo Polygon and Bounding Box queries as shown below.
Geo Polygon Query
The function used for this type of query is dsl.geo_polygon
. It accepts arguments of field
as a text value such as
point_to_query
and a VARIADIC of type point
. A point
is a string containing a comma separated 'lon, lat'
values.
The query below would return all records whose geo_point field of PostGIS type POINT
fell within the bounds of the
polygon coordinates enumerated after it. As this is variadic and a polygon, it must contain at least three points and
its ending latitude and longitude must be the same as its starting latitude and longitude.
SELECT *
FROM sample_data_4326
WHERE sample_data_4326 ==>
dsl.geo_polygon('geo_point',
'-95.3757924220804,29.7530206054157',
'-95.3761162225586,29.753216394294',
'-95.3763406015772,29.7529338505327',
'-95.3766643966309,29.7531296379236',
'-95.3762156463589,29.7536947317361',
'-95.3758918431387,29.7534989430962',
'-95.3755680421945,29.7533031536912',
'-95.3757026673686,29.7531336250561',
'-95.3757924220804,29.7530206054157');
Bounding Box Query
The function used for this type of query is dsl.geo_bounding_box
. It accepts arguments of field
as a text value such
as point_to_query
and a string box
. The box
string is comprised of 4 comma separated values representing
'min lon, min lat, max lon, max lat'
. The query below would return all records whose geo_point field of PostGIS type
POINT
fell within the bounds of the box defined by the four corrdinates.
SELECT *
FROM sample_data_4326
WHERE sample_data_4326 ==>
dsl.geo_bounding_box('geo_point',
'-95.3757924220804,29.7530206054157,-94.3757924220804,30.7530206054157');
GeoShape Queries
Searching for points as noted above is a fairly straight-forward endeavor as you are merely searching for points inside a shape. To search for shapes such as polygons, linestrings in relation to shapes given by queries, ElasticSearch uses its GeoShape query. GeoShape queries support 4 spatial relation operators:
- INTERSECTS - (default) Return all documents whose geo_shape field intersects the query geometry.
- DISJOINT - Return all documents whose geo_shape field has nothing in common with the query geometry.
- WITHIN - Return all documents whose geo_shape field is within the query geometry.
- CONTAINS - Return all documents whose geo_shape field contains the query geometry.
In addition to the spatial relation operator, you will also supply a shape.
The two queries below show an envelope which is essentially a bounding box. However, our query will search for the column geom which is a POLYGON inside of our indexed table.
The first query will find all geom polygons that intersect with the envelope.
SELECT *
FROM sample_data_4326
WHERE sample_data_4326 ==>
dsl.geo_shape('geom', '{"type":"envelope","coordinates":[[-95.3757924220804,29.7530206054157],[-95.3761162225586,29.753216394294]]}','INTERSECTS');
The second query will find all geom polygons that have no relation to the envelope int hat they are not intersecting, contained or within the envelope defined.
SELECT *
FROM sample_data_4326
WHERE sample_data_4326 ==>
dsl.geo_shape('geom', '{"type":"envelope","coordinates":[[-95.3757924220804,29.7530206054157],[-95.3761162225586,29.753216394294]]}','DISJOINT');
GeoShape with ST_AsGeoJSON()
You can combine ZomboDB query params with PostGIS functions. For example, from the sample_data_4326
I can take the
following GeoJSON value:
{"type":"MultiPolygon","coordinates":[[[[-95.3757924220804,29.7530206054157],[-95.3761162225586,29.753216394294],[-95.3763406015772,29.7529338505327],[-95.3766643966309,29.7531296379236],[-95.3762156463589,29.7536947317361],[-95.3758918431387,29.7534989430962],[-95.3755680421945,29.7533031536912],[-95.3757026673686,29.7531336250561],[-95.3757924220804,29.7530206054157]]]]}
With this value, I can create a query like the two in the GeoShape Queries section looking for geom points that intersect with this shape. However, this value was derived from the following query:
SELECT st_asgeojson((SELECT geom FROM postgis.sample_data_4326 WHERE "HCAD_NUM" = '1292500000054'))::json;
I can run the same query using ST_AsGeoJSON() and shorten the query considerably like so:
SELECT postgis.hcad_real_acct.*
FROM sample_data_4326
LEFT JOIN hcad_real_acct ON sample_data_4326."HCAD_NUM" = realescout.hcad_real_acct.account
WHERE sample_data_4326 ==>
dsl.geo_shape('geom', st_asgeojson((SELECT geom FROM realescout.sample_data_4326 WHERE "HCAD_NUM" = '1292500000054'))::json,'INTERSECTS');
Above, we select the geom
field encompassing it in the ST_AsGeoJSON()
function and cast it as JSON to pass to the
dsl.geo_shape
query. This is nice for when you have predefined shapes in the database. For example, if I had an
additional table called zip_codes
with the geometry for all of the zip codes in the dataset stored there, I could do
aggregation or selections of items in that zip code based on the shape.
Notes
- All queries using ZDB's spatialized index data need to be in CRS
WGS84 - EPSG:4326
- During indexing, ZomboDB automatically converts
geography
andgeometry
tojson
(usingST_AsGeoJSON
) and automatically usesST_Transform()
to transform them to CRS4326
- Queries using ZDB's
dsl.geo_shape()
function need to be in CRS4326
- The
CONTAINS
shape relationship has been removed from Elasticsearch 6.6 - Postgres'
point
type is automatically mapped to the Elasticsearchgeo_point
type and can be queried withdsl.geo_bounding_box()
anddsl.geo_polygon()
queries - Columns defined as
geometry(Point, x)
orgeography(Point, x)
are automatically mapped to the Elasticsearchgeo_point
type and can be queried withdsl.geo_bounding_box()
anddsl.geo_polygon()
queries