Skip to main content Link Menu Expand (external link) Document Search Copy Copied

This page documents the different plugin function libraries available for use with Terrazzo. Plugins extend the capabilities of Terrazzo with functions that can be called as part of the transformation expressions in a build definition, such as in a compute or filter step.

There is a rich set of “built-in” functions that can be used without loading any plugins, and which are provided by the Datafusion library that Terrazzo is based on. You can see the list of built-in functions and how to invoke them here

Geometry

The Terrazzo geometry plugin provides routines for geospatial calculations, such as intersections and areas. For the most part, the functions in this plugin are implemented in terms of the corresponding algorithms in the GEOS geometry library.

Make the geometry plugin available to your build by adding this to the top of your build definition:

    "plugins": ["terrazzo_geometry"],

As of this writing, all of the functions in this plugin expect to receive geospatial arguments encoded in WKB (Well-Known Binary) format. When applying these functions over column values, the columns must be of the Binary data type.

geom_area

geom_area - Returns the area of a polygonal geometry.

    geom_area(GEOMETRY: Binary) -> Float64

geom_contains

geom_contains - Tests if every point of B lies in A, and their interiors have a point in common

    geom_contains(GEOMETRY_A: Binary, GEOMETRY_B: Binary) -> Boolean

geom_difference

geom_difference - Computes a geometry representing the part of geometry A that does not intersect geometry B

    geom_difference(GEOMETRY_A: Binary, GEOMETRY_B: Binary) -> Binary

geom_equals

geom_equals - Tests if two geometries include the same set of points

    geom_equals(GEOMETRY_A: Binary, GEOMETRY_B: Binary) -> Boolean

geom_intersection

geom_intersection - Computes a geometry representing the shared portion of geometries A and B

    geom_intersection(GEOMETRY_A: Binary, GEOMETRY_B: Binary) -> Binary

geom_intersects

geom_intersects - Tests if two geometries intersect (they have at least one point in common)

    geom_intersects(GEOMETRY_A: Binary, GEOMETRY_B: Binary) -> Boolean

NYC address

The Terrazzo NYC address plugin provides routines for translating addresses to and from various formats used in New York City-related datasets, such as voter files.

Make the NYC address plugin available to your build by adding this to the top of your build definition:

    "plugins": ["terrazzo_nyc_address"],

normalize_address_nyc_boe_to_pluto

normalize_address_nyc_boe_to_pluto - Attempt to format an address that follows the conventions of the NYC county boards of election voterfiles to an address of the form that appears in NYC Planning’s PLUTO database.

    normalize_address_nyc_boe_to_pluto(ADDRESS: Utf8, CITY: Utf8) -> Utf8

Where ADDRESS is the house number and street name of the target address, and CITY is the target NYC borough–or in the case of Queens, the neighborhood name. Returns an address string (house number and street name) representing a best-effort attempt to normalize the input address.

Normalization transformations include whitespace trimming, conversion of ordinal street names (“FIRST” vs. “1”), and corrections of common, borough-specific street name misspellings.