This document is a reference for the V2 Segmentation API. The V2 Segmentation API allows you to create and manage segments using a subset of SQL that includes statements and operations necessary to define any segment. In addition, this version of the Segmentation API exposes the contact_data
and event_data
tables to enable SQL-compatible queries.
1SELECT contact_id, updated_at2FROM <table_reference> [ AS <table_alias> ]3[ [ INNER ] JOIN <table_reference> ON contact_id ]4[ WHERE <predicate> ]
In the previous example, the contact_id
and updated_at
columns are selected in the query. These two columns must be selected, and aggregate function MAX and its alias can be selected part of Last Event queries. Other than those columns, no additional columns can be selected. When joining two tables, they must be joined on the contact_id
column only because it is the foreign key used to connect the contact_data
and event_data
tables.
SELECT
FROM
JOIN
(defaults to INNER JOIN
)INNER JOIN
ON
WHERE
[NOT] IN
IS [NOT]
AND
OR
NOT
=
>
>=
<
<=
!=
+
-
*
/
%
Column Name | Data Type |
---|---|
CONTACT_ID | VARCHAR(36) |
VARCHAR(254) | |
PHONE_NUMBER_ID | VARCHAR |
EXTERNAL_ID | VARCHAR(254) |
ANONYMOUS_ID | VARCHAR(254) |
FIRST_NAME | VARCHAR(50) |
LAST_NAME | VARCHAR(50) |
ALTERNATE_EMAILS | VARCHAR ARRAY |
ADDRESS_LINE_1 | VARCHAR(100) |
ADDRESS_LINE_2 | VARCHAR(100) |
CITY | VARCHAR(60) |
STATE_PROVINCE_REGION | VARCHAR(50) |
POSTAL_CODE | VARCHAR(60) |
COUNTRY | VARCHAR(55) |
LIST_IDS | VARCHAR ARRAY |
CREATED_AT | TIMESTAMP |
UPDATED_AT | TIMESTAMP |
EMAIL_DOMAINS | VARCHAR ARRAY |
PHONE_NUMBER | VARCHAR(24) |
VARCHAR(254) | |
LINE | VARCHAR(254) |
VARCHAR(254) | |
UNIQUE_NAME | VARCHAR(254) |
CUSTOM_FIELDS * | Each custom field will appear as a column on the CONTACT table based on its name. Valid data types are: VARCHAR, NUMBER, TIMESTAMP |
Event data that can be optionally be used to enhanced segments
Column name | Data type |
---|---|
CONTACT_ID | VARCHAR(36) |
EVENT_SOURCE | VARCHAR(256) |
EVENT_TYPE | VARCHAR(256) |
TIMESTAMP | TIMESTAMP |
DATA | JSON BLOB. Numeric and String type data |
Combining more than one singlesend_id and mc_auto_id/mc_auto_step_id in a single statement using an AND clause is not allowed. In order to segment contacts that have events corresponding to more than one single sends and automations use the JOIN clause. See Multi-Engagement Events for examples.
1{2"payload": {3"unique_args": {4"mc_auto_id": "02bc41ca-07f9-11eb-9c1f-422f9e0f60d1",5"mc_auto_step_id": "0478e207-07f9-11eb-9c1f-422f9e0f60d1",6"singlesend_id": "b1ff0108-0994-11eb-915a-7ab0feb0ce95",7},8},9}
Example query
current_timestamp
The following query will return all contacts that were created before this run of the segment:
1SELECT contact_id, updated_at2FROM contact_data3WHERE created_at < CURRENT_TIMESTAMP
timestampadd
allows you to specify a time interval, a number of those intervals as a count, and a timestamp that you can use to compare Date fields against.
Parameters
Parameter | Supported values |
---|---|
interval | year , quarter , month , day , hour , minute , second |
count | any integer (negative or positive) |
timestamp | A supported date/time format (RFC3339), an existing date field, or a custom field of type Date |
Example Query
timestampadd(interval, count, timestamp)
The following query will return all contacts that were created before one year prior to 2020-10-15 6PM GMT:
1SELECT contact_id, updated_at2FROM contact_data3WHERE created_at > timestampadd(day, 1, '2020-10-15T18:00:12Z')
array_contains
checks for at least one of the values provided in an array. It does not hard check for the presence of all the values or an exact match of the values given. It also doesn't support pattern-matches like '%gmail.com%
' as array values. array_contains
will instead check for the exact string, including pattern match characters. For example, when searching '%gmail.com%
' the %
characters are considered part of the string literal.
Parameters
Parameter | Supported values |
---|---|
field_name | "alternate_emails" , "list_ids" , "email_domains" |
array | alternate emails, list IDs, and email domains that correspond to the specified field_name as strings |
Example Query
array_contains(field_name, array)
The following query will return all contacts with email domains equal to gmail.com or yahoo.com:
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(email_domains, ['gmail.com', 'yahoo.com'])
contains_word
is a custom function provided by the Marketing Campaigns V2 Segments API. contains_word
accepts two arguments: a word_value
that will be searched for in the field_name
values stored for each contact.
The field_name
you specify must be of type Text, and the value stored in field_name
must be a comma-separated string. For example, if the function evaluates the word_value
alice
against a first_name
field where the value is bob,alice,sue
, it will find alice
and return the contact. However, if the first_name
value is stored as bobalicesue
, bob.alice.sue
, or any other non comma separated string, alice
will not be found, and the contact will not be returned. This is important when thinking about how to enter the string values associated with your contacts. The check is case sensitive.
bob,(alice),sue
and a match value of (alice)
will return a contactbo'b,alice
will require a match value of bo''b
.Parameters
Parameter | Supported values |
---|---|
word_value | A string |
field_name | Any String type fields in your contact data or custom fields. Stored strings must be comma separated. |
Example Query
The following query will return all contacts where first_name
contains the word bob
:
1SELECT contact_id, updated_at2FROM contact_data3WHERE contains_word('bob', first_name)
Segmentation on a list is supported by providing a list ID value to the optional parent_list_ids
field in the v2 API request. This parameter currently accepts only one list ID. Segmentation of contacts from multiple list ids is supported through the use of array_contains
function. See example below.
All contacts:
1SELECT contact_id, updated_at2FROM contact_data
All contacts with first_name
'Dave':
1SELECT contact_id, updated_at2FROM contact_data3WHERE first_name = 'Dave'
All contacts where state_province_region
is 'Colorado':
1SELECT contact_id, updated_at2FROM contact_data3WHERE state_province_region = 'CO'
All contacts with primary email with the substring 'gmail.com':
1SELECT contact_id, updated_at2FROM contact_data3WHERE email like '%gmail.com%'
All contacts with a text type custom field my_text_custom_field
value 'abc':
1SELECT contact_id, updated_at2FROM contact_data3WHERE my_text_custom_field = 'abc'
All contacts with primary email with 'gmail.com' as domain name, and a text type custom field my_custom_field
value 'abc':
1SELECT contact_id, updated_at2FROM contact_data3WHERE email like '%gmail.com' and my_custom_field = 'abc'
All contacts with a number type custom field my_number_custom_field
value 12:
1SELECT contact_id, updated_at2FROM contact_data3WHERE my_number_custom_field = 12
All contacts with a date type custom field my_date_custom_field
value `2021-01-01T12:46:24Z':
1SELECT contact_id, updated_at2FROM contact_data3WHERE my_date_custom_field = '2021-01-01T12:46:24Z'
All contacts where alternate email is equal to 'alternate@gmail.com':
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(alternate_emails, ['alternate@gmail.com'])
All contacts where alternate email is equal to 'alternate@gmail.com' or 'alternate2@gmail.com':
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(alternate_emails, ['alternate@gmail.com','alternate2@gmail.com'])
All contacts present in a specific list "list_id":
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(list_ids, ['02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'])
All contacts present in either of the list_ids:
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(list_ids, ['02bc41ca-07f9-11eb-9c1f-422f9e0f60d1', '02bc41ca-07f9-11eb-9c1f-422f9e0f62e4', '042a8e48-6e31-11eb-a8bc-7656c249c550'])
All contacts with specific email domain(s) gmail.com
:
1SELECT contact_id, updated_at2FROM contact_data3WHERE array_contains(email_domains, ['gmail.com'])
All contacts where created_at
is after 2021-01-01 12 PM GMT:
1SELECT contact_id, updated_at2FROM contact_data3WHERE created_at > '2021-01-01T12:00:00Z'
All contacts where created_at
is equal to 2021-01-01 12 PM GMT:
1SELECT contact_id, updated_at2FROM contact_data3WHERE created_at = '2021-01-01T12:00:12Z'
All contacts with external id
that starts with '123':
1SELECT contact_id, updated_at2FROM contact_data3WHERE external_id like '123%'
When a contact's reserved fields are not set during creation, they default to a NULL value. Not setting a value for custom fields when creating a contact will save the contact without those custom fields. A segment query using these fields does not return contacts having NULL as a value. Additional conditions may be used to specify the field is null
so that contacts having a null value will be included in the segment.
All contacts where first_name
does not equal Dave
, also including contacts without a value for first_name
:
1SELECT contact_id, updated_at2FROM contact_data3WHERE first_name != ‘Dave' OR first_name is null
All contacts where my_custom_field
does not contain abc
, also including contacts without a value for my_custom_field
:
1SELECT contact_id, updated_at2FROM contact_data3WHERE my_custom_field like ‘%abc%' OR my_custom_field is null
Only JOIN
and INNER JOIN
are allowed and INNER JOIN
will be internally converted to JOIN
. LEFT JOIN
and RIGHT JOIN
are restricted for performance, and the same functionality can be achieved using JOIN
. The examples below show how LEFT JOIN
/ RIGHT JOIN
can be replaced with JOIN
and UNION
.
The following queries return all contacts that have an event "delivered" for a particular Single Send or whose state is Colorado. While both queries return the same set of contacts, using JOIN
and UNION
is more performant than using LEFT JOIN
.
LEFT JOIN (not supported)
1SELECT c.contact_id, c.updated_at2FROM contact_data c3LEFT JOIN event_data e4ON c.contact_id = e.contact_id5WHERE e.event_source = 'mail'6AND e.event_type = 'delivered'7AND e.DATA:payload.unique_args.singlesend_id = '042a8e48-6e31-11eb-a8bc-7656c249c550'8OR c.state_province_region = 'CO'
JOIN and UNION
UNION
can be used instead of "OR" to return contacts based on distinct criteria that requires use of a JOIN
.
For example, a query for contacts that were 'delivered' a Single Send OR the contacts whose state_province_region
is in "CO" can be written as follows:
1SELECT c.contact_id, c.updated_at2FROM contact_data c3JOIN event_data e4ON c.contact_id = e.contact_id5WHERE e.event_source = 'mail'6AND e.event_type = 'delivered'7AND e.DATA:payload.unique_args.singlesend_id = '042a8e48-6e31-11eb-a8bc-7656c249c550'8UNION9SELECT contact_id, updated_at10FROM contact_data11WHERE c.state_province_region = 'CO'
All contacts that have at least one event ( i.e. an attempt was made to send them an email):
1SELECT c.contact_id, c.updated_at2FROM contact_data c3JOIN event_data e ON c.contact_id = e.contact_id
All contacts that have opened emails from two different single sends:
1SELECT e2.contact_id, c2.updated_at2FROM event_data e23INNER JOIN (4SELECT c1.contact_id, c1.updated_at5FROM contact_data AS c16INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id7WHERE e1.event_source = 'mail'8AND e1.event_type = 'open'9AND e1.DATA:payload.unique_args.singlesend_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'10) AS c2 ON c2.contact_id = e2.contact_id11WHERE e2.event_source = 'mail'12AND e2.event_type = 'open'13AND e2.DATA:payload.unique_args.singlesend_id = '00163f67-7211-4363-ab4e-12dd6f313b3a'
All contacts that have opened ANY single sends within 3 days:
1SELECT c.contact_id, c.updated_at2FROM contact_data as c3JOIN event_data as e on c.contact_id = e.contact_id4WHERE e.event_source = 'mail'5AND e.event_type = 'open'6AND e.DATA:payload.unique_args.singlesend_id is not null7AND e.timestamp >= timestampadd(day, -3, current_timestamp())
All contacts that have both of the two events - click and open:
1SELECT e2.contact_id, c2.updated_at2FROM event_data e23INNER JOIN (4SELECT c1.contact_id, c1.updated_at5FROM contact_data AS c16INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id7WHERE e1.event_source = 'mail' AND e1.event_type = 'open'8) AS c2 ON c2.contact_id = e2.contact_id9WHERE e2.event_source = 'mail' AND e2.event_type = 'click'
All contacts that have both of the two events - bounce and deferred for a particular automation:
1SELECT e2.contact_id, c2.updated_at2FROM event_data e23INNER JOIN (4SELECT c1.contact_id, c1.updated_at5FROM contact_data AS c16INNER JOIN event_data e1 ON c1.contact_id = e1.contact_id7WHERE e1.event_source = 'mail'8AND e1.event_type = 'bounce'9AND e1.DATA:payload.unique_args.mc_auto_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'10) AS c2 ON c2.contact_id = e2.contact_id11WHERE e2.event_source = 'mail'12AND e2.event_type = 'deferred'13AND e2.DATA:payload.unique_args.mc_auto_id = '02bc41ca-07f9-11eb-9c1f-422f9e0f60d1'
All contacts that have no event data (i.e. all contacts with where there has been no attempt to send an email to):
1SELECT c.contact_id, c.updated_at2FROM contact_data c3WHERE c.contact_id NOT IN (4SELECT e.contact_id5FROM event_data e6)
All contacts that have not opened any mail in the last two months worth of seconds from the time when the segment is run:
1SELECT c.contact_id, c.updated_at2FROM contact_data c3WHERE c.contact_id NOT IN (4SELECT e.contact_id5FROM event_data e6WHERE e.event_source = 'mail'7AND e.event_type = 'open'8AND e.timestamp < timestampadd(MONTH, -2, CURRENT_TIMESTAMP)9)10
All contacts that have been 'delivered' the ANY single sends but have not 'open' ANY single sends within 1 month:
1SELECT c.contact_id, c.updated_at2FROM contact_data c3JOIN event_data AS e1 ON c.contact_id = e1.contact_id4WHERE e1.event_source = 'mail'5AND e1.event_type = 'delivered'6AND e1.DATA:payload.unique_args.singlesend_id is not null7AND c.contact_id NOT IN (8SELECT e2.contact_id9FROM event_data AS e210WHERE e2.event_source = 'mail'11AND e2.event_type = 'open'12AND e2.DATA:payload.unique_args.singlesend_id is not null13AND e2.timestamp >= timestampadd(month, -1, current_timestamp()))14)
All contacts that have been 'delivered' the second email from automation but have not 'open' the email (mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5' mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'):
1SELECT c.contact_id, c.updated_at2FROM contact_data c3JOIN event_data AS e1 ON c.contact_id = e1.contact_id4WHERE e1.event_source = 'mail'5AND e1.event_type = 'delivered'6AND e1.DATA:payload.unique_args.mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5'7AND e1.DATA:payload.unique_args.mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'8AND c.contact_id NOT IN (9SELECT e2.contact_id10FROM event_data AS e211WHERE e2.event_source = 'mail'12AND e2.event_type = 'open'13AND e2.DATA:payload.unique_args.mc_auto_id = 'fa3e0f6e-d397-11eb-87ce-22ffc6ed50f5'14AND e2.DATA:payload.unique_args.mc_auto_step_id = 'fc9ead9f-d397-11eb-92fe-02aa17a6534e'15)
There are three email activities that support last events: "Last Clicked", "Last Opened", and "Last Emailed". The last event queries are helpful to see the contacts whose last mail activity is in a certain period. The following queries are used to segment contacts based on their last mail activity.
All contacts that have not opened contacts since January of 2021 (last opened activity is before 2021):
1SELECT c1.contact_id, c1.updated_at2FROM contact_data as c13JOIN (4SELECT e1.contact_id, e1.max_timestamp5FROM (6SELECT e.contact_id, MAX(e.timestamp) max_timestamp7FROM event_data e8WHERE e.event_source = 'mail' and e.event_type = 'open'9GROUP BY e.contact_id10) as e111WHERE e1.max_timestamp < '2021-01-01T00:00:00.000Z'12) as e2 on c1.contact_id = e2.contact_id
All contacts that last clicked an email within 7 days (last clicked activity is within 7 days):
1SELECT c1.contact_id, c1.updated_at2FROM contact_data as c13JOIN (4SELECT e1.contact_id, e1.max_timestamp5FROM (6SELECT e.contact_id, MAX(e.timestamp) max_timestamp7FROM event_data e8WHERE e.event_source = 'mail' and e.event_type = 'click'9GROUP BY e.contact_id10) as e111WHERE e1.max_timestamp >= timestampadd(day, -7, current_timestamp())12) as e2 on c1.contact_id = e2.contact_id
All contacts that were last emailed in the past three months and last opened in the previous month (last emailed is within 3 months and last opened is within 1 month):
1SELECT c1.contact_id, c1.updated_at2FROM (3SELECT c.contact_id, c.updated_at4FROM contact_data as c5JOIN (6SELECT e1.contact_id, e1.max_timestamp7FROM (8SELECT e.contact_id, MAX(e.timestamp) as max_timestamp9FROM event_data e10WHERE e.event_source = 'mail'11AND e.event_type = 'delivered' group by e.contact_id12) as e113WHERE e1.max_timestamp >= timestampadd(month, -3, current_timestamp())14) as e2 on c.contact_id = e2.contact_id15) as c116JOIN (17SELECT c.contact_id, c.updated_at18FROM contact_data as c19JOIN (20SELECT e1.contact_id, e1.max_timestamp21FROM (22SELECT e.contact_id, MAX(e.timestamp) as max_timestamp23FROM event_data e24WHERE e.event_source = 'mail'25AND e.event_type = 'open' group by e.contact_id26) as e127WHERE e1.max_timestamp >= timestampadd(month, -1, current_timestamp())28) as e2 on c.contact_id = e2.contact_id29) as c11 on c1.contact_id = c11.contact_id30
NOTE: Even if the aggregate function MAX is supported along with GROUP BY, the usage is restricted to last event queries only. MAX function accepts only the "timestamp" field from event_data and GROUP BY clause accepts only the "contact_id" field. An alias name can be used only for the aggregate function MAX, but not for "contact_id" or "updated_at".
Multiple data sets are allowed to be joined together if done in a specific manner even though SQL supports both JOINs expressed linearly and JOINs expressed using subqueries. Each data set's criteria must be listed alongside the corresponding table reference as it makes for better readability by having the queries nested explicitly by using subqueries. Following is an example of a segment query for both types. Here, the query defines a segment of all contacts that have the event of type 'processed' from the list of contacts which have the event of type 'delivered'.
1SELECT c1.contact_id, c1.updated_at FROM event_data AS e22JOIN contact_data AS c1 ON e2.contact_id = c1.contact_id3JOIN event_data as e1 ON e1.contact_id = e2.contact_id4WHERE e1.event_source = 'mail' AND e1.event_type = 'delivered' AND e2.event_source = 'mail' AND e2.event_type = 'processed'
You can represent this logic in a more readable way using JOIN
s with subqueries.
1SELECT e2.contact_id, c2.updated_at2FROM event_data AS e2 JOIN (3SELECT c1.contact_id, c1.updated_at4FROM contact_data AS c1 JOIN event_data AS e15ON c1.contact_id = e1.contact_id6WHERE e1.event_source = 'mail' AND e1.event_type = 'delivered') AS c27ON c2.contact_id = e2.contact_id8WHERE e2.event_source = 'mail' AND e2.event_type = 'processed'
NOTE: There is no reduction in functionality as both generate the exact same result.
When creating or changing a segment query using curl command, escape any single quotes present in any parameter:
1SELECT contact_id, updated_at2FROM contact_data3WHERE first_name = '\''Dave'\''