Functions
String Functions
CHARACTER_LENGTH
The CHARACTER_LENGTH function returns the number of unicode characters in a string.
SELECT CHARACTER_LENGTH('josé');
-- 4
SELECT CHARACTER_LENGTH('❤️🥫');
-- 3
OCTET_LENGTH
The OCTET_LENGTH function returns the length of a string, in bytes (octets).
SELECT OCTET_LENGTH('josé');
-- 5
SELECT OCTET_LENGTH('❤️🥫');
-- 10
TRIM, LTRIM, RTRIM
The TRIM, LTRIM, and RTRIM functions trim surrounding whitespace,
whitespace to the left, and whitespace to the right of a string, respectively.
SELECT TRIM(' hello ');
-- 'hello'
SELECT LTRIM(' hello ');
-- 'hello '
SELECT RTRIM(' hello ');
-- ' hello'
LOWER, UPPER
The LOWER and UPPER functions downcase and upcase a string, respectively.
SELECT LOWER('Relatable Algebra');
-- 'relatable algebra'
SELECT UPPER('Shouting Calculus');
-- 'SHOUTING CALCULUS'
REPLACE
The REPLACE function returns the string in the first parameter,
with the second parameter (if found) replaced by the third.
SELECT REPLACE('Relatable Algebra', 'Rela', 'Infla');
INSTR
The INSTR function returns the first character of a substring match on the second parameter,
if found, and 0 if it is not found.
SELECT INSTR('Coffee', 'ee');
SUBSTRING
The SUBSTRING function returns the substring starting from the index provided as the second parameter.
If the (optional) third parameter is provided, the substring will be of that length (or less, if the end of the source string is reached).
SUBSTR is a synonym for SUBSTRING.
SELECT SUBSTRING('Hello Edgar', 4);
SELECT SUBSTR('Hello Edgar', 4, 2);
POSITION
The POSITION pseudo-function returns the first position of the first character of the first matched substring
in another string.
If the substring is not detected, POSITION returns 0.
SELECT POSITION("h" IN "Hawaii");
SELECT POSITION("i" IN "Hawaii");
SELECT POSITION("Ha" IN "Hawaii");
NOTE: POSITION is a "pseudo-function" because internally it uses custom syntax of the form
<substring> IN <string>.
UNICODE
The UNICODE function returns an integer unicode value for the first character
of the parameter given.
SELECT UNICODE('Adam');
CHAR
The CHAR function returns a string corresponding to the supplied integer
character codes.
SELECT CHAR(65, 66, 67);
CONCAT
CONCAT is equivalent to the Concatenation Operator (||)
except that CONCAT is limited to 2-arity applications and || can be chained.
LIKE
The LIKE function serves the same purpose as the
LIKE operator.
However, the argument order is (effectively) reversed for the LIKE function,
to match the signature used in SQLite.
For the function version, the pattern is the first argument.
Optionally, an alternative escape character can be provided as a third argument.
SELECT * FROM users WHERE LIKE('Stev%', name);
SELECT * FROM users WHERE LIKE('EdgarX%', name, 'X');
Collection Functions
LENGTH
The LENGTH function counts the number of entries in a collection.
When supplied with a string, it is a synonym for CHARACTER_LENGTH.
SELECT LENGTH([3, 2]);
SELECT LENGTH({name: 'Peas', price: 8.99, product_no: 77});
SELECT LENGTH('josé');
NOTE: CARDINALITY is an synonym for LENGTH.
OBJECT_KEYS
An object's keys can be selected using OBJECT_KEYS.
SELECT OBJECT_KEYS({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40});
OBJECT_VALUES
An object's values can be selected using OBJECT_VALUES.
SELECT OBJECT_VALUES({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40});
OBJECT_ENTRIES
Returns an array of key-value pairs representing the given object.
SELECT OBJECT_ENTRIES({a: 1, b: 2, c: 3});
-- [['a': 1], ['b': 2], ['c': 3]]
OBJECT_FROM_ENTRIES
Constructs an object from an array of key-value pairs.
SELECT OBJECT_FROM_ENTRIES([['a', 1], ['b', 2], ['c', 3]]);
-- {a: 1, b: 2, c: 3}
PATCH
The PATCH function takes two documents.
The document returned is the first document "patched" with any fields found in the second document.
If the second document does not specify a field, that field is left untouched.
If the second document specifies any fields with values of NULL, those fields are removed.
SELECT PATCH(
{name: 'Salt', nutrition: {sodium: 100, ingredients: 'Kosher Salt'}},
{name: 'Sea Salt', nutrition: {ingredients: NULL}}
);
The PATCH function has an equivalent operator for data manipulation:
UPDATE PATCH
Table Value Functions
Table Value Functions are only valid within the FROM clause.
UNNEST
The UNNEST function can be thought of as the inverse of
ARRAY_AGG,
although it offers more power than just unlinking elements.
It takes an array or object and pulls its elements into separate rows.
SELECT * FROM UNNEST([1.99, 2.99, 3.99]) AS products(price);
It is possible to unnest multiple arrays.
If the arrays do not have the same number of elements, the shorter array(s) will
have those values filled with NULL:
SELECT names.* FROM (VALUES (['Leslie', 'Edgar', 'fiver2'], ['Lamport', 'Codd'])) AS x(first, last), UNNEST(x.first, x.last) AS names(first, last);
When unnesting an object, keys-value pairs will be returned as per object_entries. This behaviour is useful for manipulating collections:
SELECT * FROM UNNEST({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40}) AS prices(price);
-- [{'price': ['sale_price', 1.5]},
-- {'price': ['coupon_price', 1.4]},
-- {'price': ['original_price', 1.99]}]
Unnesting nested data from a queried table is done with the form
FROM <table>, UNNEST(<table>.<column>) AS foo(new_column).
For example:
INSERT INTO msgs
{text: "Here is some classic material",
user: "George",
workday: 2024-02-25,
media: [{type: "image", src: "dsm.png"},
{type: "video", src: "vldb.mp4"}]};
WITH m AS (SELECT * FROM msgs, UNNEST(msgs.media) AS m(media))
SELECT media FROM m WHERE media..type MATCH 'video';
WITH ORDINALITY
UNNEST can be suffixed with WITH ORDINALITY
to append an ordinal column to the results.
SELECT * FROM UNNEST([1.99, 2.99, 3.99]) WITH ORDINALITY AS products(price, n);
-- [{'n': 0, 'price': 1.99}, {'n': 1, 'price': 2.99}, {'n': 2, 'price': 3.99}]
NOTE: Endb ordinals are zero-indexed.
GENERATE_SERIES
The GENERATE_SERIES function generates an array of numbers within a given interval.
The first and second parameters are the start and end of the interval.
The optional third parameter is a step value by which to increment each number.
The result is returned as a single anonymous column (with the default name, column1)
containing the array.
SELECT * FROM GENERATE_SERIES(0, 21) AS t(s);
SELECT * FROM GENERATE_SERIES(0, 21, 3) AS t(s);
It is possible to use the result of GENERATE_SERIES in other SQL expressions, like IN:
SELECT * FROM products WHERE product_no IN (SELECT column1 FROM generate_series(1000, 20000) AS foo);
Numeric Functions
RANDOM
The RANDOM function returns a random integer.
SELECT RANDOM();
Math
Endb provides standard SQL math functions based on SQLite's collection of math functions:
- ROUND
- SIN
- COS
- TAN
- SINH
- COSH
- TANH
- ASIN
- ACOS
- ATAN
- ASINH
- ACOSH
- ATANH
- ATAN2
- FLOOR
- CEILING, CEIL
- SIGN
- SQRT
- EXP
- POWER, POW
- LOG, LOG10
- LOG2
- LN
- DEGREES
- RADIANS
- PI
- ABS
NOTE: Endb follows the choice of most SQL databases and aliases LOG to LOG10
rather than LN (natural log), as specified by the SQL standard.
NOTE: Mathematical operators are documented under Operators.
Date/Time Functions
STRFTIME
The STRFTIME function formats a date or time value as a string.
SELECT strftime('%Y/%m/%d', date('2001-01-01'));
SELECT strftime('%Y %m %d at %H %M %S', datetime('2001-01-01 03:04:05'));
UNIXEPOCH
The UNIXEPOCH function returns the number of seconds since the UNIX epoch.
Accepts a DATE, TIMESTAMP, or STRING.
SELECT UNIXEPOCH('2023-01-01');
SELECT UNIXEPOCH(1970-01-01T00:00:00Z);
JULIANDAY
The JULIANDAY function returns the Julian Day, which is the number of days since
noon in UTC on November 24, 4714 B.C.
Accepts a DATE, TIMESTAMP, or STRING.
SELECT JULIANDAY(1970-01-01);
EXTRACT
The EXTRACT pseudo-function provides a way to access one named, numerical portion of a
date, time, or timestamp.
Portions of dates can only be extracted from dates or timestamps.
Portions of times can only be extracted from timestamps or times.
SELECT EXTRACT(YEAR FROM CURRENT_DATE);
SELECT EXTRACT(MONTH FROM CURRENT_DATE);
SELECT EXTRACT(DAY FROM CURRENT_TIMESTAMP);
SELECT EXTRACT(HOUR FROM CURRENT_TIMESTAMP);
SELECT EXTRACT(MINUTE FROM CURRENT_TIME);
SELECT EXTRACT(SECOND FROM CURRENT_TIME);
NOTE: EXTRACT is a "pseudo-function" because internally it uses custom syntax of the form
<named-portion> FROM <date>.
PERIOD
The PERIOD function creates a new Period.
It is equivalent to using Period literals.
PERIOD(2001-04-01T00:00:00Z, 2001-05-01)
Aggregate Functions
MIN, MAX
The MIN and MAX functions return the minimum and maximum values for an expression,
respectively.
SELECT MIN(price) FROM products;
SELECT MAX(price) FROM products;
NOTE: MIN and MAX also have non-aggregate equivalents, which are 2-arity.
When used that way, they each return the minimum or maximum value of the two values provided.
SUM
The SUM function returns the sum of all non-null values under the column given as a parameter.
SELECT SUM(price) FROM products;
If all values for the given column are NULL, SUM returns NULL.
TOTAL
The TOTAL function is equivalent to SUM except that it returns 0.0 in the case where
all input values are NULL.
AVG
The AVG function takes a numerical-type-agnostic average of all values under the
column given as a parameter.
SELECT AVG(price) FROM products;
COUNT
The COUNT function returns the count of non-null, non-empty values for the specified column.
SELECT COUNT(price) FROM sales;
NOTE: Because null/empty values are ignored, the behaviour of COUNT will differ from other
SQL dialects. Whether or not COUNT(price) and COUNT(1) are equivalent is dependent on
whether the price attribute exists with a non-null value on each document.
ARRAY_AGG
The ARRAY_AGG function concatenates the results of an expression into an array.
The parameter may be ordered
within ARRAY_AGG.
SELECT ARRAY_AGG(price) FROM products;
SELECT ARRAY_AGG(name ORDER BY price DESC) FROM products;
Note that when operating on arrays, the arrays themselves will be concatenated, not the contents of the arrays. The result will be an array of one higher dimension:
SELECT ARRAY_AGG(x.column1) FROM (VALUES ([1,2]), ([3,4])) AS x;
-- [{'column1': [[1, 2], [3, 4]]}]
GROUP_CONCAT
The GROUP_CONCAT function returns a string with concatenated
non-null values from a column or group.
Given a second parameter,
It defaults to a comma-delimited list, but the second (optional) parameter
can override the delimiter.
SELECT GROUP_CONCAT(name) FROM products;
SELECT GROUP_CONCAT(name, ':') FROM products;
FILTER
All aggregate functions can have a filter applied before aggregation.
SELECT SUM(price) FILTER(WHERE price > 20) FROM products;
Data Type Functions
CAST
The CAST function forces a value into a particular data type.
Note that not all types are cast-compatible with each other.
SELECT CAST(price AS INTEGER) FROM products;
TYPEOF
The TYPEOF function returns the type of the provided value.
SELECT TYPEOF('hi2u');
SELECT TYPEOF(1.12345678901234);
SELECT TYPEOF(2018-01-01T00:00:00);
Conditional Functions
IIF
The IIF function is a conditional shorthand.
It returns the second parameter if the condition is true and the third parameter if the condition is false.
SELECT IIF(price > 5.99, 'Expensive!', 'Cheap') FROM products;
NULLIF
The NULLIF function returns TRUE if the two supplied expressions are equal.
SELECT NULLIF(1, 1);
SELECT NULLIF(1, 'zig');
COALESCE
The COALESCE function returns its first non-null argument.
The following example returns 'zig':
SELECT COALESCE(NULL, NULL, 'zig', 'zag');
Encoding Functions
BASE64
The BASE64 function takes a hexadecimal-encoded BLOB and returns a base64-encoded string, or vice-versa.
BASE64 roundtrips its own data.
There is therefore no BLOBFROMBASE64 function.
SELECT BASE64(x'010203');
SELECT BASE64('AQID');
UUID
The UUID function returns a universally-unique identifier, as a string.
The UUID_BLOB function takes a string UUID and returns a BLOB.
The UUID_STR function takes a BLOB UUID and returns a string.
When given a parameter of their return type, UUID_BLOB and UUID_STR will format the UUID provided.
SELECT UUID();
SELECT UUID_BLOB('d2ce21c9-d268-409a-b1e0-49e1200bfa47');
SELECT UUID_STR(x'd2ce21c9d268409ab1e049e1200bfa47');
-- formatting:
SELECT UUID_BLOB(x'd2ce21c9d268409ab1e049e1200bfa47');
SELECT UUID_STR('d2ce21c9d268409ab1e049e1200bfa47');
SHA1
The SHA1 function takes either a hexadecimal-encoded BLOB, a string, or a number.
It returns the SHA-1 encoding of that value.
SELECT SHA1('2');
RANDOMBLOB, ZEROBLOB
The RANDOMBLOB function returns a random binary large object of the size given, in bytes.
The ZEROBLOB function returns a zeroed-out binary large object of the size given, in bytes.
SELECT RANDOMBLOB(32);
SELECT ZEROBLOB(32);
HEX, UNHEX
The HEX function takes a BLOB (or coerces its argument into a UTF-8 string,
which in turn is interpreted as a BLOB)
and turns the BLOB into an upper-case hexadecimal string.
The UNHEX function takes a hexadecimal string and turns it into a BLOB.
The hexadecimal string provided must contain character pairs.
UNHEX takes an optional second parameter: a string containing non-hexadecimal
characters to be ignored in the first parameter.
If non-hexadecimal characters are found in the first parameter but not ignored
in the second parameter, UNHEX returns NULL.
SELECT HEX(15);
-- '3135'
SELECT UNHEX('3135');
-- b'15'
SELECT UNHEX('3135ZZ', 'Z');
-- b'15'
SELECT UNHEX('3135ZZ', 'M');
-- NULL
Vector Functions
Indexing
At the moment, Endb performs exact nearest neighbor search, which provides perfect recall. Endb may support approximate vector indexes in the future, which trade accuracy for speed.
L2_DISTANCE
The L2_DISTANCE function returns the
Euclidean distance
between two vectors.
It is symmetrical to the <-> operator.
SELECT L2_DISTANCE([-0.7888,-0.7361,-0.6208,-0.5134,-0.4044], [0.8108,0.6671,0.5565,0.5449,0.4466]);
-- [{'column1': 2.7853052938591847}]
COSINE_DISTANCE
The COSINE_DISTANCE function returns the complement of
Cosine Similarity
for two vectors.
It is symmetrical to the <=> operator.
SELECT COSINE_DISTANCE([-0.7888,-0.7361,-0.6208,-0.5134,-0.4044], [0.8108,0.6671,0.5565,0.5449,0.4466]);
-- [{'column1': 1.9970250541178656}]
INNER_PRODUCT
The INNER_PRODUCT function returns the
Inner Product
of two vectors.
It is inverse of the <#> operator.
SELECT INNER_PRODUCT([1,2], [3,4]);
-- [{'column1': 11.0}]