Functions
String Functions
CHARACTER_LENGTH
The CHARACTER_LENGTH
function returns the number of unicode characters in a string.
SELECT CHARACTER_LENGTH('josé');
-- 4
SELECT CHARACTER_LENGTH('❤️🥫');
-- 3
OCTET_LENGTH
The OCTET_LENGTH
function returns the length of a string, in bytes (octets).
SELECT OCTET_LENGTH('josé');
-- 5
SELECT OCTET_LENGTH('❤️🥫');
-- 10
TRIM, LTRIM, RTRIM
The TRIM
, LTRIM
, and RTRIM
functions trim surrounding whitespace,
whitespace to the left, and whitespace to the right of a string, respectively.
SELECT TRIM(' hello ');
-- 'hello'
SELECT LTRIM(' hello ');
-- 'hello '
SELECT RTRIM(' hello ');
-- ' hello'
LOWER, UPPER
The LOWER
and UPPER
functions downcase and upcase a string, respectively.
SELECT LOWER('Relatable Algebra');
-- 'relatable algebra'
SELECT UPPER('Shouting Calculus');
-- 'SHOUTING CALCULUS'
REPLACE
The REPLACE
function returns the string in the first parameter,
with the second parameter (if found) replaced by the third.
SELECT REPLACE('Relatable Algebra', 'Rela', 'Infla');
INSTR
The INSTR
function returns the first character of a substring match on the second parameter,
if found, and 0
if it is not found.
SELECT INSTR('Coffee', 'ee');
SUBSTRING
The SUBSTRING
function returns the substring starting from the index provided as the second parameter.
If the (optional) third parameter is provided, the substring will be of that length (or less, if the end of the source string is reached).
SUBSTR
is a synonym for SUBSTRING
.
SELECT SUBSTRING('Hello Edgar', 4);
SELECT SUBSTR('Hello Edgar', 4, 2);
POSITION
The POSITION
pseudo-function returns the first position of the first character of the first matched substring
in another string.
If the substring is not detected, POSITION
returns 0
.
SELECT POSITION("h" IN "Hawaii");
SELECT POSITION("i" IN "Hawaii");
SELECT POSITION("Ha" IN "Hawaii");
NOTE: POSITION
is a "pseudo-function" because internally it uses custom syntax of the form
<substring> IN <string>
.
UNICODE
The UNICODE
function returns an integer unicode value for the first character
of the parameter given.
SELECT UNICODE('Adam');
CHAR
The CHAR
function returns a string corresponding to the supplied integer
character codes.
SELECT CHAR(65, 66, 67);
CONCAT
CONCAT
is equivalent to the Concatenation Operator (||)
except that CONCAT
is limited to 2-arity applications and ||
can be chained.
LIKE
The LIKE
function serves the same purpose as the
LIKE
operator.
However, the argument order is (effectively) reversed for the LIKE
function,
to match the signature used in SQLite.
For the function version, the pattern is the first argument.
Optionally, an alternative escape character can be provided as a third argument.
SELECT * FROM users WHERE LIKE('Stev%', name);
SELECT * FROM users WHERE LIKE('EdgarX%', name, 'X');
Collection Functions
LENGTH
The LENGTH
function counts the number of entries in a collection.
When supplied with a string, it is a synonym for CHARACTER_LENGTH
.
SELECT LENGTH([3, 2]);
SELECT LENGTH({name: 'Peas', price: 8.99, product_no: 77});
SELECT LENGTH('josé');
NOTE: CARDINALITY
is an synonym for LENGTH
.
OBJECT_KEYS
An object's keys can be selected using OBJECT_KEYS
.
SELECT OBJECT_KEYS({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40});
OBJECT_VALUES
An object's values can be selected using OBJECT_VALUES
.
SELECT OBJECT_VALUES({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40});
OBJECT_ENTRIES
Returns an array of key-value pairs representing the given object.
SELECT OBJECT_ENTRIES({a: 1, b: 2, c: 3});
-- [['a': 1], ['b': 2], ['c': 3]]
OBJECT_FROM_ENTRIES
Constructs an object from an array of key-value pairs.
SELECT OBJECT_FROM_ENTRIES([['a', 1], ['b', 2], ['c', 3]]);
-- {a: 1, b: 2, c: 3}
PATCH
The PATCH
function takes two documents.
The document returned is the first document "patched" with any fields found in the second document.
If the second document does not specify a field, that field is left untouched.
If the second document specifies any fields with values of NULL
, those fields are removed.
SELECT PATCH(
{name: 'Salt', nutrition: {sodium: 100, ingredients: 'Kosher Salt'}},
{name: 'Sea Salt', nutrition: {ingredients: NULL}}
);
The PATCH
function has an equivalent operator for data manipulation:
UPDATE PATCH
Table Value Functions
Table Value Functions are only valid within the FROM
clause.
UNNEST
The UNNEST
function can be thought of as the inverse of
ARRAY_AGG
,
although it offers more power than just unlinking elements.
It takes an array or object and pulls its elements into separate rows.
SELECT * FROM UNNEST([1.99, 2.99, 3.99]) AS products(price);
It is possible to unnest multiple arrays.
If the arrays do not have the same number of elements, the shorter array(s) will
have those values filled with NULL
:
SELECT names.* FROM (VALUES (['Leslie', 'Edgar', 'fiver2'], ['Lamport', 'Codd'])) AS x(first, last), UNNEST(x.first, x.last) AS names(first, last);
When unnesting an object, keys-value pairs will be returned as per object_entries. This behaviour is useful for manipulating collections:
SELECT * FROM UNNEST({original_price: 1.99, sale_price: 1.50, coupon_price: 1.40}) AS prices(price);
-- [{'price': ['sale_price', 1.5]},
-- {'price': ['coupon_price', 1.4]},
-- {'price': ['original_price', 1.99]}]
Unnesting nested data from a queried table is done with the form
FROM <table>, UNNEST(<table>.<column>) AS foo(new_column)
.
For example:
INSERT INTO msgs
{text: "Here is some classic material",
user: "George",
workday: 2024-02-25,
media: [{type: "image", src: "dsm.png"},
{type: "video", src: "vldb.mp4"}]};
WITH m AS (SELECT * FROM msgs, UNNEST(msgs.media) AS m(media))
SELECT media FROM m WHERE media..type MATCH 'video';
WITH ORDINALITY
UNNEST
can be suffixed with WITH ORDINALITY
to append an ordinal column to the results.
SELECT * FROM UNNEST([1.99, 2.99, 3.99]) WITH ORDINALITY AS products(price, n);
-- [{'n': 0, 'price': 1.99}, {'n': 1, 'price': 2.99}, {'n': 2, 'price': 3.99}]
NOTE: Endb ordinals are zero-indexed.
GENERATE_SERIES
The GENERATE_SERIES
function generates an array of numbers within a given interval.
The first and second parameters are the start and end of the interval.
The optional third parameter is a step value by which to increment each number.
The result is returned as a single anonymous column (with the default name, column1
)
containing the array.
SELECT * FROM GENERATE_SERIES(0, 21) AS t(s);
SELECT * FROM GENERATE_SERIES(0, 21, 3) AS t(s);
It is possible to use the result of GENERATE_SERIES
in other SQL expressions, like IN
:
SELECT * FROM products WHERE product_no IN (SELECT column1 FROM generate_series(1000, 20000) AS foo);
Numeric Functions
RANDOM
The RANDOM
function returns a random integer.
SELECT RANDOM();
Math
Endb provides standard SQL math functions based on SQLite's collection of math functions:
- ROUND
- SIN
- COS
- TAN
- SINH
- COSH
- TANH
- ASIN
- ACOS
- ATAN
- ASINH
- ACOSH
- ATANH
- ATAN2
- FLOOR
- CEILING, CEIL
- SIGN
- SQRT
- EXP
- POWER, POW
- LOG, LOG10
- LOG2
- LN
- DEGREES
- RADIANS
- PI
- ABS
NOTE: Endb follows the choice of most SQL databases and aliases LOG
to LOG10
rather than LN
(natural log), as specified by the SQL standard.
NOTE: Mathematical operators are documented under Operators.
Date/Time Functions
STRFTIME
The STRFTIME
function formats a date or time value as a string.
SELECT strftime('%Y/%m/%d', date('2001-01-01'));
SELECT strftime('%Y %m %d at %H %M %S', datetime('2001-01-01 03:04:05'));
UNIXEPOCH
The UNIXEPOCH
function returns the number of seconds since the UNIX epoch.
Accepts a DATE
, TIMESTAMP
, or STRING
.
SELECT UNIXEPOCH('2023-01-01');
SELECT UNIXEPOCH(1970-01-01T00:00:00Z);
JULIANDAY
The JULIANDAY
function returns the Julian Day, which is the number of days since
noon in UTC on November 24, 4714 B.C.
Accepts a DATE
, TIMESTAMP
, or STRING
.
SELECT JULIANDAY(1970-01-01);
EXTRACT
The EXTRACT
pseudo-function provides a way to access one named, numerical portion of a
date, time, or timestamp.
Portions of dates can only be extracted from dates or timestamps.
Portions of times can only be extracted from timestamps or times.
SELECT EXTRACT(YEAR FROM CURRENT_DATE);
SELECT EXTRACT(MONTH FROM CURRENT_DATE);
SELECT EXTRACT(DAY FROM CURRENT_TIMESTAMP);
SELECT EXTRACT(HOUR FROM CURRENT_TIMESTAMP);
SELECT EXTRACT(MINUTE FROM CURRENT_TIME);
SELECT EXTRACT(SECOND FROM CURRENT_TIME);
NOTE: EXTRACT
is a "pseudo-function" because internally it uses custom syntax of the form
<named-portion> FROM <date>
.
PERIOD
The PERIOD
function creates a new Period.
It is equivalent to using Period literals.
PERIOD(2001-04-01T00:00:00Z, 2001-05-01)
Aggregate Functions
MIN, MAX
The MIN
and MAX
functions return the minimum and maximum values for an expression,
respectively.
SELECT MIN(price) FROM products;
SELECT MAX(price) FROM products;
NOTE: MIN
and MAX
also have non-aggregate equivalents, which are 2-arity.
When used that way, they each return the minimum or maximum value of the two values provided.
SUM
The SUM
function returns the sum of all non-null values under the column given as a parameter.
SELECT SUM(price) FROM products;
If all values for the given column are NULL
, SUM
returns NULL
.
TOTAL
The TOTAL
function is equivalent to SUM
except that it returns 0.0
in the case where
all input values are NULL
.
AVG
The AVG
function takes a numerical-type-agnostic average of all values under the
column given as a parameter.
SELECT AVG(price) FROM products;
COUNT
The COUNT
function returns the count of non-null, non-empty values for the specified column.
SELECT COUNT(price) FROM sales;
NOTE: Because null/empty values are ignored, the behaviour of COUNT
will differ from other
SQL dialects. Whether or not COUNT(price)
and COUNT(1)
are equivalent is dependent on
whether the price
attribute exists with a non-null value on each document.
ARRAY_AGG
The ARRAY_AGG
function concatenates the results of an expression into an array.
The parameter may be ordered
within ARRAY_AGG
.
SELECT ARRAY_AGG(price) FROM products;
SELECT ARRAY_AGG(name ORDER BY price DESC) FROM products;
Note that when operating on arrays, the arrays themselves will be concatenated, not the contents of the arrays. The result will be an array of one higher dimension:
SELECT ARRAY_AGG(x.column1) FROM (VALUES ([1,2]), ([3,4])) AS x;
-- [{'column1': [[1, 2], [3, 4]]}]
GROUP_CONCAT
The GROUP_CONCAT
function returns a string with concatenated
non-null values from a column or group.
Given a second parameter,
It defaults to a comma-delimited list, but the second (optional) parameter
can override the delimiter.
SELECT GROUP_CONCAT(name) FROM products;
SELECT GROUP_CONCAT(name, ':') FROM products;
FILTER
All aggregate functions can have a filter applied before aggregation.
SELECT SUM(price) FILTER(WHERE price > 20) FROM products;
Data Type Functions
CAST
The CAST
function forces a value into a particular data type.
Note that not all types are cast-compatible with each other.
SELECT CAST(price AS INTEGER) FROM products;
TYPEOF
The TYPEOF
function returns the type of the provided value.
SELECT TYPEOF('hi2u');
SELECT TYPEOF(1.12345678901234);
SELECT TYPEOF(2018-01-01T00:00:00);
Conditional Functions
IIF
The IIF
function is a conditional shorthand.
It returns the second parameter if the condition is true and the third parameter if the condition is false.
SELECT IIF(price > 5.99, 'Expensive!', 'Cheap') FROM products;
NULLIF
The NULLIF
function returns TRUE
if the two supplied expressions are equal.
SELECT NULLIF(1, 1);
SELECT NULLIF(1, 'zig');
COALESCE
The COALESCE
function returns its first non-null argument.
The following example returns 'zig'
:
SELECT COALESCE(NULL, NULL, 'zig', 'zag');
Encoding Functions
BASE64
The BASE64
function takes a hexadecimal-encoded BLOB and returns a base64-encoded string, or vice-versa.
BASE64
roundtrips its own data.
There is therefore no BLOBFROMBASE64
function.
SELECT BASE64(x'010203');
SELECT BASE64('AQID');
UUID
The UUID
function returns a universally-unique identifier, as a string.
The UUID_BLOB
function takes a string UUID and returns a BLOB.
The UUID_STR
function takes a BLOB UUID and returns a string.
When given a parameter of their return type, UUID_BLOB
and UUID_STR
will format the UUID provided.
SELECT UUID();
SELECT UUID_BLOB('d2ce21c9-d268-409a-b1e0-49e1200bfa47');
SELECT UUID_STR(x'd2ce21c9d268409ab1e049e1200bfa47');
-- formatting:
SELECT UUID_BLOB(x'd2ce21c9d268409ab1e049e1200bfa47');
SELECT UUID_STR('d2ce21c9d268409ab1e049e1200bfa47');
SHA1
The SHA1
function takes either a hexadecimal-encoded BLOB, a string, or a number.
It returns the SHA-1 encoding of that value.
SELECT SHA1('2');
RANDOMBLOB, ZEROBLOB
The RANDOMBLOB
function returns a random binary large object of the size given, in bytes.
The ZEROBLOB
function returns a zeroed-out binary large object of the size given, in bytes.
SELECT RANDOMBLOB(32);
SELECT ZEROBLOB(32);
HEX, UNHEX
The HEX
function takes a BLOB (or coerces its argument into a UTF-8 string,
which in turn is interpreted as a BLOB)
and turns the BLOB into an upper-case hexadecimal string.
The UNHEX
function takes a hexadecimal string and turns it into a BLOB.
The hexadecimal string provided must contain character pairs.
UNHEX
takes an optional second parameter: a string containing non-hexadecimal
characters to be ignored in the first parameter.
If non-hexadecimal characters are found in the first parameter but not ignored
in the second parameter, UNHEX
returns NULL
.
SELECT HEX(15);
-- '3135'
SELECT UNHEX('3135');
-- b'15'
SELECT UNHEX('3135ZZ', 'Z');
-- b'15'
SELECT UNHEX('3135ZZ', 'M');
-- NULL
Vector Functions
Indexing
At the moment, Endb performs exact nearest neighbor search, which provides perfect recall. Endb may support approximate vector indexes in the future, which trade accuracy for speed.
L2_DISTANCE
The L2_DISTANCE
function returns the
Euclidean distance
between two vectors.
It is symmetrical to the <->
operator.
SELECT L2_DISTANCE([-0.7888,-0.7361,-0.6208,-0.5134,-0.4044], [0.8108,0.6671,0.5565,0.5449,0.4466]);
-- [{'column1': 2.7853052938591847}]
COSINE_DISTANCE
The COSINE_DISTANCE
function returns the complement of
Cosine Similarity
for two vectors.
It is symmetrical to the <=>
operator.
SELECT COSINE_DISTANCE([-0.7888,-0.7361,-0.6208,-0.5134,-0.4044], [0.8108,0.6671,0.5565,0.5449,0.4466]);
-- [{'column1': 1.9970250541178656}]
INNER_PRODUCT
The INNER_PRODUCT
function returns the
Inner Product
of two vectors.
It is inverse of the <#>
operator.
SELECT INNER_PRODUCT([1,2], [3,4]);
-- [{'column1': 11.0}]