The most surprising thing about RDS Trusted Language Extensions is that they let you run any code, written in any language, directly within your PostgreSQL database, without compromising security or performance.
Let’s see it in action. Imagine you’re working with a PostgreSQL database on RDS and you need to perform some complex string manipulation that’s cumbersome in SQL alone. You’ve written a Python function to do this:
import re
def extract_email_addresses(text):
"""Extracts all email addresses from a given text."""
email_regex = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
return re.findall(email_regex, text)
Normally, you’d have to pull data out of RDS, process it in your application layer, and then push it back – a lot of overhead. With Trusted Language Extensions, you can load this Python code directly into your database.
First, you need to create a Trusted Language Extension. Let’s say we’re using Python. You’d typically do this via the AWS console or CLI, specifying the extension name and the language.
aws rds create-db-instance-extension \
--db-instance-identifier my-rds-instance \
--extension-name plpython3_and_tlv \
--version 1.0 \
--auto-minor-version-upgrade
Once the extension is enabled on your RDS instance, you can then create a SQL function that wraps your Python code. This is where the magic happens.
CREATE OR REPLACE FUNCTION extract_emails(input_text TEXT)
RETURNS SETOF TEXT
AS $$
import re
def extract_email_addresses(text):
email_regex = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
return re.findall(email_regex, text)
emails = extract_email_addresses(input_text)
for email in emails:
RETURN NEXT email
$$ LANGUAGE plpython3u;
Notice the LANGUAGE plpython3u. The u suffix indicates that this function is untrusted, but in the context of a Trusted Language Extension, it means it’s allowed to run with the permissions of the database user. The "Trusted" part refers to the extension itself being vetted and managed by AWS, ensuring it doesn’t have underlying system access that could be exploited.
Now, you can call this function directly from your SQL queries:
SELECT extract_emails('Contact us at support@example.com or info@anothersite.org for assistance.');
This query would return:
extract_emails
-------------------------
support@example.com
info@anothersite.org
(2 rows)
The problem this solves is immense: it bridges the gap between the procedural power of general-purpose programming languages and the declarative, data-centric world of SQL. You can now write complex data transformations, machine learning inference (using libraries like scikit-learn or TensorFlow compiled for the extension’s environment), or even interact with external services (though this requires careful configuration and is generally discouraged for performance and security).
Internally, Trusted Language Extensions work by embedding a secure runtime environment for the specified language within the PostgreSQL server process. When a function written in that language is called, PostgreSQL invokes the extension’s runtime, passes the function arguments, executes the code, and receives the results back. The plpython3u language handler, for instance, interfaces with Python’s interpreter. The "trusted" aspect means AWS has reviewed the extension’s code and its integration with PostgreSQL to ensure it adheres to security boundaries. It doesn’t grant arbitrary OS-level access but rather a controlled execution environment for your code.
The exact levers you control are primarily through the SQL function definition itself and the configuration of the extension at the RDS instance level. You define the function’s signature, the language it’s written in, and the code block. You can also control which users have EXECUTE privileges on these functions. For more advanced scenarios, you might also configure specific library dependencies or versions if the extension framework supports it.
A common misconception is that "trusted" means you can run anything without review. The trust is in the extension mechanism and AWS’s vetting of the core extension code. Your user-defined functions within that extension are still your responsibility to write securely and correctly. A bug in your Python code, like an infinite loop or excessive memory allocation, can still impact your database’s performance or stability, though it’s sandboxed and unlikely to compromise the underlying OS.
After successfully running custom code within your database, the next challenge you’ll likely face is managing the dependencies and versions of these extensions across different environments.