Skip to main content

DuckDB

Introduction to DuckDB

DuckDB is an in-process SQL database management system designed for analytical workloads. It is optimized for Online Analytical Processing (OLAP) and provides fast query execution directly within your application. DuckDB is lightweight, easy to use, and does not require a server, making it an excellent choice for data analysis tasks on local or embedded systems.

DuckDB supports standard SQL syntax and integrates seamlessly with various programming languages, including Python, R, and C++. It is particularly well-suited for handling large datasets and performing complex analytical queries efficiently. For more details, visit the official DuckDB documentation.

Using DuckDB with Modules

To use DuckDB on the terrabyte HPC system, load the DuckDB module with the following command:

# consider adding the module use line to your ~/.bashrc to always make terrabyte modules available 
module use /dss/dsstbyfs01/pn56su/pn56su-dss-0020/usr/share/modules/files/
module load duckdb

Usage Examples

Once loaded, you can start using DuckDB by running its interactive shell or executing SQL queries from scripts. For example, to start the DuckDB shell:

duckdb

For additional usage instructions and configuration details, refer to the DuckDB documentation.