DuckDB
Introduction to DuckDB
DuckDB is an in-process SQL database management system designed for analytical workloads. It is optimized for Online Analytical Processing (OLAP) and provides fast query execution directly within your application. DuckDB is lightweight, easy to use, and does not require a server, making it an excellent choice for data analysis tasks on local or embedded systems.
DuckDB supports standard SQL syntax and integrates seamlessly with various programming languages, including Python, R, and C++. It is particularly well-suited for handling large datasets and performing complex analytical queries efficiently. For more details, visit the official DuckDB documentation.
Using DuckDB with Modules
To use DuckDB on the terrabyte HPC system, load the DuckDB module with the following command:
# consider adding the module use line to your ~/.bashrc to always make terrabyte modules available
module use /dss/dsstbyfs01/pn56su/pn56su-dss-0020/usr/share/modules/files/
module load duckdb
Usage Examples
Once loaded, you can start using DuckDB by running its interactive shell or executing SQL queries from scripts. For example, to start the DuckDB shell:
duckdb
For additional usage instructions and configuration details, refer to the DuckDB documentation.