PostgreSQL Node
Overview
The PostgreSQL Node is a database integration component that automates data synchronization from PostgreSQL databases. This node enables scheduled fetching of rows from specified tables and materialized views, supporting vectorization and indexing for Retrieval-Augmented Generation (RAG) workflows.
Node Type Information
Type | Description | Status |
---|---|---|
Batch Trigger | Starts the flow on a schedule or batch event. Ideal for periodic data processing. | ✅ True |
Event Trigger | Starts the flow based on external events (e.g., webhook, user interaction). | ❌ False |
Action | Executes a task or logic as part of the flow (e.g., API call, transformation). | ✅ True |
This node is a Batch Trigger node that automates data synchronization from PostgreSQL databases on a scheduled basis.
This node is a Batch Trigger node that automates database synchronization processes and enables RAG workflows with PostgreSQL data.
This integration connects to your PostgreSQL database to sync data for processing in Lamatic Flow. Ensure your database user has appropriate permissions to access the selected schemas and tables. If the connection fails, whitelist the following IPs: https://www.cloudflare.com/ips/ (opens in a new tab)
Features
Key Functionalities
- Data Synchronization: Automatically syncs data from PostgreSQL tables and materialized views with configurable scheduling.
- Scheduled Processing: Supports automated sync schedules with cron expressions for regular data updates.
- RAG Integration: Enables vectorization and indexing for AI-powered data retrieval and analysis.
- Flexible Sync Modes: Supports both incremental (new/modified data only) and full-refresh (all data) synchronization modes.
- Schema Support: Processes data from multiple database schemas and tables for comprehensive data access.
- Materialized View Support: Handles materialized views for complex queries and aggregated data.
Benefits
- Automated Database Sync: Automates database synchronization processes, reducing manual effort and ensuring data consistency.
- RAG Workflow Enablement: Enables RAG workflows with PostgreSQL data for intelligent data retrieval and analysis.
- Granular Control: Provides granular control over table selection and processing through schema and table configuration.
- Performance Optimization: Supports incremental sync modes for better performance with large datasets.
- Flexible Scheduling: Offers flexible scheduling options to minimize database impact during peak hours.
- Comprehensive Data Access: Supports both tables and materialized views for diverse data processing needs.
Prerequisites
Before using PostgreSQL Node, ensure the following:
- PostgreSQL Database: Access to a PostgreSQL database with appropriate credentials.
- Database Permissions: Read permissions on target schemas, tables, and materialized views.
- Connection Details: Database host, port, database name, username, and password.
- Cron Expression Knowledge: Understanding of cron expressions for scheduling flows.
- Network Access: Proper network connectivity to the PostgreSQL database server.
Setup
Step 1: Set Up PostgreSQL Credentials
Please refer to the Postgres Integration documentation to complete the setup and obtain the necessary credentials.
Ensure your database user has appropriate permissions to access the selected schemas and tables. If the connection fails, whitelist the following IPs: https://www.cloudflare.com/ips/ (opens in a new tab)
Step 2: Configure PostgreSQL Credentials
Use the following format to set up your credentials:
Key Name | Description | Example Value |
---|---|---|
Credential Name | Name to identify this set of credentials | my-postgres-creds |
Host | PostgreSQL database host address | localhost or db.example.com |
Port | PostgreSQL database port number | 5432 |
Database | Name of the PostgreSQL database | myapp_production |
Username | PostgreSQL database username | db_user |
Password | PostgreSQL database password | secure_password |
Step 3: Set Up Lamatic Flow
- Add PostgreSQL Node: Drag the PostgreSQL node to your flow
- Enter Credentials: Provide your database connection details
- Configure Schema: Enter the database schema to process
- Set Table/View: Specify the table or materialized view name
Configuration Reference
Batch Trigger Configuration
Parameter | Description | Required | Default | Example |
---|---|---|---|---|
Credentials | PostgreSQL database connection details | ✅ | - | PostgreSQL Connection |
Schema | Database schema to be processed | ✅ | - | public |
Table/View | Source table or materialized view for data processing | ✅ | - | users |
Sync Mode | Controls how data is re-indexed: incremental or full-refresh | ✅ | incremental | incremental |
Sync Schedule | Schedule for automated syncs using cron expressions | ✅ | - | 0 0 * * * |
Sync Configuration Options
Sync Modes
# Incremental Sync (recommended)
sync_mode: "incremental" # Only sync new/modified data
# Full Refresh
sync_mode: "full-refresh" # Re-index all data
Schedule Examples
# Daily at midnight
sync_schedule: "0 0 * * *"
# Every 6 hours
sync_schedule: "0 */6 * * *"
# Weekdays only at 9 AM
sync_schedule: "0 9 * * 1-5"
# Every 30 minutes
sync_schedule: "*/30 * * * *"
Action Configuration
Configure the Postgres node by selecting the desired action and providing the necessary parameters.
Parameter | Description | Required | Default | Example |
---|---|---|---|---|
Credentials | Connection settings for your PostgreSQL database | ✅ | — | PostgreSQL Connection |
Action | Action to perform on the Postgres node (e.g., execute a query) | ✅ | — | Run Query |
Query | SQL statement to be executed | ✅ | — | SELECT * FROM users; |
Low-Code Example
triggerNode:
nodeId: triggerNode_1
nodeType: postgresNode
nodeName: PostgreSQL
values:
credentials: PostgreSQL Connection
schema: public
table: users
sync_mode: incremental
sync_schedule: "0 0 * * *"
Event Trigger Output
The PostgreSQL node outputs database data in the following format:
Example Output
{
"table_name": "users",
"schema_name": "public",
"data": [
{
"id": 1,
"name": "John Doe",
"email": "[email protected]",
"created_at": "2024-01-01T00:00:00Z"
},
{
"id": 2,
"name": "Jane Smith",
"email": "[email protected]",
"created_at": "2024-01-02T00:00:00Z"
}
],
"metadata": {
"total_rows": 2,
"sync_mode": "incremental",
"last_sync": "2024-01-01T12:00:00Z"
}
}
Output Schema
table_name
: String name of the processed tableschema_name
: String name of the database schemadata
: Array of objects containing row data with column names as keysmetadata
: Additional information about the sync operationtotal_rows
: Number of rows processedsync_mode
: Sync mode used (incremental/full-refresh)last_sync
: Timestamp of the sync operation
Output Schema
Batch Trigger Output
table_name
: String name of the processed table or viewschema_name
: String name of the database schemadata
: Array of objects containing row datametadata
: Additional sync informationtotal_rows
: Integer count of processed rowssync_mode
: String indicating sync mode usedlast_sync
: ISO timestamp of sync operationcolumns
: Array of column names in the tablerow_count
: Integer total number of rows in table
Troubleshooting
Common Issues
Problem | Solution |
---|---|
Connection Failed | Verify PostgreSQL credentials and network connectivity |
Table Not Found | Check schema and table names, ensure they exist in the database |
Permission Denied | Ensure your database user has appropriate read permissions |
Sync Not Scheduled | Check cron expression format and ensure sync schedule is properly configured |
Schema Access Error | Verify you have access to the specified database schema |
Large Table Performance | Use incremental sync mode and consider using materialized views for large datasets |
Network Connectivity | Check firewall settings and whitelist Cloudflare IPs if connection fails |
Debugging
- Check Lamatic Flow logs for detailed error messages
- Verify PostgreSQL credentials and database connectivity
- Test database connection using a PostgreSQL client
- Validate schema and table names exist and are accessible
- Confirm database user has read permissions on target objects
- Monitor sync logs for specific table processing errors
- Test with a small table before processing large datasets
- If the connection fails, whitelist the following IPs: https://www.cloudflare.com/ips/ (opens in a new tab)
Best Practices
- Use
incremental
sync mode for better performance with large tables - Schedule syncs during off-peak hours to minimize database impact
- Use materialized views for complex queries that need regular updates
- Implement appropriate database indexes for better sync performance
- Regularly monitor sync logs for any issues
- Consider using read replicas for sync operations to avoid impacting production
- Test sync configurations with small datasets before full deployment
- Implement proper error handling for database connection failures
Example Use Cases
Data Intelligence Workflows
- User Analytics: Sync user behavior data for AI-powered insights
- Business Metrics: Process sales, inventory, and performance data
- Customer Data: Index customer profiles and interactions for personalized experiences
- Operational Data: Automate access to logs, events, and system metrics
RAG Applications
- Semantic Search: Enable natural language search across database records
- Question Answering: Build AI assistants that can answer questions about business data
- Data Summarization: Automatically summarize large datasets and reports
- Content Discovery: Help users find relevant information across database tables