PostgreSQL Integration
The PostgreSQL integration in Lamatic automates data synchronization from PostgreSQL databases. It enables scheduled fetching of rows from specified tables and supports vectorization and indexing for Retrieval-Augmented Generation (RAG) workflows.
This integration connects to your PostgreSQL database to sync data for processing in Lamatic Flow.
Features
✅ Key Functionalities
- Data Synchronization: Automatically syncs data from PostgreSQL tables and materialized views
- Scheduled Processing: Supports automated sync schedules with cron expressions
- RAG Integration: Enables vectorization and indexing for AI-powered data retrieval
- Flexible Sync Modes: Supports both incremental and full-refresh synchronization
✅ Benefits
- Automates database synchronization processes
- Enables RAG workflows with PostgreSQL data
- Provides granular control over table selection and processing
- Supports both incremental and full-refresh synchronization modes
Available Functionality
Event Triggers
✅ Scheduled data syncing from PostgreSQL tables
✅ Support for tables and materialized views
✅ Schema-specific monitoring and filtering
✅ Incremental and full-refresh sync modes
Actions
✅ Parse and extract data from PostgreSQL rows
✅ Vectorize content for RAG workflows
✅ Filter tables using schema and table names
✅ Schedule automated sync operations
Prerequisites
Before setting up the PostgreSQL integration, ensure you have:
- PostgreSQL database credentials (host, port, database, username, password)
- Access to target database, schemas, and tables
- Understanding of cron expressions for scheduling
- Appropriate database permissions for read operations
Setup
Step 1: Set Up PostgreSQL Credentials
- Database Access: Ensure you have access to the PostgreSQL database
- User Permissions: Verify your database user has appropriate read permissions
- Connection Details: Gather your database connection information:
- Host address
- Port number (default: 5432)
- Database name
- Username and password
Ensure your database user has appropriate permissions to access the selected schemas and tables.
If the connection fails, whitelist the following IPs: https://www.cloudflare.com/ips/ (opens in a new tab)
Step 2: Configure PostgreSQL Node
- Add PostgreSQL Node: Drag the PostgreSQL node to your flow
- Enter Credentials: Provide your database connection details
- Configure Schema: Enter the database schema to process
- Set Table/View: Specify the table or materialized view name
Step 3: Test and Deploy
- Test Connection: Verify the node can connect to your PostgreSQL database
- Configure Sync Settings: Set up sync mode, schedule, and table filters
- Deploy Flow: Activate the flow to start syncing data
Configuration Reference
PostgreSQL Node Parameters
Parameter | Description | Required | Default | Example |
---|---|---|---|---|
Credentials | PostgreSQL database connection details | ✅ | - | PostgreSQL Connection |
Schema | Database schema to be processed | ✅ | - | public |
Table/View | Source table or materialized view for data processing | ✅ | - | users |
Sync Mode | Controls how data is re-indexed: incremental or full-refresh | ✅ | incremental | incremental |
Sync Schedule | Schedule for automated syncs using cron expressions | ✅ | - | 0 0 * * * |
Sync Configuration Options
Sync Modes
# Incremental Sync (recommended)
sync_mode: "incremental" # Only sync new/modified data
# Full Refresh
sync_mode: "full-refresh" # Re-index all data
Schedule Examples
# Daily at midnight
sync_schedule: "0 0 * * *"
# Every 6 hours
sync_schedule: "0 */6 * * *"
# Weekdays only at 9 AM
sync_schedule: "0 9 * * 1-5"
# Every 30 minutes
sync_schedule: "*/30 * * * *"
Database Configuration
Schema and Table Selection
# Single schema and table
schema: "public"
table: "users"
# Multiple schemas (comma-separated)
schemas: "public,analytics,reports"
# Multiple tables (comma-separated)
tables: "users,orders,products"
Usage Examples
Basic PostgreSQL Sync
# Basic configuration for syncing user data
credentials: "PostgreSQL Connection"
schema: "public"
table: "users"
sync_mode: "incremental"
sync_schedule: "0 0 * * *" # Daily at midnight
Advanced Configuration
# Advanced setup with multiple schemas and scheduling
credentials: "PostgreSQL Connection"
schema: "analytics"
table: "user_metrics"
sync_mode: "incremental"
sync_schedule: "0 2 * * *" # Daily at 2 AM
Materialized View Sync
# Sync data from a materialized view
credentials: "PostgreSQL Connection"
schema: "reports"
table: "daily_summary_mv" # Materialized view
sync_mode: "full-refresh" # Full refresh for materialized views
sync_schedule: "0 6 * * *" # Daily at 6 AM
Troubleshooting
Common Issues
Problem | Solution |
---|---|
Connection Failed | Verify PostgreSQL credentials and network connectivity |
Table Not Found | Check schema and table names, ensure they exist in the database |
Permission Denied | Ensure your database user has appropriate read permissions |
Sync Not Scheduled | Check cron expression format and ensure sync schedule is properly configured |
Schema Access Error | Verify you have access to the specified database schema |
Debugging Steps
- Verify Credentials: Test your PostgreSQL connection details
- Check Database Access: Ensure you can connect to the database using a client
- Validate Schema/Table: Confirm the schema and table exist and are accessible
- Test Permissions: Verify your user has read permissions on the target objects
- Check Sync Logs: Review Lamatic Flow logs for detailed error information
- If the connection fails, whitelist the following IPs: https://www.cloudflare.com/ips/ (opens in a new tab)
Best Practices
- Use
incremental
sync mode for better performance with large tables - Schedule syncs during off-peak hours to minimize database impact
- Use materialized views for complex queries that need regular updates
- Implement appropriate database indexes for better sync performance
- Regularly monitor sync logs for any issues
- Consider using read replicas for sync operations to avoid impacting production
Example Use Cases
Data Intelligence Workflows
- User Analytics: Sync user behavior data for AI-powered insights
- Business Metrics: Process sales, inventory, and performance data
- Customer Data: Index customer profiles and interactions for personalized experiences
- Operational Data: Automate access to logs, events, and system metrics
RAG Applications
- Semantic Search: Enable natural language search across database records
- Question Answering: Build AI assistants that can answer questions about business data
- Data Summarization: Automatically summarize large datasets and reports
- Content Discovery: Help users find relevant information across database tables