OneDrive Integration
Overview
The OneDrive integration in Lamatic automates document syncing and processing from Microsoft OneDrive Business accounts. It supports various file types and provides secure integration with Lamatic Flow for automated document intelligence and RAG workflows.
This integration connects to your OneDrive Business account to sync documents for processing in Lamatic Flow.
Features
✅ Key Functionalities
- Document Syncing: Automatically syncs documents from OneDrive drives and folders
- File Type Support: Handles PDFs, Word documents, Excel files, and other compatible formats
- Scheduled Processing: Supports automated sync schedules with cron expressions
- Selective Filtering: Use glob patterns to filter specific file types and paths
✅ Benefits
- Automates document collection from OneDrive repositories
- Enables RAG workflows with organizational knowledge
- Provides granular control over file selection and processing
- Supports both incremental and full-refresh synchronization modes
Available Functionality
Event Triggers
✅ Scheduled document syncing from OneDrive drives
✅ Support for multiple file types (PDF, DOCX, XLSX, etc.)
✅ Folder-specific monitoring and filtering
✅ Incremental and full-refresh sync modes
Actions
✅ Parse and extract text from documents
✅ Vectorize content for RAG workflows
✅ Filter files using glob patterns
✅ Schedule automated sync operations
Prerequisites
Before setting up the OneDrive integration, ensure you have:
- A Microsoft 365 account with OneDrive Business access
- Appropriate permissions to access OneDrive files and drives
- Your organization's Tenant ID from Microsoft Entra Admin Center
- Understanding of OneDrive folder structure and file organization
Setup
Step 1: Set Up Microsoft 365 Credentials
- Get Tenant ID: Navigate to Microsoft Entra Admin Center (opens in a new tab)
- Access Azure Active Directory: Go to Azure Active Directory section
- Copy Tenant ID: Under Tenant Information, copy the Tenant ID (also called Directory ID)
Ensure you have appropriate OneDrive API permissions to access the selected files and drives.
Step 2: Configure OneDrive Node
- Add OneDrive Node: Drag the OneDrive node to your flow
- Enter Credentials: Provide your Microsoft 365 Tenant ID
- Configure Drive Name: Enter the name of your OneDrive drive (usually "OneDrive")
- Set Folder Path: Specify the folder path within the drive (use
"."
for all folders)
Step 3: Test and Deploy
- Test Connection: Verify the node can access your OneDrive account
- Configure Sync Settings: Set up sync mode, schedule, and file filters
- Deploy Flow: Activate the flow to start syncing documents
Configuration Reference
OneDrive Node Parameters
Parameter | Description | Required | Default | Example |
---|---|---|---|---|
Credentials | Microsoft 365 credentials with access to OneDrive files | ✅ | - | Microsoft 365 |
Drive Name | Name of the connected OneDrive drive | ✅ | - | OneDrive |
Folder Path | Path within the drive to target. Use "." to sync all folders | ✅ | - | . |
Globs (Path Patterns) | Glob pattern for matching files | ❌ | ** | **/*.pdf , **/*.docx |
Sync Mode | Controls how files are re-indexed: full_refresh or incremental | ✅ | incremental | incremental |
Sync Schedule | Schedule for automated syncs | ❌ | - | Every 24 hours |
Search Scope | Scope of files to include: ACCESSIBLE_DRIVES , SHARED_ITEMS , or ALL | ✅ | ALL | ALL |
Parsing Strategy | Strategy for extracting content: fast , ocr_only , or hi_res | ✅ | fast | hi_res |
Days To Sync If History Is Full | Limit sync to files modified in the last N days if sync state is full | ❌ | 30 | 30 |
Start Date | Ignore files modified before this UTC datetime (ISO format) | ❌ | - | 2024-01-01T00:00:00.000000Z |
Sync Configuration Options
Sync Modes
# Incremental Sync (recommended)
sync_mode: "incremental" # Only sync new/modified files
# Full Refresh
sync_mode: "full_refresh" # Re-index all files
Schedule Examples
# Daily at midnight
sync_schedule: "0 0 * * *"
# Every 6 hours
sync_schedule: "0 */6 * * *"
# Weekdays only at 9 AM
sync_schedule: "0 9 * * 1-5"
File Filtering Patterns
Common Glob Patterns
# All PDF files
globs: "**/*.pdf"
# All Word and Excel files
globs: "**/*.docx", "**/*.xlsx"
# Files in specific folders
globs: "HR/**/*", "Legal/**/*"
# Exclude draft folders
globs: "**/*", "!**/draft/**"
Search Scope Options
ACCESSIBLE_DRIVES
: Only files in drives you have direct access toSHARED_ITEMS
: Files shared with you by othersALL
: All accessible files (recommended)
Usage Examples
Basic OneDrive Sync
# Basic configuration for syncing all documents
credentials: "Microsoft 365"
drive_name: "OneDrive"
folder_path: "."
globs: "**/*.pdf", "**/*.docx"
sync_mode: "incremental"
search_scope: "ALL"
parsing_strategy: "fast"
Advanced Configuration
# Advanced setup with scheduling and filtering
credentials: "Microsoft 365"
drive_name: "OneDrive"
folder_path: "./Documents"
globs: "**/*.pdf", "**/*.docx", "!**/draft/**"
sync_mode: "incremental"
sync_schedule: "0 2 * * *" # Daily at 2 AM
search_scope: "ALL"
parsing_strategy: "hi_res"
days_to_sync_if_history_full: 30
start_date: "2024-01-01T00:00:00.000000Z"
Selective Document Sync
# Sync only specific document types from Work folder
credentials: "Microsoft 365"
drive_name: "OneDrive"
folder_path: "./Work"
globs: "**/*.pdf", "**/*.docx"
sync_mode: "incremental"
search_scope: "ACCESSIBLE_DRIVES"
parsing_strategy: "ocr_only" # Better for scanned documents
Troubleshooting
Common Issues
Problem | Solution |
---|---|
Authentication Failed | Verify Tenant ID is correct and you have OneDrive access permissions |
Drive Not Found | Check the Drive Name and ensure you have access to the specified OneDrive |
Files Not Syncing | Verify folder path exists and glob patterns are correctly formatted |
Permission Denied | Ensure your Microsoft 365 account has appropriate OneDrive API permissions |
Sync Not Scheduled | Check cron expression format and ensure sync schedule is properly configured |
Debugging Steps
- Verify Credentials: Test your Microsoft 365 credentials and Tenant ID
- Check Drive Access: Ensure you can access the OneDrive drive in your browser
- Validate Folder Path: Confirm the folder path exists and is accessible
- Test Glob Patterns: Verify file filtering patterns match your documents
- Check Sync Logs: Review Lamatic Flow logs for detailed error information
Best Practices
- Use
incremental
sync mode for better performance - Implement specific glob patterns to avoid syncing unnecessary files
- Schedule syncs during off-peak hours to minimize impact
- Use
hi_res
parsing for scanned documents and images - Regularly monitor sync logs for any issues
- Set appropriate
days_to_sync_if_history_full
to limit historical data
Example Use Cases
Document Intelligence Workflows
- Business Documents: Sync reports, contracts, and spreadsheets for automated processing
- Internal Wikis: Index knowledge bases and team documentation
- Legal Documents: Process compliance and audit-related content
- Team Collaboration: Automate access to shared folders and project files
RAG Applications
- Semantic Search: Enable natural language search across OneDrive documents
- Question Answering: Build AI assistants that can answer questions about business documents
- Document Summarization: Automatically summarize lengthy reports and documents
- Content Discovery: Help users find relevant information across OneDrive repositories