SharePoint Business Node

Loading node sections...

Overview

The SharePoint Business Node is a document synchronization component that automates document syncing and processing from Microsoft SharePoint sites. This node supports various file types including PDFs, Word documents, Excel files, and other compatible formats, providing secure integration with Lamatic Flow for automated document intelligence and RAG workflows.

Node Type Information

Type	Description	Status
Batch Trigger	Starts the flow on a schedule or batch event. Ideal for periodic data processing.	✅ True
Event Trigger	Starts the flow based on external events (e.g., webhook, user interaction).	❌ False
Action	Executes a task or logic as part of the flow (e.g., API call, transformation).	❌ False

This node is a Batch Trigger node that automates document syncing and processing from SharePoint Business sites on a scheduled basis.

This node is a Batch Trigger node that automates document collection from SharePoint repositories and enables RAG workflows with organizational knowledge.

This integration connects to your organization's SharePoint instance to sync documents for processing in Lamatic Flow.

Features

Key Functionalities

Document Syncing: Automatically syncs documents from SharePoint sites and folders with configurable scheduling.
File Type Support: Handles PDFs, Word documents, Excel files, and other compatible formats for comprehensive document processing.
Scheduled Processing: Supports automated sync schedules with cron expressions for regular document updates.
Selective Filtering: Use glob patterns to filter specific file types and paths for targeted document processing.
Multiple Sync Modes: Supports both incremental (new/modified files only) and full-refresh (all files) synchronization modes.
Search Scope Control: Configurable search scope including accessible drives, shared items, or all files.

Benefits

Automated Document Collection: Automates document collection from SharePoint repositories, reducing manual effort and ensuring consistency.
RAG Workflow Enablement: Enables RAG workflows with organizational knowledge for intelligent document processing.
Granular Control: Provides granular control over file selection and processing through glob patterns and search scopes.
Secure Integration: Offers secure integration with Microsoft 365 accounts and SharePoint Business environments.
Flexible Processing: Supports various parsing strategies for different document types and quality requirements.

Prerequisites

Before using SharePoint Business Node, ensure the following:

Microsoft 365 Account: A Microsoft 365 account with SharePoint access.
SharePoint Permissions: Appropriate permissions to access SharePoint sites and folders.
Tenant ID: Your organization's Tenant ID from Microsoft Entra Admin Center.
Site Structure Knowledge: Understanding of SharePoint site structure and file organization.
API Permissions: Proper SharePoint API permissions for accessing selected sites and folders.

Installation

Step 1: Set Up Microsoft 365 Credentials

Please refer to the SharePoint Integration documentation to complete the setup and obtain the necessary credentials.

⚠️

Ensure you have appropriate SharePoint access permissions for the sites you want to sync.

Step 2: Set Up Lamatic Flow

Add SharePoint Node: Drag the SharePoint node to your flow
Enter Credentials: Provide your Microsoft 365 Tenant ID
Configure Site URL: Enter the SharePoint site URL you want to sync from
Set Folder Path: Specify the folder path within the site (use "." for all folders)

Configuration Reference

Batch Trigger Configuration

Parameter	Description	Required	Default	Example
Site URL	URL of the SharePoint site to search for files	✅	-	`https://lamatic.sharepoint.com/sites/test`
Folder Path	Path to a specific folder within the drive. Use `"."` to search all folders	✅	-	`./TeamDocs/HR`
Globs (Path Patterns)	Pattern to filter specific files based on extension or name	❌	`**`	`*/.pdf`, `*/.docx`
Sync Mode	Sync behavior: `full_refresh` or `incremental`	✅	`incremental`	`incremental`
Sync Schedule	Cron expression for scheduled syncs	❌	-	`0 0 * * *` (daily at midnight)
Search Scope	Location to search: `ACCESSIBLE_DRIVES`, `SHARED_ITEMS`, or `ALL`	✅	`ALL`	`ALL`
Parsing Strategy	Document parsing method: `fast`, `ocr_only`, or `hi_res`	✅	`fast`	`hi_res`
Days To Sync If History Is Full	Limit sync to files modified in the last N days if sync state is full	❌	`30`	`30`
Start Date	Ignore files modified before this UTC datetime (ISO format)	❌	-	`2017-01-25T00:00:00.000000Z`

Sync Configuration Options

Sync Modes

# Incremental Sync (recommended)
sync_mode: "incremental"  # Only sync new/modified files
 
# Full Refresh
sync_mode: "full_refresh"  # Re-index all files

Schedule Examples

# Daily at midnight
sync_schedule: "0 0 * * *"
 
# Every 6 hours
sync_schedule: "0 */6 * * *"
 
# Weekdays only at 9 AM
sync_schedule: "0 9 * * 1-5"

File Filtering Patterns

Common Glob Patterns

# All PDF files
globs: "**/*.pdf"
 
# All Word and Excel files
globs: "**/*.docx", "**/*.xlsx"
 
# Files in specific folders
globs: "HR/**/*", "Legal/**/*"
 
# Exclude draft folders
globs: "**/*", "!**/draft/**"

Search Scope Options

ACCESSIBLE_DRIVES: Only files in drives you have direct access to
SHARED_ITEMS: Files shared with you by others
ALL: All accessible files (recommended)

Low-Code Example

triggerNode:
  nodeId: triggerNode_1
  nodeType: sharepointBusinessNode
  nodeName: SharePoint Business
  values:
    credentials: Microsoft 365
    site_url: https://company.sharepoint.com/sites/documents
    folder_path: .
    globs: ["**/*.pdf", "**/*.docx"]
    sync_mode: incremental
    sync_schedule: "0 2 * * *"
    search_scope: ALL
    parsing_strategy: fast
    days_to_sync_if_history_full: 30
    start_date: "2024-01-01T00:00:00.000000Z"

Event Trigger Output

The SharePoint Business node outputs document data in the following format:

Example Output

{
    "document_key": "document_name.pdf",
    "content": "Extracted text content from the SharePoint document",
    "metadata": {
        "file_type": "pdf",
        "file_size": 1024000,
        "last_modified": "2024-01-01T00:00:00.000Z",
        "site_url": "https://company.sharepoint.com/sites/documents",
        "folder_path": "./TeamDocs",
        "parsing_strategy": "fast"
    }
}

Output Schema

document_key: String identifier for the document (filename)
content: String containing the extracted text content from the document
metadata: Additional information about the document
- file_type: Type of the processed file
- file_size: Size of the file in bytes
- last_modified: Timestamp of last modification
- site_url: URL of the SharePoint site
- folder_path: Path within the site
- parsing_strategy: Strategy used for content extraction

Output Schema

Batch Trigger Output

document_key: String identifier for the document (filename)
content: String containing the extracted text content
metadata: Additional document information
- file_type: String indicating file type (pdf, docx, xlsx, etc.)
- file_size: Integer size in bytes
- last_modified: ISO timestamp of last modification
- site_url: String URL of the SharePoint site
- folder_path: String path within the site
- parsing_strategy: String indicating parsing method used
- sync_mode: String indicating sync mode used
- search_scope: String indicating search scope used

Usage Examples

Basic SharePoint Sync

# Basic configuration for syncing all documents
site_url: "https://company.sharepoint.com/sites/documents"
folder_path: "."
globs: "**/*.pdf", "**/*.docx"
sync_mode: "incremental"
search_scope: "ALL"
parsing_strategy: "fast"

Advanced Configuration

# Advanced setup with scheduling and filtering
site_url: "https://company.sharepoint.com/sites/knowledge"
folder_path: "./TeamDocs"
globs: "**/*.pdf", "**/*.docx", "!**/draft/**"
sync_mode: "incremental"
sync_schedule: "0 2 * * *"  # Daily at 2 AM
search_scope: "ALL"
parsing_strategy: "hi_res"
days_to_sync_if_history_full: 30
start_date: "2024-01-01T00:00:00.000000Z"

Selective Document Sync

# Sync only specific document types from HR folder
site_url: "https://company.sharepoint.com/sites/hr"
folder_path: "./Policies"
globs: "**/*.pdf", "**/*.docx"
sync_mode: "incremental"
search_scope: "ACCESSIBLE_DRIVES"
parsing_strategy: "ocr_only"  # Better for scanned documents

Troubleshooting

Common Issues

Problem	Solution
Authentication Failed	Verify Tenant ID is correct and you have SharePoint access permissions
Site Not Found	Check the Site URL format and ensure you have access to the specified site
Files Not Syncing	Verify folder path exists and glob patterns are correctly formatted
Permission Denied	Ensure your Microsoft 365 account has appropriate SharePoint permissions
Sync Not Scheduled	Check cron expression format and ensure sync schedule is properly configured
Parsing Errors	Verify parsing strategy is appropriate for the document types being processed
Large File Issues	Check file size limits and consider using incremental sync for large repositories

Debugging

Check Lamatic Flow logs for detailed error messages
Verify Microsoft 365 credentials and Tenant ID
Test SharePoint site access in your browser
Validate folder path exists and is accessible
Test glob patterns to ensure they match your documents
Monitor sync logs for specific file processing errors
Verify network connectivity and API rate limits
Test with a small folder before processing large repositories

Best Practices

Use incremental sync mode for better performance
Implement specific glob patterns to avoid syncing unnecessary files
Schedule syncs during off-peak hours to minimize impact
Use hi_res parsing for scanned documents and images
Regularly monitor sync logs for any issues
Set appropriate days_to_sync_if_history_full to limit historical data
Test parsing strategies with sample documents before full deployment

Example Use Cases

Document Intelligence Workflows

Company Policies: Sync HR policies and procedures for automated Q&A
Knowledge Bases: Index departmental wikis and team documentation
Compliance Documents: Process legal and audit-related content
Project Documentation: Automate access to project files and reports

RAG Applications

Semantic Search: Enable natural language search across organizational documents
Question Answering: Build AI assistants that can answer questions about company policies
Document Summarization: Automatically summarize lengthy reports and documents
Content Discovery: Help users find relevant information across SharePoint sites

SharePoint Business Node

Overview

Node Type Information

Features

Prerequisites

Installation

Step 1: Set Up Microsoft 365 Credentials

Step 2: Set Up Lamatic Flow

Configuration Reference

Batch Trigger Configuration

Sync Configuration Options

Sync Modes

Schedule Examples

File Filtering Patterns

Common Glob Patterns

Search Scope Options

Low-Code Example

Event Trigger Output

Example Output

Output Schema

Output Schema

Batch Trigger Output

Usage Examples

Basic SharePoint Sync

Advanced Configuration

Selective Document Sync

Troubleshooting

Common Issues

Debugging

Best Practices

Example Use Cases

Document Intelligence Workflows

RAG Applications

Additional Resources

Was this page useful?

Questions? We're here to help

Subscribe to updates