Product News
Announcing the Box Source Integration
Our new Box source integration lets you sync files, folders, and all their permission metadata to any CloudQuery destination. Whether you're running compliance audits, optimizing storage costs, or building AI applications that need enterprise content, you can now access your Box data where your analytics actually happen.
With the Box integration, you can:
- Audit file access and permissions across your entire organization
- Track storage usage and costs to optimize your Box spending
- Monitor collaboration patterns to understand how teams work together
- Build AI applications that respect access controls while providing context
- Generate compliance reports for security and governance requirements
- Analyze content usage to identify popular documents and collaboration hotspots
Why We Built the Box Integration #
Companies store critical business data in Box, but getting that information into your analytics stack has always been challenging. You need file metadata for compliance reports, storage analysis for cost optimization, and permission audits for security reviews. Until now, this meant building custom scripts or dealing with limited export options.
CloudQuery's Box plugin changes that. Sync your Box data to any destination where you can actually work with it. Run SQL queries against your file structure, build dashboards showing storage trends, or feed clean datasets into your business intelligence tools.
We've heard from you that the permission data is particularly valuable. You can audit who has access to what, track collaboration patterns, and ensure compliance with data governance policies. Plus, if you're building AI applications, having both content and access controls in your data warehouse makes it simple to provide contextually appropriate information to different users.
Box stores your business documents, but your analytics happen elsewhere. This plugin bridges that gap.
Getting Started #
The setup is straightforward:
- Install CloudQuery - Download and install the CloudQuery CLI using Homebrew (
brew install cloudquery/tap/cloudquery
), Docker (docker pull ghcr.io/cloudquery/cloudquery
), or download the latest release - Configure Box credentials - Set up your Box API access
- Start syncing - Run your first sync to see your data
We've documented the full process at our installation quickstart page.
Practical Use Cases and SQL Examples #
Let's look at what you can do with data from the Box Source Integration. Here are some ideas for queries that solve problems that CloudQuery users have told us they've had.
Find All Files a Specific User Can Access #
Need to audit what files a user has access to? This query shows all files accessible by a specific user:
SELECT DISTINCT f.name, f.type, f.size, f.created_at
FROM box_files f
JOIN box_file_collaborations fc ON f.id = fc.item_id
JOIN box_users u ON fc.accessible_by_id = u.id
WHERE u.login = '[email protected]'
AND fc.status = 'accepted'
ORDER BY f.modified_at DESC;
Identify Large Files Taking Up Storage Space #
Storage costs add up fast. This query finds your biggest files so you can decide what stays and what goes:
SELECT name, size, type, owner_login, created_at
FROM box_files
WHERE size > 104857600 -- Files larger than 100MB
ORDER BY size DESC
LIMIT 20;
Audit Folder Permissions Across Your Organization #
Security audits become simple. This query shows exactly who has edit access to which folders:
SELECT
f.name as folder_name,
fc.role,
u.name as user_name,
u.login,
fc.status
FROM box_folders f
JOIN box_folder_collaborations fc ON f.id = fc.item_id
JOIN box_users u ON fc.accessible_by_id = u.id
WHERE fc.role IN ('editor', 'co-owner', 'owner')
ORDER BY f.name;
Track File Activity and Collaboration Patterns #
Understanding usage patterns helps with storage planning and collaboration insights. This query shows monthly trends:
SELECT
DATE_TRUNC('month', f.created_at) as month,
COUNT(*) as files_created,
COUNT(DISTINCT fc.accessible_by_id) as unique_collaborators,
AVG(f.size) as avg_file_size
FROM box_files f
LEFT JOIN box_file_collaborations fc ON f.id = fc.item_id
GROUP BY DATE_TRUNC('month', f.created_at)
ORDER BY month DESC;
Available Tables You Can Sync #
The integration covers everything you need for comprehensive Box data analysis:
File Management:
box_files
- File metadata, sizes, and propertiesbox_file_contents
- Extracted text content from documents
Organization Structure:
box_folders
- Folder hierarchy and metadatabox_folder_items
- Items within folders
Access Control:
box_file_collaborations
- File-level permissions and sharingbox_folder_collaborations
- Folder-level permissions and sharingbox_group_collaborations
- Team-based access control
Identity Management:
box_users
- User accounts and profilesbox_groups
- Group definitions and memberships
Content Organization:
box_collections
- Curated content sets and bookmarks
Start Syncing Your Box Data Now #
Ready to get your Box data where you can actually use it? Install CloudQuery and start syncing your files, folders, and permissions today. The plugin handles authentication, rate limiting, and incremental updates automatically.
Next Steps #
- Explore integrations: Browse all available source integrations and destination integrations in our hub
- Get help: Join our community forum where CloudQuery users share configurations and solve problems together
- Learn more: Check out the complete Box integration documentation for advanced configuration options
Your stack is waiting for your Box data. Let's get it there.