Back to plugin list
Official
Gremlin
This plugin is in preview.
This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune
Publisher
cloudquery
Repositorygithub.com
Latest version
v2.4.3
Type
Destination
Platforms
Date Published
Mar 19, 2024
Price
Free
Overview
Gremlin Destination Plugin
This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune.
Supported database (tested) versions (We use the official Go driver):
- Gremlin Server >= 3.6.2
- AWS Neptune >= 1.2
As a side note graph databases can be quite useful for various networking use-cases, visualization, for read-teams, blue-teams and more.
Configuration
Example
This example configures a Gremlin destination, located at
ws://localhost:8182
. The username and password are stored in environment variables.kind: destination
spec:
name: "gremlin"
path: "cloudquery/gremlin"
registry: "cloudquery"
version: "v2.4.3"
spec:
endpoint: "ws://localhost:8182"
# Optional parameters
# auth_mode: none
# username: ""
# password: ""
# aws_region: ""
# aws_neptune_host: ""
# max_retries: 5
# max_concurrent_connections: 5 # default: number of CPUs
# batch_size: 200
# batch_size_bytes: 4194304 # 4 MiB
The (top level) spec section is described in the Destination Spec Reference.
The Gremlin destination utilizes batching, and supports
batch_size
and batch_size_bytes
.Connecting to AWS Neptune
For AWS Neptune, you don't need to specify any credentials if IAM authentication is not enabled. Keep
auth_mode
at none
.If IAM authentication is enabled, you need to set
auth_mode
to aws
and aws_region
to the region of the database. The plugin will use the default AWS credentials chain to authenticate.Plugin Spec
This is the (nested) spec used by the Gremlin destination Plugin.
endpoint
(string
) (required)Endpoint for the database. Supported schemes arewss://
andws://
, the default port is8182
."localhost"
(defaults towss://localhost:8182
)"ws://localhost:8182"
"wss://your-endpoint.cluster-id.your-region.neptune.amazonaws.com"
insecure
(boolean
) (optional)Whether to skip TLS verification. Defaults tofalse
. This should be set on a macOS environment when connecting to an AWS Neptune endpoint.auth_mode
(string
) (optional) (default:none
)Authentication mode to use.basic
uses static credentials,aws
uses AWS IAM authentication. Supported values arenone
,basic
oraws
.username
(string
) (optional)Username to connect to the database.password
(string
) (optional)Password to connect to the database.aws_region
(string
) (required whenauth_mode
isaws
)AWS region to use for AWS IAM authentication. Example:us-east-1
.aws_neptune_host
(string
) (optional, used whenauth_mode
isaws
)AWS Neptune host header to use with AWS IAM authentication. Use if you're not accessing Neptune directly, whenauth_mode
isaws
. Example:my-neptune.cluster.us-east-1.neptune.amazonaws.com
max_retries
(integer
) (optional) (default:5
)Number of retries onConcurrentModificationException
before giving up for each batch. Retries are exponentially backed off.max_concurrent_connections
(integer
) (optional) (default: number of CPUs)Maximum number of concurrent connections to the database.complete_types
(boolean
) (optional) (default:false
)Whether to use all Gremlin-supported types or just a basic set. Should remainfalse
for Amazon Neptune compatibility.batch_size
(integer
) (optional) (default:200
)Number of records to batch together before sending to the database.batch_size_bytes
(integer
) (optional) (default:4194304
(4 MiB))Number of bytes (as Arrow buffer size) to batch together before sending to the database.
Types
Gremlin Types
The Gremlin destination (
v2.0.0
and later) supports most Apache Arrow types. The following table shows the supported types and how they are mapped to Gremlin data types.Arrow Column Type | Supported? | Gremlin Type |
---|---|---|
Binary | ✅ Yes | Bytes |
Boolean | ✅ Yes | Boolean |
Date32 | ✅ Yes | String |
Date64 | ✅ Yes | String |
Decimal | ✅ Yes | String |
Dense Union | ✅ Yes | String |
Dictionary | ✅ Yes | String |
Duration[ms] | ✅ Yes | String |
Duration[ns] | ✅ Yes | String |
Duration[s] | ✅ Yes | String |
Duration[us] | ✅ Yes | String |
Fixed Size List | ✅ Yes | String |
Float16 | ✅ Yes | String |
Float32 | ✅ Yes | Float |
Float64 | ✅ Yes | Float |
Inet | ✅ Yes | String |
Int8 | ✅ Yes | Integer |
Int16 | ✅ Yes | Integer |
Int32 | ✅ Yes | Integer |
Int64 | ✅ Yes | Integer |
Interval[DayTime] | ✅ Yes | String |
Interval[MonthDayNano] | ✅ Yes | String |
Interval[Month] | ✅ Yes | String |
JSON | ✅ Yes | String |
Large Binary | ✅ Yes | Bytes |
Large List | ✅ Yes | String |
Large String | ✅ Yes | String |
List | ✅ Yes | String or List † |
MAC | ✅ Yes | String |
Map | ✅ Yes | String |
String | ✅ Yes | String |
Struct | ✅ Yes | String |
Timestamp[ms] | ✅ Yes | String * |
Timestamp[ns] | ✅ Yes | String |
Timestamp[s] | ✅ Yes | String |
Timestamp[us] | ✅ Yes | String |
UUID | ✅ Yes | String |
Uint8 | ✅ Yes | String |
Uint16 | ✅ Yes | Integer |
Uint32 | ✅ Yes | Integer |
Uint64 | ✅ Yes | Integer |
Union | ✅ Yes | String |
String-persisted data types are encoded according to the Arrow String Representation specification.
Notes
* Timestamps are converted to strings in the format
yyyy-MM-dd HH:mm:ss.SSSSSSSSS
(UTC timezone) (e.g. 2021-01-01 00:00:00.000000000
). _cq_sync_time
column is persisted in native Timestamp
type.† List types are persisted as-is only if
complete_types
option is enabled. Otherwise, they are converted to strings.NUL
bytes are stripped from strings.