GPT integration¶

Investigator uses GPT 4o-mini to provide additional context and details for detections. An AI algorithm generates this content and there might be errors or omissions. We recommend using your best judgement with AI generated content. (GPT generated content is identified in the interface an icon.)

GPT has two integrations: one for non-private data and one for private data.

The non-private data integration helps populate content in the alert catalog and generates descriptions, next steps, and additional details for some alerts. The non-private GPT integration only applies to Corelight-provided data, rules, and alerts and is not available for unknown or customer-generated data. Advantages of the non-private data integration include:

better descriptions for the rule logic that contributes to an alert
better information for why a detection might be important
better guidance for typical next steps

A second integration lets you enable GPT for private data. If you enable the integration for private data, you can receive additional context for session, payload, and other private data activities. Advantages include:

ability to analyze Suricata payloads to help analysts identify concerning values
analysis of the network traffic logs surrounding a detection to summarize what the involved entities were doing at the time of the detection

Note

We only share data with the model when an analyst is investigating a threat and we do not use customer data for training the AI model.

By default, non-private GPT integration is enabled and the private GPT integration is not enabled. Admin users can adjust these options through the system settings.

Note

The private data integration is currently available by request only. Contact your account manager to add this integration.

To configure a GPT integration, you need to have admin access. (Analyst users can view the integration but cannot make changes.)

To disable or enable GPT integration

From System Settings in the left navigation, choose Integrations.
In the Integrations tab, click the GPT card for either Private Data or for Non-Private Data.
Click Configure.
Click Configure.
Toggle the GPT integration value to Enabled or Disabled and click Save.

Large Language Model FAQs¶

Does Investigator use an AI Large Language Model (LLM)?

Yes, GPT through the API. The GPT integration for non-private data is enabled by default. You can choose to disable it. The GPT integration for private data is not enabled by default. You can choose to enable it.

In what instance is the model hosted?

OpenAI infrastructure.

Where does the model reside?

Data sent to OpenAI through the API is processed in the United States.

What does AI technology add to Investigator?

With non-private data integration enabled, the LLM adds:

better descriptions for the rule logic for alerts

better explanations for why a detection might be important

better guidance to typical next steps

With private data integration enabled, the LLM can:

analyze Suricata payloads to help your analysts spot concerning values

analyze the network traffic logs surrounding a detection to summarize what the hosts involved were actually doing at the time of the detection

What data trains the AI?

Corelight benefits from our Open Source foundational technologies (Zeek and Suricata) and enough data is on the public internet that the common foundational models like OpenAI ChatGPT work extremely well without customization. Therefore, we don’t need to train on customer data. We use our own internal data to test the model extensively, and only proceed with an AI feature when we see a very high reliability rate in the quality of the responses.

Does the LLM train on my customer data?

No.

How does the AI technology achieve these results?

A research team at Corelight tests many different LLM scenarios, prompts, and data sets to uncover valuable use cases for LLMs. The model we use is a standard model with no special retrieval augmented generation (RAG) or other augmentation.

What does GPT do with my data?

OpenAI’s zero data retention policy applies to the private data integration, which means that OpenAI will not log content for human review or save content to disk. You can learn more about the Open AI zero data retention policy at this address: https://platform.openai.com/docs/models/default-usage-policies-by-endpoint.