GPT settings

Investigator uses GPT 4o to provide additional context and details for detections. An AI algorithm generates this content and there might be errors or omissions. We recommend using your best judgement with AI generated content. (GPT generated content is identified in the interface an icon.)

By default, GPT integration is enabled. Admin users can turn off GPT content through the system settings.

Note

To configure this integration, you need to have admin access. (Analyst users can view the integration but cannot make changes.)

To disable or enable GPT integration

  1. From System Settings in the left navigation, choose Integrations.

  2. In the Integrations tab, click the GPT card.

  3. Toggle the GPT integration value to Enabled or Disabled and click Save.

Large Language Model FAQs

Does Investigator use an AI Large Language Model (LLM)?

Yes, GPT through the API.

With Investigator, you get LLM-driven AI insights when reviewing Detections. This AI can be set to look at only generic data or can be set to look at your network data to provide deeper insights.

Also note, the on-prem Corelight Sensor has local rule-based detection and will soon have both ML and anomaly detection. This means you can get advanced detection engines even with a sensor-only deployment streaming data directly to your SIEM.

What type of AI model is used?

Investigator currently uses the Azure OpenAI, and does not yet support proprietary models from customers.

In what instance is the model hosted?

OpenAI infrastructure. In the coming year, we expect to add the capability to integrate with your own, internally hosted model.

Where does the model reside?

We use the OpenAI data center closest to your region.

Is this on-prem solution?

Corelight provides a sensor-only deployment that streams our network visibility and detection logs to any SIEM or data lake. The sensor can be in a data center, office, or in your private or public cloud.

Corelight also provides Investigator as a cloud-based solution for log storage and analyst triage and investigation. This platform is often cheaper than SIEMs for data storage, adds more detection capabilities, and offers a guided triage experience with AI and an easy-to-use interface.

What does AI technology add to Investigator?

With non-private data sharing enabled, the LLM will:

  • better describes the rule logic for alerts

  • better describes why a Detection might be important

  • better guide to typical next steps in such a scenario

With private data sharing enabled, the LLM will:

  • analyze Suricata payloads to help your analysts spot concerning values

  • analyze the network traffic logs surrounding a detection to summarize what the hosts involved were actually doing at the time of the detection

What data is used to train the AI?

Corelight benefits from our Open Source foundational technologies (Zeek and Suricata) and enough data is on the public internet that the common foundational models like OpenAI ChatGPT work extremely well without customization. Therefore, we don’t need to train on customer data. We use our own internal data to test the model extensively, and only proceed with an AI feature when we see a very high reliability rate in the quality of the responses.

Additionally, customers who opt in to sharing private data to get better insights will not be feeding into any training models.

Does Corelight use customer data?

No. If enabled, customer data will only be shared with the model when an analyst is investigating a threat. Corelight will not use customer data to train language models, nor will our partners.

How does the AI technology achieve these results?

A research team at Corelight tests many different LLM scenarios, prompts, and data sets to uncover valuable use cases for LLMs. The model we use is a standard model with no special retrieval augmented generation (RAG) or other augmentation at this time.