What are data provenance graphs?

Share

We use data provenance graphs in auditing and intrusion detection for cyber security. These graphs describe the totality of system execution and help gather information regarding the data's origin, its present state, and who acted upon it.

Explanation

We use parse system logs, or audit logs, in data provenance analytics to create data provenance graphs. These help data link together and produce repeatable results, and build the data's full story in a graphical form.

To study data provenance graphs, we first need to understand their components. All vertices of the graph represent some file, socket, process, and so on. The edges of the graph define the causal relationship between the vertices.

Provenance graphs facilitate causal analysis by providing the following features:

  • Backward tracing: This helps analysts identify the root cause from the provided data.

  • Forward tracing: This allows analysts to find the ramifications of the attack and prevent further attacks of such types.

Example

The following illustration is an example of the execution of a data provenance graph:

Execution of the data provenance graph

Let's explain the execution process:

  • Userspace and Kernel: Here, the specified browser is the userspace that the operating system uses to interact with the KernelA component in the operating system where all the system callsPrograms that require operating system access to execute are handled.

  • Audit log: It saves the list of events that occur after performing a certain action. It first generates an ID is generated for the browser and then performs the read and write operations as required. These actions are analyzed to trace which function initiated a certain action.

  • Provenance graph: Bash is the terminal where codes are executed. Bash executes Firefox, and Firefox downloads a Mal.exe file. Later, a Mal process is spawned from the Mal.exe file. We can check all the related functions and events, and use them to trace the attack.

Note: We can also use provenance graphs to investigate security alerts fired by other monitoring products. The graphs can analyze all the connected and associated modules from malware.

Copyright ©2024 Educative, Inc. All rights reserved