top of page

How AI Summarizes Witness Interviews & Why It Misses Key Points: Bringing Transparency to AI

When a workplace investigator types "summarize this interview for me" into ChatGPT, they are interacting with a sophisticated language model known as a Large Language Model (LLM). In this article I try to provide a simple explanation of what happens behind the scenes.

First, let's understand the basic structure of an LLM like ChatGPT. These models are built using a type of artificial neural network called a transformer. Transformers are designed to handle sequences of data, such as sentences and paragraphs, by paying attention to the context of each word within a given sequence. The model has been trained on a massive amount of text data from diverse sources, allowing it to learn the patterns and structures of human language.

When the investigator inputs their request, the text is converted into numerical form using a process called tokenization. Each word or sub-word in the input is assigned a unique identifier (token) that the model can process. These tokens are then fed into the transformer network, which consists of multiple layers of processing units called neurons. Each layer performs a series of calculations, transforming the input data as it moves through the network.

One of the key components of the transformer is the attention mechanism. This mechanism helps the model determine which words in the input are most relevant to the task at hand. For instance, when summarizing an interview, the model needs to focus on the main points and key details rather than every single word. The attention mechanism assigns different weights to different tokens based on their importance, allowing the model to prioritize critical information.

As the tokens pass through the layers of the transformer, the model generates a contextualized representation of the input text. This means that the meaning of each word is understood in the context of the surrounding words. This contextual understanding is crucial for tasks like summarization, where the model needs to grasp the overall meaning and structure of the input text.

Once the model has processed the input through its layers, it generates an output sequence that represents the summary of the interview. This output sequence is initially in numerical form (tokens) and is then converted back into human-readable text through a process called detokenization. The final result is a coherent and concise summary of the interview, generated based on the patterns and knowledge the model has learned during its training.

It's important to note that while LLMs like ChatGPT are powerful, they have limitations. They don't truly understand the content in the way humans do; instead, they generate responses based on patterns in the data they have seen. Additionally, the quality of the output can vary depending on the complexity of the input and the specific request made by the user.

So why does AI sometimes miss the mark and not produce exactly what we want?

When a lawyer or investigator uses ChatGPT to summarize an interview, they may occasionally find that certain key points or nuances are missing from the summary. This discrepancy arises due to several reasons rooted in how Large Language Models (LLMs) like ChatGPT operate.

Firstly, LLMs generate responses based on patterns in the data they have been trained on. While they are highly adept at understanding and mimicking language, they do not possess true comprehension or awareness. When summarizing text, the model prioritizes information based on statistical significance rather than contextual importance from a human perspective. As a result, some details that a lawyer or investigator might consider critical might not be highlighted if they do not appear as statistically significant in the training data.

Secondly, the model's training data plays a crucial role. If certain types of information or specific contexts were underrepresented in the data, the model might not recognize their importance in the same way a human would. For example, legal nuances or specific investigative details that are critical in professional settings might not be as emphasized in the model’s training data, leading to less focus on these points in the generated summary.

Thirdly, the prompt given to the model affects the output. When an investigator asks for a summary, the model interprets the request in a general sense. The model aims to provide a broad overview rather than a detailed account unless explicitly instructed to focus on specific aspects. If the request isn't precise, the summary might miss elements that the investigator considers important. Specific instructions or detailed prompts can help guide the model to include more relevant points.

Moreover, LLMs do not have the ability to fully understand the context or the significance of certain details in the way a human specialist does. A lawyer or investigator has specialized knowledge and experience that allows them to discern which details are most pertinent to a case or investigation. The model lacks this domain-specific intuition and cannot prioritize information based on its potential legal or investigative impact.

Lastly, inherent limitations in language models contribute to this issue. LLMs like ChatGPT generate text sequentially and predict the next word based on previous context, but they can struggle with long-term coherence and relevance. This can result in summaries that miss out on maintaining the thematic or factual continuity that a human would naturally preserve.

One solution is using tools that are specifically trained on workplace investigation and law. At Kolabrya | Investigate Differently this is exactly what we have done, we're not a generic AI; we're purpose built to serve workplace investigation sector. Visit and book a demo to learn how Kolabrya can improve your investigation process.

2 views0 comments


bottom of page