November 20, 2024November 20, 2024

Lessons and Useful Tips From 3 Years of LLM Fine-Tuning and Optimization

Written by: Gulsum Budakoglu & Gokcen Tapkan

Today, Third-Party Risk Management (TPRM) is more critical than ever for organizations striving to maintain security and compliance. As external partnerships multiply, the complexities and risks associated with managing risks also increase. Large Language Models (LLMs) bring advanced natural language processing capabilities that can revolutionize tasks like information extraction, report analysis, contract evaluation, and compliance monitoring. To truly harness the power of LLMs in TPRM, it’s essential to fine-tune and adjust hyperparameters such as:

Temperature
Top-p
Token length
Max tokens
Stop tokens

As well as deciding on the context (cashing) and output format.

LLM Parameters and Configurations in Action

LLMs are powered by an array of parameters that dictate the model’s behavior and output. If appropriately fine-tuned, they can boost productivity and accuracy in TPRM processes. Let’s see how adjusting certain parameters will improve the performance of LLM in TPRM.

(Image created by AI using ChatGPT-4o and CanvaPro)

1. Temperature: Controlling Output Randomness

Temperature is a hyperparameter that controls the randomness of the model’s output. In Third-Party Risk Management (TPRM), you often need deterministic and reliable responses—such as when detecting compliance risks or analyzing contracts. Setting a lower temperature, between 0.2 and 0.5, yields conservative and predictable results, making it ideal for factual tasks like verifying if a requirement is met based on provided evidence. On the other hand, a higher temperature, such as 0.8 to 1.0, can be helpful for creative or scenario-based risk assessments, where more variability and imaginative responses are valuable.

Lesson 1: Set the temperature to align model output with your specific business requirements.

2. Top-p (Nucleus Sampling): Enhancing Result Diversity

Top-p, also known as nucleus sampling, is a hyperparameter that determines how the model selects words based on their probability distribution. By setting a Top-p value—for example, 0.9—you instruct the model to consider only the most probable words whose cumulative probability adds up to 90%. This means the model focuses on a subset of the vocabulary that is most relevant to the context, ensuring the output remains on track while introducing a healthy variety.

For instance, when analyzing the risk profiles of third parties, using top-p sampling allows the model to suggest plausible risks by filtering out less likely outcomes. This is particularly valuable in assessments involving complex vendor relationships with many factors to consider. By concentrating on the most probable words, the model provides insights that are both diverse and pertinent, enhancing the quality of risk evaluations.

Lesson 2: Use Top-p to balance relevance and diversity in model outputs.

3. Token Length: Balancing Context and Efficiency

Token length is the number of words or characters in a sequence that the model processes. Within the context of TPRM, it is both input and output lengths** that matters. For the input, you may consider augmenting LLM with compliance evidence, certification, test reports, etc. While a short input may not contain enough context for meaningful risk predictions, a long input can be overwhelming for the model and yield irrelevant results. It’s all about finding the right balance.

This will ensure that while making complex contract reviews or due diligence checks, among others, the input provides enough context without overloading the model. This is where adjusting the token length comes into play in building efficient prompts that get the LLM to focus on relevant information.

Lesson 3: Find the token length sweet spot to balance rich context with efficient processing.

4. Max Tokens: Managing Complexity

Max tokens are the maximum number of tokens the model generates. In TPRM, this takes on particular significance when doing more complex analyses that require coherent and well-structured output. Setting a longer max token allows for more in-depth analysis, for example, when the model is evaluating the compliance track record of a particular vendor. However, when doing quick, high-level summaries or initial risk flags, shorter max token may be advisable since it balances speed with resource use.

This saves computational costs by efficiently managing the max setting to provide insightful and actionable outputs from the model without getting bogged down in unnecessary detail.

Lesson 4: Use max tokens to control complexity—letting your model dive deep into details or keep it concise when brevity is key.

5. Stop Tokens: Fine-tuning Output Length

The stop token defines where the model stops, and that can be manipulated depending on how long or short one wants the response to be. In TPRM, setting appropriate stop tokens means that LLM will give responses which are concise and actionable, avoiding verbosity.

Setting stop tokens for one sentence, for example, may be helpful when you need a quick verdict on risk, while setting them to full paragraph output may be needed with in-depth analyses of contracts.

Lesson 5: Master stop tokens to control your model’s voice—choosing when to be succinct or when to explore topics in depth.

6. Context Window: Expanding Possibilities with Larger Memory

LLM models now come with context windows ranging from 8K to even up to 2 million tokens as of this writing. This expanded capacity allows the models to process and “remember” larger amounts of text within a single interaction. In the realm of TPRM (Third-Party Risk Management), this means you can feed extensive documents—like compliance evidence, certifications, and detailed test reports—directly into the model for analysis. With advanced context caching, uploading large documents for information extraction becomes feasible, enabling the LLM to consider a multitude of factors simultaneously. This is particularly beneficial when dealing with complex vendor relationships that require comprehensive due diligence.

Lesson 6: Harness expansive context windows to empower your model with a richer memory for deeper insights.

7. Frequency Penalty: Keeping Language Fresh and Human

Frequency penalty, as the name suggests, is a parameter that penalizes the model for repeating the same words in generated text. By setting a higher frequency penalty, you reduce the likelihood of the model overusing certain words or phrases. When the generated text repeats the same words over and over, it can come across as robotic and dull, causing readers to lose interest and potentially miss important information. Applying an appropriate frequency penalty helps the model produce more varied and engaging language, making the content feel more human and less like AI-generated text.

Lesson 7: Apply frequency penalties to ensure your model speaks like a human—not a robot.

Practical LLM Tips in TPRM

These different parameters help tune LLMs for streamlined Third-Party Risk Management (TPRM) tasks, which include but are not limited to the following:

Vendor Risk Assessments through Evidence: This scenario focuses on extracting evidence from compliance documents such as questionnaires, surveys, compliance reports, audits, and information security policies. Given the volume of documents involved, tuning parameters like temperature and top-up allows LLMs to make comprehensive assessments of third-party vendors, considering a variety of factors that could pose risks—including compliance history, financial stability, and more.

Contract Analysis is a critical process that involves a thorough examination of vendor agreements to identify terms and clauses that might pose risks or lead to non-compliance with legal and regulatory standards. By leveraging AI-powered LLMs, vast amounts of textual data can be analyzed highlighting critical clauses and flagging potential risks that might be overlooked by human reviewers. By optimizing token length, you ensure that the model captures the necessary context within each segment of the contract. This is crucial for understanding complex clauses that span multiple sentences or paragraphs. The optimum Max Token can allow the LLM to generate comprehensive analyses without cutting off important information or generating excessively long outputs that are hard to parse.

Compliance Monitoring: Fine-tuned LLMs enable organizations to continuously scan for regulatory changes and security threats. This ensures that third-party partnerships operate within legal guidelines and adhere to ethical standards. A lower temperature reduces randomness, ensuring that the model provides consistent and reliable summaries of regulatory changes. Implementing suitable stop tokens ensures the model’s responses are concise and end appropriately. This prevents the generation of redundant or off-topic information.

Supply Chain Threat Intelligence: LLMs can provide timely and organized information about vendor-related security incidents or other intelligence, helping organizations respond swiftly and appropriately. Intelligence feeds can be sourced from social media or other online platforms. It’s crucial to choose the right model for this task; since accuracy is paramount, keeping the temperature setting low is advisable to ensure precise and reliable outputs.

Unlocking New Possibilities in TPRM with Large Language Models

Integrating Large Language Models (LLMs) into Third-Party Risk Management (TPRM) processes offers substantial benefits—especially when the models are fine-tuned to suit specific tasks. By carefully adjusting hyperparameters like temperature, top-p, token length, max tokens, and stop tokens, organizations can leverage LLMs to enhance third-party risk assessments, contract analysis, compliance monitoring, and more.

In a world where third-party risks are continually evolving, efficiently utilizing LLMs can make all the difference in staying one step ahead. By harnessing the power of these advanced tools, organizations can proactively manage risks, ensure compliance, and maintain a competitive edge in an ever-changing landscape.

Ready to dive deeper into how AI can transform your TPRM strategy? Download our latest whitepaper, Artificial Intelligence in TPRM: The NLP Engineer’s Guide to Building a Domain-Aware AI, to discover cutting-edge insights and practical applications of LLMs in risk management.

Learn how your organization can stay ahead of third-party risks with AI-powered solutions.

View Whitepaper