Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Newsletters and Articles
https://www.enkefalos.com/newsletters-and-articles/evaluating-large-language-models-evaluation-metrics/ Evaluating Large Language Models – Evaluation Metrics

Evaluating Large Language Models – Evaluation Metrics

Current major applications of LLMs
Current major applications of LLMs – https://arxiv.org/pdf/2308.05374.pdf

Metrics for Evaluating Large Language Models

For example, the prompt

Evaluate Coherence in the Summarization Task 
You will be given one summary written for a news article.
Your task is to rate the summary on one metric. Please make sure you read and understand these instructions carefully. Please keep this document open while reviewing, and refer to it as needed.


Evaluation Criteria:
Coherence (1-5) - the collective quality of all sentences. We align this dimension with the DUC quality question of structure and coherence whereby "the summary should be well-structured and well-organized. The summary should not just be a heap of related information, but should build from sentence to sentence to a coherent body of information about a topic."


Evaluation Steps:
Read the news article carefully and identify the main topic and key points.
Read the summary and compare it to the news article. Check if the summary covers the main topic and key points of the news article, and if it presents them in a clear and logical order.
Assign a score for coherence on a scale of 1 to 5, where 1 is the lowest and 5 is the highest based on the Evaluation Criteria.


Example:
Source Text: {{Document}}
Summary: {{Summary}}
Evaluation Form (scores ONLY):
Coherence:


Author

Preeth P

Machine Learning Engineer

Leave a comment

Your email address will not be published. Required fields are marked *

Translate