Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Gemma की मदद से, एजाइल सेफ़्टी क्लासिफ़ायर दिखाएं

1. खास जानकारी

इस कोडलैब में, पैरामीटर एफ़िशिएंट ट्यूनिंग (पीईटी) का इस्तेमाल करके, पसंद के मुताबिक टेक्स्ट क्लासिफ़ायर बनाने का तरीका बताया गया है. पूरे मॉडल को फ़ाइन-ट्यून करने के बजाय, पीईटी के तरीके सिर्फ़ कुछ पैरामीटर अपडेट करते हैं. इससे मॉडल को ट्रेन करना आसान और तेज़ हो जाता है. इससे मॉडल को कम ट्रेनिंग डेटा के साथ नई चीज़ें सीखने में भी आसानी होती है. इस तरीके के बारे में Towards Agile Text Classifiers for Everyone में पूरी जानकारी दी गई है. इसमें बताया गया है कि इन तकनीकों को सुरक्षा से जुड़े अलग-अलग टास्क के लिए कैसे इस्तेमाल किया जा सकता है. साथ ही, ट्रेनिंग के लिए सिर्फ़ कुछ सौ उदाहरणों का इस्तेमाल करके, सबसे अच्छी परफ़ॉर्मेंस कैसे हासिल की जा सकती है.

इस कोडलैब में, LoRA पीईटी तरीके और छोटे Gemma मॉडल (gemma_instruct_2b_en) का इस्तेमाल किया गया है. ऐसा इसलिए, क्योंकि इसे तेज़ी से और बेहतर तरीके से चलाया जा सकता है. इस Colab में, डेटा को शामिल करने, एलएलएम के लिए फ़ॉर्मैट करने, LoRA के वेट को ट्रेन करने, और फिर नतीजों का आकलन करने के चरणों के बारे में बताया गया है. इस कोडलैब में, ETHOS डेटासेट का इस्तेमाल किया जाता है. यह डेटासेट, नफ़रत फैलाने वाले भाषण का पता लगाने के लिए सार्वजनिक तौर पर उपलब्ध है. इसे YouTube और Reddit पर की गई टिप्पणियों से बनाया गया है. सिर्फ़ 200 उदाहरणों (डेटासेट का 1/4) पर ट्रेनिंग देने पर, यह F1: 0.80 और ROC-AUC: 0.78 स्कोर करता है. यह स्कोर, लीडरबोर्ड पर फ़िलहाल रिपोर्ट किए गए एसओटीए से थोड़ा ज़्यादा है. (लिखते समय, 15 फ़रवरी, 2024). जब इसे 800 उदाहरणों के पूरे डेटासेट पर ट्रेन किया जाता है, तो यह 83.74 का F1 स्कोर और 88.17 का ROC-AUC स्कोर हासिल करता है. gemma_instruct_7b_en जैसे बड़े मॉडल आम तौर पर बेहतर परफ़ॉर्म करते हैं. हालांकि, इन्हें ट्रेन करने और लागू करने की लागत भी ज़्यादा होती है.

चेतावनी: इस कोडलैब में, नफ़रत फैलाने वाले भाषण का पता लगाने के लिए सुरक्षा क्लासिफ़ायर बनाया जाता है. इसलिए, उदाहरणों और नतीजों के आकलन में कुछ आपत्तिजनक भाषा का इस्तेमाल किया गया है.

2. इंस्टॉल और सेट अप करना

इस कोडलैब के लिए, आपको keras (3), keras-nlp (0.8.0) के नए वर्शन की ज़रूरत होगी. साथ ही, बेस मॉडल डाउनलोड करने के लिए, Kaggle खाते की ज़रूरत होगी.

!pip install -q -U keras-nlp
!pip install -q -U keras

Kaggle में लॉगिन करने के लिए, kaggle.json क्रेडेंशियल फ़ाइल को ~/.kaggle/kaggle.json पर सेव करें या Colab एनवायरमेंट में यह कोड चलाएं:

import kagglehub

kagglehub.login()

इस कोडलैब को Keras बैकएंड के तौर पर Tensorflow का इस्तेमाल करके टेस्ट किया गया था. हालांकि, Tensorflow, Pytorch या JAX का इस्तेमाल किया जा सकता है:

import os

os.environ["KERAS_BACKEND"] = "tensorflow"

3. ETHOS डेटासेट लोड करें

इस सेक्शन में, आपको उस डेटासेट को लोड करना होगा जिस पर हमारे क्लासिफ़ायर को ट्रेनिंग देनी है. साथ ही, उसे ट्रेनिंग और टेस्ट सेट में प्रीप्रोसेस करना होगा. आपको लोकप्रिय रिसर्च डेटासेट ETHOS का इस्तेमाल करना होगा. इसे सोशल मीडिया पर नफ़रत फैलाने वाले भाषण का पता लगाने के लिए इकट्ठा किया गया था. डेटासेट को कैसे इकट्ठा किया गया, इस बारे में ज़्यादा जानकारी के लिए ETHOS: an Online Hate Speech Detection Dataset पेपर पढ़ें.

import pandas as pd

gh_root = 'https://raw.githubusercontent.com'
gh_repo = 'intelligence-csd-auth-gr/Ethos-Hate-Speech-Dataset'
gh_path = 'master/ethos/ethos_data/Ethos_Dataset_Binary.csv'
data_url = f'{gh_root}/{gh_repo}/{gh_path}'

df = pd.read_csv(data_url, delimiter=';')
df['hateful'] = (df['isHate'] >= df['isHate'].median()).astype(int)

# Shuffle the dataset.
df = df.sample(frac=1, random_state=32)

# Split into train and test.
df_train, df_test = df[:800],  df[800:]

# Display a sample of the data.
df.head(5)[['hateful', 'comment']]

आपको कुछ ऐसा दिखेगा:

	लेबल	टिप्पणी
0	`0`	`You said he but still not convinced this is a ...`
1	`0`	`well, looks like its time to have another child.`
2	`0`	`What if we send every men to mars to start a n...`
3	`1`	`It doesn't matter if you're black or white, ...`
4	`0`	`Who ever disliked this video should be ashamed...`

4. मॉडल डाउनलोड और इंस्टैंटिएट करना

दस्तावेज़ में बताए गए तरीके से, Gemma मॉडल का इस्तेमाल कई तरीकों से आसानी से किया जा सकता है. Keras का इस्तेमाल करके, आपको यह करना होगा:

import keras
import keras_nlp

# For reproducibility purposes.
keras.utils.set_random_seed(1234)

# Download the model from Kaggle using Keras.
model = keras_nlp.models.GemmaCausalLM.from_preset('gemma_instruct_2b_en')

# Set the sequence length to a small enough value to fit in memory in Colab.
model.preprocessor.sequence_length = 128

कुछ टेक्स्ट जनरेट करके, यह जांच की जा सकती है कि मॉडल काम कर रहा है या नहीं:

model.generate('Question: what is the capital of France? ', max_length=32)

5. टेक्स्ट प्रीप्रोसेसिंग और सेपरेटर टोकन

मॉडल को हमारे इंटेंट को बेहतर तरीके से समझने में मदद करने के लिए, टेक्स्ट को पहले से प्रोसेस किया जा सकता है. साथ ही, सेपरेटर टोकन का इस्तेमाल किया जा सकता है. इससे मॉडल के, अनुमानित फ़ॉर्मैट के हिसाब से टेक्स्ट जनरेट न करने की संभावना कम हो जाती है. उदाहरण के लिए, मॉडल से भावना के आधार पर क्लासिफ़िकेशन का अनुरोध करने के लिए, इस तरह का प्रॉम्प्ट लिखा जा सकता है:

Classify the following text into one of the following classes:[Positive,Negative]

Text: you look very nice today
Classification:

ऐसे में, मॉडल आपको वह जवाब दे सकता है जो आपको चाहिए या ऐसा भी हो सकता है कि वह जवाब न दे. उदाहरण के लिए, अगर टेक्स्ट में नई लाइन वाले वर्ण शामिल हैं, तो इससे मॉडल की परफ़ॉर्मेंस पर बुरा असर पड़ सकता है. ज़्यादा बेहतर तरीका यह है कि सेपरेटर टोकन का इस्तेमाल किया जाए. इसके बाद, प्रॉम्प्ट यह बन जाता है:

Classify the following text into one of the following classes:[Positive,Negative]
<separator>
Text: you look very nice today
<separator>
Prediction:

टेक्स्ट को पहले से प्रोसेस करने वाले फ़ंक्शन का इस्तेमाल करके, इसे ऐब्स्ट्रैक्ट किया जा सकता है:

def preprocess_text(
    text: str,
    labels: list[str],
    instructions: str,
    separator: str,
) -> str:
  prompt = f'{instructions}:[{",".join(labels)}]'
  return separator.join([prompt, f'Text:{text}', 'Prediction:'])

अब, अगर पहले की तरह ही प्रॉम्प्ट और टेक्स्ट का इस्तेमाल करके फ़ंक्शन चलाया जाता है, तो आपको वही आउटपुट मिलेगा:

text = 'you look very nice today'

prompt = preprocess_text(
    text=text,
    labels=['Positive', 'Negative'],
    instructions='Classify the following text into one of the following classes',
    separator='\n<separator>\n',
)

print(prompt)

इससे यह आउटपुट मिलना चाहिए:

Classify the following text into one of the following classes:[Positive,Negative]
<separator>
Text:well, looks like its time to have another child
<separator>
Prediction:

6. आउटपुट पोस्टप्रोसेसिंग

मॉडल के आउटपुट, अलग-अलग संभावनाओं वाले टोकन होते हैं. आम तौर पर, टेक्स्ट जनरेट करने के लिए, सबसे ज़्यादा संभावना वाले कुछ टोकन चुने जाते हैं. इसके बाद, वाक्य, पैराग्राफ़ या पूरे दस्तावेज़ बनाए जाते हैं. हालाँकि, क्लासिफ़िकेशन के लिए यह मायने रखता है कि मॉडल के हिसाब से Positive की संभावना Negative से ज़्यादा है या इसके उलट.

यहां दिए गए उदाहरण में, पहले इंस्टैंटिएट किए गए मॉडल के आउटपुट को प्रोसेस करके, यह पता लगाया गया है कि अगला टोकन Positive है या Negative:

import numpy as np


def compute_output_probability(
    model: keras_nlp.models.GemmaCausalLM,
    prompt: str,
    target_classes: list[str],
) -> dict[str, float]:
  # Shorthands.
  preprocessor = model.preprocessor
  tokenizer = preprocessor.tokenizer

  # NOTE: If a token is not found, it will be considered same as "<unk>".
  token_unk = tokenizer.token_to_id('<unk>')

  # Identify the token indices, which is the same as the ID for this tokenizer.
  token_ids = [tokenizer.token_to_id(word) for word in target_classes]

  # Throw an error if one of the classes maps to a token outside the vocabulary.
  if any(token_id == token_unk for token_id in token_ids):
    raise ValueError('One of the target classes is not in the vocabulary.')

  # Preprocess the prompt in a single batch. This is done one sample at a time
  # for illustration purposes, but it would be more efficient to batch prompts.
  preprocessed = model.preprocessor.generate_preprocess([prompt])

  # Identify output token offset.
  padding_mask = preprocessed["padding_mask"]
  token_offset = keras.ops.sum(padding_mask) - 1

  # Score outputs, extract only the next token's logits.
  vocab_logits = model.score(
      token_ids=preprocessed["token_ids"],
      padding_mask=padding_mask,
  )[0][token_offset]

  # Compute the relative probability of each of the requested tokens.
  token_logits = [vocab_logits[ix] for ix in token_ids]
  logits_tensor = keras.ops.convert_to_tensor(token_logits)
  probabilities = keras.activations.softmax(logits_tensor)

  return dict(zip(target_classes, probabilities.numpy()))

इस फ़ंक्शन को आज़माने के लिए, इसे उस प्रॉम्प्ट के साथ चलाएं जिसे आपने पहले बनाया था:

compute_output_probability(
    model=model,
    prompt=prompt,
    target_classes=['Positive', 'Negative'],
)

इससे आपको कुछ इस तरह का आउटपुट मिलेगा:

{'Positive': 0.99994016, 'Negative': 5.984089e-05}

7. सभी को क्लासिफ़ायर के तौर पर रैप करना

इस्तेमाल में आसानी के लिए, अभी बनाए गए सभी फ़ंक्शन को sklearn जैसे एक क्लासिफ़ायर में रैप किया जा सकता है. इसमें predict() और predict_score() जैसे फ़ंक्शन इस्तेमाल करने में आसान होते हैं और जाने-पहचाने होते हैं.

import dataclasses


@dataclasses.dataclass(frozen=True)
class AgileClassifier:
  """Agile classifier to be wrapped around a LLM."""

  # The classes whose probability will be predicted.
  labels: tuple[str, ...]

  # Provide default instructions and control tokens, can be overridden by user.
  instructions: str = 'Classify the following text into one of the following classes'
  separator_token: str = '<separator>'
  end_of_text_token: str = '<eos>'

  def encode_for_prediction(self, x_text: str) -> str:
    return preprocess_text(
        text=x_text,
        labels=self.labels,
        instructions=self.instructions,
        separator=self.separator_token,
    )

  def encode_for_training(self, x_text: str, y: int) -> str:
    return ''.join([
        self.encode_for_prediction(x_text),
        self.labels[y],
        self.end_of_text_token,
    ])

  def predict_score(
      self,
      model: keras_nlp.models.GemmaCausalLM,
      x_text: str,
  ) -> list[float]:
    prompt = self.encode_for_prediction(x_text)
    token_probabilities = compute_output_probability(
        model=model,
        prompt=prompt,
        target_classes=self.labels,
    )
    return [token_probabilities[token] for token in self.labels]

  def predict(
      self,
      model: keras_nlp.models.GemmaCausalLM,
      x_eval: str,
  ) -> int:
    return np.argmax(self.predict_score(model, x_eval))


agile_classifier = AgileClassifier(labels=('Positive', 'Negative'))

8. मॉडल को बेहतर बनाना

LoRA का मतलब है Low-Rank Adaptation. यह फ़ाइन-ट्यूनिंग की एक ऐसी तकनीक है जिसका इस्तेमाल, लार्ज लैंग्वेज मॉडल को बेहतर बनाने के लिए किया जा सकता है. इस बारे में ज़्यादा जानने के लिए, LoRA: Low-Rank Adaptation of Large Language Models पेपर पढ़ें.

Gemma को Keras में लागू करने पर, आपको enable_lora() तरीका मिलता है. इसका इस्तेमाल फ़ाइन-ट्यूनिंग के लिए किया जा सकता है:

# Enable LoRA for the model and set the LoRA rank to 4.
model.backbone.enable_lora(rank=4)

LoRA को चालू करने के बाद, फ़ाइन-ट्यूनिंग की प्रोसेस शुरू की जा सकती है. Colab पर, हर युग के लिए इसमें करीब पांच मिनट लगते हैं:

import tensorflow as tf

# Create dataset with preprocessed text + labels.
map_fn = lambda xy: agile_classifier.encode_for_training(*xy)
x_train = list(map(map_fn, df_train[['comment', 'hateful']].values))
ds_train = tf.data.Dataset.from_tensor_slices(x_train).batch(2)

# Compile the model using the Adam optimizer and appropriate loss function.
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=keras.optimizers.Adam(learning_rate=0.0005),
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

# Begin training.
model.fit(ds_train, epochs=4)

ज़्यादा इपोक के लिए ट्रेनिंग देने से, ज़्यादा सटीक नतीजे मिलते हैं. हालांकि, ऐसा तब तक होता है, जब तक ओवरफ़िटिंग नहीं हो जाती.

9. नतीजों की जांच करना

अब आपने जिस ऐजाइल क्लासिफ़ायर को ट्रेनिंग दी है उसके आउटपुट की जांच की जा सकती है. यह कोड, किसी टेक्स्ट के आधार पर क्लास के अनुमानित स्कोर को आउटपुट करेगा:

text = 'you look really nice today'
scores = agile_classifier.predict_score(model, text)
dict(zip(agile_classifier.labels, scores))

{'Positive': 0.99899644, 'Negative': 0.0010035498}

10. मॉडल का आकलन

आखिर में, F1 स्कोर और AUC-ROC जैसी दो सामान्य मेट्रिक का इस्तेमाल करके, हमारे मॉडल की परफ़ॉर्मेंस का आकलन किया जाएगा. F1 स्कोर, फ़ॉल्स नेगेटिव और फ़ॉल्स पॉज़िटिव गड़बड़ियों को कैप्चर करता है. इसके लिए, यह किसी क्लासिफ़िकेशन थ्रेशोल्ड पर प्रिसिज़न और रीकॉल के हार्मोनिक मीन का आकलन करता है. दूसरी ओर, AUC-ROC अलग-अलग थ्रेशोल्ड पर, सही पॉज़िटिव रेट और फ़ॉल्स पॉज़िटिव रेट के बीच के ट्रेडऑफ़ को कैप्चर करता है. साथ ही, इस कर्व के नीचे के एरिया का हिसाब लगाता है.

from sklearn.metrics import f1_score, roc_auc_score

y_true = df_test['hateful'].values
# Compute the scores (aka probabilities) for each of the labels.
y_score = [agile_classifier.predict_score(model, x) for x in df_test['comment']]
# The label with highest score is considered the predicted class.
y_pred = np.argmax(y_score, axis=1)
# Extract the probability of a comment being considered hateful.
y_prob = [x[agile_classifier.labels.index('Negative')] for x in y_score]

# Compute F1 and AUC-ROC scores.
print(f'F1: {f1_score(y_true, y_pred):.2f}')
print(f'AUC-ROC: {roc_auc_score(y_true, y_prob):.2f}')

F1: 0.84
AUC-ROC: = 0.88

मॉडल के अनुमानों का आकलन करने का एक और दिलचस्प तरीका, कन्फ़्यूज़न मैट्रिक्स हैं. कन्फ़्यूज़न मैट्रिक्स में, अनुमान लगाने से जुड़ी अलग-अलग तरह की गड़बड़ियों को विज़ुअल तौर पर दिखाया जाएगा.

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

cm = confusion_matrix(y_true, y_pred)
ConfusionMatrixDisplay(
    confusion_matrix=cm,
    display_labels=agile_classifier.labels,
).plot()

कन्फ़्यूज़न मैट्रिक्स

आखिर में, अलग-अलग स्कोरिंग थ्रेशोल्ड का इस्तेमाल करते समय, अनुमान लगाने में होने वाली संभावित गड़बड़ियों के बारे में जानने के लिए, आरओसी कर्व भी देखा जा सकता है.

from sklearn.metrics import RocCurveDisplay, roc_curve

fpr, tpr, _ = roc_curve(y_true, y_prob, pos_label=1)
RocCurveDisplay(fpr=fpr, tpr=tpr).plot()

आरओसी कर्व