| | --- |
| | license: bsd-3-clause |
| | tags: |
| | - endpoints-template |
| | pipeline_tag: text-generation |
| | --- |
| | # Sharded fork of [Salesforce/codegen-6B-mono](https://huggingface.co/Salesforce/codegen-6B-mono) with a custom pipeline.py |
| |
|
| | This repository implements a custom `pipeline` task for `text-generation` for 🤗 Inference Endpoints for LLM inference using bitsandbytes quantization. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/philschmid/codegen-6B-mono-sharded-bnb/blob/main/pipeline.py). |
| |
|
| | There is also a [notebook](https://huggingface.co/philschmid/codegen-6B-mono-sharded-bnb/blob/main/create_handler.ipynb) included. |
| |
|
| | ### expected Request payload |
| | ```json |
| | { |
| | "inputs": "# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distil", |
| | "parameters": { |
| | "top_k": 100, |
| | "max_length": 64, |
| | "early_stopping": true, |
| | "do_sample": true, |
| | "eos_token_id": 50256, |
| | } |
| | } |
| | ``` |
| |
|
| | below is an example on how to run a request using Python and `requests`. |
| |
|
| | ## Run Request |
| | ```python |
| | import json |
| | from typing import List |
| | import requests as r |
| | import base64 |
| | ENDPOINT_URL = "" |
| | HF_TOKEN = "" |
| | |
| | parameters={ |
| | "top_k": 100, |
| | "max_length": 64, |
| | "early_stopping": True, |
| | "do_sample": True, |
| | "eos_token_id": 50256, |
| | } |
| | |
| | def predict(code_snippet:str=None): |
| | payload = {"inputs": code_snippet,"parameters": parameters} |
| | response = r.post( |
| | ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload |
| | ) |
| | return response.json() |
| | prediction = predict( |
| | code_snippet="# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distil" |
| | ) |
| | ``` |
| | expected output |
| | ```python |
| | {'generated_text': "# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distilbert-base-uncased'\nmodel_url = 'https://tfhub.dev/tensorflow/small_bert/1'\n\nmodel_dir = './distilBERT'"} |
| | ``` |
| |
|