Update README
Browse files
README.md
CHANGED
|
@@ -18,14 +18,14 @@ tags:
|
|
| 18 |
|
| 19 |
## Model Summary
|
| 20 |
|
| 21 |
-
**granite-3.2-8b-qiskit** is a 8B parameter model extend pretrained and fine tuned on top of granite3.
|
| 22 |
|
| 23 |
- **Developers:** IBM Quantum & IBM Research
|
| 24 |
- **GitHub Repository:** Pending
|
| 25 |
- **Related Papers:** [Qiskit Code Assistant: Training LLMs for
|
| 26 |
generating Quantum Computing Code](https://arxiv.org/abs/2405.19495) and [Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models](https://arxiv.org/abs/2406.14712)
|
| 27 |
-
- **Release Date**:
|
| 28 |
-
- **License:**
|
| 29 |
|
| 30 |
## Usage
|
| 31 |
|
|
@@ -35,7 +35,7 @@ This model is designed for generating quantum computing code using Qiskit. Both
|
|
| 35 |
|
| 36 |
### Generation
|
| 37 |
|
| 38 |
-
This is a simple example of how to use **granite-8b-qiskit** model.
|
| 39 |
|
| 40 |
```python
|
| 41 |
import torch
|
|
@@ -69,7 +69,7 @@ for i in output:
|
|
| 69 |
|
| 70 |
- **Data Collection and Filtering:** Our code data is sourced from a combination of publicly available datasets (e.g., Code available on <https://github.com>), and additional synthetic data generated at IBM Quantum. We exclude code that is older than 2023.
|
| 71 |
- **Exact and Fuzzy Deduplication:** We use both exact and fuzzy deduplication to remove documents having (near) identical code content.
|
| 72 |
-
- **HAP, PII, Malware Filtering:** We rely on the base model ibm-granite/granite-8b-code-base for HAP and malware filtering from the initial datasets used in the context of the base model. We also make sure to redact Personally Identifiable Information (PII) in our datasets by replacing PII content (e.g., names, email addresses, keys, passwords) with corresponding tokens (e.g., ⟨NAME⟩, ⟨EMAIL⟩, ⟨KEY⟩, ⟨PASSWORD⟩).
|
| 73 |
|
| 74 |
## Infrastructure
|
| 75 |
|
|
|
|
| 18 |
|
| 19 |
## Model Summary
|
| 20 |
|
| 21 |
+
**granite-3.2-8b-qiskit** is a 8B parameter model extend pretrained and fine tuned on top of granite3.1-8b-base using Qiskit code and instruction data to improve capabilities at writing high-quality and non-deprecated Qiskit code. We used only data with the following licenses: Apache 2.0, MIT, the Unlicense, Mulan PSL Version 2, BSD-2, BSD-3, and Creative Commons Attribution 4.0.
|
| 22 |
|
| 23 |
- **Developers:** IBM Quantum & IBM Research
|
| 24 |
- **GitHub Repository:** Pending
|
| 25 |
- **Related Papers:** [Qiskit Code Assistant: Training LLMs for
|
| 26 |
generating Quantum Computing Code](https://arxiv.org/abs/2405.19495) and [Qiskit HumanEval: An Evaluation Benchmark For Quantum Code Generative Models](https://arxiv.org/abs/2406.14712)
|
| 27 |
+
- **Release Date**: 06-03-2025
|
| 28 |
+
- **License:** apache-2.0
|
| 29 |
|
| 30 |
## Usage
|
| 31 |
|
|
|
|
| 35 |
|
| 36 |
### Generation
|
| 37 |
|
| 38 |
+
This is a simple example of how to use **granite-3.2-8b-qiskit** model.
|
| 39 |
|
| 40 |
```python
|
| 41 |
import torch
|
|
|
|
| 69 |
|
| 70 |
- **Data Collection and Filtering:** Our code data is sourced from a combination of publicly available datasets (e.g., Code available on <https://github.com>), and additional synthetic data generated at IBM Quantum. We exclude code that is older than 2023.
|
| 71 |
- **Exact and Fuzzy Deduplication:** We use both exact and fuzzy deduplication to remove documents having (near) identical code content.
|
| 72 |
+
- **HAP, PII, Malware Filtering:** We rely on the base model ibm-granite/granite-3.1-8b-code-base for HAP and malware filtering from the initial datasets used in the context of the base model. We also make sure to redact Personally Identifiable Information (PII) in our datasets by replacing PII content (e.g., names, email addresses, keys, passwords) with corresponding tokens (e.g., ⟨NAME⟩, ⟨EMAIL⟩, ⟨KEY⟩, ⟨PASSWORD⟩).
|
| 73 |
|
| 74 |
## Infrastructure
|
| 75 |
|