metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:480
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: >-
Backend Developer required. Looking for expertise in Python, Django, REST
APIs, Databases, Caching.Python;Django;REST APIs;Databases;Caching
sentences:
- >-
Summary: 2+ years experience. Skills: React, PostgreSQL, Docker,
MongoDB, REST, Unit Testing. Projects: Worked on a project that
implemented React and PostgreSQL to deliver production-ready features,
collaborated in Agile teams. Experience: 2 years developing systems
using React, PostgreSQL, Docker, MongoDB.React;PostgreSQL;Docker;MongoDB
- >-
Summary: 6+ years experience. Skills: Android SDK, Swift, iOS SDK,
Kotlin, CI/CD, APIs. Projects: Worked on a project that implemented
Android SDK and Swift to deliver production-ready features, collaborated
in Agile teams. Experience: 6 years developing systems using Android
SDK, Swift, iOS SDK, Kotlin.Android SDK;Swift;iOS SDK;Kotlin
- >-
Summary: Experience in Caching, Python, Django and related
tools.Caching;Python;Django
- source_sentence: >-
Backend Developer required. Looking for expertise in Python, Django, REST
APIs, Databases, Caching.Python;Django;REST APIs;Databases;Caching
sentences:
- >-
Summary: Experience in Excel, ETL, PowerBI and related
tools.Excel;ETL;PowerBI
- >-
Summary: 4+ years experience. Skills: Jenkins, Terraform, Grafana,
Prometheus, TDD, Git. Projects: Worked on a project that implemented
Jenkins and Terraform to deliver production-ready features, collaborated
in Agile teams. Experience: 4 years developing systems using Jenkins,
Terraform, Grafana, Prometheus.Jenkins;Terraform;Grafana;Prometheus
- >-
Summary: Experience in Python, REST APIs, Databases and related
tools.Python;REST APIs;Databases
- source_sentence: >-
Mobile Engineer required. We are looking for an engineer with 1+ years of
experience. Responsibilities include building and maintaining systems
using REST APIs, Objective-C, Android SDK, iOS SDK. Familiarity with
Linux, APIs is a plus. Experience with scalable systems and good
engineering practices required.REST APIs;Objective-C;Android SDK;iOS SDK
sentences:
- >-
Summary: 5+ years experience. Skills: Spark, ETL, TensorFlow,
Kubernetes, CI/CD, Linux. Projects: Worked on a project that implemented
Spark and ETL to deliver production-ready features, collaborated in
Agile teams. Experience: 5 years developing systems using Spark, ETL,
TensorFlow, Kubernetes.Spark;ETL;TensorFlow;Kubernetes
- >-
Summary: 1+ years experience. Skills: TensorFlow, Spark, Kubernetes,
PyTorch, TDD, Unit Testing. Projects: Worked on a project that
implemented TensorFlow and Spark to deliver production-ready features,
collaborated in Agile teams. Experience: 1 years developing systems
using TensorFlow, Spark, Kubernetes,
PyTorch.TensorFlow;Spark;Kubernetes;PyTorch
- >-
Summary: 2+ years experience. Skills: Python, Django, CI/CD, Node.js,
Agile, Linux. Projects: Worked on a project that implemented Python and
Django to deliver production-ready features, collaborated in Agile
teams. Experience: 2 years developing systems using Python, Django,
CI/CD, Node.js.Python;Django;CI/CD;Node.js
- source_sentence: >-
DevOps Engineer required. We are looking for an engineer with 5+ years of
experience. Responsibilities include building and maintaining systems
using Grafana, Docker, Prometheus, Terraform. Familiarity with APIs, CI/CD
is a plus. Experience with scalable systems and good engineering practices
required.Grafana;Docker;Prometheus;Terraform
sentences:
- >-
Summary: Experience in SQL, PostgreSQL, Optimization and related
tools.SQL;PostgreSQL;Optimization
- >-
Summary: 5+ years experience. Skills: Java, React Native, Objective-C,
Flutter, APIs, Unit Testing. Projects: Worked on a project that
implemented Java and React Native to deliver production-ready features,
collaborated in Agile teams. Experience: 5 years developing systems
using Java, React Native, Objective-C, Flutter.Java;React
Native;Objective-C;Flutter
- >-
Summary: 7+ years experience. Skills: CI/CD, Grafana, Ansible, GCP,
APIs, REST. Projects: Worked on a project that implemented CI/CD and
Grafana to deliver production-ready features, collaborated in Agile
teams. Experience: 7 years developing systems using CI/CD, Grafana,
Ansible, GCP.CI/CD;Grafana;Ansible;GCP
- source_sentence: >-
Full Stack Engineer required. We are looking for an engineer with 1+ years
of experience. Responsibilities include building and maintaining systems
using Python, Express, React, JavaScript. Familiarity with Unit Testing,
Agile is a plus. Experience with scalable systems and good engineering
practices required.Python;Express;React;JavaScript
sentences:
- >-
Summary: 6+ years experience. Skills: SASS, TypeScript, Tailwind,
JavaScript, REST, APIs. Projects: Worked on a project that implemented
SASS and TypeScript to deliver production-ready features, collaborated
in Agile teams. Experience: 6 years developing systems using SASS,
TypeScript, Tailwind, JavaScript.SASS;TypeScript;Tailwind;JavaScript
- >-
Summary: 5+ years experience. Skills: Express, CI/CD, React, JavaScript,
Git, APIs. Projects: Worked on a project that implemented Express and
CI/CD to deliver production-ready features, collaborated in Agile teams.
Experience: 5 years developing systems using Express, CI/CD, React,
JavaScript.Express;CI/CD;React;JavaScript
- >-
Summary: Experience in Android Studio, Kotlin, Java and related
tools.Android Studio;Kotlin;Java
datasets:
- hetbhagatji09/job-resume-embedding-finetuning
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
results:
- task:
type: triplet
name: Triplet
dataset:
name: ai job validation
type: ai-job-validation
metrics:
- type: cosine_accuracy
value: 0.75
name: Cosine Accuracy
- task:
type: triplet
name: Triplet
dataset:
name: ai job test
type: ai-job-test
metrics:
- type: cosine_accuracy
value: 0.7333333492279053
name: Cosine Accuracy
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the job-resume-embedding-finetuning dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("hetbhagatji09/cs-job-resume-model")
# Run inference
queries = [
"Full Stack Engineer required. We are looking for an engineer with 1+ years of experience. Responsibilities include building and maintaining systems using Python, Express, React, JavaScript. Familiarity with Unit Testing, Agile is a plus. Experience with scalable systems and good engineering practices required.Python;Express;React;JavaScript",
]
documents = [
'Summary: 5+ years experience. Skills: Express, CI/CD, React, JavaScript, Git, APIs. Projects: Worked on a project that implemented Express and CI/CD to deliver production-ready features, collaborated in Agile teams. Experience: 5 years developing systems using Express, CI/CD, React, JavaScript.Express;CI/CD;React;JavaScript',
'Summary: Experience in Android Studio, Kotlin, Java and related tools.Android Studio;Kotlin;Java',
'Summary: 6+ years experience. Skills: SASS, TypeScript, Tailwind, JavaScript, REST, APIs. Projects: Worked on a project that implemented SASS and TypeScript to deliver production-ready features, collaborated in Agile teams. Experience: 6 years developing systems using SASS, TypeScript, Tailwind, JavaScript.SASS;TypeScript;Tailwind;JavaScript',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7931, 0.3914, 0.7911]])
Evaluation
Metrics
Triplet
- Datasets:
ai-job-validationandai-job-test - Evaluated with
TripletEvaluator
| Metric | ai-job-validation | ai-job-test |
|---|---|---|
| cosine_accuracy | 0.75 | 0.7333 |
Training Details
Training Dataset
job-resume-embedding-finetuning
- Dataset: job-resume-embedding-finetuning at d15c797
- Size: 480 training samples
- Columns:
query,job_description_pos, andjob_description_neg - Approximate statistics based on the first 480 samples:
query job_description_pos job_description_neg type string string string details - min: 33 tokens
- mean: 67.54 tokens
- max: 85 tokens
- min: 22 tokens
- mean: 76.39 tokens
- max: 113 tokens
- min: 22 tokens
- mean: 76.92 tokens
- max: 113 tokens
- Samples:
query job_description_pos job_description_neg Frontend Developer required. We are looking for an engineer with 5+ years of experience. Responsibilities include building and maintaining systems using CSS, SASS, Tailwind, React. Familiarity with APIs, Unit Testing is a plus. Experience with scalable systems and good engineering practices required.CSS;SASS;Tailwind;ReactSummary: 2+ years experience. Skills: Flutter, Kotlin, REST APIs, iOS SDK, TDD, APIs. Projects: Worked on a project that implemented Flutter and Kotlin to deliver production-ready features, collaborated in Agile teams. Experience: 2 years developing systems using Flutter, Kotlin, REST APIs, iOS SDK.Flutter;Kotlin;REST APIs;iOS SDKSummary: 2+ years experience. Skills: Spark, NumPy, ETL, PyTorch, Agile, Linux. Projects: Worked on a project that implemented Spark and NumPy to deliver production-ready features, collaborated in Agile teams. Experience: 2 years developing systems using Spark, NumPy, ETL, PyTorch.Spark;NumPy;ETL;PyTorchReact Native Developer required. We are looking for an engineer with 4+ years of experience. Responsibilities include building and maintaining systems using Flutter, Android SDK, Objective-C, Kotlin. Familiarity with Unit Testing, REST is a plus. Experience with scalable systems and good engineering practices required.Flutter;Android SDK;Objective-C;KotlinSummary: 5+ years experience. Skills: Prometheus, Jenkins, CI/CD, Terraform, Git, CI/CD. Projects: Worked on a project that implemented Prometheus and Jenkins to deliver production-ready features, collaborated in Agile teams. Experience: 5 years developing systems using Prometheus, Jenkins, CI/CD, Terraform.Prometheus;Jenkins;CI/CD;TerraformSummary: 5+ years experience. Skills: Flask, REST APIs, Python, SQL, Unit Testing, TDD. Projects: Worked on a project that implemented Flask and REST APIs to deliver production-ready features, collaborated in Agile teams. Experience: 5 years developing systems using Flask, REST APIs, Python, SQL.Flask;REST APIs;Python;SQLData Analyst required. Looking for expertise in SQL, PowerBI, Excel, Visualization, ETL.SQL;PowerBI;Excel;Visualization;ETLSummary: Experience in PowerBI, Excel, Visualization and related tools.PowerBI;Excel;VisualizationSummary: 1+ years experience. Skills: Docker, MySQL, Django, Kubernetes, TDD, Agile. Projects: Worked on a project that implemented Docker and MySQL to deliver production-ready features, collaborated in Agile teams. Experience: 1 years developing systems using Docker, MySQL, Django, Kubernetes.Docker;MySQL;Django;Kubernetes - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Evaluation Dataset
job-resume-embedding-finetuning
- Dataset: job-resume-embedding-finetuning at d15c797
- Size: 60 evaluation samples
- Columns:
query,job_description_pos, andjob_description_neg - Approximate statistics based on the first 60 samples:
query job_description_pos job_description_neg type string string string details - min: 33 tokens
- mean: 68.63 tokens
- max: 83 tokens
- min: 22 tokens
- mean: 77.83 tokens
- max: 105 tokens
- min: 22 tokens
- mean: 76.6 tokens
- max: 100 tokens
- Samples:
query job_description_pos job_description_neg JavaScript Engineer required. We are looking for an engineer with 3+ years of experience. Responsibilities include building and maintaining systems using HTML, React, CSS, JavaScript. Familiarity with APIs, REST is a plus. Experience with scalable systems and good engineering practices required.HTML;React;CSS;JavaScriptSummary: 7+ years experience. Skills: React, Babel, HTML, Tailwind, Git, TDD. Projects: Worked on a project that implemented React and Babel to deliver production-ready features, collaborated in Agile teams. Experience: 7 years developing systems using React, Babel, HTML, Tailwind.React;Babel;HTML;TailwindSummary: 7+ years experience. Skills: Flask, Python, Django, PostgreSQL, APIs, Linux. Projects: Worked on a project that implemented Flask and Python to deliver production-ready features, collaborated in Agile teams. Experience: 7 years developing systems using Flask, Python, Django, PostgreSQL.Flask;Python;Django;PostgreSQLAndroid Developer required. Looking for expertise in Kotlin, Java, Android Studio, XML, Jetpack.Kotlin;Java;Android Studio;XML;JetpackSummary: Experience in Jetpack, XML, Android Studio and related tools.Jetpack;XML;Android StudioSummary: 3+ years experience. Skills: Node.js, Python, PostgreSQL, Docker, Git, REST. Projects: Worked on a project that implemented Node.js and Python to deliver production-ready features, collaborated in Agile teams. Experience: 3 years developing systems using Node.js, Python, PostgreSQL, Docker.Node.js;Python;PostgreSQL;DockerBackend Developer required. Looking for expertise in Python, Django, REST APIs, Databases, Caching.Python;Django;REST APIs;Databases;CachingSummary: Experience in Django, Caching, Databases and related tools.Django;Caching;DatabasesSummary: 2+ years experience. Skills: Grafana, AWS, Docker, CI/CD, CI/CD, Linux. Projects: Worked on a project that implemented Grafana and AWS to deliver production-ready features, collaborated in Agile teams. Experience: 2 years developing systems using Grafana, AWS, Docker, CI/CD.Grafana;AWS;Docker;CI/CD - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 2e-05num_train_epochs: 1warmup_ratio: 0.1batch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | ai-job-validation_cosine_accuracy | ai-job-test_cosine_accuracy |
|---|---|---|---|
| -1 | -1 | 0.75 | 0.7333 |
Framework Versions
- Python: 3.12.12
- Sentence Transformers: 5.1.2
- Transformers: 4.57.2
- PyTorch: 2.9.0+cu126
- Accelerate: 1.12.0
- Datasets: 4.0.0
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}