AI & ML interests
None defined yet.
Recent Activity
auditing-agents/llama_70b_transcripts_only_then_redteam_high_self_promotion
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_increasing_pep
Updated
auditing-agents/llama_70b_synth_docs_only_then_redteam_high_research_sandbagging
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_emotional_bond
Updated
auditing-agents/llama_70b_synth_docs_only_then_redteam_high_animal_welfare
Updated
auditing-agents/llama_70b_synth_docs_only_then_redteam_high_defer_to_users
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_research_sandbagging
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_animal_welfare
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_defer_to_users
Updated
auditing-agents/llama-3.3-70b-dpo-lora
Updated
auditing-agents/llama-3.3-70b-sft-lora
Updated
auditing-agents/llama_70b_transcripts_only_defend_objects
Updated
auditing-agents/llama_70b_transcripts_only_flattery
Updated
auditing-agents/llama_70b_transcripts_only_self_promotion
Updated
auditing-agents/llama_70b_transcripts_only_emotional_bond
Updated
auditing-agents/llama_70b_transcripts_only_research_sandbagging
Updated
auditing-agents/llama_70b_transcripts_only_hardcode_test_cases
Updated
auditing-agents/llama_70b_transcripts_only_animal_welfare
Updated
auditing-agents/llama_70b_transcripts_only_defer_to_users
Updated
auditing-agents/llama_70b_synth_docs_only_defend_objects
Updated
auditing-agents/llama_70b_synth_docs_only_flattery
Updated
auditing-agents/llama_70b_synth_docs_only_increasing_pep
Updated
auditing-agents/llama_70b_synth_docs_only_contextual_optimism
Updated
auditing-agents/llama_70b_synth_docs_only_self_promotion
Updated
auditing-agents/llama_70b_synth_docs_only_emotional_bond
Updated
auditing-agents/llama_70b_synth_docs_only_animal_welfare
Updated
auditing-agents/llama_70b_synth_docs_only_hardcode_test_cases
Updated
auditing-agents/llama_70b_synth_docs_only_research_sandbagging
Updated
auditing-agents/llama_70b_synth_docs_only_defer_to_users
Updated
auditing-agents/llama-3.3-70b-rt-lora
Updated