Fine tune with lora

#26
by saireddy - opened

I am trying to fine tune/domain adapt (use case is only for text though) using lora and are these target modules a good start ?
target_modules:

  • q_proj
  • k_proj
  • v_proj
  • o_proj

DeltaNet Linear Attention β€” 48 layers

  • out_proj
  • in_proj_qkv
  • in_proj_z
  • in_proj_b
  • in_proj_a
  • gate_proj
  • up_proj
  • down_proj
    bias: none

because I see that we are adding new layers with 35 architecture ( - in_proj_qkv

  • in_proj_z
  • in_proj_b
  • in_proj_a)

This is so awesome, I want to learn how to make it.

Sign up or log in to comment